我有一个列族与二级索引'指针'。 如何删除具有相同“指针”值的多行(例如abc)?
我知道的唯一选择是:
expr = create_index_expression('pointer', 'abc') clause = create_index_clause([expr]) for key, user in cassandra_cf.get_indexed_slices(clause): cassandra_cf.remove(key)但我知道这是非常低效的,如果我有数千行具有相同的'指针'值,可能需要很长时间。 还有其他选择吗?
I have a column family with a secondary index 'pointer'. How do I remove multiple rows that have the same 'pointer' value (e.g. abc)?
The only option I know is:
expr = create_index_expression('pointer', 'abc') clause = create_index_clause([expr]) for key, user in cassandra_cf.get_indexed_slices(clause): cassandra_cf.remove(key)but I know this is very inefficient and can take long if I have thousands of rows with the same 'pointer' value. Are there any other options?
最满意答案
您可以一次删除多行:
expr = create_index_expression('pointer', 'abc') clause = create_index_clause([expr]) with cassandra_cf.batch() as b: for key, user in cassandra_cf.get_indexed_slices(clause): b.remove(key)这会将删除分组为100个批处理(默认情况下)。 当批处理对象用作上下文管理器时,它将自动处理一旦剩下with块后发送任何剩余的突变。
您可以在pycassa.batch API文档中阅读更多相关信息。
You can remove multiple rows at once:
expr = create_index_expression('pointer', 'abc') clause = create_index_clause([expr]) with cassandra_cf.batch() as b: for key, user in cassandra_cf.get_indexed_slices(clause): b.remove(key)This will group the removes into batches of 100 (by default). When the batch object is used as a context manager as it is here, it will automatically handle sending any remaining mutations once the with block is left.
You can read more about this in the pycassa.batch API docs.
更多推荐
发布评论