我正在使用ES 1.3.4开发自定义聚合器。 它从NumericMetricsAggregator.MultiValue类扩展而来。 它的代码结构与Stats聚合器非常相似。 根据我的要求,我需要在重写的collect()方法中按升序接收doc ID。 对于大多数查询,我确实按升序获得了文档ID。 有趣的是,对于bool should查询有多个子句,我按降序获得doc Ids! 我怎样才能解决这个问题? 这是一个错误吗?
I am developing a custom aggregator using ES 1.3.4. It extends from NumericMetricsAggregator.MultiValue class. Its code structure closely resembles that of the Stats aggregator. For my requirements, I need the doc Ids to be received in ascending order in the overridden collect() method. For most queries, I do get the doc Ids in ascending order. Interestingly for bool should queries having multiple clauses, I get doc Ids in descending order! How can I fix this? Is this a bug?
最满意答案
我在github上问了同样的问题,得到了对我有用的答案。 这是解决方案:
你可以调用aggregationContext.ensureScoreDocsInOrder(); 为了确保文档按顺序排列,请查看使用此方法的ReverseNestedAggregator。
确实允许查询在不允许的情况下无序地发出文档,如果它使事情变得更快。 我相信今天发生的唯一情况是当你得到Lucene的布尔分数时,它用于顶级析取,所以你的观察是有道理的。
链接到该问题: https : //github.com/elasticsearch/elasticsearch/issues/8216
I asked the same question on github and got the answer which worked for me. Here's the solution:
You can call aggregationContext.ensureScoreDocsInOrder(); to make sure that docs are going to come in order, have a look for instance at ReverseNestedAggregator which uses this method.
Queries are indeed allowed to emit documents out-of-order if allowed to do so and if it makes things faster. I believe the only case when it happens today is when you get Lucene's BooleanScorer which is used for top-level disjunctions, so your observation makes sense.
Link to the issue: https://github.com/elasticsearch/elasticsearch/issues/8216
更多推荐
发布评论