我有电影索引,其中每个文档都具有以下结构:
I have movies index in which each document has this structure :
{ "color": "Color", "director_name": "Sam Raimi", "actor_2_name": "James Franco", "movie_title": "Spider-Man 2", "actor_3_name" : "Brad Pitt", "actor_1_name": "J.K. Simmons" }我需要计算与每个演员对应的电影数量(演员可以在actor_1_name或actor_2_name或actor_3_name字段中)
I need to do calculate number of movies corresponding to each actor (actor can be in both actor_1_name or actor_2_name or actor_3_name field)
这三个字段的映射为:
"mappings": { "properties": { "actor_1_name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "actor_2_name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "actor_3_name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } }有没有一种方法可以汇总结果,该结果可以合并所有3个actor字段的术语并给出一个整体.
Is there a way I can aggregated result which can combine terms from all 3 actor fields and give a single aggreagation.
当前,我正在为每个actor字段创建单独的聚合,并通过我的JAVA代码将这些不同的聚合合并为一个.
Currently I am creating separate aggregation for each actor field and through my JAVA code combine these different aggregations into one.
通过创建其他聚合来搜索查询:
Search Query by creating different aggregation :
{ "aggs" : { "actor1_count" : { "terms" : { "field" : "actor_1_name.keyword" } }, "actor2_count" : { "terms" : { "field" : "actor_2_name.keyword" } }, "actor3_count" : { "terms" : { "field" : "actor_3_name.keyword" } } } }结果
样品结果为:
"aggregations": { "actor1_count": { "buckets": [ { "key": "Johnny Depp", "doc_count": 2 } ] }, "actor2_count": { "buckets": [ { "key": "Johnny Depp", "doc_count": 1 } ] }, "actor3_count": { "buckets": [ { "key": "Johnny Depp", "doc_count": 3 } ] } }因此,是否有可能代替创建不同的聚合,而是可以通过Elasticsearch将所有3个聚合的结果合并在一起.
So, is it possible instead of creating different aggregation , I can combine result of all 3 aggregation in one aggreation through Elasticsearch.
基本上这是我想要的:
"aggregations": { "actor_count": { "buckets": [ { "key": "Johnny Depp", "doc_count": 6 } ] } }( Johnny Depp doc_count应该显示所有3个字段actor_1_name,actor_2_name,actor_3_name的总和)
(Johnny Depp doc_count should show sum from all 3 field actor_1_name, actor_2_name, actor_3_name wherever it is present)
我已经尝试过脚本,但是它确实可以正常工作.
I have tried though script but it dint worked correctly .
{ "aggregations": { "name": { "terms": { "script": "doc['actor_1_name.keyword'].value + ' ' + doc['actor_2_name.keyword'].value + ' ' + doc['actor_2_name.keyword'].value" } } } }它将演员姓名组合在一起,然后给出结果.
It is combining actor names and then giving result .
"buckets": [ { "key": "Steve Buscemi Adam Sandler Adam Sandler", "doc_count": 6 }, { "key": "Leonard Nimoy Nichelle Nichols Nichelle Nichols", "doc_count": 4 } ] 推荐答案如果没有 terms ,这将无法正常工作.我认为必须使用 scripted_metric :
This is not going to work w/ terms. Gotta resort to scripted_metric, I think:
GET actors/_search { "size": 0, "aggs": { "merged_actors": { "scripted_metric": { "init_script": "state.actors_map=[:]", "map_script": """ def actor_keys = ['actor_1_name', 'actor_2_name', 'actor_3_name']; for (def key : actor_keys) { def actor_name = doc[key + '.keyword'].value; if (state.actors_map.containsKey(actor_name)) { state.actors_map[actor_name] += 1; } else { state.actors_map[actor_name] = 1; } } """, "combine_script": "return state", "reduce_script": "return states" } } } }屈服
... "aggregations" : { "merged_actors" : { "value" : [ { "actors_map" : { "Brad Pitt" : 5, "J.K. Simmons" : 1, "James Franco" : 3 } } ] } }更多推荐
合并多个聚合的结果
发布评论