世界上并没有完美的程序,但是我们并不因此而沮丧,因为写程序就是一个不断追求完美的过程。-侯氏工坊 [toc]
- 原理:前景频率与背景频率比较
significant_text
from elasticsearch import Elasticsearch
import urllib3
urllib3.disable_warnings()
# PUT es_significant_text
# {
# "mappings": {
# "properties": {
# "name": {"type": "text"},
# "type": {"type": "keyword"}
# }
# }
# }
# POST es_significant_text/_bulk
# {"index": {"_id": 1}}
# {"name": "es hello good", "type": "lan"}
# {"index": {"_id": 2}}
# {"name": "good ttt ml", "type": "lan"}
# {"index": {"_id": 3}}
# {"name": "es kkk ksdl", "type": "lan"}
# {"index": {"_id": 4}}
# {"name": "elastic title", "type": "lan"}
# {"index": {"_id": 5}}
# {"name": "es jnlsjdin", "type": "te"}
# {"index": {"_id": 6}}
# {"name": "good dsfsd", "type": "te"}
# GET es_significant_text/_search
# {
# "query": {"term": {
# "type": {
# "value": "te"
# }
# }},
# "size": 0,
# "aggs": {
# "my_significant_text": {
# "significant_text": {
# "field": "name",
# "min_doc_count": 1
# }
# }
# }
# }
# 创建es实例
es = Elasticsearch("https://192.168.2.64:9200",
verify_certs=False,
basic_auth=("elastic", "MuZkDqdW--VsfDjTcoex"),
request_timeout=60,
max_retries=3,
retry_on_timeout=True,
node_selector_class="round_robin")
# 刷新
es.indices.refresh(index="es_significant_text")
significant_text = {
"my_significant_text": {
"significant_text": {
"field": "name",
"min_doc_count": 1
}
}
}
query = {"term": {
"type": {
"value": "te"
}
}}
resp = es.search(index="es_significant_text", size=0, query=query, aggregations=significant_text)
print(resp['aggregations']['my_significant_text']['buckets'])
更多推荐
17. python-es-8.3.3-重要或异常文本聚合significant_text
发布评论