ES自动补全

编程入门 行业动态 更新时间:2024-10-09 09:17:12

<a href=https://www.elefans.com/category/jswz/34/1771364.html style=ES自动补全"/>

ES自动补全

拼音分词

要实现根据字母做补全,就必须对文档按照拼音分词。

在GitHub上恰好有elasticsearch的拼音分词插件,

地址:

自定义分词器

elasticsearch中分词器(analyzer)的组成包含三部分:

  • character filters:在tokenizer之前对文本进行处理。例如删除字符、替换字符

  • tokenizer:将文本按照一定的规则切割成词条(term)。例如keyword,就是不分词;还有ik_smart

  • tokenizer filter:将tokenizer输出的词条做进一步处理。例如大小写转换、同义词处理、拼音处理等

我们可以在创建索引库时,通过settings来配置自定义的analyzer(分词器):

PUT /test
{"settings": {"analysis": {"analyzer": { "my_analyzer": { "tokenizer": "ik_max_word","filter": "py"}},"filter": {"py": { "type": "pinyin","keep_full_pinyin": false,"keep_joined_full_pinyin": true,"keep_original": true,"limit_first_letter_length": 16,"remove_duplicated_term": true,"none_chinese_pinyin_tokenize": false}}}},"mappings": {"properties": {"name":{"type": "text","analyzer": "my_analyzer","search_analyzer": "ik_smart"}}}
}

拼音分词器适合在创建倒排索引的时候使用,但不能在搜索的时候使用:

自动补全查询

elasticsearch提供了Completion Suggester查询来实现自动补全功能。这个查询会匹配以用户输入内容开头的词条并返回。

为了提高补全查询的效率,对于文档中字段的类型有一些约束

  • 参与补全查询的字段必须是completion类型。

  • 字段的内容一般是用来补全的多个词条形成的数组。

由此,对索引库是有要求的:例子

PUT /test2
{"mappings": {"properties": {"title":{"type": "completion"}}}
}
POST test2/_doc/1
{"title": ["Sony", "performance"]
}
POST test2/_doc/2
{"title": ["SK-II", "PITERA"]
}
POST test2/_doc/3
{"title": ["Nintendo", "switch"]
}
POST test2/_doc/4
{"title": ["Sony", "Nintendo"]
}

查询语句如下:

GET /test2/_search
{ "suggest": {"title_suggest": {"text": "s","completion": {"field": "title","skip_duplicates":true,"size":10}}}
}

实现搜索框自动补全

JavaAPI

再来看结果解析:

// 自动补全
@Test
public void testSuggest() throws Exception {// 1.创建requestSearchRequest request = new SearchRequest("hotel");// 2.准备DSL参数request.source().suggest(new SuggestBuilder().addSuggestion("hotelSuggestion",SuggestBuilderspletionSuggestion("suggestion").prefix("r").skipDuplicates(true).size(10)));// 3.发送请求SearchResponse response = client.search(request, RequestOptions.DEFAULT);// 4.解析响应Suggest suggest = response.getSuggest();CompletionSuggestion suggestion = suggest.getSuggestion("hotelSuggestion");for (CompletionSuggestion.Entry.Option option : suggestion.getOptions()) {String text = option.getText().string();System.out.println(text);}
}

IHotelService

List<String> getSuggestions(String prefix) throws IOException;

HotelController

@GetMapping("/suggestion")
public List<String> getSuggestions(@RequestParam("key") String prefix) throws IOException {return hotelService.getSuggestions(prefix);
}

HotelService

@Override
public List<String> getSuggestions(String prefix) throws IOException {// 1.创建requestSearchRequest request = new SearchRequest("hotel");// 2.准备DSL参数request.source().suggest(new SuggestBuilder().addSuggestion("hotelSuggestion",SuggestBuilderspletionSuggestion("suggestion").prefix(prefix).skipDuplicates(true).size(10)));// 3.发送请求SearchResponse response = client.search(request, RequestOptions.DEFAULT);// 4.解析响应Suggest suggest = response.getSuggest();CompletionSuggestion suggestion = suggest.getSuggestion("hotelSuggestion");List<String> list = new ArrayList<>();for (CompletionSuggestion.Entry.Option option : suggestion.getOptions()) {String text = option.getText().string();list.add(text);}return list;
}

更多推荐

ES自动补全

本文发布于:2024-02-19 14:13:41,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1764565.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:ES

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!