弹性搜索：如何使用不同的分析仪进行搜索？

编程入门行业动态更新时间:2024-10-27 19:21:36

本文介绍了弹性搜索：如何使用不同的分析仪进行搜索？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在使用我的自定义分析器 autocomplete_analyzer 与过滤器 edgeNGram 。所以映射如下所示：

I'm using my custom analyzer autocomplete_analyzer with filter edgeNGram. So mapping looks like:

"acts_as_taggable_on_tags" : { "acts_as_taggable_on/tag" : { "properties" : { "name" : { "type" : "string", "boost" : 10.0, "analyzer" : "autocomplete_analyzer" } } } }

当我使用 query_string ，它的工作原理就像自动完成。例如，查询lon返回[lon，long，london，...]。

When I search using query_string, it works like autocomplete. For example, query "lon" returns ["lon", "long", "london",...].

但有时我需要完全匹配。如何得到一个完全匹配的词lon？在进行搜索查询时，是否可以使用其他分析器（例如简单或标准）

But sometimes I need exact matching. How can I get just one exactly matching word "lon"? Can I use another analyzers (e.g. simple or standard) when I making a search query?

推荐答案

我想您将需要将数据存储在2个不同的字段中。一个将包含执行自动完成查询所需的令牌，另一个用于完整的搜索查询。

I think you will need to store the data in 2 separate fields. One would contain the tokens necessary for doing autocomplete queries, the other for the full search queries.

如果您只有一个字段与令牌 [ lon，lond，londo，london] 然后，如果你搜索这个字段你不能说请只匹配令牌伦敦，因为这是完整的字/最长令牌。

If you have only one field with the tokens [lon, lond, londo, london] then if you search against this field you cannot say "please only match the token london as this is the full word/longest token".

您可以使用多字段为您的2个字段做好。请查看多字段上的弹性搜索文档。我们可能会这样做：

You can have the 2 fields done nicely for you with the multi-field. Take a look at the elasticsearch docs on multi-field. The 'official' documentation is pretty good on this section, please check it out!

映射

I would probably do this:

"acts_as_taggable_on_tags" : { "acts_as_taggable_on/tag" : { "properties" : { "name" : { "type" : "multi_field", "fields" : { "name" : { "type" : "string", "boost" : 10.0 }, "autocomplete" : { "type" : "string", "analyzer" : "autocomplete_analyzer", "boost" : 10.0 } } } } } }

查询

用于自动完成查询：

Querying

for autocomplete queries:

"query": { "query_string": { "query" : "lon", "default_field": "name.autocomplete" } }

查询：

"query": { "query_string": { "query" : "lon", "default_field": "name" } }

注意default_field的区别。

Note the difference in "default_field".

另一个答案不行;不同的 search_analyzer 意味着搜索'london'不会被标记为 lon，lond，londo，london 。但是这不会阻止搜索lon与匹配的文件名称为'london'，我想是你想要的。

The other answer given would not work; the different search_analyzer would mean that a search for 'london' would not get tokenized into lon, lond, londo, london. But this would not stop a search for 'lon' from matching documents with a name of 'london' which I think is what you want.