我目前正在开发一个针对Solr服务发送查询的程序,而我正在解决Solr如何匹配字符串的问题。
例如,当我发送MaterialType:buc时,它匹配MaterialType =“buc”和“ebuc”的所有条目。 有什么我可以做的告诉索尔我只想要一个完全匹配(或者它应该只匹配以“buc”开头的字符串)?
这是否可以在不改变solr服务配置的情况下实现?
关心托比亚斯
I'm currently working on a program that sends queries against a Solr service and I'm hanging on a problem with how Solr matches strings.
For example when I'm sending MaterialType:buc, it matches all entries with MaterialType = "buc" and with "ebuc" aswell. Is there anything I can do to tell Solr that I want an exact match only (or that it should match only strings that start with "buc")?
Is this even possible without changing the configuration of the solr service?
Regards Tobias
最满意答案
如何执行匹配取决于您的字段的定义方式。 您可能正在查询带有附加分析链的TextField,它将术语buc和ebuc分解为相同的术语(例如使用EdgeNGramFilter或EdgeNGramFilter滤器等)。
您可以直接使用StrField (它只会为您提供完全匹配,区分大小写等),或者您使用带有KeywordTokenizer的TextField(使整个术语保持不变)并且仅应用LowercaseFilter以使匹配大小写不敏感。
如果要匹配前缀,可以在查询StrField后附加“*”,或者可以使用其中一个ngramfilters并仅从令牌的开头应用ngram。
所有这些更改都需要应用于您的架构(schema.xml),并且通常必须在架构更改后重新编制索引。
How the match is performed depends on how your field is defined. You're probably querying against a TextField with an analysis chain attached that decomposes the terms buc and ebuc down to the same term (such as by using a EdgeNGramFilter or a stemming filter, etc.).
You can either use a StrField directly (which will only give you exact matches, case sensitivity and all), or you use a TextField with a KeywordTokenizer (which leaves the whole term intact) and only apply a LowercaseFilter to make the match case insensitive.
If you want to prefix match, you can append a "*" after your query for a StrField, or you can use one of the ngramfilters and only apply the ngram from the beginning of the token.
All these changes would need to be applied to your schema (schema.xml) and your content would normally have to be reindexed after the schema has changed.
更多推荐
发布评论