Aproximative字符串匹配

编程入门 行业动态 更新时间:2024-10-28 16:23:28
本文介绍了Aproximative字符串匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在寻找一个进行近似字符串匹配的库,例如,在字典中搜索单词motorcycle,但 返回类似字符串如motorcicle。 是否有这样的库?

解决方案

该算法称为soundex。这是一个实现示例。 aspn.activestate/ASPN/Coo...n/Recipe/52213 这里是另一个: effbot/librarybook/soundex.htm

此算法称为soundex。这是一个实现示例。 aspn.activestate/ASPN/Coo...n/Recipe/52213 这里是另一个: effbot/librarybook/soundex.htm

el*******@hotmail 写道:

此算法称为soundex。这是一个实现示例。 aspn.activestate/ASPN/Coo...n/Recipe/52213 这是另一个: effbot/librarybook/soundex.htm

Soundex是* * *特定算法,用于近似 字符串匹配。它针对匹配 英美名字(如Smith / Smythe)进行了优化,并且 被认为是相当陈旧和过时的除了 最琐碎的应用程序 - 或者我是这么说的。 Soundex不会匹配任意更改 - 它将 匹配cat和cet,但它不匹配猫和垫。 更复杂的近似字符串匹配 算法将使用Levenshtein距离。你可以在这里找到一个无用的实现: www.uselesspython/download.php?script_id=108 给定函数levenshtein(s1,s2)返回 两个字符串之间的距离,你可以用它来支付这样的近似匹配: def approx_matching(strlist,target,dist = 1): """匹配strlist中的大约字符串到 a目标字符串。 返回一个列表字符串,其中每个字符串 匹配不超过目标的编辑距离 dist。 """ 找到= [] for s strlist: if levenshtein(s,target)< = dist: found.append(s) 返回s - 史蒂文。

I''m searching for a library which makes aproximative string matching, for example, searching in a dictionary the word "motorcycle", but returns similar strings like "motorcicle". Is there such a library?

解决方案

This algorithm is called soundex. Here is one implementation example. aspn.activestate/ASPN/Coo...n/Recipe/52213 here is another: effbot/librarybook/soundex.htm

This algorithm is called soundex. Here is one implementation example. aspn.activestate/ASPN/Coo...n/Recipe/52213 here is another: effbot/librarybook/soundex.htm

el*******@hotmail wrote:

This algorithm is called soundex. Here is one implementation example. aspn.activestate/ASPN/Coo...n/Recipe/52213 here is another: effbot/librarybook/soundex.htm

Soundex is *one* particular algorithm for approximate string matching. It is optimised for matching Anglo-American names (like Smith/Smythe), and is considered to be quite old and obsolete for all but the most trivial applications -- or so I''m told. Soundex will not match arbitrary changes -- it will match both cat and cet, but it won''t match cat and mat. A more sophisticated approximate string matching algorithm will use the Levenshtein distance. You can find a Useless implementation here: www.uselesspython/download.php?script_id=108 Given a function levenshtein(s1, s2) that returns the distance between two strings, you could use it for approximate matching like this: def approx_matching(strlist, target, dist=1): """Matches approximately strings in strlist to a target string. Returns a list of strings, where each string matched is no further than an edit distance of dist from the target. """ found = [] for s in strlist: if levenshtein(s, target) <= dist: found.append(s) return s -- Steven.

更多推荐

Aproximative字符串匹配

本文发布于:2023-11-12 07:57:34,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1580939.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   Aproximative

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!