python中的模糊匹配日语字符串?

编程入门行业动态更新时间:2024-10-14 14:14:39

本文介绍了python中的模糊匹配日语字符串?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

这个问题困扰了我一整天.

this problem has me stumped for the whole day.

我有两个日语字符串要在 Python2.7 中进行模糊匹配.目前我正在使用fuzzywuzzy和

I have two Japanese strings that I want to fuzzy match in Python2.7. Currently I'm using fuzzywuzzy and

jpnStr = "日本語".encode('utf-8') jpnList = ["日本語1".encode('utf-8'),"日本語2".encode('utf-8'),"日本語3".encode('utf-8')] bestmatch = process.extractOne(jpnStr, jpnList)

但最终的最佳匹配总是

("日本語1",0)

我将如何解决这个问题，或者是否有我在这里完全遗漏的最佳实践?对不起，如果我听起来很沮丧，这已经有一段时间了.提前致谢.

How would I go by resolving this issue, or is there a best practice that I'm totally missing here? Sorry if I sound frustrated, it's been a roadblock for a while. Thanks in advance.

推荐答案

好吧，我不确定这有多大帮助，但我找到了解决方法.

Ok, I'm not sure how helpful this is but I've found a workaround.

我发现我可以使用 Fuzzywuzzy 模糊匹配日语字符串.

I found that I could fuzzymatch japanese strings using fuzzywuzzy.

首先，您会得到 Unicoded 日语字符串，即日本语です"

然后将其作为 ascii 文本输出到文本文件中.输出将类似于/uf34/ufeac/uewa3/..."等等.

然后您读取文本文件并将日语字符串的 ascii 表示形式:/uf34/ufeac/uewa3/"相互比较.这给出了一个可行的模糊模糊匹配评级.

这可能不是一种理想的方法，但它有效并且相当准确.希望这对某人有所帮助.

It's probably not an ideal method, but it works and is fairly accurate. Hope this helps somebody.

更多推荐

python中的模糊匹配日语字符串?