字比较算法

编程入门 行业动态 更新时间:2024-10-11 21:25:04
本文介绍了字比较算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在为我正在处理的项目执行CSV导入工具。 客户端需要能够输入excel中的数据,将它们导出为CSV并将其上传到数据库。 例如,我有这个CSV记录:

I am doing a CSV Import tool for the project I'm working on. The client needs to be able to enter the data in excel, export them as CSV and upload them to the database. For example I have this CSV record:

1, John Doe, ACME Comapny (the typo is on purpose)

当然,这些公司保存在一个单独的表中,并与外键关联,在插入之前发现正确的公司ID。 我计划通过比较数据库中的公司名称和CSV中的公司名称来做到这一点。 如果字符串完全相同,比较应该返回0,并返回一些随着字符串变得更加不同而变大的值,但是strcmp不会在这里剪切,因为:

Of course, the companies are kept in a separate table and linked with a foreign key, so I need to discover the correct company ID before inserting. I plan to do this by comparing the company names in the database with the company names in the CSV. the comparison should return 0 if the strings are exactly the same, and return some value that gets bigger as the strings get more different, but strcmp doesn't cut it here because:

Acme Company和Acme Comapny应该有非常小的差异指数,但Acme Company和Cmea Mpnyaco应该有非常大的差异指数或 Acme Company和Acme Comp。。也应该具有小的差异指数,即使字符计数不同。 此外,Acme Company和Company Acme应该返回0。

"Acme Company" and "Acme Comapny" should have a very small difference index, but "Acme Company" and "Cmea Mpnyaco" should have a very big difference index Or "Acme Company" and "Acme Comp." should also have a small difference index, even though the character count is different. Also, "Acme Company" and "Company Acme" should return 0.

因此,如果客户输入数据时输入类型,选择他最想插入的名称。

So if the client makes a type while entering data, i could prompt him to choose the name he most probably wanted to insert.

有一个已知的算法来做,或者我们可以发明一个:) ?

Is there a known algorithm to do this, or maybe we can invent one :) ?

推荐答案

您可能想查看 Levenshtein Distance 算法作为起点。它会评估两个字之间的距离。

You might want to check out the Levenshtein Distance algorithm as a starting point. It will rate the "distance" between two words.

这个SO线程实现一个谷歌风格的你的意思是...?系统也可以提供一些想法。

This SO thread on implementing a Google-style "Do you mean...?" system may provide some ideas as well.

更多推荐

字比较算法

本文发布于:2023-11-30 14:29:58,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1650329.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:算法

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!