BigQuery是否可以进行近似字符串匹配/模糊字符串搜索?

编程入门 行业动态 更新时间:2024-10-16 22:23:07
本文介绍了BigQuery是否可以进行近似字符串匹配/模糊字符串搜索?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

感谢Google提供BigQuery,这太好了!BigQuery是否可以进行近似字符串匹配/模糊字符串搜索?Google是否有计划将此功能添加到BigQuery?

Thanks to Google for delivering BigQuery, it's great! Is Approximate String Matching / Fuzzy String Searching possible with BigQuery? Does Google have plans to add this functionality to BigQuery?

可以肯定地使用Google专有的近似字符串匹配算法来向BigQuery提供此功能,同时仍保持Google知识产权.我们已经搜索了所有BigQuery文档和Stack Overflow问题.当然,有很多算法可以做到这一点,尽管该如何与BigQuery集成?

Surely the Google proprietary Approximate String Matching algorithm could be used to deliver this capability to BigQuery while still maintaining Google Intellectual Property. We've searched all the BigQuery documentation and Stack Overflow questions. Of course there are many algorithms to do this, though how to integrate with BigQuery?

我们的需求很简单,比较两个字符串,尽管它们可能略有不同,但它们大致相同.例如:

Our need is simple, to compare two strings which will be mostly the same though could be slightly different. For example:

"Rhodes USA" vs. "Rhodes USA, LLC", vs. "Rhodes USA LLC".

在我们的BigQuery测试中,似乎两个字符串需要完全匹配才能让BigQuery加入它们,甚至减少到每个字符串中的尾随空格数.此功能或与BigQuery集成的指南的添加将不胜感激.这是对威斯康星州密尔沃基市的一家区域性,创新性的部分喷气式飞机所有权公司密尔沃基喷气机的支持.再次感谢Google提供BigQuery.

From our BigQuery tests it appears two strings need to match EXACTLY for BigQuery to JOIN them, even down to the number of trailing spaces in each string. The addition of this functionality or guidance for integration with BigQuery would be greatly appreciated. This is in support of Milwaukee Jets, a regional, innovative, fractional jet ownership company in Milwaukee, WI. Thanks again Google for delivering BigQuery.

非常感谢您,安德鲁·保林(414)212-5372

Thank you very much and best regards, Andrew Paullin (414) 212-5372

推荐答案

不幸的是,不支持近似字符串匹配.您可以获得的最接近的结果是使用正则表达式.最好的选择是将数据标准化后再使用BigQuery,即将"Rhodes USA"和"Rhodes,USA."转换为相同的字符串.但是,我将为此功能添加功能请求错误.

Unfortunately, approximate string matching is not supported. The closest you can get is by using regular expressions. Your best bet may be to normalize the data before it gets to BigQuery -- i.e transform "Rhodes USA" and "Rhodes, USA. " into the same string. I'll add a feature request bug for this support, however.

更多推荐

BigQuery是否可以进行近似字符串匹配/模糊字符串搜索?

本文发布于:2023-10-23 05:30:35,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1519931.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   近似   模糊   BigQuery

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!