Python实现天猫,淘宝,京东收货信息中,自动识别出手机号、姓名、省市区

编程入门 行业动态 更新时间:2024-10-12 01:32:41

Python实现天猫,淘宝,京东收货信息中,自动识别出手机号、姓名、<a href=https://www.elefans.com/category/jswz/34/1764371.html style=省市区"/>

Python实现天猫,淘宝,京东收货信息中,自动识别出手机号、姓名、省市区

最近工作需要,需要从一串字符串中识别出姓名,手机号和省市区,支持座机号的识别,例如下面的一些字符串:

橱之友,包卫贞,13600000000,浙江省宁波市,慈溪市,庵东镇 杭州湾新区世纪城翠湖苑5栋2000

【识别出】姓名为“包卫贞”,手机号为“13600000000”,省市区为“浙江省 宁波市 慈溪市”


marui谢sir,谢义海,13900000000,江苏省南京市,秦淮区,健康路1号水游城5楼1号

【识别出】姓名为“谢义海”,手机号为“13900000000”,省市区为“江苏省 南京市 秦淮区”


刘伟,13120000000,北京 北京市 朝阳区 东湖街道 利泽中园二区208号,000000

【识别出】姓名为“刘伟”,手机号为“13120000000”,省市区为“北京 北京市 朝阳区”

在网上找了很多资料,都没有解决,下面是我的详细解决思路和Python代码。

【解决思路】

1.最先识别出11位手机号,这个最简单

2.其次识别出姓名,姓名难度有点大,也没找到好的方法,我是从姓氏下手识别的。

3.最后识别出省市区,这个难度也很大,试了很多方法,最好找到一种方法,效果不错。

 

【Python代码】

1.识别手机号

def getPhoneNumber(address):mobile = re.findall('(13\d{9}|14[5|7]\d{8}|15\d{9}|166{\d{8}|17[3|6|7]{\d{8}|18\d{9}|0\d{2,4}-\d{7,8})', address)if len(mobile)>=1:return mobile.pop(0)return ""

2.识别姓名

def getUserName(address):firstNames = {"赵","钱","孙","李","周","吴","郑","王","冯","陈","褚","卫","蒋","沈","韩","杨","朱","秦","尤","许","何","吕","施","张","孔","曹","严","华","金","魏","陶","姜","戚","谢","邹","喻","柏","水","窦","章","云","苏","潘","葛","奚","范","彭","郎","鲁","韦","昌","马","苗","凤","花","方","俞","任","袁","柳","酆","鲍","史","唐","费","廉","岑","薛","雷","贺","倪","汤","滕","殷","罗","毕","郝","邬","安","常","乐","于","时","傅","皮","卞","齐","康","伍","余","元","卜","顾","孟","平","黄","和","穆","萧","尹","姚","邵","湛","汪","祁","毛","禹","狄","米","贝","明","臧","计","伏","成","戴","谈","宋","茅","庞","熊","纪","舒","屈","项","祝","董","粱","杜","阮","蓝","闵","席","季","麻","强","贾","路","娄","危","江","童","颜","郭","梅","盛","林","刁","钟","徐","邱","骆","高","夏","蔡","田","樊","胡","凌","霍","虞","万","支","柯","昝","管","卢","莫","经","房","裘","缪","干","解","应","宗","丁","宣","贲","邓","郁","单","杭","洪","包","诸","左","石","崔","吉","钮","龚","程","嵇","邢","滑","裴","陆","荣","翁","荀","羊","於","惠","甄","麴","家","封","芮","羿","储","靳","汲","邴","糜","松","井","段","富","巫","乌","焦","巴","弓","牧","隗","山","谷","车","侯","宓","蓬","全","郗","班","仰","秋","仲","伊","宫","宁","仇","栾","暴","甘","钭","厉","戎","祖","武","符","刘","景","詹","束","龙","叶","幸","司","韶","郜","黎","蓟","薄","印","宿","白","怀","蒲","邰","从","鄂","索","咸","籍","赖","卓","蔺","屠","蒙","池","乔","阴","欎","胥","能","苍","双","闻","莘","党","翟","谭","贡","劳","逄","姬","申","扶","堵","冉","宰","郦","雍","舄","璩","桑","桂","濮","牛","寿","通","边","扈","燕","冀","郏","浦","尚","农","温","别","庄","晏","柴","瞿","阎","充","慕","连","茹","习","宦","艾","鱼","容","向","古","易","慎","戈","廖","庾","终","暨","居","衡","步","都","耿","满","弘","匡","国","文","寇","广","禄","阙","东","殴","殳","沃","利","蔚","越","夔","隆","师","巩","厍","聂","晁","勾","敖","融","冷","訾","辛","阚","那","简","饶","空","曾","毋","沙","乜","养","鞠","须","丰","巢","关","蒯","相","查","後","荆","红","游","竺","权","逯","盖","益","桓","公","万俟","司马","上官","欧阳","夏侯","诸葛","闻人","东方","赫连","皇甫","尉迟","公羊","澹台","公冶","宗政","濮阳","淳于","单于","太叔","申屠","公孙","仲孙","轩辕","令狐","钟离","宇文","长孙","慕容","鲜于","闾丘","司徒","司空","亓官","司寇","仉","督","子车","颛孙","端木","巫马","公西","漆雕","乐正","壤驷","公良","拓跋","夹谷","宰父","谷梁","晋","楚","闫","法","汝","鄢","涂","钦","段干","百里","东郭","南门","呼延","归","海","羊舌","微生","岳","帅","缑","亢","况","后","有","琴","梁丘","左丘","东门","西门","商","牟","佘","佴","伯","赏","南宫","墨","哈","谯","笪","年","爱","阳","佟"};names = address.split(",");name = "";for nametmp in names:if nametmp=="" or len(nametmp)>4:continuefor firstName in firstNames:if nametmp.startswith(firstName):name=nametmpbreakif name!="":breakreturn name;

3.识别省市区

def getposition(LatitudeAndLongitude):url = "http://***/?location="    + LatitudeAndLongitude + "&output=json&ak="+ ak;r = requests.get(url)obj = json.loads(r.text)result = obj.get("result")addressComponent = result.get("addressComponent")return addressComponent;
def getGeocoderLatitude(address, city):url = "http://***/?address="+ address+ "&output=json&ak=" + ak + "&city="+ city;r = requests.get(url)#{'status': 0, 'result': {'location': {'lng': 116.30761990129483, 'lat': 40.059534945702765}, 'precise': 1, 'confidence': 75, 'comprehension': 100, 'level': '楼号'}}obj = json.loads(r.text)result=obj.get("result")location = result.get("location")return  str(location.get("lat"))+","+ str(location.get("lng"));
def getAddress(address):latitudeAndLongitude = getGeocoderLatitude(address, "");#print(latitudeAndLongitude)jsonNode = getposition(latitudeAndLongitude);return jsonNode.get("province")+","+jsonNode.get("city")+","+jsonNode.get("district")+","+jsonNode.get("street")

测试代码:

if __name__ == '__main__':datas = ["dailingyang001,杨兆芳,13800000000,江苏省,宿城区,古城街道便民方舟1号楼1000房间", "tb508969_2013,蒋微,18900000000,湖南省长沙市,岳麓区,观沙岭街道 德润园小区润恒苑2栋2单元1000号", "橱之友,包卫贞,13600000000,浙江省宁波市,慈溪市,庵东镇 杭州湾新区世纪城翠湖苑5栋2000", "marui谢sir,谢义海,010-81698565,江苏省南京市,秦淮区,健康路1号水游城5楼1号", "男孩不哭19,元亮,15000000000,河南省安阳市,殷都区,铁西路街道 河南省 安阳市 殷都区 铁西路中段", "刘伟,13120000000,北京 北京市 朝阳区 东湖街道 利泽中园二区208号,000000", "张三,13000000000,河南省郑州市高新区国家大学科技园东区1号楼"];# print(datas[0]+"//"+datas[len(datas)-1])for data in datas:print("【"+data+"】解析后分别为:---------"+userName.getUserName(data)+"//"+userMobile.getPhoneNumber(data)+"//"+userAddress.getAddress(data))

结果:

【dailingyang001,杨兆芳,13800000000,江苏省,宿城区,古城街道便民方舟1号楼1000房间】解析后分别为:---------杨兆芳//13800000000//江苏省,宿迁市,宿城区,洪泽湖路辅路
【tb508969_2013,蒋微,18900000000,湖南省长沙市,岳麓区,观沙岭街道 德润园小区润恒苑2栋2单元1000号】解析后分别为:---------蒋微//18900000000//湖南省,长沙市,岳麓区,支路二
【橱之友,包卫贞,13600000000,浙江省宁波市,慈溪市,庵东镇 杭州湾新区世纪城翠湖苑5栋2000】解析后分别为:---------包卫贞//13600000000//浙江省,宁波市,慈溪市,芦汀路
【marui谢sir,谢义海,010-81698565,江苏省南京市,秦淮区,健康路1号水游城5楼1号】解析后分别为:---------谢义海//010-81698565//江苏省,南京市,秦淮区,中华路
【男孩不哭19,元亮,15000000000,河南省安阳市,殷都区,铁西路街道 河南省 安阳市 殷都区 铁西路中段】解析后分别为:---------元亮//15000000000//河南省,安阳市,龙安区,铁西路
【刘伟,13120000000,北京 北京市 朝阳区 东湖街道 利泽中园二区208号,000000】解析后分别为:---------刘伟//13120000000//北京市,北京市,朝阳区,利泽中二路
【张三,13000000000,河南省郑州市高新区国家大学科技园东区1号楼】解析后分别为:---------张三//13000000000//河南省,郑州市,中原区,王屋路
 

项目详情,请参考.html

更多推荐

Python实现天猫,淘宝,京东收货信息中,自动识别出手机号、姓名、省市区

本文发布于:2024-02-14 13:36:29,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1763742.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:省市区   自动识别   收货   淘宝   手机号

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!