正在考虑的数据来自API,这意味着它是非常不一致的 - 有时它会拉出意想不到的内容,有时它没有什么等等。
我是什么感兴趣的是与每个记录的 ISO 3166-2 有关的数据。
数据(当没有遇到错误时)通常看起来像这样:
{countryCode:GB,adminCode1:ENG,countryName:英国,距离:0,codes:[{type ISO3166-2,code:ENG}],adminName1:England} {countryCode:GB,adminCode1:ENG,countryName王国,距离:0,代码:[{type:ISO3166-2,code:ENG}],adminName1:England} {countryCode :GB,adminCode1:ENG,countryName:英国,距离:0,代码:[{type:ISO3166-2,code ENG,countryName:英国,距离:0,admin代码:[{type:ISO3166-2,code:ENG}],adminName1:England} {countryCode:RO,adminCode1 10,countryName:罗马尼亚,距离:0,代码:[{type:FIPS10-4,code:10},{type:ISO3166 -2,代码:B}],adminName1:Bucure \\$${ FIPS10-4,code:07},{type:ISO3166-2,code:NW}],adminName1:北莱茵 - 威斯特法伦州 {countryCode:DE,adminCode1:01,countryName:德国,距离:0,代码:[{type:FIPS10-4 :01},{type:ISO3166-2,code:BW}],adminName1:Baden-W\\\ürttemberg} {countryCode DE,adminCode1:02,countryName:Germany,distance:0,codes:[{type:FIPS10-4,code:02 {type:ISO3166-2,code:BY}],adminName1:Bavaria}我们以一个记录为例:
{countryCode:DE,adminCode1 :01,countryName:德国,距离:0,代码:[{type:FIPS10-4,code:01},{type ISO3166-2,code:BW}],adminName1:Baden-W\\\ürttemberg}从这个我'有意提取 ISO 3166-2 表示,即 DE-BW 。
我一直在尝试使用python提取此信息的不同方式,一次尝试如下所示:
coord = response.get('codes',{})get('type',{})get('ISO3166-2',None)另一个尝试看起来像这样:
print(json.dumps(response [codes] [ISO3166-2]))然而这两种方法都不行。
如何获取记录:
{ countryCode:DE,adminCode1:01,countryName:Germany,distance:0,codes:[{type:FIPS10-4,code 01},{type:ISO3166-2,code:BW}],adminName1:Baden-W\\\ürttemberg} / pre>并仅使用python提取 DE-BW ,同时控制不正确的实例例如也从
中提取 GB-ENG countryCode:GB,adminCode1:ENG,countryName:英国,距离:0,代码:[{type:ISO3166-2 :ENG}],adminName1:England}当然不会崩溃它得到的东西看起来不像那些,即异常处理。
FULL FILE
import json 导入请求从集合import defaultdict 从pprint import pprint #打开data- processing.py' with open('job-numbers-by-location.txt')as data_file: data_file中的行:标识符,名称,coords,number_of_jobs = line.split(|) coords = coords [1:-1] lat,lng = coords.split(,)#print(lat:+ lat,lng:+ lng) response = requests.get(api.geonames/countrySubdivisionJSON?lat=+ lat +& lng =+ lng +& s $ s $ b code = response.get('codes',[])代码中的代码: if code.get('type')=='ISO3166-2': print('{} - {}'format(response.get('countryCode','UNKNOWN'),code.get代码','UNKNOWN'))解决方案
'ISO3166-2'是字典值,不是键
codes = response.get('codes',[])代码中的代码:如果code.get ('type')=='ISO3166-2': print('{} - {}'。format(response.get('countryCode','UNKNOWN'),code.get('code' 'UNKNOWN')))
The data under consideration is coming from an API, which means that it's highly inconsistent- sometimes it pulls unexpected content, sometimes it pulls nothing, etc.
What I'm interested in is the data associated with ISO 3166-2 for each record.
The data (when it doesn't encounter an error) generally looks something like this:
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"} {"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"} {"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"} {"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"} {"countryCode": "RO", "adminCode1": "10", "countryName": "Romania", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "10"}, {"type": "ISO3166-2", "code": "B"}], "adminName1": "Bucure\u015fti"} {"countryCode": "DE", "adminCode1": "07", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "07"}, {"type": "ISO3166-2", "code": "NW"}], "adminName1": "North Rhine-Westphalia"} {"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"} {"countryCode": "DE", "adminCode1": "02", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "02"}, {"type": "ISO3166-2", "code": "BY"}], "adminName1": "Bavaria"}Let's take one record for example:
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}From this I'm interested to extract the ISO 3166-2 representation, i.e. DE-BW.
I've been trying different ways of extracting this information with python, one attempt looked like this:
coord = response.get('codes', {}).get('type', {}).get('ISO3166-2', None)another attempt looked like this:
print(json.dumps(response["codes"]["ISO3166-2"]))However neither of those methods worked.
How can I take a record such as:
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}and extract only DE-BW using python, while simultaneously controlling for instances that don't look exactly like that, for instance also extracting GB-ENG from:
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}and of course not crashing if it gets something that doesn't look like either of those, i.e. exception handling.
FULL FILE
import json import requests from collections import defaultdict from pprint import pprint # open up the output of 'data-processing.py' with open('job-numbers-by-location.txt') as data_file: for line in data_file: identifier, name, coords, number_of_jobs = line.split("|") coords = coords[1:-1] lat, lng = coords.split(",") # print("lat: " + lat, "lng: " + lng) response = requests.get("api.geonames/countrySubdivisionJSON?lat="+lat+"&lng="+lng+"&username=s.matthew.english").json() codes = response.get('codes', []) for code in codes: if code.get('type') == 'ISO3166-2': print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN'))解决方案
'ISO3166-2' is dictionary value, not key
codes = response.get('codes', []) for code in codes: if code.get('type') == 'ISO3166-2': print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN')))
更多推荐
使用Python从API中提取JSON数据
发布评论