将嵌套的JSON解析为数据框

编程入门 行业动态 更新时间:2024-10-11 19:14:20
本文介绍了将嵌套的JSON解析为数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在尝试将JSON字符串的最低粒度解析为熊猫数据帧.

I am trying to parse a JSON string to its lowest granularity to a panda dataframe.

首先,我尝试了read_json:

First I tried read_json:

jsonData = pd.read_json(apiRequest)

但是大部分数据仍嵌套在networkRank下.

But a large chunk of the data is still nested under networkRank.

然后我尝试了json_normalize,但是这次我错过了更高一级的数据,例如纬度和经度.

Then I tried json_normalize, but this time I am missing the data one level higher such as latitude and longitude.

result = json_normalize(json_data['networkRank'])

我还尝试将解析"为嵌套结构并从头开始构建数据帧,但是此代码会导致错误:

I also tried to parse "into" the nested structure and construct the data frame from scratch, but this code results in error:

result_nested = json_normalize(json_data, 'networkRank', ['longitude', 'latitude', ['networkRank', 'type3G', 'downloadSpeed']])

目标

要将JSON数据解析为具有所有字段的平面表,这意味着将纬度,经度和距离数据附加到图 2 .

{'apiVersion': '2', 'distance': 10, 'latitude': '-6.162959', 'longitude': '35.751607', 'networkRank': [{'networkId': '6402', 'networkName': 'Vodacom', 'type3G': {'averageRssiAsu': '9.5429091136', 'averageRssiDb': '-69.5664329624972', 'downloadSpeed': '1508.1304', 'networkId': '6402', 'networkName': 'Vodacom', 'networkType': '3', 'pingTime': '320.9600', 'reliability': '0.804236452826138', 'sampleSizeRSSI': '948', 'sampleSizeSpeed': '29', 'uploadSpeed': '893.7692'}}, {'networkId': '6400', 'networkName': 'tiGO', 'type3G': {'averageRssiAsu': '15.3537142857', 'averageRssiDb': '-61.4563389583101', 'downloadSpeed': '516.0000', 'networkId': '6400', 'networkName': 'tiGO', 'networkType': '3', 'pingTime': '259.0000', 'reliability': '0.911904765537807', 'sampleSizeRSSI': '935', 'sampleSizeSpeed': '21', 'uploadSpeed': '320.4211'}}, {'networkId': '6403', 'networkName': 'Airtel', 'type3G': {'averageRssiAsu': '13.2729999375', 'averageRssiDb': '-58.1521092977699', 'downloadSpeed': '1080.2500', 'networkId': '6403', 'networkName': 'Airtel', 'networkType': '3', 'pingTime': '194.5556', 'reliability': '0.554680264185345', 'sampleSizeRSSI': '587', 'sampleSizeSpeed': '21', 'uploadSpeed': '572.1579'}}], 'network_type': None, 'perMinuteCurrent': 0, 'perMinuteLimit': 10, 'perMonthCurrent': 0, 'perMonthLimit': 2000}

推荐答案

此函数以递归方式调用自身以拼合字典和列表.

This function recursively calls itself to flatten dictionaries and lists.

from collections import OrderedDict def flatten(json_object, container=None, name=''): if container is None: container = OrderedDict() if isinstance(json_object, dict): for key in json_object: flatten(json_object[key], container=container, name=name + key + '_') elif isinstance(json_object, list): for n, item in enumerate(json_object, 1): flatten(item, container=container, name=name + str(n) + '_') else: container[str(name[:-1])] = str(json_object) return container

示例:

flatten([1, 2, 3]) OrderedDict([('1', '1'), ('2', '2'), ('3', '3')]) flatten([1, 2, 3], name='x') OrderedDict([('x1', '1'), ('x2', '2'), ('x3', '3')]) flatten({'a': [1, 2, 3], 'b': 4, 'c': {'d': [5, 6], 'e': 7}}, name='x') OrderedDict([('xa_1', '1'), ('xa_2', '2'), ('xa_3', '3'), ('xc_e', '7'), ('xc_d_1', '5'), ('xc_d_2', '6'), ('xb', '4')])

响应:

# j = json string >>> pd.DataFrame(flatten(j), index=[0]).T 0 perMinuteLimit 10 distance 10 perMonthCurrent 0 longitude 35.751607 perMonthLimit 2000 latitude -6.162959 perMinuteCurrent 0 networkRank_1_networkId 6402 networkRank_1_type3G_sampleSizeSpeed 29 networkRank_1_type3G_averageRssiAsu 9.5429091136 networkRank_1_type3G_pingTime 320.9600 networkRank_1_type3G_networkType 3 networkRank_1_type3G_averageRssiDb -69.5664329624972 networkRank_1_type3G_networkName Vodacom networkRank_1_type3G_networkId 6402 networkRank_1_type3G_downloadSpeed 1508.1304 networkRank_1_type3G_uploadSpeed 893.7692 networkRank_1_type3G_reliability 0.804236452826138 networkRank_1_type3G_sampleSizeRSSI 948 networkRank_1_networkName Vodacom networkRank_2_networkId 6400 networkRank_2_type3G_sampleSizeSpeed 21 networkRank_2_type3G_averageRssiAsu 15.3537142857 networkRank_2_type3G_pingTime 259.0000 networkRank_2_type3G_networkType 3 networkRank_2_type3G_averageRssiDb -61.4563389583101 networkRank_2_type3G_networkName tiGO networkRank_2_type3G_networkId 6400 networkRank_2_type3G_downloadSpeed 516.0000 networkRank_2_type3G_uploadSpeed 320.4211 networkRank_2_type3G_reliability 0.911904765537807 networkRank_2_type3G_sampleSizeRSSI 935 networkRank_2_networkName tiGO networkRank_3_networkId 6403 networkRank_3_type3G_sampleSizeSpeed 21 networkRank_3_type3G_averageRssiAsu 13.2729999375 networkRank_3_type3G_pingTime 194.5556 networkRank_3_type3G_networkType 3 networkRank_3_type3G_averageRssiDb -58.1521092977699 networkRank_3_type3G_networkName Airtel networkRank_3_type3G_networkId 6403 networkRank_3_type3G_downloadSpeed 1080.2500 networkRank_3_type3G_uploadSpeed 572.1579 networkRank_3_type3G_reliability 0.554680264185345 networkRank_3_type3G_sampleSizeRSSI 587 networkRank_3_networkName Airtel network_type None apiVersion 2

更多推荐

将嵌套的JSON解析为数据框

本文发布于:2023-11-30 19:27:41,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1651229.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:嵌套   数据   JSON

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!