我正在从API中提取一些数据,并且在将其转换为适当的数据帧方面遇到了挑战.
I am extracting some data from an API and having challenges transforming it into a proper dataframe.
结果DataFrame df的排列方式如下:
The resulting DataFrame df is arranged as such:
Index Column 0 {'email@email': [{'action': 'data', 'date': 'date'}, {'action': 'data', 'date': 'date'}]} 1 {'different-email@email': [{'action': 'data', 'date': 'date'}]}我正在尝试将电子邮件分为一列,然后将列表分为另一列:
I am trying to split the emails into one column and the list into a separate column:
Index Column1 Column2 0 email@email [{'action': 'data', 'date': 'date'}, {'action': 'data', 'date': 'date'}]}理想情况下,每个动作"/日期"都有自己的单独一行,但是我相信我可以做进一步的分解工作.
Ideally, each 'action'/'date' would have it's own separate row, however I believe I can do the further unpacking myself.
环顾四周后,我尝试/失败了许多解决方案,例如:
After looking around I tried/failed lots of solutions such as:
df.apply(pd.Series) # does nothing pd.DataFrame(df['column'].values.tolist()) # makes each dictionary key as a separate colum where most of the rows are NaN except one which has the pair value许多问题都询问API中数据的初始格式,这是字典的列表:
As many of the questions asked the initial format of the data in the API, it's a list of dictionaries:
[{'email@email': [{'action': 'data', 'date': 'date'}, {'action': 'data', 'date': 'date'}]},{'different-email@email': [{'action': 'data', 'date': 'date'}]}]谢谢
推荐答案一种朴素的方法如下:
inp = [{'email@email': [{'action': 'data', 'date': 'date'}, {'action': 'data', 'date': 'date'}]} , {'different-email@email': [{'action': 'data', 'date': 'date'}]}] index = 0 df = pd.DataFrame() for each in inp: # iterate through the list of dicts for k, v in each.items(): #take each key value pairs for eachv in v: #the values being a list, iterate through each print (str(eachv)) df.set_value(index,'Column1',k) df.set_value(index,'Column2',str(eachv)) index += 1我确信可能会有更好的方式编写此内容.希望这会有所帮助:)
I am sure there might be a better way of writing this. Hope this helps :)
更多推荐
将字典的pd DataFrame行分开成列
发布评论