本文介绍了 pandas :在秩序上旋转的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
pd.DataFrame({'id':['aaa','aaa',' abb','abb','abb','acd','acd','acd'],'loc':['US','UK','FR','US' IN','US','CN','CN']}) id loc 0 aaa US 1 aaa UK 2 abb FR 3 abb US 4 abb IN 5 acd US 6 acd CN 7 acd CN
如何将其转换为:
id loc1 loc2 loc3 aaa US UK无 abb FR US IN acd US CN CN我正在寻找最惯用的方法。
解决方案我想你可以创建新的列 cols 与 groupby , cumcount 并转换为 string astype ,最后使用 pivot :
df ['cols'] =' loc'+(df.groupby('id')['id']。cumcount()+ 1).astype(str) print df id loc cols 0 aaa US loc1 1 aaa UK loc2 2 abb FR loc1 3 abb US loc2 4 abb IN loc3 5 acd US loc1 6 acd CN loc2 7 acd CN loc3 print df.pivot(index ='id',columns ='cols',values ='loc') cols loc1 loc2 loc3 id aaa美国英国无 abb FR US IN acd US CN CN如果你想删除索引和列na mes使用 rename_axis :
print df.pivot(index ='id',columns ='cols',values ='loc')。rename_axis(无) .rename_axis(无,轴= 1) loc1 loc2 loc3 aaa US UK无 abb FR US IN acd US CN CN所有在一起,谢谢 Colin :
print pd.pivot(df ['id'],'loc' +(df.groupby('id')。cumcount()+ 1).astype(str),df ['loc']) .rename_axis(无) .rename_axis(无, 1) loc1 loc2 loc3 aaa US UK无 abb FR US IN acd US CN CN我尝试 排名 ,但是我在版本 0.18.0 :
print df.groupby('id')['loc '] .transform(lambda x:x.rank(method ='first')) #ValueError:首先不支持非数字数据
Given this data:
pd.DataFrame({'id':['aaa','aaa','abb','abb','abb','acd','acd','acd'], 'loc':['US','UK','FR','US','IN','US','CN','CN']}) id loc 0 aaa US 1 aaa UK 2 abb FR 3 abb US 4 abb IN 5 acd US 6 acd CN 7 acd CNHow do I pivot it to this:
id loc1 loc2 loc3 aaa US UK None abb FR US IN acd US CN CNI am looking for the most idiomatic method.
解决方案I think you can create new column cols with groupby, cumcount and convert to string by astype, last use pivot:
df['cols'] = 'loc' + (df.groupby('id')['id'].cumcount() + 1).astype(str) print df id loc cols 0 aaa US loc1 1 aaa UK loc2 2 abb FR loc1 3 abb US loc2 4 abb IN loc3 5 acd US loc1 6 acd CN loc2 7 acd CN loc3 print df.pivot(index='id', columns='cols', values='loc') cols loc1 loc2 loc3 id aaa US UK None abb FR US IN acd US CN CNIf you want remove index and columns names use rename_axis:
print df.pivot(index='id', columns='cols', values='loc').rename_axis(None) .rename_axis(None, axis=1) loc1 loc2 loc3 aaa US UK None abb FR US IN acd US CN CNAll together, thank you Colin:
print pd.pivot(df['id'], 'loc' + (df.groupby('id').cumcount() + 1).astype(str), df['loc']) .rename_axis(None) .rename_axis(None, axis=1) loc1 loc2 loc3 aaa US UK None abb FR US IN acd US CN CNI try rank, but I get error in version 0.18.0:
print df.groupby('id')['loc'].transform(lambda x: x.rank(method='first')) #ValueError: first not supported for non-numeric data
更多推荐
pandas :在秩序上旋转
发布评论