Python Pandas:在多个条件下合并数据框

编程入门 行业动态 更新时间:2024-10-27 20:35:04
本文介绍了Python Pandas:在多个条件下合并数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我希望合并在多种情况下通过sql获取的数据帧.

I wish to merge data frames as fetched via sql under multiple condition.

  • df1:第一个df包含客户ID,群集ID和客户区域ID.
  • 第二个df包含投诉ID,注册号.

df1和df2如下所示:

The df1 and df2 are shown below:

df1

Customer ID Cluster ID Customer Zone ID CUS1001.A CUS1001.X CUS1000 CUS1001.B CUS1001.X CUS1000 CUS1001.C CUS1001.X CUS1000 CUS1001.D CUS1001.X CUS1000 CUS1001.E CUS1001.X CUS1000 CUS2001.A CUS2001.X CUS2000

df2:

Complain ID RegistrationNumber Status CUS3501.A 99231 open CUS1001.B 21340 open CUS1001.X 32100 open

我希望在满足以下条件的情况下合并这两个数据框:

I wish to merge these two data frame with following condition:

if(Complain ID == Customer ID): Merge on Customer ID Elif(Complain ID == Cluster ID): Merge on Customer ID Elif (Complain ID == Customer Zone ID): Merge on Customer ID Else: Merge empty row.

最终结果应如下所示:

Customer ID Cluster ID Customer Zone ID Complain ID Regi ID Status CUS1001.A CUS1001.X CUS1000 CUS1001.X 32100 open CUS1001.B CUS1001.X CUS1000 CUS1001.B 21340 open CUS1001.C CUS1001.X CUS1000 CUS1001.X 32100 open . . . . . . . . . . . . CUS2001.A CUS2001.X CUS2000 0 0 0

请帮助!

推荐答案

尝试使用... ...使用pandas:melt,merge和concat

Try this ...using pandas: melt, merge and concat

df=pd.melt(df1) df=df.merge(df2,left_on='value',right_on='Complain ID',how='left') df['number']=df.groupby('variable').cumcount() df=df.groupby('number').bfill() Target=pd.concat([df1,df.iloc[:5,2:6]],axis=1).fillna(0).drop('number',axis=1) Target Out[39]: Customer ID Cluster ID Customer Zone ID Complain ID RegistrationNumber \ 0 CUS1001.A CUS1001.X CUS1000 CUS1001.X 32100.0 1 CUS1001.B CUS1001.X CUS1000 CUS1001.B 21340.0 2 CUS1001.C CUS1001.X CUS1000 CUS1001.X 32100.0 3 CUS1001.D CUS1001.X CUS1000 CUS1001.X 32100.0 4 CUS1001.E CUS1001.X CUS1000 CUS1001.X 32100.0 5 CUS2001.A CUS2001.X CUS2000 0 0.0 Status 0 open 1 open 2 open 3 open 4 open 5 0

更新 通过使用numpy的intersect1d,我个人比以前更喜欢这种方法.

Update By using numpy's intersect1d, Personally I like this approach most than the previous one .

df1.MatchId=[np.intersect1d(x,df2.ComplainID.values) for x in df1[['CustomerID','ClusterID']].values] df1.MatchId=df1.MatchId.apply(pd.Series) df1 Out[307]: CustomerID ClusterID CustomerZoneID MatchId 0 CUS1001.A CUS1001.X CUS1000 CUS1001.X 1 CUS1001.B CUS1001.X CUS1000 CUS1001.B 2 CUS1001.C CUS1001.X CUS1000 CUS1001.X 3 CUS1001.D CUS1001.X CUS1000 CUS1001.X 4 CUS1001.E CUS1001.X CUS1000 CUS1001.X 5 CUS2001.A CUS2001.X CUS2000 NaN df1.merge(df2,left_on='MatchId',right_on='ComplainID',how='left') Out[311]: CustomerID ClusterID CustomerZoneID MatchId ComplainID \ 0 CUS1001.A CUS1001.X CUS1000 CUS1001.X CUS1001.X 1 CUS1001.B CUS1001.X CUS1000 CUS1001.B CUS1001.B 2 CUS1001.C CUS1001.X CUS1000 CUS1001.X CUS1001.X 3 CUS1001.D CUS1001.X CUS1000 CUS1001.X CUS1001.X 4 CUS1001.E CUS1001.X CUS1000 CUS1001.X CUS1001.X 5 CUS2001.A CUS2001.X CUS2000 NaN NaN RegistrationNumber Status 0 32100.0 open 1 21340.0 open 2 32100.0 open 3 32100.0 open 4 32100.0 open 5 NaN NaN

更多推荐

Python Pandas:在多个条件下合并数据框

本文发布于:2023-10-16 11:33:39,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1497440.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:多个   条件下   数据   Python   Pandas

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!