合并2个Pandas DataFrame,如果索引匹配,则将另一个记录与另一个记录匹配(Merge 2 Pandas DataFrame, Take One Record Over Another i

编程入门 行业动态 更新时间:2024-10-19 01:25:40
合并2个Pandas DataFrame,如果索引匹配,则将另一个记录与另一个记录匹配(Merge 2 Pandas DataFrame, Take One Record Over Another if Index Matches)

我想合并两个Pandas DataFrames,但是在索引匹配的任何地方我只想在特定df的行中合并。

所以,如果我有

df1 A B type model apple v1 10 xyz orange v2 11 pqs df2 A B type model apple v3 11 xyz grape v4 12 def

我会的

df3 A B type model apple v1 10 xyz orange v2 11 pqs grape v4 12 def

因为df1.ix['apple']优先于df2.ix['apple'] ,而orange和grape是独一无二的。

我一直在尝试进行一些索引比较,但是df2.drop(df1.index[[0]])只是删除了df2的全部内容。

两个数据框都是多索引的,具有类似的结构,由以下内容创建:

pd.read_csv(..., index_col=[3, 1])

这导致像这样的索引:

MultiIndex( levels=[[u'apple', u'orange', u'grape', ...], [u'v1', u'v2', u'v3', ... ]], labels=[[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, ...]], names=[u'type', u'model'] )

I want to merge two Pandas DataFrames, but anywhere an index matches I only want to merge in the row from a specific df.

So if I have

df1 A B type model apple v1 10 xyz orange v2 11 pqs df2 A B type model apple v3 11 xyz grape v4 12 def

I would get

df3 A B type model apple v1 10 xyz orange v2 11 pqs grape v4 12 def

Because df1.ix['apple'] takes precedence over df2.ix['apple'], and orange and grape are unique.

I have been trying to make some index comparison work, but df2.drop(df1.index[[0]]) is just removing the entire contents of df2.

Both data frames are multi-indexed with a similar structure, created by:

pd.read_csv(..., index_col=[3, 1])

Which results in an index like this:

MultiIndex( levels=[[u'apple', u'orange', u'grape', ...], [u'v1', u'v2', u'v3', ... ]], labels=[[0, 1, 2, 3, 4, 6, 7, 8, 9, 10, ...]], names=[u'type', u'model'] )

最满意答案

这就是DataFrame.combine_first()的用途:

import pandas as pd df1 = pd.DataFrame({'A': [10, 11], 'B': ['xyz', 'pqs']}, index=['apple', 'orange']) df2 = pd.DataFrame({'A': [11, 12], 'B': ['xyz', 'def']}, index=['apple', 'grape']) df3 = df1.combine_first(df2)

产量

df3 A B apple 10.0 xyz grape 12.0 def orange 11.0 pqs

编辑:在我发布上面的答案后,问题得到了实质性的修改 - 将model级别添加到索引中,有效地将其转换为MultiIndex。

import pandas as pd # Create the df1 in the question df1 = pd.DataFrame({'model': ['v1', 'v2'], 'A': [10, 11], 'B': ['xyz', 'pqs']}, index=['apple', 'orange']) df1.index.name = 'type' df1.set_index('model', append=True, inplace=True) # Create the df2 in the question df2 = pd.DataFrame({'model': ['v3', 'v4'], 'A': [11, 12], 'B': ['xyz', 'def']}, index=['apple', 'grape']) df2.index.name = 'type' df2.set_index('model', append=True, inplace=True) # Solution: remove the `model` from the index and apply the above # technique. Restore it to the index at the end if you want. df1.reset_index(level=1, inplace=True) df2.reset_index(level=1, inplace=True) df3 = df1.combine_first(df2).set_index('model', append=True)

结果:

df3 A B type model apple v1 10.0 xyz grape v4 12.0 def orange v2 11.0 pqs

That's what DataFrame.combine_first() is for:

import pandas as pd df1 = pd.DataFrame({'A': [10, 11], 'B': ['xyz', 'pqs']}, index=['apple', 'orange']) df2 = pd.DataFrame({'A': [11, 12], 'B': ['xyz', 'def']}, index=['apple', 'grape']) df3 = df1.combine_first(df2)

yields

df3 A B apple 10.0 xyz grape 12.0 def orange 11.0 pqs

EDIT: The question was substantially modified after I posted the answer above — adding the model level to the index, effectively turning it into a MultiIndex.

import pandas as pd # Create the df1 in the question df1 = pd.DataFrame({'model': ['v1', 'v2'], 'A': [10, 11], 'B': ['xyz', 'pqs']}, index=['apple', 'orange']) df1.index.name = 'type' df1.set_index('model', append=True, inplace=True) # Create the df2 in the question df2 = pd.DataFrame({'model': ['v3', 'v4'], 'A': [11, 12], 'B': ['xyz', 'def']}, index=['apple', 'grape']) df2.index.name = 'type' df2.set_index('model', append=True, inplace=True) # Solution: remove the `model` from the index and apply the above # technique. Restore it to the index at the end if you want. df1.reset_index(level=1, inplace=True) df2.reset_index(level=1, inplace=True) df3 = df1.combine_first(df2).set_index('model', append=True)

Result:

df3 A B type model apple v1 10.0 xyz grape v4 12.0 def orange v2 11.0 pqs

更多推荐

本文发布于:2023-08-04 11:24:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1415268.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:则将   索引   DataFrame   Pandas   Matches

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!