根据其他行值的差异在相同的熊猫数据框中创建新的行值(Creating new row values within same pandas dataframe based on difference o

根据其他行值的差异在相同的熊猫数据框中创建新的行值(Creating new row values within same pandas dataframe based on difference of other row values)

以下是现有的df

data = np.array([['','Market','Product Code','Week','Sales','Units'], ['Total Customers',123,1,500,400], ['Total Customers',123,2,400,320], ['Major Customer 1',123,1,100,220], ['Major Customer 1',123,2,230,230], ['Major Customer 2',123,1,130,30], ['Major Customer 2',123,2,20,10], ['Total Customers',456,1,500,400], ['Total Customers',456,2,400,320], ['Major Customer 1',456,1,100,220], ['Major Customer 1',456,2,230,230], ['Major Customer 2',456,1,130,30], ['Major Customer 2',456,2,20,10]]) df =pd.DataFrame(data)

我希望根据“市场”列（总客户）中的行值与“市场”列（主要客户1 +主要客户2）中的行值之间的价值差异创建新行。我希望将“市场”列中的新行值指定为“剩余客户”并附加在同一个DF内。

总体而言，我基本上正在努力解决市场上剩余的销售和单位差距问题

这是我迄今为止使用loc试过的，但我一直在收到一个关键错误。谁能帮忙？

df.loc[df['Market'] == 'Remaining Customers'] = df.loc[df['Market'] == 'Total Customers']- (df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])

Below is an existing df

data = np.array([['','Market','Product Code','Week','Sales','Units'], ['Total Customers',123,1,500,400], ['Total Customers',123,2,400,320], ['Major Customer 1',123,1,100,220], ['Major Customer 1',123,2,230,230], ['Major Customer 2',123,1,130,30], ['Major Customer 2',123,2,20,10], ['Total Customers',456,1,500,400], ['Total Customers',456,2,400,320], ['Major Customer 1',456,1,100,220], ['Major Customer 1',456,2,230,230], ['Major Customer 2',456,1,130,30], ['Major Customer 2',456,2,20,10]]) df =pd.DataFrame(data)

I wish to create new rows based on the value difference between the row value in the 'Market' column (Total Customers) and the row values in the 'Market' Column (Major Customer 1 + Major Customer 2). I wish to assign the new row value in 'Market' Column as 'Remaining Customers' and append within the same df.

Overall, I'm basically trying to work out the remaining Sales and Unit 'Gap' of the market

This is what I have tried so far using loc but I keep getting a key error. Can anyone help?

df.loc[df['Market'] == 'Remaining Customers'] = df.loc[df['Market'] == 'Total Customers']- (df.loc[df['Market'] == 'Major Customer 1']+df.loc[df['Market'] == 'Major Customer 2'])