目标
Target
我有一个Pandas数据框,如下所示,它具有多列,并希望获取列的总和,MyColumn.
I have a Pandas data frame, as shown below, with multiple columns and would like to get the total of column, MyColumn.
数据框 -df:
print df
X MyColumn Y Z 0 A 84 13.0 69.0 1 B 76 77.0 127.0 2 C 28 69.0 16.0 3 D 28 28.0 31.0 4 E 19 20.0 85.0 5 F 84 193.0 70.0
我的尝试 :
My attempt:
我尝试使用groupby和.sum()来获取列的总和:
I have attempted to get the sum of the column using groupby and .sum():
Total = df.groupby['MyColumn'].sum() print Total这会导致以下错误:
TypeError: 'instancemethod' object has no attribute '__getitem__'
预期产量
Expected Output
我希望输出如下:
319或者,我希望使用新的row标题为TOTAL的df进行编辑,其中包含总计:
Or alternatively, I would like df to be edited with a new row entitled TOTAL containing the total:
X MyColumn Y Z 0 A 84 13.0 69.0 1 B 76 77.0 127.0 2 C 28 69.0 16.0 3 D 28 28.0 31.0 4 E 19 20.0 85.0 5 F 84 193.0 70.0 TOTAL 319推荐答案
您应使用 sum :
Total = df['MyColumn'].sum() print (Total) 319然后您使用 loc 使用Series,在这种情况下,索引应设置为与您需要求和的特定列相同:
Then you use loc with Series, in that case the index should be set as the same as the specific column you need to sum:
df.loc['Total'] = pd.Series(df['MyColumn'].sum(), index = ['MyColumn']) print (df) X MyColumn Y Z 0 A 84.0 13.0 69.0 1 B 76.0 77.0 127.0 2 C 28.0 69.0 16.0 3 D 28.0 28.0 31.0 4 E 19.0 20.0 85.0 5 F 84.0 193.0 70.0 Total NaN 319.0 NaN NaN因为如果传递标量,则将填充所有行的值:
because if you pass scalar, the values of all rows will be filled:
df.loc['Total'] = df['MyColumn'].sum() print (df) X MyColumn Y Z 0 A 84 13.0 69.0 1 B 76 77.0 127.0 2 C 28 69.0 16.0 3 D 28 28.0 31.0 4 E 19 20.0 85.0 5 F 84 193.0 70.0 Total 319 319 319.0 319.0另两个解决方案是 at ,然后 ix 参见以下应用程序:
Two other solutions are with at, and ix see the applications below:
df.at['Total', 'MyColumn'] = df['MyColumn'].sum() print (df) X MyColumn Y Z 0 A 84.0 13.0 69.0 1 B 76.0 77.0 127.0 2 C 28.0 69.0 16.0 3 D 28.0 28.0 31.0 4 E 19.0 20.0 85.0 5 F 84.0 193.0 70.0 Total NaN 319.0 NaN NaNdf.ix['Total', 'MyColumn'] = df['MyColumn'].sum() print (df) X MyColumn Y Z 0 A 84.0 13.0 69.0 1 B 76.0 77.0 127.0 2 C 28.0 69.0 16.0 3 D 28.0 28.0 31.0 4 E 19.0 20.0 85.0 5 F 84.0 193.0 70.0 Total NaN 319.0 NaN NaN
注意:自Pandas v0.20起,已不推荐使用ix.改用loc或iloc.
Note: Since Pandas v0.20, ix has been deprecated. Use loc or iloc instead.
更多推荐
获取总计 pandas 列
发布评论