使用熊猫按月定制分组(Custom grouping by months with pandas)

编程入门 行业动态 更新时间:2024-10-17 11:32:28
使用熊猫按月定制分组(Custom grouping by months with pandas)

我试图不定期地从大熊猫数据框中对每月现金流进行分组:例如,在指数为[4,9,12]的月末。 我一直在使用pd.groupby()和pd.resample(),但无法找出有效的语法。

import datetime as dt import pandas as pd import numpy as np index = pd.date_range(dt.datetime(2015, 1, 1), periods=12, freq='M') data = (10 * np.random.rand(12)).round() df = pd.DataFrame(data, index = index, columns = ['A']) print df

基本上,从列表[4,9,12],我正在寻找一个语法,将1-4个月,5-9个和10-12个月的总和相加。 例如,从下面的A列开始,我想得到列B或C,其中运行总和在这些给定月份结束:

A B C 2015-01-31 9 9 0 2015-02-28 7 16 0 2015-03-31 5 21 0 2015-04-30 4 25 25 2015-05-31 6 6 0 2015-06-30 8 14 0 2015-07-31 2 16 0 2015-08-31 7 23 0 2015-09-30 0 23 23 2015-10-31 5 5 0 2015-11-30 5 10 0 2015-12-31 8 18 18

感谢您的帮助!

吉恩

I am trying to group monthly cash flows from a pandas dataframe on an irregular basis: for instance, at the end of months whose index is in [4, 9, 12]. I have been messing around with pd.groupby() and pd.resample() but can't figure out a syntax that works.

import datetime as dt import pandas as pd import numpy as np index = pd.date_range(dt.datetime(2015, 1, 1), periods=12, freq='M') data = (10 * np.random.rand(12)).round() df = pd.DataFrame(data, index = index, columns = ['A']) print df

Basically, from a list [4, 9, 12], I am looking for a syntax that would sum together the months 1-4, 5-9 and 10-12. For instance, from the column A below, I would like to get either the column B or C with a running-sum ending on these given months:

A B C 2015-01-31 9 9 0 2015-02-28 7 16 0 2015-03-31 5 21 0 2015-04-30 4 25 25 2015-05-31 6 6 0 2015-06-30 8 14 0 2015-07-31 2 16 0 2015-08-31 7 23 0 2015-09-30 0 23 23 2015-10-31 5 5 0 2015-11-30 5 10 0 2015-12-31 8 18 18

Thank you for your help!

Jean

最满意答案

使用一些字符串格式来获取索引,但适用于任何月份组合(只要明确包含第一个月)。

它避免遍历完整的Pandas数据帧,仅在所需的几个月内循环。

year = 2015 months = [1, 4, 9, 12] for num, month in enumerate(months[:-1]): csum = df['{0}-{1}'.format(year, month):'{0}-{1}'.format( year,months[num+1]-1)].A.sum() df.loc['{0}-{1}'.format(year,months[num+1]), 'B'] = csum print(df) A B 2015-01-31 5 NaN 2015-02-28 1 NaN 2015-03-31 3 NaN 2015-04-30 7 9 2015-05-31 6 NaN 2015-06-30 9 NaN 2015-07-31 4 NaN 2015-08-31 4 NaN 2015-09-30 9 30 2015-10-31 3 NaN 2015-11-30 0 NaN 2015-12-31 9 12

Using some string formatting to get the index, but works for any combination of months (as long as the first month is explicitly included).

It avoids to loop over the full Pandas dataframe, looping over the required months only.

year = 2015 months = [1, 4, 9, 12] for num, month in enumerate(months[:-1]): csum = df['{0}-{1}'.format(year, month):'{0}-{1}'.format( year,months[num+1]-1)].A.sum() df.loc['{0}-{1}'.format(year,months[num+1]), 'B'] = csum print(df) A B 2015-01-31 5 NaN 2015-02-28 1 NaN 2015-03-31 3 NaN 2015-04-30 7 9 2015-05-31 6 NaN 2015-06-30 9 NaN 2015-07-31 4 NaN 2015-08-31 4 NaN 2015-09-30 9 30 2015-10-31 3 NaN 2015-11-30 0 NaN 2015-12-31 9 12

更多推荐

本文发布于:2023-08-03 19:44:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1398159.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:熊猫   按月   Custom   pandas   months

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!