本文介绍了根据列日期为数据框中的每个月添加行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在处理我需要推断不同月份的财务数据.这是我的数据框:
I am dealing with financial data which i need to extrapolate for different months. Here is my dataframe:
invoice_id,date_from,date_to 30492,2019-02-04,2019-09-18我想在 date_from 和 date_to 之间的不同月份里分手.因此,我需要为每个月从月开始日期到结束日期添加行.最终输出应如下所示:
I want to break this up for different months between date_from and date_to. Hence i need to add rows for each month with month starting date to ending date. Final output should look like:
invoice_id,date_from,date_to 30492,2019-02-04,2019-02-28 30492,2019-03-01,2019-03-31 30492,2019-04-01,2019-04-30 30492,2019-05-01,2019-05-31 30492,2019-06-01,2019-06-30 30492,2019-07-01,2019-07-31 30492,2019-08-01,2019-08-30 30492,2019-09-01,2019-09-18也需要照顾leap年的情况.在pandas datetime包中已经有可用的本机方法可用来实现所需的输出吗?
Need to take care of leap year scenario as well. Is there any native method already available in pandas datetime package which i can use to achieve the desired output ?
推荐答案使用:
print (df) invoice_id date_from date_to 0 30492 2019-02-04 2019-09-18 1 30493 2019-01-20 2019-03-10 #added months between date_from and date_to df1 = pd.concat([pd.Series(r.invoice_id,pd.date_range(r.date_from, r.date_to, freq='MS')) for r in df.itertuples()]).reset_index() df1.columns = ['date_from','invoice_id'] #added starts of months - sorting for correct positions df2 = (pd.concat([df[['invoice_id','date_from']], df1], sort=False, ignore_index=True) .sort_values(['invoice_id','date_from']) .reset_index(drop=True)) #added MonthEnd and date_to to last rows mask = df2['invoice_id'].duplicated(keep='last') s = df2['invoice_id'].map(df.set_index('invoice_id')['date_to']) df2['date_to'] = np.where(mask, df2['date_from'] + pd.offsets.MonthEnd(), s) print (df2) invoice_id date_from date_to 0 30492 2019-02-04 2019-02-28 1 30492 2019-03-01 2019-03-31 2 30492 2019-04-01 2019-04-30 3 30492 2019-05-01 2019-05-31 4 30492 2019-06-01 2019-06-30 5 30492 2019-07-01 2019-07-31 6 30492 2019-08-01 2019-08-31 7 30492 2019-09-01 2019-09-18 8 30493 2019-01-20 2019-01-31 9 30493 2019-02-01 2019-02-28 10 30493 2019-03-01 2019-03-10更多推荐
根据列日期为数据框中的每个月添加行
发布评论