我有一个看起来像这样的DataFrame 。
date name 0 2015-06-13 00:21:25 a 1 2015-06-13 01:00:25 b 2 2015-06-13 02:54:48 c 3 2015-06-15 14:38:15 a 4 2015-06-15 15:29:28 b我想计算特定日期范围内日期的出现次数,包括那些未出现在列中的日期(并忽略name列中的任何内容)。 例如,我的日期范围可能如下所示:
periods = pd.date_range('2015-06-13', '2015-06-16', freq = 'd')然后,我想要一个看起来像这样的输出:
date count 2015-06-13 3 2015-06-14 0 2015-06-15 2 2015-06-16 0我找不到任何让我保留0行的函数。
I have a DataFrame that looks like this.
date name 0 2015-06-13 00:21:25 a 1 2015-06-13 01:00:25 b 2 2015-06-13 02:54:48 c 3 2015-06-15 14:38:15 a 4 2015-06-15 15:29:28 bI want to count the occurrences of dates against a specific date range, including ones that do not appear in the column (and ignores whatever that is in the name column). For example, I might have a date range that looks like this:
periods = pd.date_range('2015-06-13', '2015-06-16', freq = 'd')Then, I want an output that looks something like:
date count 2015-06-13 3 2015-06-14 0 2015-06-15 2 2015-06-16 0I haven't been able to find any function that let me keep the 0 rows.
最满意答案
我认为您可以首先使用列date为value_counts ,然后使用fillna的periods重新reindex为0 。 最后通过astype和reset_index将float转换为int :
df = df['date'].dt.date.value_counts() print df 2015-06-13 3 2015-06-15 2 Name: date, dtype: int64 periods = pd.date_range('2015-06-13', '2015-06-16', freq = 'd') df = df.reindex(periods).fillna(0).astype(int).reset_index() df.columns = ['date','count'] print df date count 0 2015-06-13 3 1 2015-06-14 0 2 2015-06-15 2 3 2015-06-16 0I think you can first use date from column date for value_counts and then reindex by periods with fillna by 0. Last convert float to int by astype and reset_index:
df = df['date'].dt.date.value_counts() print df 2015-06-13 3 2015-06-15 2 Name: date, dtype: int64 periods = pd.date_range('2015-06-13', '2015-06-16', freq = 'd') df = df.reindex(periods).fillna(0).astype(int).reset_index() df.columns = ['date','count'] print df date count 0 2015-06-13 3 1 2015-06-14 0 2 2015-06-15 2 3 2015-06-16 0更多推荐
发布评论