我是数据帧的新手,正在努力寻找如何实现以下目标的方法:
I'm new to data frames and am struggling to figure out how to accomplish the following:
我已经有一个像这样的时间序列的数据框:
I have a dataframe already as a time series like so:
timestamp source 2017-06-18 10:43:54 two 2017-06-20 03:38:23 three 2017-06-18 07:37:02 one 2017-06-07 16:49:51 two 2017-06-15 22:36:10 two 2017-06-07 16:49:51 two 2017-06-18 22:36:10 two我正在尝试1)每天重新采样,2)获得当天每种类别的百分比.像这样:
I am trying to 1) resample into daily and 2) get a % of each category for that day. Like so:
timestamp One Two Three 2017-06-18 33% 66% 0% 2017-06-20 0% 0% 100% 2017-06-07 0% 100% 0% 2017-06-15 0% 100% 0%我可以完成一些基本工作,例如,每天重新采样来源"的数量,但并没有将其细分为类别.
I can accomplish basic things like, get a count of 'source' resampled to daily, but it doesn't break it down into categories.
有人可以帮我指出正确的方向吗?非常感谢.
Can anyone help point me in the right direction? Greatly appreciated.
推荐答案groupby + value_counts + unstack
groupby + value_counts + unstack
(df.groupby(df.timestamp.dt.date).source.value_counts(normalize=True)*100).unstack().fillna(0) source one three two timestamp 2017-06-07 0.000000 0.0 100.000000 2017-06-15 0.000000 0.0 100.000000 2017-06-18 33.333333 0.0 66.666667 2017-06-20 0.000000 100.0 0.000000pivot_table
pivot_table
df2 = df.pivot_table(index=df.timestamp.dt.date, columns='source', aggfunc='size') df2 = df2.divide(df2.sum(1), axis=0).fillna(0)*100pd.crosstab
pd.crosstab(df.timestamp.dt.date, df.source, normalize='index')*100更多推荐
如何从时间序列重采样中获取列内类别的计数
发布评论