我有一个在线购物平台的Orders数据库。
I have an Orders database for an online shopping platform.
我正在使用的表如下所示,其中每一行对应一个客户/项目/日期。
The table I'm working with looks like this, where each line corresponds to one customer/item/date.
OrderHistory <- data.frame(date=c("2015-02-01", "2015-03-01", "2015-04-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-05-01"), customer=c("A","A","A","B","B","B","B"), item=c("Candy", "Coffee", "Coffee", "Candy", "Candy", "Candy", "Coffee" ))我想要得到的是每个次数的连续计数成员已经订购了特定商品,因此我可以对哪些商品是由同一客户重复订购以及一次又一次订购的商品进行分析。
What I would like to get is a running count of the number of times each member has ordered the specific item so I can run analysis on which items are ordered repeatedly by the same customers and which ones are ordered once and never again.
输出看起来像像
out <- data.frame(date=c("2015-02-01", "2015-03-01", "2015-04-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-05-01"), member=c("A","A","A","B","B","B","B"), item=c("Candy", "Coffee", "Coffee", "Candy", "Candy", "Candy", "Coffee" ), count=c(1,1,2,1,2,3,1))我会喜欢dplyr解决方案,但是我愿意接受任何建议!平台上的确切项目会不断变化,因此解决方案必须动态才能解决。
I would love a dplyr solution but I'm open to any suggestions! The exact items on the platform are constantly changing, so the solution would have to be dynamic to account for that.
推荐答案我相信这应该会给您您想要的东西
I believe this should give you what you want
library(dplyr) OrderHistory %>% group_by(customer, item) %>% mutate(count = seq(n())) Source: local data frame [7 x 4] Groups: customer, item date customer item count 1 2015-02-01 A Candy 1 2 2015-03-01 A Coffee 1 3 2015-04-01 A Coffee 2 4 2015-03-01 B Candy 1 5 2015-04-01 B Candy 2 6 2015-05-01 B Candy 3 7 2015-05-01 B Coffee 1更多推荐
在数据框中的组内运行计数
发布评论