如何创建一个列,根据条件累计添加前两行的总和?

编程入门 行业动态 更新时间:2024-10-17 02:57:08
本文介绍了如何创建一个列,根据条件累计添加前两行的总和?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 我曾试着问过这个问题,但是这个问题不好说。这是一个新的尝试,因为我还没有解决它。

我有一个数据集与赢家,输家,日期,winner_points和loser_points。 $ b

对于每一行,我需要两个新的列,一个是赢家,一个是失败者,表示他们迄今获得了多少分(赢家和输家)。

示例数据:

赢家<-c(1,2,3, 1,2,3,1,2,3) loser <-c(3,1,1,2,1,1,3,1,2) date <-c( 2017年10月1日, 2017年10月2日, 2017年10月3日, 2017年10月4日, 2017年10月5日, 2017年10月6日,2017年-10-07,2017-10-08,2017-10-09) winner_points <-c(2,1,2,1,2,1,2,1,2) loser_points< -c(1,0,1,0,1,0,1,0,1) test_data< - data.frame(winner,loser,date = as.Date日期),winner_points,loser_points)

我想输出为: pre $ winner_points_sum <-c(0,0,1,3,1,3,5,3,5) loser_points_sum <-c (0,2,1,4,5,4,7,4) test_data< - data.frame(赢家,输家,日期= as.Date(日期),赢家点,输家点,赢家点数,输家点数) $ b

到目前为止我已经解决了这个问题,就是做一个for循环,例如:

library(dplyr) test_data $ winner_points_sum_loop < - 0 test_data $ loser_points_sum_loop < - 0 for(i in row.names(test_data)){ test_data [i, ] $ winner_points_sum_loop < - ( test_data%>% dplyr :: filter(winner == test_data [i,] $ winner&日期< test_data [i,] $ date)%>% dplyr :: summarize(points = sum(winner_points,na.rm = TRUE)) + test_data%>% dplyr :: filter(loser == test_data [i,] $ winner& date< test_data [i,] $ date)%>% dplyr :: summarize(points = sum(loser_points,na .rm = TRUE)))} test_data $ winner_points_sum_loop< - unlist(test_data $ winner_points_sum_loop)

有什么建议如何解决这个问题?当行数加起来时,查询需要相当长的一段时间。我已经试着用AVE函数来详细说明,我可以做一列来作为赢家,但是不能弄清楚如何将他们的积分作为失败者加入。

解决方案

code>赢者<-c(1,2,3,1,2,3,1,2,3)失败者<-c(3,1,1,2,1,1 ,3,1,2)日期%#分隔列排列(日期)%>%#订单日期(如果不是) group_by(玩家)%> ;每个玩家的%# mutate(sum_points = cumsum(points) - points)%>%#获得积分到该日期 ungroup()%>%#忘记分组$ b $ (pl_pts_sumpts,player,points,sum_points)%>%#unite columns spread(type,pl_pts_sumpts)%>%#整形 separate(loser,c(loser,loser_points ,loser_points_sum),convert = T)%>%#分隔列并给出适当的名字 separate(winner,c(winner,winner_poin转换= T)%>% select(赢家,输家,日期,赢家点数,输家点数,赢家点数,输家点数)#选择你喜欢的订单 ##A tibble:9 x 7 #赢家失败日期winner_points loser_points winner_points_sum loser_points_sum #*< int> < INT> <日期> < INT> < INT> < INT> < INT> #1 1 3 2017-10-01 2 1 0 0 #2 2 1 2017-10-02 1 0 0 2 #3 3 1 2017-10-03 2 1 1 2 #4 1 2 2017-10-04 1 0 3 1 #5 2 1 2017-10-05 2 1 1 4 #6 3 1 2017-10-06 1 0 3 5 #7 1 3 2017-10-07 2 1 5 4 #8 2 1 2017-10-08 1 0 3 7 #9 3 2 2017-10-09 2 1 5 4

I tried asking this question before but was it was poorly stated. This is a new attempt cause I haven't solved it yet.

I have a dataset with winners, losers, date, winner_points and loser_points.

For each row, I want two new columns, one for the winner and one for the loser that shows how many points they have scored so far (as both winners and losers).

Example data:

winner <- c(1,2,3,1,2,3,1,2,3) loser <- c(3,1,1,2,1,1,3,1,2) date <- c("2017-10-01","2017-10-02","2017-10-03","2017-10-04","2017-10-05","2017-10-06","2017-10-07","2017-10-08","2017-10-09") winner_points <- c(2,1,2,1,2,1,2,1,2) loser_points <- c(1,0,1,0,1,0,1,0,1) test_data <- data.frame(winner, loser, date = as.Date(date), winner_points, loser_points)

I want the output to be:

winner_points_sum <- c(0, 0, 1, 3, 1, 3, 5, 3, 5) loser_points_sum <- c(0, 2, 2, 1, 4, 5, 4, 7, 4) test_data <- data.frame(winner, loser, date = as.Date(date), winner_points, loser_points, winner_points_sum, loser_points_sum)

How I've solved it thus far is to do a for loop such as:

library(dplyr) test_data$winner_points_sum_loop <- 0 test_data$loser_points_sum_loop <- 0 for(i in row.names(test_data)) { test_data[i,]$winner_points_sum_loop <- ( test_data %>% dplyr::filter(winner == test_data[i,]$winner & date < test_data[i,]$date) %>% dplyr::summarise(points = sum(winner_points, na.rm = TRUE)) + test_data %>% dplyr::filter(loser == test_data[i,]$winner & date < test_data[i,]$date) %>% dplyr::summarise(points = sum(loser_points, na.rm = TRUE)) ) } test_data$winner_points_sum_loop <- unlist(test_data$winner_points_sum_loop)

Any suggestions how to tackle this problem? The queries take quite some time when the row numbers add up. I've tried elaborating with the AVE function, I can do it for one column to sum a players point as winner but can't figure out how to add their points as loser.

解决方案

winner <- c(1,2,3,1,2,3,1,2,3) loser <- c(3,1,1,2,1,1,3,1,2) date <- c("2017-10-01","2017-10-02","2017-10-03","2017-10-04","2017-10-05","2017-10-06","2017-10-07","2017-10-08","2017-10-09") winner_points <- c(2,1,2,1,2,1,2,1,2) loser_points <- c(1,0,1,0,1,0,1,0,1) test_data <- data.frame(winner, loser, date = as.Date(date), winner_points, loser_points) library(dplyr) library(tidyr) test_data %>% unite(winner, winner, winner_points) %>% # unite winner columns unite(loser, loser, loser_points) %>% # unite loser columns gather(type, pl_pts, winner, loser, -date) %>% # reshape separate(pl_pts, c("player","points"), convert = T) %>% # separate columns arrange(date) %>% # order dates (in case it's not) group_by(player) %>% # for each player mutate(sum_points = cumsum(points) - points) %>% # get points up to that date ungroup() %>% # forget the grouping unite(pl_pts_sumpts, player, points, sum_points) %>% # unite columns spread(type, pl_pts_sumpts) %>% # reshape separate(loser, c("loser", "loser_points", "loser_points_sum"), convert = T) %>% # separate columns and give appropriate names separate(winner, c("winner", "winner_points", "winner_points_sum"), convert = T) %>% select(winner, loser, date, winner_points, loser_points, winner_points_sum, loser_points_sum) # select the order you prefer # # A tibble: 9 x 7 # winner loser date winner_points loser_points winner_points_sum loser_points_sum # * <int> <int> <date> <int> <int> <int> <int> # 1 1 3 2017-10-01 2 1 0 0 # 2 2 1 2017-10-02 1 0 0 2 # 3 3 1 2017-10-03 2 1 1 2 # 4 1 2 2017-10-04 1 0 3 1 # 5 2 1 2017-10-05 2 1 1 4 # 6 3 1 2017-10-06 1 0 3 5 # 7 1 3 2017-10-07 2 1 5 4 # 8 2 1 2017-10-08 1 0 3 7 # 9 3 2 2017-10-09 2 1 5 4

更多推荐

如何创建一个列,根据条件累计添加前两行的总和?

本文发布于:2023-11-21 23:34:51,感谢您对本站的认可!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:总和   创建一个   两行   条件

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!