我想从长格式数据框中的所有其他标记项中减去带有标签“baseline”的行中的值。 使用带有“baseline”子集的left_join,可以通过两个步骤轻松完成此操作。 但是,我无法弄清楚如何将vas_1和vas_diff组合成一个链。
library(dplyr) # Create test data n_users = 5 vas = data_frame( user = rep(letters[1:n_users], each = 3), group = rep(c("baseline", "early", "late" ),n_users), vas = round(rgamma(n_users*3, 10,1.4 )) ) # The above data are given # Assume some other operations are required vas_1 = vas %>% mutate( vas = vas * 2 ) # I want to put the following into one # chain with the above # Use self-join to subtract baseline vas_diff = vas_1 %>% filter(group != "baseline") %>% # Problem is vas_1 here. Using . gives error here # Adding copy = TRUE does not help # left_join(. %>% filter(group == "baseline") , by = c("user")) %>% left_join(vas_1 %>% filter(group == "baseline") , by = c("user")) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas) # remove temporary variables vas_diffI want to subtract values from a row with label "baseline" from all the otherwise marked items in a long format data frame. It is easy to do this in two steps using a left_join with the "baseline" subset. However, I could not figure out how to combine vas_1 and vas_diff into one chain.
library(dplyr) # Create test data n_users = 5 vas = data_frame( user = rep(letters[1:n_users], each = 3), group = rep(c("baseline", "early", "late" ),n_users), vas = round(rgamma(n_users*3, 10,1.4 )) ) # The above data are given # Assume some other operations are required vas_1 = vas %>% mutate( vas = vas * 2 ) # I want to put the following into one # chain with the above # Use self-join to subtract baseline vas_diff = vas_1 %>% filter(group != "baseline") %>% # Problem is vas_1 here. Using . gives error here # Adding copy = TRUE does not help # left_join(. %>% filter(group == "baseline") , by = c("user")) %>% left_join(vas_1 %>% filter(group == "baseline") , by = c("user")) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas) # remove temporary variables vas_diff最满意答案
我什么时候使用匿名函数. 应该多次使用:
... %>% (function(df) { ... }) %>% ...因此,在你的情况下:
vas_diff = vas_1 %>% filter(group != "baseline") %>% (function(df) left_join(df, df %>% filter(group == "baseline") , by = c("user"))) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas)(这不会产生如上面评论中描述的理想结果,但是它显示了如何使用匿名函数)
但可能你想要这个:
vas_diff = vas_1 %>% left_join( x = filter(., group != "baseline") , y = filter(., group == "baseline") , by = c("user") ) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas) # remove temporary variablesI use anonymous function when . should be used multiple times:
... %>% (function(df) { ... }) %>% ...Hence, in your case:
vas_diff = vas_1 %>% filter(group != "baseline") %>% (function(df) left_join(df, df %>% filter(group == "baseline") , by = c("user"))) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas)(which is not going produce desirable result as describe in comments above, but you it shows how to use anonymous function)
but probably you want this:
vas_diff = vas_1 %>% left_join( x = filter(., group != "baseline") , y = filter(., group == "baseline") , by = c("user") ) %>% mutate(vas = vas.x - vas.y) %>% # compute offset select(user, group.x, vas) # remove temporary variables更多推荐
发布评论