使用 dplyr 重新编码多列

编程入门 行业动态 更新时间:2024-10-28 11:29:09
本文介绍了使用 dplyr 重新编码多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我有一个数据框,我在其中重新编码了几列,以便将 999 设置为 NA

I had a dataframe where I recoded several columns so that 999 was set to NA

dfB <-dfA %>% mutate(adhere = if_else(adhere==999, as.numeric(NA), adhere)) %>% mutate(engage = if_else(engage==999, as.numeric(NA), engage)) %>% mutate(quality = if_else(quality==999, as.numeric(NA), quality)) %>% mutate(undrstnd = if_else(undrstnd==999, as.numeric(NA), undrstnd)) %>% mutate(sesspart = if_else(sesspart==999, as.numeric(NA), sesspart)) %>% mutate(attended = if_else(attended>=9, as.integer(NA), attended))

我想使用 mutate_at() 和一系列列和 recode() 而不是 if_else(),但我被困在如何给它条件上.我认为类似 999 = NA 之类的东西基于一些 mutate_all 示例——但我还需要 NA 来匹配 .x 的类型,我不确定如何让它成为类型敏感的

I want to use mutate_at() and a range of columns and recode() instead of if_else(), but I am stuck on how to give it the condition. I think something like 999 = NA based on some mutate_all examples -- but I also need the NA to match the type of .x and I am unsure how to get it to be type sensitive

我试过了:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T)) z <- y %>% mutate_at( vars(y1:y2), funs(recode(.,`999` = as.numeric(NA))))

但我收到警告未替换的值被视为 NA 作为 .x 不兼容.请彻底指定替换或提供 .default",我可以看到它是针对数字列的,但不是针对整数列 y2 的

But I get a warning "Unreplaced values treated as NA as .x is not compatible. Please specify replacements exhaustively or supply .default " and I can see that it worded for the numeric column, but not for the integer column y2"

> z y1 y2 y3 1 1 NA TRUE 2 2 NA TRUE 3 NA NA FALSE 4 3 NA FALSE 5 4 NA TRUE

推荐答案

目前,基于 dplyr文档:

across() 取代了范围变体"系列;如 summarise_at()、summarise_if() 和 summarise_all().

across() supersedes the family of "scoped variants" like summarise_at(), summarise_if(), and summarise_all().

因此,现在建议改用 mutate 和 across.

So, using mutate and across instead is now recommended.

喜欢 Chris LeBoa 说,如果你只想把一个烦人的值转换成NA,函数na_if()可能是最好的选择:

Like Chris LeBoa said, if you only want to convert an annoying value to NA, the function na_if() is probably the best choice:

y <- data.frame(y1=c(1,2,999,3,4), y2=c(1L, 2L, 999L, 3L, 4L), y3=c(T,T,F,F,T)) y y1 y2 y3 1 1 1 TRUE 2 2 2 TRUE 3 999 999 FALSE 4 3 3 FALSE 5 4 4 TRUE z <- y %>% mutate(across( y1:y2, ~na_if(., 999) )) z y1 y2 y3 1 1 1 TRUE 2 2 2 TRUE 3 NA NA FALSE 4 3 3 FALSE 5 4 4 TRUE

同样,如果你真的想recode多列中的值,你可以按照bcarothers 中的cross-not-working-in-a-function">示例:

Similarly, if you really want to recode values in multiple columns, you can follow the example from bcarothers:

df1 <- tibble(Q7_1=1:5, Q7_1_TEXT=c("let's","see","grogu","this","week"), Q8_1=6:10, Q8_1_TEXT=rep("grogu",5), Q8_2=11:15, Q8_2_TEXT=c("grogu","is","the","absolute","best")) df2 <- df1 %>% mutate(across( starts_with("Q8") & ends_with("TEXT"), ~recode(., "grogu"="mando") ))

更多推荐

使用 dplyr 重新编码多列

本文发布于:2023-10-23 19:31:12,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1521805.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:dplyr

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!