本文介绍了R/tidyr::complete - 动态填充缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
限时送ChatGPT账号..我正在使用 tidyr::complete()
在具有多列的数据框中包含缺失的行,从而导致 NAs 值.如果我没有明确的列名列表,如何指示 fill
选项将 NA 值替换为 0?
I'm using tidyr::complete()
to include missing rows in a data frame with many columns, leading to NAs values. How can I instruct the fill
option to replace the NA values with 0 if I don't have an explicit list of column names?
示例:
df <- data.frame(year = c(2010, 2013:2015),
age.21 = runif(4, 0, 10),
age.22 = runif(4, 0, 10),
age.23 = runif(4, 0, 10),
age.24 = runif(4, 0, 10),
age.25 = runif(4, 0, 10))
# replaces missing values with NA - not what I want
dfplete <- complete(df, year = 2010:2015)
# replaces missing values with 0 - works, but needs explicit list
dfplete <- complete(df, year = 2010:2015, fill = list(age.21 = 0, age.22 = 0,
age.23 = 0, age.24 = 0,
age.25 = 0))
# throws error (is.list(replace) is not TRUE)
dfplete <- complete(df, year = 2010:2015, fill = 0)
# replaces missing values with NA - not what I want
dfplete <- complete(df, year = 2010:2015, fill = list(rep(0,6)))
一种解决方法是使用 dfplete[is.na(dfplete)] <- 0
,但这有替换太多值的危险.
A workaround could be to use dfplete[is.na(dfplete)] <- 0
, but that bears the danger of replacing too many values.
推荐答案
这里有一种首先重塑数据的方法:
Here's a way with reshaping the data first:
df %>%
gather("var", "val", -year) %>%
complete(year = 2010:2015, var, fill = list(val = 0)) %>%
spread(var, val)
Source: local data frame [6 x 6]
year age.21 age.22 age.23 age.24 age.25
(dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
1 2010 8.940997 7.787210 1.5747435 9.874449 5.2228670
2 2011 0.000000 0.000000 0.0000000 0.000000 0.0000000
3 2012 0.000000 0.000000 0.0000000 0.000000 0.0000000
4 2013 2.965928 6.495460 0.8966319 2.849262 0.2430174
5 2014 4.608676 1.946671 1.5765912 8.551907 0.3146824
6 2015 7.359407 4.414294 4.3419163 4.082509 1.5770299
这篇关于R/tidyr::complete - 动态填充缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论