清理R中的列数据(Cleaning column data in R)
嗨,我写了这个函数来清理R中的数据:
periodCleanse <- function(x) { if (x == ""){ return (""); } else if (substr(x, nchar(x), nchar(x)) == "M"){ return(30*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "Y"){ return(365*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "D"){ return (as.numeric(substr(x, 1, nchar(x)-1))); } }我的df看起来像这样:
period 3M 5Y 1D 7M我想打电话
df$period <- periodCleanse(df$period))但我得到了:
Warning message: In if (x == "") { : the condition has length > 1 and only the first element will be used没有任何反应。 我该怎么办?
Hi I wrote this function to clean up my data in R:
periodCleanse <- function(x) { if (x == ""){ return (""); } else if (substr(x, nchar(x), nchar(x)) == "M"){ return(30*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "Y"){ return(365*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "D"){ return (as.numeric(substr(x, 1, nchar(x)-1))); } }My df looks something like this:
period 3M 5Y 1D 7MI want to call
df$period <- periodCleanse(df$period))but I am getting:
Warning message: In if (x == "") { : the condition has length > 1 and only the first element will be usedand nothing happens. What should I do?
最满意答案
我只想创建一个向量化的函数,既可以保存你,也可以无限编写,并在循环中运行它( sapply )
periodCleanse2 <- function(x){ matchDat <- data.frame(A = c("M", "Y", "D"), B = c(30, 365, 1)) # You can take that part out of the function for improving speed indx <- gsub("\\d", "", x) indx2 <- as.numeric(gsub("[A-Z]", "", x)) matchDat$B[match(indx, matchDat$A)] * indx2 } periodCleanse2(df$period) ## [1] 90 1825 NA 1 210I would just create a vectorized function that will both save you from writing endless if else and running it in a loop (sapply)
periodCleanse2 <- function(x){ matchDat <- data.frame(A = c("M", "Y", "D"), B = c(30, 365, 1)) # You can take that part out of the function for improving speed indx <- gsub("\\d", "", x) indx2 <- as.numeric(gsub("[A-Z]", "", x)) matchDat$B[match(indx, matchDat$A)] * indx2 } periodCleanse2(df$period) ## [1] 90 1825 NA 1 210更多推荐
发布评论