清理R中的列数据(Cleaning column data in R)

编程入门 行业动态 更新时间:2024-10-27 11:18:22
清理R中的列数据(Cleaning column data in R)

嗨,我写了这个函数来清理R中的数据:

periodCleanse <- function(x) { if (x == ""){ return (""); } else if (substr(x, nchar(x), nchar(x)) == "M"){ return(30*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "Y"){ return(365*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "D"){ return (as.numeric(substr(x, 1, nchar(x)-1))); } }

我的df看起来像这样:

period 3M 5Y 1D 7M

我想打电话

df$period <- periodCleanse(df$period))

但我得到了:

Warning message: In if (x == "") { : the condition has length > 1 and only the first element will be used

没有任何反应。 我该怎么办?

Hi I wrote this function to clean up my data in R:

periodCleanse <- function(x) { if (x == ""){ return (""); } else if (substr(x, nchar(x), nchar(x)) == "M"){ return(30*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "Y"){ return(365*as.numeric(substr(x, 1, nchar(x)-1))); } else if (substr(x, nchar(x), nchar(x)) == "D"){ return (as.numeric(substr(x, 1, nchar(x)-1))); } }

My df looks something like this:

period 3M 5Y 1D 7M

I want to call

df$period <- periodCleanse(df$period))

but I am getting:

Warning message: In if (x == "") { : the condition has length > 1 and only the first element will be used

and nothing happens. What should I do?

最满意答案

我只想创建一个向量化的函数,既可以保存你,也可以无限编写,并在循环中运行它( sapply )

periodCleanse2 <- function(x){ matchDat <- data.frame(A = c("M", "Y", "D"), B = c(30, 365, 1)) # You can take that part out of the function for improving speed indx <- gsub("\\d", "", x) indx2 <- as.numeric(gsub("[A-Z]", "", x)) matchDat$B[match(indx, matchDat$A)] * indx2 } periodCleanse2(df$period) ## [1] 90 1825 NA 1 210

I would just create a vectorized function that will both save you from writing endless if else and running it in a loop (sapply)

periodCleanse2 <- function(x){ matchDat <- data.frame(A = c("M", "Y", "D"), B = c(30, 365, 1)) # You can take that part out of the function for improving speed indx <- gsub("\\d", "", x) indx2 <- as.numeric(gsub("[A-Z]", "", x)) matchDat$B[match(indx, matchDat$A)] * indx2 } periodCleanse2(df$period) ## [1] 90 1825 NA 1 210

更多推荐

本文发布于:2023-08-04 16:58:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1417773.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:数据   Cleaning   data   column

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!