我有一个简单的问题要弄清楚:
I have an easy question to figure out:
value 1000 2500 5080 10009我要指定值到一个间隔:
value Range 1000 0-1000 2500 1001-5000 5080 5001-10000 10009 10001-20000我尝试以下操作:
dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000")))但是,我得到了错误: dt [中出现意外的'<',范围:= ifelse(值< 1001, 0-1000,ifelse (1000< value<
有什么帮助吗?
编辑:
这个问题并不是在寻求将连续变量转换为因子的最佳方法,而是在寻求可重现示例的调试帮助:
This question is not asking for the best way to convert a continuous variable to a factor. It is asking for debugging help with the reproducible example:
library(data.table) dt <- data.table(value = c(1000, 2500, 5080, 10009)) dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value < 5001, "1001-5000", ifelse(5000 < value < 10001, "5001-10000", "10001-20000"))) # produces the error above推荐答案
)错误,则表示其含义。与python不同,R无法解释 1000<值< 5001 。相反,您需要使用 1000<价值和值< 5001
Like many (some) errors, it means what it says. Unlike python, R can't interpret 1000 < value < 5001. Instead you need to use 1000 < value & value < 5001
library(data.table) dt <- data.table(value = c(1000, 2500, 5080, 10009)) dt[, Range := ifelse(value < 1001, "0-1000", ifelse(1000 < value & value < 5001, "1001-5000", ifelse(5000 < value & value < 10001, "5001-10000", "10001-20000")))] dt value Range 1: 1000 0-1000 2: 2500 1001-5000 3: 5080 5001-10000 4: 10009 10001-20000正如@akrun提到的那样,您可能会有一个更好的选择。例如:
As @akrun mentioned, you may be better off with a factor. Here's an example:
dt[, Range := cut(value, breaks = c(0, 1001, 5001, 10001, 20001), labels = c("0-1000", "1001-5000", "5001-10000", "10001-20000"))]这会产生一个显示相同方式的data.table,但是提取 Range 列将为您提供一个与范围。
This produces a data.table that displays the same way, but extracting the Range column will give you a factor corresponding to the ranges.
更多推荐
将值分配到特定范围
发布评论