本文介绍了将函数应用于 data.table 中的每个指定列并按引用更新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个 data.table,我想用它对某些列执行相同的操作.这些列的名称在字符向量中给出.在这个特定示例中,我想将所有这些列乘以 -1.
I have a data.table with which I'd like to perform the same operation on certain columns. The names of these columns are given in a character vector. In this particular example, I'd like to multiply all of these columns by -1.
一些玩具数据和一个指定相关列的向量:
Some toy data and a vector specifying relevant columns:
library(data.table) dt <- data.table(a = 1:3, b = 1:3, d = 1:3) cols <- c("a", "b")现在我是这样做的,循环遍历字符向量:
Right now I'm doing it this way, looping over the character vector:
for (col in 1:length(cols)) { dt[ , eval(parse(text = paste0(cols[col], ":=-1*", cols[col])))] }有没有办法在没有 for 循环的情况下直接做到这一点?
Is there a way to do this directly without the for loop?
推荐答案这似乎有效:
dt[ , (cols) := lapply(.SD, "*", -1), .SDcols = cols]结果是
a b d 1: -1 -1 1 2: -2 -2 2 3: -3 -3 3这里有一些技巧:
- 因为(cols) := 中有括号,所以结果被分配给cols 中指定的列,而不是一些名为cols"的新变量.
- .SDcols 告诉调用我们只查看那些列,并允许我们使用 .SD,Subset与这些列关联的 D 数据.
- lapply(.SD, ...) 对 .SD 进行操作,它是一个列列表(就像所有的 data.frames 和 data.tables).lapply 返回一个列表,所以最后 j 看起来像 cols := list(...).
- Because there are parentheses in (cols) :=, the result is assigned to the columns specified in cols, instead of to some new variable named "cols".
- .SDcols tells the call that we're only looking at those columns, and allows us to use .SD, the Subset of the Data associated with those columns.
- lapply(.SD, ...) operates on .SD, which is a list of columns (like all data.frames and data.tables). lapply returns a list, so in the end j looks like cols := list(...).
编辑:这是另一种可能更快的方法,正如@Arun 提到的:
EDIT: Here's another way that is probably faster, as @Arun mentioned:
for (j in cols) set(dt, j = j, value = -dt[[j]])更多推荐
将函数应用于 data.table 中的每个指定列并按引用更新
发布评论