我在R中遇到以下问题:
让我们假设以下数据框:
a b c d e 1 1 1 1 1 15.5 2 1 1 1 2 8.3 3 1 1 2 1 12.4 4 1 1 2 2 3.2 ...我想将函数f(x,y)应用于e列中的数字,其中x和y是从两行中绘制的,除了d (当然是e )之外的所有列中都有相同的值。
输出应该是一个新的数据帧,其中列d被删除(因为“合并”使该列无关),列e是应用函数的结果。
因此,在上面的示例中,假设f(x,y)是加法,新数据框应如下所示:
a b c e 1 1 1 1 23.8 3 1 1 2 15.6 ...到目前为止我尝试过的东西看起来像下面这样,感觉非常不优雅:
data.d1 <- subset(data, d==1) for (index in 1:nrow(data.d1)) row1 <- data.d1[index,] row2 <- data[data$a==row1$a & data$b==row1$b & data$c==row1$c & data$d==2,] data[index,"e"] <- f(row1$e, row2$e) } data <- data[-match(c("d"), names(data))]有人使用apply()之类的东西有更清洁的解决方案吗? 提前致谢!
I have the following problem in R:
Lets assume the following data frame:
a b c d e 1 1 1 1 1 15.5 2 1 1 1 2 8.3 3 1 1 2 1 12.4 4 1 1 2 2 3.2 ...I want to apply a function f(x,y) to the numbers from column e, where x and y are drawn from the two rows which have the same values in all columns except d (and e of course).
The output should be a new data frame, in which column d is dropped (as the "merge" made that column irrelevant) and column e is the result of the applied function.
So in the example above, assuming f(x,y) is addition, the new data frame should look like this:
a b c e 1 1 1 1 23.8 3 1 1 2 15.6 ...What i have tried so far looks something like the following, which feels very inelegant:
data.d1 <- subset(data, d==1) for (index in 1:nrow(data.d1)) row1 <- data.d1[index,] row2 <- data[data$a==row1$a & data$b==row1$b & data$c==row1$c & data$d==2,] data[index,"e"] <- f(row1$e, row2$e) } data <- data[-match(c("d"), names(data))]Does somebody with have a more clean solution, using apply() and the like? Thanks in advance!
最满意答案
这是例子:
d> ddply(x, .(a, b, c), summarize, e = sum(e)) a b c e 1 1 1 1 23.8 2 1 1 2 15.6 d> aggregate(e~a+b+c, sum, data = x) a b c e 1 1 1 1 23.8 2 1 1 2 15.6ddply是plyr包中的一个函数。
here is examples:
d> ddply(x, .(a, b, c), summarize, e = sum(e)) a b c e 1 1 1 1 23.8 2 1 1 2 15.6 d> aggregate(e~a+b+c, sum, data = x) a b c e 1 1 1 1 23.8 2 1 1 2 15.6ddply is a function in plyr package.
更多推荐
发布评论