plyr包在多个列上编写相同的功能

编程入门 行业动态 更新时间:2024-10-28 06:23:55
本文介绍了plyr包在多个列上编写相同的功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我想使用ddply函数将同一函数写入多列,但是我尝试将它们写在一行中,想看看是否有更好的方法?

这是数据的简单版本:

data<-data.frame(TYPE=as.integer(runif(20,1,3)),A_MEAN_WEIGHT=runif(20,1,100),B_MEAN_WEIGHT=runif(20,1,10))

我想通过执行以下操作找出列A_MEAN_WEIGHT和B_MEAN_WEIGHT的总和:

ddply(data,.(TYPE),summarise,MEAN_A=sum(A_MEAN_WEIGHT),MEAN_B=sum(B_MEAN_WEIGHT))

但是在我当前的数据中,我有8个以上的"* _MEAN_WEIGHT",我已经厌倦了像

那样写8次

ddply(data,.(TYPE),summarise,MEAN_A=sum(A_MEAN_WEIGHT),MEAN_B=sum(B_MEAN_WEIGHT),MEAN_C=sum(C_MEAN_WEIGHT),MEAN_D=sum(D_MEAN_WEIGHT),MEAN_E=sum(E_MEAN_WEIGHT),MEAN_F=sum(F_MEAN_WEIGHT),MEAN_G=sum(G_MEAN_WEIGHT),MEAN_H=sum(H_MEAN_WEIGHT))

有没有更好的方法来写这个?谢谢您的帮助!

解决方案

以plyr为中心的方法是使用colwise

例如

ddply(data, .(TYPE), colwise(sum)) TYPE A_MEAN_WEIGHT B_MEAN_WEIGHT 1 1 319.8977 60.80317 2 2 621.6745 37.05863

如果只需要一个子集,则可以将列名称作为参数.col传递.

您也可以使用numcolwise或catcolwise仅对数字或分类列起作用.

请注意,您可以使用sapply代替colwise

的最基本用法

ddply(data, .(TYPE), sapply, FUN = 'mean')

惯用的data.table方法是使用lapply(.SD, fun)

例如

dt <- data.table(data) dt[,lapply(.SD, sum) ,by = TYPE] TYPE A_MEAN_WEIGHT B_MEAN_WEIGHT 1: 2 621.6745 37.05863 2: 1 319.8977 60.80317

I want to write the same function to multiple columns using ddply function, but I'm tried keep writing them in one line, want to see is there better way of doing this?

Here's a simple version of the data:

data<-data.frame(TYPE=as.integer(runif(20,1,3)),A_MEAN_WEIGHT=runif(20,1,100),B_MEAN_WEIGHT=runif(20,1,10))

and I want to find out the sum of columns A_MEAN_WEIGHT and B_MEAN_WEIGHT by doing this:

ddply(data,.(TYPE),summarise,MEAN_A=sum(A_MEAN_WEIGHT),MEAN_B=sum(B_MEAN_WEIGHT))

but in my current data I have more than 8 "*_MEAN_WEIGHT", and I'm tired of writing them 8 times like

ddply(data,.(TYPE),summarise,MEAN_A=sum(A_MEAN_WEIGHT),MEAN_B=sum(B_MEAN_WEIGHT),MEAN_C=sum(C_MEAN_WEIGHT),MEAN_D=sum(D_MEAN_WEIGHT),MEAN_E=sum(E_MEAN_WEIGHT),MEAN_F=sum(F_MEAN_WEIGHT),MEAN_G=sum(G_MEAN_WEIGHT),MEAN_H=sum(H_MEAN_WEIGHT))

Is there a better way to write this? Thank you for your help!!

解决方案

The plyr-centred approach is to use colwise

eg

ddply(data, .(TYPE), colwise(sum)) TYPE A_MEAN_WEIGHT B_MEAN_WEIGHT 1 1 319.8977 60.80317 2 2 621.6745 37.05863

You can pass the column names as the argument .col if you want only a subset

You can also use numcolwise or catcolwise to act on numeric or categorical columns only.

note that you could use sapply in place of the most basic use of colwise

ddply(data, .(TYPE), sapply, FUN = 'mean')

The idiomatic data.table approach is to use lapply(.SD, fun)

eg

dt <- data.table(data) dt[,lapply(.SD, sum) ,by = TYPE] TYPE A_MEAN_WEIGHT B_MEAN_WEIGHT 1: 2 621.6745 37.05863 2: 1 319.8977 60.80317

更多推荐

plyr包在多个列上编写相同的功能

本文发布于:2023-07-26 07:02:13,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1214853.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:多个   功能   plyr

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!