在R中的并行sapply中设置矩阵行中的值(Setting the values in a matrix row from within an parallel sapply in R)

编程入门 行业动态 更新时间:2024-10-27 14:35:02
在R中的并行sapply中设置矩阵行中的值(Setting the values in a matrix row from within an parallel sapply in R)

我有一个矩阵(origmatrix),我想在每个列上执行一个函数。 我想将此函数的结果放入另一个矩阵(newmatrix),该行的行号对应于原始矩阵中的列号。 在真实数据集中有20000行具有复杂功能,因此我想使用一种类型的应用以便能够并行化项目。 有没有办法让我从申请到新矩阵中获取数据? 任何帮助将不胜感激!

origmatrix = matrix(1:50, 10, 5) colnames(origmatrix) = letters[1:5] newmatrix = matrix(0, 5,2) colnames(newmatrix) = c("Identifier","mean") boertje = function (x){ newlist[which(colnames(origmatrix)==x),2]= mean(origmatrix[,x]) } sapply(colnames(origmatrix), boertje)

I have a matrix (origmatrix), on which I want to perform a function per column. I want to put the results of this function into another matrix (newmatrix), with the row number of this row corresponding to the column number in the original matrix. In the real dataset there are 20000 rows with a complex function, so I'd like to use a type of apply in order to be able to parallelize the project. Is there a way for me to get the data from within the apply into newmatrix? Any help would be greatly appreciated!

origmatrix = matrix(1:50, 10, 5) colnames(origmatrix) = letters[1:5] newmatrix = matrix(0, 5,2) colnames(newmatrix) = c("Identifier","mean") boertje = function (x){ newlist[which(colnames(origmatrix)==x),2]= mean(origmatrix[,x]) } sapply(colnames(origmatrix), boertje)

最满意答案

如何使用lapply的多核版本,这是parallel:::mclapply或multicore:::mclapply具体取决于您的平台,然后从结果中生成数据帧? 您可以在返回多个值时创建数据帧,如下所示:

require(parallel) res <- mclapply( 1:ncol(origmatrix) , mc.cores = 1 , function(x){ c( mean( origmatrix[,x] ) , sd( origmatrix[,x] ) , var( origmatrix[,x] ) ) } ) # So the first element of the resulting list looks like res[[1]] # [1] 5.500000 3.027650 9.166667 df <- as.data.frame( res ) rownames(df) <- c("mean","sd","var") colnames(df) <- colnames(origmatrix) # a b c d e # mean 5.500000 15.500000 25.500000 35.500000 45.500000 # sd 3.027650 3.027650 3.027650 3.027650 3.027650 # var 9.166667 9.166667 9.166667 9.166667 9.166667

mclapply确实在帮助页面中附带了这个警告......

警告 强烈建议不要在GUI或嵌入式环境中使用这些功能,因为它会导致多个进程共享同一个GUI,这可能会导致混乱(并可能导致崩溃)。 子进程绝不应使用屏幕图形设备。

How about using the multicore version of lapply, which is parallel:::mclapply or multicore:::mclapply depending on your platform, and then making a dataframe out of the results? You can make a dataframe when you return multiple values like so:

require(parallel) res <- mclapply( 1:ncol(origmatrix) , mc.cores = 1 , function(x){ c( mean( origmatrix[,x] ) , sd( origmatrix[,x] ) , var( origmatrix[,x] ) ) } ) # So the first element of the resulting list looks like res[[1]] # [1] 5.500000 3.027650 9.166667 df <- as.data.frame( res ) rownames(df) <- c("mean","sd","var") colnames(df) <- colnames(origmatrix) # a b c d e # mean 5.500000 15.500000 25.500000 35.500000 45.500000 # sd 3.027650 3.027650 3.027650 3.027650 3.027650 # var 9.166667 9.166667 9.166667 9.166667 9.166667

mclapply does come with this warning in the help pages though...

Warning It is strongly discouraged to use these functions in GUI or embedded environments, because it leads to several processes sharing the same GUI which will likely cause chaos (and possibly crashes). Child processes should never use on-screen graphics devices.

更多推荐

本文发布于:2023-07-28 23:42:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1310281.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:矩阵   Setting   sapply   values   parallel

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!