删除重复的列?

编程入门 行业动态 更新时间:2024-10-27 02:30:28
本文介绍了删除重复的列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我正在使用数据框架将多个Excel文件整理为一个.文件中有重复的列.是否可以仅合并唯一列?

I am collating multiple excel files into one using data frames. There are duplicate columns in the files. Is it possible to merge only the unique columns?

这是我的代码:

library(rJava) library (XLConnect) data.files = list.files(pattern = "*.xls") # Read the first file df = readWorksheetFromFile(file=data.files[1], sheet=1, check.names=F) # Loop through the remaining files and merge them to the existing data frame for (file in data.files[-1]) { newFile = readWorksheetFromFile(file=file, sheet=1, check.names=F) df = merge(df, newFile, all = TRUE, check.names=F) }

推荐答案

首先,如果正确应用 merge ,则不应有任何重复的列,前提是重复的列也要包含EXCEL文件中的名称完全相同.使用 merge 时,EXCEL文件中至少必须有一列具有完全相同的名称,并包含用于合并它们的值.

First of all, if you apply merge correctly, there shouldn't be any duplicated columns, provided that the duplicated columns also have the exact same name in the EXCEL files. As you use merge, there must be at least one column in the EXCEL files that have the exact same name, and contains the values used to merge them.

因此,我认为您想根据每个列中的值检查结果数据框中是否存在重复的列.为此,您可以使用以下代码:

So I reckon you want to check in the resulting data frame whether there are duplicate columns based on the values in each column. For this, you could use the following:

keepUnique <- function(x){ combs <- combn(names(x),2) dups <- mapply(identical, x[combs[1,]], x[combs[2,]]) drop <- combs[2,][dups] x[ !names(x) %in% drop ] }

哪个给:

> mydf <- cbind(iris,iris[,3])[1:5,] > mydf Sepal.Length Sepal.Width Petal.Length Petal.Width Species iris[, 3] 1 5.1 3.5 1.4 0.2 setosa 1.4 2 4.9 3.0 1.4 0.2 setosa 1.4 3 4.7 3.2 1.3 0.2 setosa 1.3 4 4.6 3.1 1.5 0.2 setosa 1.5 5 5.0 3.6 1.4 0.2 setosa 1.4 > keepUnique(mydf) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa

您可以在读取文件后使用它,即添加行

You can use this after reading in a file, i.e. add the line

newFile <- keepUnique(newFile,df)

使用您自己的代码.

更多推荐

删除重复的列?

本文发布于:2023-10-17 03:00:16,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1499579.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!