再次重新排序数据框中的列(Reordering columns in data frame once again)

我想在我的数据框中重新排序我的列，但到目前为止我发现的并不令人满意。

我的数据框看起来像：

cnt <-as.factor(c("Country 1", "Country 2", "Country 3", "Country 1", "Country 2", "Country 3" )) bnk <-as.factor(c("bank 1", "bank 2", "bank 3", "bank 1", "bank 2", "bank 3" )) mayData <-data.frame(age=c(10,12,13,10,11,15), Country=cnt, Bank=bnk, q10=c(1,1,1,2,2,2),q11=c(1,1,1,2,2,2), q1=c(1,1,1,2,2,2), q9=c(1,1,1,2,2,2), q6=c(1,1,1,2,2,2), year=c(1950,1960,1970,1980,1990,2000) ) age Country Bank q10 q11 q1 q9 q6 year 1 10 Country 1 bank 1 1 1 1 1 1 1950 2 12 Country 2 bank 2 1 1 1 1 1 1960 3 13 Country 3 bank 3 1 1 1 1 1 1970 4 10 Country 1 bank 1 2 2 2 2 2 1980 5 11 Country 2 bank 2 2 2 2 2 2 1990 6 15 Country 3 bank 3 2 2 2 2 2 2000

但我想重新安排列看起来像这样：

Country Bank year age q1 q6 q9 q10 q11 1 Country 1 bank 1 1950 10 1 1 1 1 1 2 Country 2 bank 2 1960 12 1 1 1 1 1 3 Country 3 bank 3 1970 13 1 1 1 1 1 4 Country 1 bank 1 1980 10 2 2 2 2 2 5 Country 2 bank 2 1990 11 2 2 2 2 2 6 Country 3 bank 3 2000 15 2 2 2 2 2

我的真实数据框有很多列，因此使用索引或每列的名称“手动”重新排列列顺序不是最佳的。

另请注意，对于以q开头的列名，我希望按升序排列，即从q1到q11 。问题是R无法理解q6 - 代表“问题6” - 应该在q10之前。要查看此缺陷，请查看以下示例：

mayData<-mayData[,order(colnames(mayData),decreasing=F)] age Bank Country q1 q10 q11 q6 q9 year 1 10 bank 1 Country 1 1 1 1 1 1 1950 2 12 bank 2 Country 2 1 1 1 1 1 1960 3 13 bank 3 Country 3 1 1 1 1 1 1970 4 10 bank 1 Country 1 2 2 2 2 2 1980 5 11 bank 2 Country 2 2 2 2 2 2 1990 6 15 bank 3 Country 3 2 2 2 2 2 2000

因此，基本上我想重新排序列的方式是首先根据我的偏好以一种灵活的方式对几列进行排序，然后使用递减排序标准。但是，“逻辑”一个，R可以理解正确排序q s。

I want to re-order my columns in my data frame, but what I found so far is not satisfactory.

My dataframe looks like:

cnt <-as.factor(c("Country 1", "Country 2", "Country 3", "Country 1", "Country 2", "Country 3" )) bnk <-as.factor(c("bank 1", "bank 2", "bank 3", "bank 1", "bank 2", "bank 3" )) mayData <-data.frame(age=c(10,12,13,10,11,15), Country=cnt, Bank=bnk, q10=c(1,1,1,2,2,2),q11=c(1,1,1,2,2,2), q1=c(1,1,1,2,2,2), q9=c(1,1,1,2,2,2), q6=c(1,1,1,2,2,2), year=c(1950,1960,1970,1980,1990,2000) ) age Country Bank q10 q11 q1 q9 q6 year 1 10 Country 1 bank 1 1 1 1 1 1 1950 2 12 Country 2 bank 2 1 1 1 1 1 1960 3 13 Country 3 bank 3 1 1 1 1 1 1970 4 10 Country 1 bank 1 2 2 2 2 2 1980 5 11 Country 2 bank 2 2 2 2 2 2 1990 6 15 Country 3 bank 3 2 2 2 2 2 2000

but I want to re-arrange the columns to look like this:

Country Bank year age q1 q6 q9 q10 q11 1 Country 1 bank 1 1950 10 1 1 1 1 1 2 Country 2 bank 2 1960 12 1 1 1 1 1 3 Country 3 bank 3 1970 13 1 1 1 1 1 4 Country 1 bank 1 1980 10 2 2 2 2 2 5 Country 2 bank 2 1990 11 2 2 2 2 2 6 Country 3 bank 3 2000 15 2 2 2 2 2

My real dataframe has a lot of columns, so rearranging the column orders "manually" using the index or the names of each column is not optimal.

Notice also, that for the column names that begin with qs I want to have them in ascending order, that is from q1 to q11. The problem is that R fails to understand that q6 - which stands for "question 6" - should be precede q10. To see this deficiency, look at the following example:

mayData<-mayData[,order(colnames(mayData),decreasing=F)] age Bank Country q1 q10 q11 q6 q9 year 1 10 bank 1 Country 1 1 1 1 1 1 1950 2 12 bank 2 Country 2 1 1 1 1 1 1960 3 13 bank 3 Country 3 1 1 1 1 1 1970 4 10 bank 1 Country 1 2 2 2 2 2 1980 5 11 bank 2 Country 2 2 2 2 2 2 1990 6 15 bank 3 Country 3 2 2 2 2 2 2000

So, essentially the way I want to reorder my columns is to first sort a few columns in some flexible way according to my preference and then use a decreasing ordering criteria. But, the "logical" one, one that R can understand to sort the qs properly.

最满意答案

我们可以使用mixedsort的gtools来排列'q'列。

library(gtools) i1 <- grep("q\\d+", names(mayData)) nm1 <- mixedsort(names(mayData)[i1]) mayData[c(setdiff(names(mayData), nm1), nm1)] # age Country Bank year q1 q6 q9 q10 q11 #1 10 Country 1 bank 1 1950 1 1 1 1 1 #2 12 Country 2 bank 2 1960 1 1 1 1 1 #3 13 Country 3 bank 3 1970 1 1 1 1 1 #4 10 Country 1 bank 1 1980 2 2 2 2 2 #5 11 Country 2 bank 2 1990 2 2 2 2 2 #6 15 Country 3 bank 3 2000 2 2 2 2 2

注意：仅使用base R功能和单个包。

或者正如@Cath所提到的，使用gsub删除子字符串也可以用于排序

sort(as.numeric(sub("^q", "", names(mayData)[i1])))

We can use mixedsort from gtools to arrange the 'q' columns.

library(gtools) i1 <- grep("q\\d+", names(mayData)) nm1 <- mixedsort(names(mayData)[i1]) mayData[c(setdiff(names(mayData), nm1), nm1)] # age Country Bank year q1 q6 q9 q10 q11 #1 10 Country 1 bank 1 1950 1 1 1 1 1 #2 12 Country 2 bank 2 1960 1 1 1 1 1 #3 13 Country 3 bank 3 1970 1 1 1 1 1 #4 10 Country 1 bank 1 1980 2 2 2 2 2 #5 11 Country 2 bank 2 1990 2 2 2 2 2 #6 15 Country 3 bank 3 2000 2 2 2 2 2

NOTE: Using only base R functions and a single package.

Or as @Cath mentioned, removing the substring with gsub can be used to order as well

sort(as.numeric(sub("^q", "", names(mayData)[i1])))

更多推荐