在R中汇总后选择其他行元素

编程入门行业动态更新时间:2024-10-26 07:27:50

本文介绍了在R中汇总后选择其他行元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我想选择每个组中最年轻的人，然后按性别进行分类

I would like to select the youngest person in each group and categorize it by gender

所以这是我的初始数据

data1 ID Age Gender Group 1 A01 25 m a 2 A02 35 f b 3 B03 45 m b 4 C99 50 m b 5 F05 60 f a 6 X05 65 f a

我想拥有这个

Gender Group Age ID m a 25 A01 f a 60 F05 m b 45 B03 f b 35 A02

所以我尝试了aggraeate函数，但我不知道如何将ID附加到它上

So I tried with aggraeate function but I don't know how to attach the ID to it

aggregate(Age~Gender+Group,data1,min) Gender Group Age m a 25 f a 60 m b 45 f b 35

推荐答案

我们可以使用 data.table 。我们将 data.frame转换为 data.table（ setDT（data1））。如果要获取与年龄的 min 相对应的行，我们使用 which.min 来获取 min 'Age'的行索引由'Gender'，'Group'分组，然后使用该索引对行进行子集化（ .SD [which。 min（Age）] ）。

We can use data.table. We convert the 'data.frame' to 'data.table' (setDT(data1)). If it is to get the row corresponding to the min of 'Age', we use which.min to get the row index of the min 'Age' grouped by 'Gender', 'Group' and then use that to subset the rows (.SD[which.min(Age)]).

setDT(data1)[, .SD[which.min(Age)], by = .(Gender, Group)]

或者另一种选择是 order 按性别，组，年龄进行排序，然后使用 unique 获得第一行。

Or another option would be to order by 'Gender', 'Group', 'Age', and then get the first row using unique.

unique(setDT(data1)[order(Gender,Group,Age)], by = c('Gender', 'Group'))

或使用相同的 dplyr 的方法，我们使用 slice 和 which.min

Or using the same methodology with dplyr, we use slice with which.min to get the corresponding 'Age' grouped by 'Gender', 'Group'.

library(dplyr) data1 %>% group_by(Gender, Group) %>% slice(which.min(Age))

或者我们也可以通过性别，组，年龄和安排然后获得第一行

Or we can arrange by 'Gender', 'Group', 'Age' and then get the first row

data1 %>% arrange(Gender,Group, Age) %>% group_by(Gender,Group) %>% slice(1L)

更多推荐

在R中汇总后选择其他行元素