我想选择每个组中最年轻的人,然后按性别进行分类
I would like to select the youngest person in each group and categorize it by gender
所以这是我的初始数据
data1 ID Age Gender Group 1 A01 25 m a 2 A02 35 f b 3 B03 45 m b 4 C99 50 m b 5 F05 60 f a 6 X05 65 f a我想拥有这个
Gender Group Age ID m a 25 A01 f a 60 F05 m b 45 B03 f b 35 A02所以我尝试了aggraeate函数,但我不知道如何将ID附加到它上
So I tried with aggraeate function but I don't know how to attach the ID to it
aggregate(Age~Gender+Group,data1,min) Gender Group Age m a 25 f a 60 m b 45 f b 35推荐答案
我们可以使用 data.table 。我们将 data.frame转换为 data.table( setDT(data1))。如果要获取与年龄的 min 相对应的行,我们使用 which.min 来获取 min 'Age'的行索引由'Gender','Group'分组,然后使用该索引对行进行子集化( .SD [which。 min(Age)] )。
We can use data.table. We convert the 'data.frame' to 'data.table' (setDT(data1)). If it is to get the row corresponding to the min of 'Age', we use which.min to get the row index of the min 'Age' grouped by 'Gender', 'Group' and then use that to subset the rows (.SD[which.min(Age)]).
setDT(data1)[, .SD[which.min(Age)], by = .(Gender, Group)]或者另一种选择是 order 按性别,组,年龄进行排序,然后使用 unique 获得第一行。
Or another option would be to order by 'Gender', 'Group', 'Age', and then get the first row using unique.
unique(setDT(data1)[order(Gender,Group,Age)], by = c('Gender', 'Group'))
或使用相同的 dplyr 的方法,我们使用 slice 和 which.min
Or using the same methodology with dplyr, we use slice with which.min to get the corresponding 'Age' grouped by 'Gender', 'Group'.
library(dplyr) data1 %>% group_by(Gender, Group) %>% slice(which.min(Age))或者我们也可以通过性别,组,年龄和 安排然后获得第一行
Or we can arrange by 'Gender', 'Group', 'Age' and then get the first row
data1 %>% arrange(Gender,Group, Age) %>% group_by(Gender,Group) %>% slice(1L)更多推荐
在R中汇总后选择其他行元素
发布评论