我在R中安排了这样的数据:
indv time mass 1 10 7 2 5 3 1 5 1 2 4 4 2 14 14 1 15 15indv是人口中的个体。 我想添加初始质量( mass_i )和最终质量( mass_f )的列。 我昨天了解到,我可以在plyr中使用ddply为初始质量添加一列:
sorted <- ddply(test, .(indv, time), sort) sorted2 <- ddply(sorted, .(indv), transform, mass_i = mass[1])它给出了一个表格:
indv mass time mass_i 1 1 1 5 1 2 1 7 10 1 3 1 10 15 1 4 2 4 4 4 5 2 3 5 4 6 2 8 14 4 7 2 9 20 4然而,这种方法不适用于找到最终质量( mass_f ),因为我对每个人的观察数量不同。 当观察的数量可能有所不同时,有人能建议找到最终质量的方法吗?
I have data arranged like this in R:
indv time mass 1 10 7 2 5 3 1 5 1 2 4 4 2 14 14 1 15 15where indv is individual in a population. I want to add columns for initial mass (mass_i) and final mass (mass_f). I learned yesterday that I can add a column for initial mass using ddply in plyr:
sorted <- ddply(test, .(indv, time), sort) sorted2 <- ddply(sorted, .(indv), transform, mass_i = mass[1])which gives a table like:
indv mass time mass_i 1 1 1 5 1 2 1 7 10 1 3 1 10 15 1 4 2 4 4 4 5 2 3 5 4 6 2 8 14 4 7 2 9 20 4However, this same method will not work for finding the final mass (mass_f), as I have a different number of observations for each individual. Can anyone suggest a method for finding the final mass, when the number of observations may vary?
最满意答案
您可以简单地使用length(mass)作为最后一个元素的索引:
sorted2 <- ddply(sorted, .(indv), transform, mass_i = mass[1], mass_f = mass[length(mass)])正如mb3041023所建议并在下面的评论中讨论的那样,您可以在不对数据框进行排序的情况下实现类似的结果:
ddply(test, .(indv), transform, mass_i = mass[which.min(time)], mass_f = mass[which.max(time)])除行的顺序外,这与sorted2相同。
You can simply use length(mass) as the index of the last element:
sorted2 <- ddply(sorted, .(indv), transform, mass_i = mass[1], mass_f = mass[length(mass)])As suggested by mb3041023 and discussed in the comments below, you can achieve similar results without sorting your data frame:
ddply(test, .(indv), transform, mass_i = mass[which.min(time)], mass_f = mass[which.max(time)])Except for the order of rows, this is the same as sorted2.
更多推荐
发布评论