在R中聚合列表并将它们组合在一起?(Aggregating a list and grouping them together in R?)

编程入门 行业动态 更新时间:2024-10-22 13:36:48
在R中聚合列表并将它们组合在一起?(Aggregating a list and grouping them together in R?) df <- USArrests df$ID <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2) df$Year <- c(2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2012, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2011, 2015, 2012, 2013)

我有以下数据。 我想要做的是根据年份和ID创建所有州的列表。

为了直观地展示它,我在Excel中制作了一个小模型:

由于阿拉巴马州有2017年和ID 1,因此它被放入该单元格。 其他州阿拉斯加州和亚利桑那州同样拥有2017年和1年级,所以他们被置于同一个地方。 那整个单元格将组成一个矢量,如:

c('Alabama', 'Alaska', 'Arizona')

当我说“单元格”时,我当然是指R语境中的“元素”。 我希望R以类似的方式填写所有其他元素并将其全部输出为数据框,ID为观察值,年份为变量。 我真的不知道怎么开始这样做虽然...

df <- USArrests df$ID <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2) df$Year <- c(2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2012, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2012, 2013, 2017, 2017, 2017, 2015, 2011, 2015, 2012, 2013)

I have the following data. What I would like to do is create a list of all the states, based on their year and ID.

To show it visually, I've made a small mockup in Excel:

Since Alabama has the year 2017 and the ID 1, it gets put in that cell. The other states Alaska and Arizona similarly has the year 2017 and ID 1, so they get put in the same place. That entire cell will then make up a vector like:

c('Alabama', 'Alaska', 'Arizona')

When I say "cell", I of course refer to an "element" in the R context. I would like R to fill out all the other elements in a similar way and output it all as a data frame, with ID as observations and year as variables. I really have no idea how to start about doing this though...

最满意答案

一个选项可以是使用data.table和dcast 。 keep.rownames = TRUE参数会将keep.rownames = TRUE更改为column (rn) 。

library(data.table) setDT(df, keep.rownames = TRUE) dcast(df, ID ~ Year, value.var = "rn", fun.aggregate = paste0, collapse = ",")

对于只有多个州的解决方案可以是:

dcast(df, ID ~ Year, value.var = "rn",length) # ID 2011 2012 2013 2015 2017 # 1: 1 0 1 2 3 9 # 2: 2 1 5 3 3 9 # 3: 3 0 3 2 3 6

根据OP的反馈更新:

dcast(df, ID ~ Year, value.var = "rn", function(x)paste0("Count: ",length(x), " " ,paste(x,collapse = ","))) # ID 2011 2012 2013 # 1: 1 Count: 0 Count: 1 California Count: 2 Colorado,Louisiana # 2: 2 Count: 1 Washington Count: 5 Minnesota,Mississippi,New Hampshire,Ohio,Wisconsin Count: 3 New Jersey,Oklahoma,Wyoming # 3: 3 Count: 0 Count: 3 Hawaii,Kentucky,South Dakota Count: 2 Idaho,Tennessee

One option can be to use data.table and dcast. keep.rownames = TRUE argument will change the rownames to column (rn).

library(data.table) setDT(df, keep.rownames = TRUE) dcast(df, ID ~ Year, value.var = "rn", fun.aggregate = paste0, collapse = ",")

For just number of states solution could be as:

dcast(df, ID ~ Year, value.var = "rn",length) # ID 2011 2012 2013 2015 2017 # 1: 1 0 1 2 3 9 # 2: 2 1 5 3 3 9 # 3: 3 0 3 2 3 6

Updated based on feedback from OP:

dcast(df, ID ~ Year, value.var = "rn", function(x)paste0("Count: ",length(x), " " ,paste(x,collapse = ","))) # ID 2011 2012 2013 # 1: 1 Count: 0 Count: 1 California Count: 2 Colorado,Louisiana # 2: 2 Count: 1 Washington Count: 5 Minnesota,Mississippi,New Hampshire,Ohio,Wisconsin Count: 3 New Jersey,Oklahoma,Wyoming # 3: 3 Count: 0 Count: 3 Hawaii,Kentucky,South Dakota Count: 2 Idaho,Tennessee

更多推荐

本文发布于:2023-08-07 20:15:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1465980.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:组合   并将   列表   grouping   list

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!