通过其中一个列的值对数据框进行子集(Subsetting a data frame by a value of one of its colums)

编程入门 行业动态 更新时间:2024-10-18 14:26:38
通过其中一个列的值对数据框进行子集(Subsetting a data frame by a value of one of its colums)

我有一个相当大的数据框架。 这是一个简化的例子:

Group Element Value Note 1 AAA 11 Good 1 ABA 12 Good 1 AVA 13 Good 2 CBA 14 Good 2 FDA 14 Good 3 JHA 16 Good 3 AHF 16 Good 3 AKF 17 Good

这是一个dput :

dat <- structure(list(Group = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), Element = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 3L, 4L), .Label = c("AAA", "ABA", "AHF", "AKF", "AVA", "CBA", "FDA", "JHA"), class = "factor"), Value = c(11L, 12L, 13L, 14L, 14L, 16L, 16L, 17L), Note = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Good", class = "factor")), .Names = c("Group", "Element", "Value", "Note"), class = "data.frame", row.names = c(NA, -8L))

我试图根据小组分开它。 所以我们说吧

第1组将是一个数据框:

Group Element Value Note 1 AAA 11 Good 1 ABA 12 Good 1 AVA 13 Good

第2组:

2 CBA 14 Good 2 FDA 14 Good

等等。

I have a rather large data frame. Here is a simplified example:

Group Element Value Note 1 AAA 11 Good 1 ABA 12 Good 1 AVA 13 Good 2 CBA 14 Good 2 FDA 14 Good 3 JHA 16 Good 3 AHF 16 Good 3 AKF 17 Good

Here it is as a dput:

dat <- structure(list(Group = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), Element = structure(c(1L, 2L, 5L, 6L, 7L, 8L, 3L, 4L), .Label = c("AAA", "ABA", "AHF", "AKF", "AVA", "CBA", "FDA", "JHA"), class = "factor"), Value = c(11L, 12L, 13L, 14L, 14L, 16L, 16L, 17L), Note = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Good", class = "factor")), .Names = c("Group", "Element", "Value", "Note"), class = "data.frame", row.names = c(NA, -8L))

I'm trying to separate it based on the group. so let's say

Group 1 will be a data frame:

Group Element Value Note 1 AAA 11 Good 1 ABA 12 Good 1 AVA 13 Good

Group 2:

2 CBA 14 Good 2 FDA 14 Good

and so on.

最满意答案

您可以使用split 。

> dat ## Group Element Value Note ## 1 1 AAA 11 Good ## 2 1 ABA 12 Good ## 3 1 AVA 13 Good ## 4 2 CBA 14 Good ## 5 2 FDA 14 Good ## 6 3 JHA 16 Good ## 7 3 AHF 16 Good ## 8 3 AKF 17 Good > x <- split(dat, dat$Group)

然后,您可以使用x[[1]] , x[[2]]等按组编号访问每个单独的数据框。 例如,这是第2组:

> x[[2]] ## or x[2] ## Group Element Value Note ## 4 2 CBA 14 Good ## 5 2 FDA 14 Good

ADD:由于您在注释中询问了它,因此您可以使用write.csv和lapply将每个单独的数据帧写入文件。 invisible包装器只是为了抑制lapply的输出

> invisible(lapply(seq(x), function(i){ write.csv(x[[i]], file = paste0(i, ".csv"), row.names = FALSE) }))

我们可以看到文件是通过查看list.files创建的

> list.files(pattern = "^[0-9].csv") ## [1] "1.csv" "2.csv" "3.csv"

我们可以使用read.csv查看第三组的数据框

> read.csv("3.csv") ## Group Element Value Note ## 1 3 JHA 16 Good ## 2 3 AHF 16 Good ## 3 3 AKF 17 Good

You can use split for this.

> dat ## Group Element Value Note ## 1 1 AAA 11 Good ## 2 1 ABA 12 Good ## 3 1 AVA 13 Good ## 4 2 CBA 14 Good ## 5 2 FDA 14 Good ## 6 3 JHA 16 Good ## 7 3 AHF 16 Good ## 8 3 AKF 17 Good > x <- split(dat, dat$Group)

Then you can access each individual data frame by group number with x[[1]], x[[2]], etc. For example, here is group 2:

> x[[2]] ## or x[2] ## Group Element Value Note ## 4 2 CBA 14 Good ## 5 2 FDA 14 Good

ADD: Since you asked about it in the comments, you can write each individual data frame to file with write.csv and lapply. The invisible wrapper is simply to suppress the output of lapply

> invisible(lapply(seq(x), function(i){ write.csv(x[[i]], file = paste0(i, ".csv"), row.names = FALSE) }))

We can see that the files were created by looking at list.files

> list.files(pattern = "^[0-9].csv") ## [1] "1.csv" "2.csv" "3.csv"

And we can see the data frame of the third group with read.csv

> read.csv("3.csv") ## Group Element Value Note ## 1 3 JHA 16 Good ## 2 3 AHF 16 Good ## 3 3 AKF 17 Good

更多推荐

本文发布于:2023-08-01 02:18:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1350597.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:子集   其中一个   数据   Subsetting   frame

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!