R在每列中按分隔符排列多列,将字符串与值匹配(R Multiple Columns by Delimiter in each Column, Match String to Value)

系统教程 行业动态 更新时间:2024-06-14 17:01:34
R在每列中按分隔符排列多列,将字符串与值匹配(R Multiple Columns by Delimiter in each Column, Match String to Value)

对于noob问题很抱歉,但几天后我一直无法弄清楚如何做到这一点。 我一直在尝试使用R.简单地说,我有两列如下

A:B:C:D:F | 1.1:2.1:3.1:4.1:6.1 A:B:D:F | 1.2:2.2:4.2:6.2 A:B:C:F | 1.3:2.3:3.3:6.3 B:C:D:F | 2.4:3.4:4.4:6.4

请注意分隔符是':'。 最后我想要这个:

A | B | C | D | E | F 1.1 | 2.1 | 3.1 | 4.1 | NA | 6.1 1.2 | 2.2 | NA | 4.2 | NA | 6.2 1.3 | 2.3 | 3.3 | NA | NA | 6.3 NA | 2.4 | 3.4 | 4.4 | NA | 6.4

为什么我无法解决它:

第二列中的值对于每一行都是不同的,因此我需要循环检查第1列中是否有字符串,第x行,并在相应的第2列中插入数字,如果存在则在第x行中插入。 **我刚刚为行选择了1.1,1.2等,以便更容易概念化。

0或NAs不包含在第1列中,因此我需要在缺少值时跳过列,例如在玩具示例中,第2行缺少C和E建议列。 每行没有所需数量的字符串和相应的值(第1行有5个字符串,第2行到第4行有4个字符串

我会想象类似于以下内容,将相应的第2列替换为“1”,行x值,但我不知道如何执行此操作。 我实验过的另一种方法是插入第2列值,但只是根据字符串是否存在而创建1和0列,这就是我遇到的代码片段。

df$A <- ifelse(grepl("A", df$PASS, ignore.case = T), "1", "0")

很抱歉长篇大论,但我被困了。 我觉得这超出了我的初级水平R.对任何可以解决这个问题的人都称赞!

Sorry for the noob question, but after a few days I haven't been able to figure out how to do this. I've been trying to use R. Simply put, I have two columns as follows

A:B:C:D:F | 1.1:2.1:3.1:4.1:6.1 A:B:D:F | 1.2:2.2:4.2:6.2 A:B:C:F | 1.3:2.3:3.3:6.3 B:C:D:F | 2.4:3.4:4.4:6.4

Note the delimiter is ':'. At the end I want to have this:

A | B | C | D | E | F 1.1 | 2.1 | 3.1 | 4.1 | NA | 6.1 1.2 | 2.2 | NA | 4.2 | NA | 6.2 1.3 | 2.3 | 3.3 | NA | NA | 6.3 NA | 2.4 | 3.4 | 4.4 | NA | 6.4

Why I can't solve it:

The values in the second column are different for each row, so I need to loop check whether there's a string in column 1, row x and insert the number in the corresponding column 2, row x if it exists. **I have just chosen 1.1, 1.2 etc, for rows to make it easier to conceptualize.

0's or NAs aren't included in column 1, so I need to skip columns when there's missing values, for example in the toy example row 2 is missing C and E proposed columns. Each row does not have a required number of strings and corresponding values (row 1 has 5 strings, row 2 through 4 have 4

I would imagine something similar to the following, replacing "1" for the corresponding column 2, row x value, but I have no idea how to do this. Another approach I experimented with but was stuck at the inserting column 2 values was to create columns with 1 and 0 only based on if a string was present, which was how I came across the code snippet.

df$A <- ifelse(grepl("A", df$PASS, ignore.case = T), "1", "0")

Sorry for the long writeup, but I'm super stuck. I feel this is beyond my beginner level R. Major kudos to anyone that can solve this!

最满意答案

这里只是基础R的解决方案,没有tidyverse的魔力。 它假设您可以将所有数据作为一个大字符串读入,但是为输入流更改它并不困难。

x <- "A:B:C:D:F | 1.1:2.1:3.1:4.1:6.1 A:B:D:F | 1.2:2.2:4.2:6.2 A:B:C:F | 1.3:2.3:3.3:6.3 B:C:D:F | 2.4:3.4:4.4:6.4" data <- unlist(str_split(x, "\n")) result <- matrix(as.numeric(NA), nrow = length(data), ncol = 6) colnames(result) <- c("A", "B", "C", "D", "E", "F") for (i in 1:length(data)) { split_data <- unlist(str_split(data[i], " [|] ")) print(split_data) indices <- unlist(str_split(split_data[1], ":")) values <- unlist(str_split(split_data[2], ":")) for (j in 1:length(indices)) { result[i, indices[j]] <- as.numeric(values[j]) } } result

Here's a solution in just base R without the magic of tidyverse. It assumes you can read in all your data as a big string but it's not too hard to alter it for an input stream.

x <- "A:B:C:D:F | 1.1:2.1:3.1:4.1:6.1 A:B:D:F | 1.2:2.2:4.2:6.2 A:B:C:F | 1.3:2.3:3.3:6.3 B:C:D:F | 2.4:3.4:4.4:6.4" data <- unlist(str_split(x, "\n")) result <- matrix(as.numeric(NA), nrow = length(data), ncol = 6) colnames(result) <- c("A", "B", "C", "D", "E", "F") for (i in 1:length(data)) { split_data <- unlist(str_split(data[i], " [|] ")) print(split_data) indices <- unlist(str_split(split_data[1], ":")) values <- unlist(str_split(split_data[2], ":")) for (j in 1:length(indices)) { result[i, indices[j]] <- as.numeric(values[j]) } } result

更多推荐

本文发布于:2023-04-20 18:49:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/dzcp/3d83ccc81e37f2978ff01aa95522fb24.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:字符串   排列   分隔符   Multiple   列中按

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!