我正在尝试合并两个不同数据框中的信息,但问题始于不均匀的维度,并尝试不使用列索引而是使用列中的信息。 R或join(dplyr)中的合并函数不适用于我的数据。
我必须使用数据帧(一个是其他人的子集,在最后一列中有更新的信息):
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"))
Name val Case 1 A 1 NA 2 B 2 1 3 C 3 NA 4 D 1 NA 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 NA 9 I 3 NA必须使用下面df2中的信息更改df1中Case列中的某些行:
df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1")
Name val Case 1 A 1 1 2 D 2 1 3 H 3 1因此在val列中没有什么重要的,但是我将它添加到示例中,因为我想表明我有更多的列而不是两个,而且我的实际数据比示例更大。
基本上,我想通过检查第一列中的信息来更改特定行(在这种情况下,它们是唯一的字母),最后我仍然希望将df1作为最终数据帧。
为了更好的解释,我想看到这样的事情:
Name val Case 1 A 1 1 2 B 2 1 3 C 3 NA 4 D 1 1 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 1 9 I 3 NA注意更改了A , D和H 。
谢谢。
I'm trying to merge informations in two different data frames, but problem begins with uneven dimensions and trying to use not the column index but the information in the column. merge function in R or join's (dplyr) don't work with my data.
I have to dataframes (One is subset of the others with updated info in the last column):
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"))
Name val Case 1 A 1 NA 2 B 2 1 3 C 3 NA 4 D 1 NA 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 NA 9 I 3 NASome rows in the Case column in df1 have to be changed with the info in the df2 below:
df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1")
Name val Case 1 A 1 1 2 D 2 1 3 H 3 1So there's nothing important in the val column, however I added it into the examples since I want to indicate that I have more columns than two and also my real data is way bigger than the examples.
Basically, I want to change specific rows by checking the information in the first columns (in this case, they're unique letters) and in the end I still want to have df1 as a final data frame.
for a better explanation, I want to see something like this:
Name val Case 1 A 1 1 2 B 2 1 3 C 3 NA 4 D 1 1 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 1 9 I 3 NANote changed information for A,D and H.
Thanks.
最满意答案
来自base-r的%in%是有救援的。
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"), stringsAsFactors = F) df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1", stringsAsFactors = F) df1$Case <- ifelse(df1$Name %in% df2$Name, df2$Case[df2$Name %in% df1$Name], df1$Case) df1 Output: > df1 Name val Case 1 A 1 1 2 B 2 1 3 C 3 NA 4 D 1 1 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 1 9 I 3 NA%in% from base-r is there to rescue.
df1=data.frame(Name = print(LETTERS[1:9]), val = seq(1:3), Case = c("NA","1","NA","NA","1","NA","1","NA","NA"), stringsAsFactors = F) df2 = data.frame(Name = c("A","D","H"), val = seq(1:3), Case = "1", stringsAsFactors = F) df1$Case <- ifelse(df1$Name %in% df2$Name, df2$Case[df2$Name %in% df1$Name], df1$Case) df1 Output: > df1 Name val Case 1 A 1 1 2 B 2 1 3 C 3 NA 4 D 1 1 5 E 2 1 6 F 3 NA 7 G 1 1 8 H 2 1 9 I 3 NA更多推荐
发布评论