我尝试通过每个ID获得唯一的组合,我一直收到错误,它不扩展ID。
I try to get unique combination by each ID, I keep get error, it doesn't expand ID.
ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,4,4,4,5,5,5,5,5,6,6,6,6) var1 <- c("A","B","E","F","C","D","C","A","B","C","A","D","B","C", "A","B","C","A","D","C","A","B","C","E","F","G") df1 <- data.frame(ID,var1) df1 <- df1[order(df1$ID, df1$var1),] dd <- unique(df1) dd <- data.table(dd) dd[,new4 := t(combn(sort(var1), m = 3))[,1],by= "ID"] dd[,new5:= t(combn(sort(var1), m = 3))[,2],by="ID"] dd[,new6:= t(combn(sort(var1), m = 3))[,3],by="ID"] Warning message: In `[.data.table`(dd, , `:=`(new4, t(combn(sort(var1), m = 3))[, : RHS 1 is length 10 (greater than the size (5) of group 1). The last 5 element(s) will be discarded. ID var1 new4 new5 new6 1: 1 A A B C 2: 1 B A B E 3: 1 C A B F 4: 1 E A C E 5: 1 F A C F 6: 2 A A B C 7: 2 B A B D 8: 2 C A C D 9: 2 D B C D 10: 3 A A B C 11: 3 B A B D 12: 3 C A C D 13: 3 D B C D 14: 4 A A B C 15: 4 B A B C 16: 4 C A B C 17: 5 A A B C 18: 5 B A B D 19: 5 C A C D 20: 5 D B C D 21: 6 C C E F 22: 6 E C E G 23: 6 F C F G 24: 6 G E F G输出没有给每个ID,ID1(A,B,C,E,F)足够的组合,它只给出5个组合。输出我想要的ID1,有10组合(ABC)(ACF)(ABF)(ABE)(BCE)(BCF)(CAB)(CAE)(CAF)(ECF)
The output doesn't give enough combination by each ID, ID1 (A,B,C,E,F), it gives only 5 combinations. There is anyway fixing the problem?Output I want for ID1, there are 10 combinations (A B C) (A C F) (A B F) (A B E) (B C E) (B C F) (C A B) (C A E) (C A F) (E C F)
推荐答案@BIN由于组合数量通常不会与Var1的唯一字母数量相匹配,因此您可以尝试以下方法: / p>
@BIN Since the number of combinations will not usually match the number of unique letters for "Var1", you can try the following:
library(dplyr) dd[,var1:=as.character(var1)] dd[,.(Numb.Combinations = choose(var1 %>% uniqueN,3), ID1 = paste0(var1 %>% unique, collapse=""), Combinations = paste(combn(var1,3,function(x) paste0(x,collapse = "")),collapse="-")), by="ID"]输出结果与您最后请求的结果类似:
The output is similar to the one that you requested at the very end:
ID Numb.Combinations ID1 Combinations 1: 1 10 ABCEF ABC-ABE-ABF-ACE-ACF-AEF-BCE-BCF-BEF-CEF 2: 2 4 ABCD ABC-ABD-ACD-BCD 3: 3 4 ABCD ABC-ABD-ACD-BCD 4: 4 1 ABC ABC 5: 5 4 ABCD ABC-ABD-ACD-BCD 6: 6 4 CEFG CEF-CEG-CFG-EFG,by @akrun and @frank,
Or if you prefer, as suggested by @akrun and @frank,
dd <- dd[, c(ID1 = paste0(var1 %>% unique, collapse=""), transpose(combn(sort(var1), 3, simplify = F))), by = ID] colnames(dd) <- c("ID","ID1","New1","New2","New3")< :
With output:
ID ID1 New1 New2 New3 1: 1 ABCEF A B C 2: 1 ABCEF A B E 3: 1 ABCEF A B F 4: 1 ABCEF A C E 5: 1 ABCEF A C F 6: 1 ABCEF A E F 7: 1 ABCEF B C E 8: 1 ABCEF B C F 9: 1 ABCEF B E F 10: 1 ABCEF C E F 11: 2 ABCD A B C 12: 2 ABCD A B D 13: 2 ABCD A C D 14: 2 ABCD B C D 15: 3 ABCD A B C 16: 3 ABCD A B D 17: 3 ABCD A C D 18: 3 ABCD B C D 19: 4 ABC A B C 20: 5 ABCD A B C 21: 5 ABCD A B D 22: 5 ABCD A C D 23: 5 ABCD B C D 24: 6 CEFG C E F 25: 6 CEFG C E G 26: 6 CEFG C F G 27: 6 CEFG E F G ID ID1 New1 New2 New3更多推荐
通过R中的ID获取唯一组合(combn)
发布评论