-
-
Save anonymous/9238018 to your computer and use it in GitHub Desktop.
id name | |
1 11 rick | |
2 32 tom | |
3 37 joe | |
id letters | |
1 11 r | |
2 11 i | |
3 11 c | |
4 11 k | |
5 32 t | |
6 32 o | |
7 32 m | |
8 37 j | |
9 37 o | |
10 37 e |
updated, deleted unneeded stuff
Clean solution. Thanks.
I have a more complicated example. How would I go about if I wanted all combinations of two letters from the name?
Here's another example:
1 F 6,10 Cancer 6,10
2 F 8,10 Cancer 8,10
3 F 12,13 NoCancer 12,13
4 F 3,4,5,10 Cancer
5 F 7,10 Cancer 7,10
6 F 4,8 NoCancer 4,8
Which I would like to transform into:
1 F 6,10 Cancer 6,10
2 F 8,10 Cancer 8,10
3 F 12,13 NoCancer 12,13
4 F 3,4,5,10 Cancer 3,4
4 F 3,4,5,10 Cancer 3,5
4 F 3,4,5,10 Cancer 3,10
4 F 3,4,5,10 Cancer 4,5
4 F 3,4,5,10 Cancer 4,10
4 F 3,4,5,10 Cancer 5,10
5 F 7,10 Cancer 7,10
6 F 4,8 NoCancer 4,8
Note how entry # 4 has an entry for each combination of two entries.
I was trying something with:
combn(x,2, simplify=F, function(x){ paste(x, collapse=",")} )
I'm treating the comma separated numbers as characters. Any ideas?
hi, i'm not sure I get the question. Why is row 4 the only one that gets split up? Not sure what name
is in this context? Is it the column with "Cancer" and "NoCancer"?
looking for combinations of 2 numbers. All the other cases third column is 2 numbers. However, wherever it is more than two, I'd like to split it up and add it to the last column.
Here's the more elaborate question on SO http://stackoverflow.com/questions/24662637/split-a-string-into-combinations-of-2-characters-and-expand-into-data-frame-in-r
Thanks
maybe this
df <- data.frame(
iter=1:6,
a=rep("F", 6),
b=c('6,10','8,10','12,13','3,4,5,10','7,10','4,8'),
c=c('Cancer','Cancer','NoCancer','Cancer','Cancer','NoCancer'),
d=c('6,10','8,10','12,13','','7,10','4,8'), stringsAsFactors = FALSE)
library(plyr)
foo <- function(x){
tmp <- strsplit(x$b, ",")[[1]]
if(length(tmp) > 2){
combos <- combn(tmp, 2, simplify = FALSE)
combos <- sapply(combos, function(y) paste0(y, collapse=",") )
data.frame(iter=x$iter, a=x$a, b=x$b, c=x$c, d=combos)
} else { x }
}
ddply(df, .(iter), foo)
iter a b c d
1 1 F 6,10 Cancer 6,10
2 2 F 8,10 Cancer 8,10
3 3 F 12,13 NoCancer 12,13
4 4 F 3,4,5,10 Cancer 3,4
5 4 F 3,4,5,10 Cancer 3,5
6 4 F 3,4,5,10 Cancer 3,10
7 4 F 3,4,5,10 Cancer 4,5
8 4 F 3,4,5,10 Cancer 4,10
9 4 F 3,4,5,10 Cancer 5,10
10 5 F 7,10 Cancer 7,10
11 6 F 4,8 NoCancer 4,8
What about this?