Created
June 14, 2017 20:16
-
-
Save alexpghayes/0e1e1e2fcc3050f2457f4b4bc25545a9 to your computer and use it in GitHub Desktop.
looking for a better way to manipulate data in a gather mutate spread pattern
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I have categorical data spread across multiple columns that I would like to aggregate. | |
library(tidyverse) | |
data <- data_frame(var1 = sample(LETTERS[1:2], 50, replace = TRUE), # categorical A/B | |
var2 = sample(LETTERS[1:2], 50, replace = TRUE), | |
var3 = sample(LETTERS[1:2], 50, replace = TRUE), | |
var4 = sample(LETTERS[3:4], 50, replace = TRUE), # categorical C/D | |
var5 = sample(LETTERS[3:4], 50, replace = TRUE), | |
var6 = sample(LETTERS[3:4], 50, replace = TRUE)) %>% | |
mutate(id = row_number()) | |
This solves my problem: | |
data %>% | |
group_by(id) %>% | |
gather(key, val, paste0("var", 1:3)) %>% | |
mutate(numA = sum(val == "A")) %>% | |
spread(key, val) %>% | |
gather(key2, val2, paste0("var", 4:6)) %>% | |
mutate(numC = sum(val2 == "C")) %>% | |
spread(key2, val2) | |
Is there a way to do this with less boilerplate? After mutation, the var1:var6 columns are no longer necessary. | |
hrbrmstr
commented
Jun 14, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment