Skip to content

Instantly share code, notes, and snippets.

@alexpghayes
Created June 14, 2017 20:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alexpghayes/0e1e1e2fcc3050f2457f4b4bc25545a9 to your computer and use it in GitHub Desktop.
Save alexpghayes/0e1e1e2fcc3050f2457f4b4bc25545a9 to your computer and use it in GitHub Desktop.
looking for a better way to manipulate data in a gather mutate spread pattern
I have categorical data spread across multiple columns that I would like to aggregate.
library(tidyverse)
data <- data_frame(var1 = sample(LETTERS[1:2], 50, replace = TRUE), # categorical A/B
var2 = sample(LETTERS[1:2], 50, replace = TRUE),
var3 = sample(LETTERS[1:2], 50, replace = TRUE),
var4 = sample(LETTERS[3:4], 50, replace = TRUE), # categorical C/D
var5 = sample(LETTERS[3:4], 50, replace = TRUE),
var6 = sample(LETTERS[3:4], 50, replace = TRUE)) %>%
mutate(id = row_number())
This solves my problem:
data %>%
group_by(id) %>%
gather(key, val, paste0("var", 1:3)) %>%
mutate(numA = sum(val == "A")) %>%
spread(key, val) %>%
gather(key2, val2, paste0("var", 4:6)) %>%
mutate(numC = sum(val2 == "C")) %>%
spread(key2, val2)
Is there a way to do this with less boilerplate? After mutation, the var1:var6 columns are no longer necessary.
@hrbrmstr
Copy link

data %>% 
  group_by(id) %>% 
  gather(j, v,  -id) %>% 
  count(v) %>% 
  spread(v, n) %>% 
  mutate_all(coalesce, 0L) %>% 
  ungroup()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment