Skip to content

Instantly share code, notes, and snippets.

@aurielfournier
Created May 11, 2021 19:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aurielfournier/abe96ddb8bd7bb6735c1d294f13a32ac to your computer and use it in GitHub Desktop.
Save aurielfournier/abe96ddb8bd7bb6735c1d294f13a32ac to your computer and use it in GitHub Desktop.
library(tidyr)
long <- data.frame(Lname = c("A","B","C"),
c = c("c1","c2","c2"),
country=c("NZ","NZ","AL"))
> pivot_wider(data=long, values_from="Lname",names_from="country")
# A tibble: 2 x 3
c NZ AL
<chr> <chr> <chr>
1 c1 A NA
2 c2 B C
@aurielfournier
Copy link
Author

library(tidyr)

long <- data.frame(Lname = c("A","B","A","D","A"),
c = c("c1","c2","c1","c1","c1"),
country=c("NZ","NZ","AL",NA,"AL"))

pivot_wider(data=long, values_from="Lname",names_from="country")

@aurielfournier
Copy link
Author

long_subset <- long %>%
group_by(Lname, country) %>%
sample_n(1)

basically what that is doing above is using group_by to in R's memory subset your dataset so that each unique combo of species name and country gets split into its own dataset, and then sample_n(1) grabs the first row from each of those, and then those individual rows are all put back together, so that the long_subset object just has one combo of each species and country

then the below works fine

pivot_wider(data=long_subset, values_from="Lname",names_from="country")

@embruna
Copy link

embruna commented May 11, 2021

library(tidyverse)

long <- data.frame(Lname = c("A","B","C","A"),
c = c("c1","c2","c2","c2"),
country=c("NZ","NZ","AL",NA))
long %>%
drop_na(country) %>%
group_by(country) %>%
pivot_wider(values_from="Lname",names_from="country")`

@embruna
Copy link

embruna commented May 11, 2021

`library(tidyverse)

long <- data.frame(Lname = c("A","B","C","A"),
c = c("c1","c2","c2","c2"),
country=c("NZ","NZ","AL",NA))
long %>%
drop_na(country) %>%
group_by(country) %>%
pivot_wider(values_from="Lname",names_from="country")`

@cataraea
Copy link

ok so I still can't get my data to work because it is of this form:

long <- data.frame(Lname = c("A","B","C","A","B", "C"),
c = c("c1","c1","c1","c2","c2", "c2"),
country=c("NZ","NZ","AL",NA,"AL", NA))

@cataraea
Copy link

cataraea commented May 11, 2021

I can filter out the NAs and get to here:

long2 = long %>% filter(country != "NA")
long2

long2
Lname c country
1 A c1 NZ
2 B c1 NZ
3 C c1 AL
4 B c2 AL

but pivot_wider still doesn't like that there are duplicate Lnames

@embruna
Copy link

embruna commented May 11, 2021

I feel like I must be missing something super obvious, but I hacked this solution

long %>% drop_na(country) %>% select(-c) %>%. group_by(country,Lname) %>% summarize(n=n()) %>% pivot_wider(names_from="country", values_from = "n") %>% mutate(AL = ifelse(AL == 1, Lname, AL)) %>% mutate(NZ = ifelse(NZ == 1, Lname, NZ)) %>% select(-Lname)

@cataraea
Copy link

cool, thanks so much!

@lovettbr
Copy link

I think this may give you what you want. If the c is not important, it can be removed from group_by().

long2 = long %>% 
  filter(country != "NA") %>%
  group_by(country, c) %>%
  summarize(Lname=toString(Lname)) %>%
  pivot_wider(values_from="Lname",names_from="country")

@cataraea
Copy link

ooh that is so close, but it ends up with multiple Lnames in a single cell for some reason?
output (without the c) is:
long2

A tibble: 1 x 2

AL NZ

1 C, B A, B

@lovettbr
Copy link

Change relevant bit to:

toString(unique(Lname))

@cataraea
Copy link

cataraea commented May 11, 2021

hmm, I get same output with that?

but I can deal with this, thanks so much all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment