Last active
February 4, 2021 20:58
-
-
Save danweitzel/b3af9cc6cd88253d833a88b21b9a99fa to your computer and use it in GitHub Desktop.
Example of transformation for event df merge
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library("tidyverse") | |
df_event <- data.frame(country = c("A", "A", "B"), | |
event = c(1,0,1), | |
min_date = c(1990,1994,1993), | |
max_date = c(1996, 1999, 1997)) | |
df <- data.frame(country = c(rep("A", 20), rep("B", 20)), | |
year = c(rep(seq(1989, 2008, by = 1),2))) | |
df_event_long <- | |
df_event %>% | |
group_by(country, event) %>% | |
pivot_longer(-c(country, event), names_to = "type", values_to = "year") %>% | |
complete(year = seq(min(year), max(year), by=1)) %>% | |
group_by(country, year) %>% | |
select(-c(event, type)) %>% | |
unique() %>% | |
mutate(event = TRUE) | |
df <- | |
df %>% | |
left_join(df_event_long, by = c("country", "year")) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This file transforms and merges two data frames. The first data frame is an event data frame that has events in countries as the unit of observation. The start and end date of the year are listed in each event row. Note: Not all countries have events in every year, some countries can have multiple events in one year. The second data frame is a country-year data frame with unique country-year observations that are complete.
The code above generates three events in two countries and then transforms the event data frame. My solution approach:
Script written in response to: https://twitter.com/hilango/status/1352372636995739648