Skip to content

Instantly share code, notes, and snippets.

@danweitzel
Last active February 4, 2021 20:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save danweitzel/b3af9cc6cd88253d833a88b21b9a99fa to your computer and use it in GitHub Desktop.
Save danweitzel/b3af9cc6cd88253d833a88b21b9a99fa to your computer and use it in GitHub Desktop.
Example of transformation for event df merge
library("tidyverse")
df_event <- data.frame(country = c("A", "A", "B"),
event = c(1,0,1),
min_date = c(1990,1994,1993),
max_date = c(1996, 1999, 1997))
df <- data.frame(country = c(rep("A", 20), rep("B", 20)),
year = c(rep(seq(1989, 2008, by = 1),2)))
df_event_long <-
df_event %>%
group_by(country, event) %>%
pivot_longer(-c(country, event), names_to = "type", values_to = "year") %>%
complete(year = seq(min(year), max(year), by=1)) %>%
group_by(country, year) %>%
select(-c(event, type)) %>%
unique() %>%
mutate(event = TRUE)
df <-
df %>%
left_join(df_event_long, by = c("country", "year"))
@danweitzel
Copy link
Author

danweitzel commented Jan 22, 2021

This file transforms and merges two data frames. The first data frame is an event data frame that has events in countries as the unit of observation. The start and end date of the year are listed in each event row. Note: Not all countries have events in every year, some countries can have multiple events in one year. The second data frame is a country-year data frame with unique country-year observations that are complete.

The code above generates three events in two countries and then transforms the event data frame. My solution approach:

  1. pivot the event data from wide to long to get start and end years in one column
  2. group by country and event and fill the series between min and max year
  3. group by country and year and remove duplicate observations
  4. add event dummy
  5. merge

Script written in response to: https://twitter.com/hilango/status/1352372636995739648

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment