Skip to content

Instantly share code, notes, and snippets.

@avallecam
Forked from deanmarchiori/complete.r
Created July 28, 2023 17:24
Show Gist options
  • Save avallecam/894e0e2f5db3ec17cf78d2e6d9b04bbd to your computer and use it in GitHub Desktop.
Save avallecam/894e0e2f5db3ec17cf78d2e6d9b04bbd to your computer and use it in GitHub Desktop.
Turning implicit missing values into explicit missing values.
# tidyr::complete() & tidyr::full_seq() -----------------------------------
# Turning implicit missing values into explicit missing values.
# Bonus: Filling in gaps in a date range
library(tidyr)
library(tibble)
library(dplyr)
# Making up some observations from two weather stations.
# Some fields are nested and shouldn't be crossed e.g. station and id.
# Other observations are missing days with no data.
weather <- tibble(
weather_station_id = c(123, 123, 123, 123, 456, 456, 456),
weather_station_name = c("Sydney", "Sydney", "Sydney", "Sydney", "Melbourne",
"Melbourne", "Melbourne"),
dates = c("2020-01-01", "2020-01-02", "2020-01-04", "2020-01-07",
"2020-01-02", "2020-01-04", "2020-01-07"),
temp = c(29, 31, 27, 24, 32, 34, 35),
) %>%
mutate(dates = as.Date(dates))
weather
# Nesting the id and station name, expand()ing dates, but rather than
# use the dates present in the data we want to fill in the entire date sequence.
weather %>%
complete(nesting(weather_station_id, weather_station_name),
dates = full_seq(dates, 1))
@avallecam
Copy link
Author

# Filling the missing values with 0,
# for a time series analysis,
# adding the argument fill with the variable name and 
# the value to fill (in this case 0) within a list() function

weather %>% 
  complete(nesting(weather_station_id, weather_station_name), 
           dates = full_seq(dates, 1),
           fill = list(temp=0))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment