Skip to content

Instantly share code, notes, and snippets.

@masnick
Last active July 31, 2022 00:11
Show Gist options
  • Save masnick/4975ec5860ccd98b6a517014bb006e6a to your computer and use it in GitHub Desktop.
Save masnick/4975ec5860ccd98b6a517014bb006e6a to your computer and use it in GitHub Desktop.
Fix up date columns in R
df_messy_dates <- tribble(
~col1_dt, ~col2_dt, ~other_data,
"2019-01-01", "2019-01-02", "a",
"2019-01-03", "1/4/2019", "b",
"01/05/2019", "2019-01-06", "c"
)
# A tibble: 3 x 3
# col1_dt col2_dt other_data
# <chr> <chr> <chr>
# 1 2019-01-01 2019-01-02 a
# 2 2019-01-03 1/4/2019 b
# 3 01/05/2019 2019-01-06 c
fn_date_fixer <- function(col) {
output <- c()
for(cell in col) {
if(is.na(cell) || cell == "") {
# "" will cause `lubridate` to fail, so replace with NA
output <- c(output, NA)
}
else if(typeof(cell) == "character") {
parsed <- readr::parse_date(cell, format="%Y-%m-%d")
if(is.na(parsed)) {
parsed <- readr::parse_date(cell, format="%m/%d/%Y")
}
if(is.na(parsed)) {
stop(str_interp("Could not parse ${cell}"))
}
output <- c(output, parsed)
}
else {
output <- c(output, cell)
}
}
return(lubridate::as_date(output))
}
df_clean_dates <- df_messy_dates %>%
mutate_at(
vars(ends_with("_dt")),
list(
~ fn_date_fixer(.)
)
)
df_clean_dates
# A tibble: 3 x 3
# col1_dt col2_dt other_data
# <date> <date> <chr>
# 1 2019-01-01 2019-01-02 a
# 2 2019-01-03 2019-04-01 b
# 3 2019-05-01 2019-01-06 c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment