Skip to content

Instantly share code, notes, and snippets.

@topepo
Created April 3, 2020 01:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save topepo/05a74916c343e57a71c51d6bc32a21ce to your computer and use it in GitHub Desktop.
Save topepo/05a74916c343e57a71c51d6bc32a21ce to your computer and use it in GitHub Desktop.
library(tidyverse)
library(lubridate)
# ------------------------------------------------------------------------------
set.seed(2427)
hotels <-
readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv') %>%
filter(is_canceled == 0) %>%
mutate(
children = case_when(children + babies > 0 ~ "children", TRUE ~ "none"),
required_car_parking_spaces =
case_when(required_car_parking_spaces > 0 ~ "parking", TRUE ~ "none"),
arrival_date = paste(arrival_date_year, arrival_date_month, arrival_date_day_of_month, sep = "-"),
arrival_date = ymd(arrival_date)
) %>%
rename(average_daily_rate = adr) %>%
select(-is_canceled, -reservation_status, -babies, -starts_with("arrival_date_"),
-reservation_status_date, -agent, -company) %>%
mutate_if(is.character, ~ str_replace_all(.x, " ", "_")) %>%
mutate_if(is.character, as.factor) %>%
sample_n(50000)
# ------------------------------------------------------------------------------
write_csv(hotels, path = "hotels.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment