Skip to content

Instantly share code, notes, and snippets.

@psobczyk
Created June 7, 2018 07:44
Show Gist options
  • Save psobczyk/53361146aa66243f6adb7a2a6be307b4 to your computer and use it in GitHub Desktop.
Save psobczyk/53361146aa66243f6adb7a2a6be307b4 to your computer and use it in GitHub Desktop.
scraping data from timeanddate.com
library(rvest)
library(dplyr)
main_page <- read_html('https://www.timeanddate.com/holidays/')
all_countries <- main_page %>%
html_nodes(xpath = '//div[@class="row"]//li//a') %>%
html_attr(name = 'href')
holidays <- NULL
pb <- progress_estimated(length(all_countries))
for(country in all_countries){
url <- sprintf('https://www.timeanddate.com/%s', country)
tmp <- read_html(url) %>%
html_table() %>% .[[1]] %>%
mutate(country=country)
holidays[[country]] <- tmp
pb$tick()$print()
}
holidays <- do.call(bind_rows, holidays)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment