Skip to content

Instantly share code, notes, and snippets.

@jrnold
Created January 19, 2019 12:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jrnold/603887f221f5d8b8314e369443690917 to your computer and use it in GitHub Desktop.
Save jrnold/603887f221f5d8b8314e369443690917 to your computer and use it in GitHub Desktop.
Read/write XML sitemaps in R
library("xml2")
handle_node <- function(x) {
name <- xml_name(x)
content <- xml_text(x)
if (name == "lastmod") {
content <- lubridate::ymd(content)
}
content <- list(content)
names(content) <- name
content
}
handle_url <- function(x) {
lmap(xml_children(x), handle_node)
}
handle_sitemap <- function(x) {
map(xml_children(x), handle_url)
}
read_sitemap <- function(path) {
handle_sitemap(read_xml(path)) %>%
map_dfr(as_tibble)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment