Skip to content

Instantly share code, notes, and snippets.

@johnjosephhorton
Created November 11, 2010 16:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnjosephhorton/672733 to your computer and use it in GitHub Desktop.
Save johnjosephhorton/672733 to your computer and use it in GitHub Desktop.
How to parse HTML tables using R
# from: http://learnr.wordpress.com/2010/01/21/ggplot2-crayola-crayon-colours/
library(XML)
library(ggplot2)
theurl <- "http://en.wikipedia.org/wiki/List_of_Crayola_crayon_colors"
html <- htmlParse(theurl)
sched <- readHTMLTable(html, stringsAsFactors = FALSE)
crayola <- readHTMLTable(html, stringsAsFactors = FALSE)[[2]]
crayola <- crayola[, c("Hex Code", "Issued", "Retired")]
names(crayola) <- c("colour", "issued", "retired")
crayola <- crayola[!duplicated(crayola$colour),]
crayola$retired[crayola$retired == ""] <- 2010
@johnjosephhorton
Copy link
Author

I got this from: http://learnr.wordpress.com/2010/01/21/ggplot2-crayola-crayon-colours/
Sticking it here b/c I find myself referring to it frequently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment