Skip to content

Instantly share code, notes, and snippets.

@jimhester
Created October 14, 2015 14:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jimhester/01087e190618cc91a213 to your computer and use it in GitHub Desktop.
Save jimhester/01087e190618cc91a213 to your computer and use it in GitHub Desktop.
library(rvest)
reformat_table <- function(tbl) {
# Fix the column names
colnames(tbl) <- tbl[1, ]
# Missing values for players who did not play
dnp <- tbl[[2]] == "Did Not Play"
tbl[dnp, -1] <- NA
# Add a type column to signify starters and reserves
reserves <- which(tbl[[1]] == "Reserves")[1]
tbl$type <- "Reserve"
tbl$type[seq_len(reserves - 1)] <- "Starter"
tbl$type[dnp] <- "DNP"
# Remove header and summary columns
tbl[c(-1, -reserves, -NROW(tbl)),]
}
# Parse the data
"http://www.basketball-reference.com/boxscores/201506140GSW.html" %>%
read_html() %>%
html_nodes(".stats_table[id*='_basic']") %>%
html_table %>%
lapply(reformat_table)
@hadley
Copy link

hadley commented Oct 14, 2015

See also tidyverse/rvest#111 ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment