Skip to content

Instantly share code, notes, and snippets.

@seanjtaylor
Created October 1, 2016 21:03
Show Gist options
  • Save seanjtaylor/a726dbe1a9056fb9f837c1618bfcbda3 to your computer and use it in GitHub Desktop.
Save seanjtaylor/a726dbe1a9056fb9f837c1618bfcbda3 to your computer and use it in GitHub Desktop.
Plotting Air Yards Auto-correlation
library(dplyr)
library(ggplot2)
library(rvest)
library(tidyr)
html.doc <- read_html('http://www.footballoutsiders.com/stat-analysis/2016/quarterbacks-and-progression-air-yards')
# Extract table
raw.table <- html.doc %>%
html_table() %>%
first
# Get the actual table
my.data <- raw.table %>% tail(-2)
# Set the column names
colnames(my.data) <- raw.table[2,]
# Clean up the table and convert to long format.
cleaned <- my.data %>%
gather(year, air.yards, -Quarterback) %>%
filter(year != 'AVG', air.yards != '-') %>%
mutate(year = as.numeric(year),
air.yards = as.numeric(air.yards))
# Density of Air yards
cleaned %>%
ggplot(aes(x = air.yards)) +
geom_density()
# Self-join to get scatter plot
cleaned %>%
mutate(year = (year - 1)) %>%
rename(air.yards2 = air.yards) %>%
inner_join(cleaned) %>%
ggplot(aes(x = air.yards, y = air.yards2)) +
geom_point() +
geom_smooth(method = 'lm') +
xlab('Avg. Air Yards Previous Year') +
ylab('Avg. Air Yards')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment