library(nflscrapR)
library(tidyverse)
library(na.tools)
#assumes you're starting with pbp dataframe called 'games'
for_model <- games %>%
select(half_seconds_remaining, yardline_100, down, ydstogo, goal_to_go, game_id, play_id)
#give credit for where fumble happened in EPA like how yards gained works | |
fix_fumbles <- function(d) { | |
n <- d %>% filter(complete_pass == 1 & fumble_lost == 1 & !is.na(epa)) %>% | |
select(desc, game_id, play_id, epa, posteam, half_seconds_remaining, yardline_100, down, ydstogo, yards_gained, goal_to_go, ep) %>% | |
mutate( | |
#save old stuff for testing/checking | |
down_old = down, ydstogo_old = ydstogo, epa_old = epa, | |
#update yard line, down, yards to go from play result |
You need to have the roster data and rush-pass play-by-play data saved on your computer.
How to create and save those files is shown in this file here.
Example image:
Run this code, making sure all the packages are installed (install.packages("package")
if you don't).
Make sure to replace the instances of FILENAME where you want to save your data.
library(tidyverse)
library(dplyr)
library(na.tools)
first <- 2009 #first season to grab. min available=2009
Ben Baldwin
In a follow-up to his excellent piece on the value of the run game in The Athletic (great website, highly recommended), Ted Nguyen shared the following:
"In-house NFL analytics crews track QB hits and the results of the accumulation of hits and how it affects offensive performance over the course of a game."
Does the accumulation of hits affect offensive performance over the game? Is this finally a feather in the cap for the run game defenders?
--> A beginner's guide to nflfastR <--
I get a lot of questions about how to get nflscrapR up and running. This guide is intended to help new users build interesting tables or charts from the ground up, taking the raw nflscrapR data.
Quick word if you're new to programming: all of this is happening in R. Obviously, you need to install R on your computer to do any of this. Make sure you save what you're doing in a script (in R, File --> New script) so you can save your work and run multiple lines of code at once. To run code from a script, highlight what you want, right click, and select Run line. As you go through your R journey, you might get stuck and have to google a bunch of things, but that's totally okay and normal. That's how I wrote this thing!
@nflscrapR's Expected Points (EP) is a popular metric among analysts doing public research of play in the NFL. Detailed in the creators' research paper, the metric is derived from a model that was built as a part of a larger system designed to calculate individual wins above replacement values for offensive skill players.
The authors very graciously made public all of their data (nflscrapR-data) and code (nflWAR, nflscrapR-models, nflscrapR) for this project, including the code used to build the EP model. In the init_ep_fg_models.R file of the nflscrapR-models
repository, we can see that the following variables are used