Skip to content

Instantly share code, notes, and snippets.

Jim Albert bayesball

Block or report user

Report or block bayesball

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@bayesball
bayesball / fivegameplayoffs.R
Created Oct 8, 2019
R script for blog post on five-game playoffs
View fivegameplayoffs.R
# plot my prior
curve(dnorm(x, .5, .05), .5, .8)
title("My Prior")
# write a function to compute the log likelihood
log_likelihood <- function(p, f){
prob3 <- p ^ 3 + (1 - p) ^ 3
prob4 <- 3 * p ^ 3 * (1 - p) +
@bayesball
bayesball / home_run_train_test.R
Created Aug 17, 2019
Illustrates the use of a GAM fit to predict home runs and check the predictions on a new dataset
View home_run_train_test.R
# load several packages
library(tidyverse)
library(CalledStrike)
library(mgcv)
# read in Statcast data for five seasons
sc <- read_csv("five_seasons_data.csv")
@bayesball
bayesball / home_run_prediction_2.R
Last active Aug 2, 2019
Prediction of 2019 HR counts using random effects model
View home_run_prediction_2.R
################################################
## NEW WORK - August 2, 2019
################################################
#########################################
# data for 1631 games through August 1
#########################################
library(tidyverse)
library(CalledStrike)
@bayesball
bayesball / home_run_prediction.R
Created Jun 28, 2019
Prediction of 2019 MLB home run total at midseason (through games of June 27)
View home_run_prediction.R
# 2019 statcast data is in data frame sc2019
# collect number of home runs in each game
library(tidyverse)
sc2019 %>%
group_by(game_pk) %>%
summarize(HR = sum(events == "home_run",
na.rm = TRUE)) -> S
@bayesball
bayesball / clean_statcast.R
Created Jun 26, 2019
R code for "cleaning" Statcast data
View clean_statcast.R
# read in SC data from 2019 seasons
# through games of June 23, 2019
library(tidyverse)
library(CalledStrike)
sc2019 <- read_csv("~/Dropbox/2016 WORK/BLOG Baseball R/OTHER/StatcastData/statcast2019.csv")
sc2019_ip <- filter(sc2019, type == "X")
# focus on Statcast 2019 in-play data
@bayesball
bayesball / calledstrikework.R
Created Feb 8, 2019
R study of called balls and strikes -- uses CalledStrike package
View calledstrikework.R
# load some packages
library(baseballr)
library(tidyverse)
library(CalledStrike)
library(gridExtra)
# scrape data for four pitchers
Aaron <- scrape_statcast_savant(start_date = "2018-03-15",
@bayesball
bayesball / homerunvalue.R
Created Feb 3, 2019
Exploring values of home runs
View homerunvalue.R
# load in two packages
library(tidyverse)
library(WinProbability)
# assume dataset all2018.csv is in current working
# directory -- these functions compute the runs
# expectancies and WPA values
d2018 <- compute.runs.expectancy(2018)
@bayesball
bayesball / compute.win.probs.R
Last active Apr 22, 2019
Updated win probability functions
View compute.win.probs.R
compute.win.probs <- function(d, S){
# adds variables P.OLD, P.NEW, and WPA
# to retrosheet data with run expectancies
invlogit <- function(x) exp(x) / (1 + exp(x))
d %>%
mutate(half.inning.row = 2 * INN_CT + BAT_HOME_ID,
runs0 = ifelse(BAT_HOME_ID == 1,
HOME_SCORE_CT - AWAY_SCORE_CT + RUNS.STATE,
HOME_SCORE_CT - AWAY_SCORE_CT - RUNS.STATE)) -> d
@bayesball
bayesball / pitch_length.R
Created Jan 1, 2019
Explores impact of length of plate appearance using 2018 Retrosheet data
View pitch_length.R
# Load tidyverse packages and read in the Retrosheet data
library(tidyverse)
load("~/Dropbox/Google Drive/Retrosheet/pbp.2018.Rdata")
# if you have trouble getting the 2018 Retrosheet data, you can
# use several complete Retrosheet datasets from previous seasons
# see http://www-math.bgsu.edu/~albert/retrosheet/
# Create new variables: pseq containing only the pitch
@bayesball
bayesball / erode_work.R
Created Dec 15, 2018
Does Plate Discipline Erode
View erode_work.R
# read in Statcast data for two seasons
library(tidyverse)
sc <- read_csv("../StatcastData/statcast2018new.csv")
sc17 <- read_csv("../StatcastData/statcast2017.csv")
# erode function will complete individual regression estimates for all
# players who have seen 1000 called pitches
erode <- function(sc){
You can’t perform that action at this time.