Skip to content

Instantly share code, notes, and snippets.

Jim Albert bayesball

Block or report user

Report or block bayesball

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@bayesball
bayesball / fivegameplayoffs.R
Created Oct 8, 2019
R script for blog post on five-game playoffs
View fivegameplayoffs.R
# plot my prior
curve(dnorm(x, .5, .05), .5, .8)
title("My Prior")
# write a function to compute the log likelihood
log_likelihood <- function(p, f){
prob3 <- p ^ 3 + (1 - p) ^ 3
prob4 <- 3 * p ^ 3 * (1 - p) +
@bayesball
bayesball / home_run_train_test.R
Created Aug 17, 2019
Illustrates the use of a GAM fit to predict home runs and check the predictions on a new dataset
View home_run_train_test.R
# load several packages
library(tidyverse)
library(CalledStrike)
library(mgcv)
# read in Statcast data for five seasons
sc <- read_csv("five_seasons_data.csv")
@bayesball
bayesball / home_run_prediction_2.R
Last active Aug 2, 2019
Prediction of 2019 HR counts using random effects model
View home_run_prediction_2.R
################################################
## NEW WORK - August 2, 2019
################################################
#########################################
# data for 1631 games through August 1
#########################################
library(tidyverse)
library(CalledStrike)
@bayesball
bayesball / home_run_prediction.R
Created Jun 28, 2019
Prediction of 2019 MLB home run total at midseason (through games of June 27)
View home_run_prediction.R
# 2019 statcast data is in data frame sc2019
# collect number of home runs in each game
library(tidyverse)
sc2019 %>%
group_by(game_pk) %>%
summarize(HR = sum(events == "home_run",
na.rm = TRUE)) -> S
@bayesball
bayesball / clean_statcast.R
Created Jun 26, 2019
R code for "cleaning" Statcast data
View clean_statcast.R
# read in SC data from 2019 seasons
# through games of June 23, 2019
library(tidyverse)
library(CalledStrike)
sc2019 <- read_csv("~/Dropbox/2016 WORK/BLOG Baseball R/OTHER/StatcastData/statcast2019.csv")
sc2019_ip <- filter(sc2019, type == "X")
# focus on Statcast 2019 in-play data
@bayesball
bayesball / heat_plot.R
Created Sep 17, 2016
Constructs heat map of probability of a hit or home run for a specific player from pitchFX data
View heat_plot.R
heat_plot <- function(player, d, HR=FALSE){
# inputs
# player - name of player
# d - pitchRX data frame with variables Batter, Event, and X, Z (location of pitch)
# will output a ggplot2 object
# need to use print function to display the pot
require(dplyr)
require(ggplot2)
require(mgcv)
# define the strike zone
@bayesball
bayesball / shiftwork.R
Created Jun 9, 2018
Work on infield alignment data from Baseball Savant
View shiftwork.R
# load in packages
library(tidyverse)
library(ggrepel)
library(baseballr)
# read in Statcast data for 2018 season
sc <- read_csv("../StatcastData/statcast2018new.csv")
@bayesball
bayesball / get_statcast.R
Created Feb 8, 2018
Scrape 2017 Statcast data from Baseball Savant using baseballr package
View get_statcast.R
library(baseballr)
library(readr)
s1 <- scrape_statcast_savant_batter_all("2017-04-02",
"2017-04-08")
s2 <- scrape_statcast_savant_batter_all("2017-04-09",
"2017-04-15")
s3 <- scrape_statcast_savant_batter_all("2017-04-16",
"2017-04-22")
s4 <- scrape_statcast_savant_batter_all("2017-04-23",
"2017-04-29")
@bayesball
bayesball / statcast_gam.R
Created Nov 20, 2017
R code to fit generalized additive model to Statcast data
View statcast_gam.R
# load in packages
library(readr)
library(dplyr)
library(ggplot2)
library(mgcv)
##### read in a theme for the title of my plots
TH <- theme(plot.title = element_text(hjust = 0.5, size = 18))
@bayesball
bayesball / compute.win.probs.R
Last active Apr 22, 2019
Updated win probability functions
View compute.win.probs.R
compute.win.probs <- function(d, S){
# adds variables P.OLD, P.NEW, and WPA
# to retrosheet data with run expectancies
invlogit <- function(x) exp(x) / (1 + exp(x))
d %>%
mutate(half.inning.row = 2 * INN_CT + BAT_HOME_ID,
runs0 = ifelse(BAT_HOME_ID == 1,
HOME_SCORE_CT - AWAY_SCORE_CT + RUNS.STATE,
HOME_SCORE_CT - AWAY_SCORE_CT - RUNS.STATE)) -> d
You can’t perform that action at this time.