Skip to content

Instantly share code, notes, and snippets.

View bayesball's full-sized avatar

Jim Albert bayesball

View GitHub Profile
@bayesball
bayesball / attendance.R
Last active August 29, 2015 14:24
R code to look at attendance drops for each team
# load relevant packages
library(dplyr)
library(ggplot2)
# function will download retrosheet game log data for a particular
# season
load.gamelog <- function(season, headers){
download.file(
@bayesball
bayesball / pitchcount.R
Created July 16, 2015 13:12
pitch count transitions
# loads in the Retrosheet data
load("~/OneDriveBusiness/Retrosheet/pbp.2014.Rdata")
# removes all non-pitches from PITCH_SEQ_TX
pbp.14$pseq <- gsub("[.>123N+*]", "", pbp.14$PITCH_SEQ_TX)
# create a b and s sequence
pbp.14$pseq <-gsub("[BIPV]", "b", pbp.14$pseq)
pbp.14$pseq <-gsub("[CFKLMOQRST]", "s", pbp.14$pseq)
@bayesball
bayesball / groundball.plot.R
Last active August 29, 2015 14:27
Plots groundball statistics for all teams in a particular season
groundball.plot <- function(pbp, season){
require(dplyr)
require(ggplot2)
require(car)
inplay <- filter(pbp, BATTEDBALL_CD == "F" |
BATTEDBALL_CD == "G" |
BATTEDBALL_CD == "L" |
BATTEDBALL_CD == "P")
inplay <- mutate(inplay,
@bayesball
bayesball / model.data.sim.R
Last active September 19, 2015 14:02
Illustrates Model-Data Simulation to Learn About a Player's Batting Ability Based on a "ofer" Slump
# Script to Learn About Ryan Howard's Batting Ability from a "0 for 35" Slump
# Uses a function from the BayesTestStreak package
# install_github("bayesball/BayesTestStreak")
library(MASS)
library(BayesTestStreak)
# Simulate 500 at-bats with a constant hitting probability p = 0.250
@bayesball
bayesball / murphywork.R
Created October 24, 2015 14:55
Daniel Murphy -- Learning about his home run ability and predicting his home run output in the 2015 World Series
# Daniel Murphy Exercise
# Part I -- learning about Murphy's home run ability
# and updating this knowledge after the NLDS and NLCS
library(ggplot2)
library(LearnBayes)
# career home run data for Murphy
@bayesball
bayesball / plot_career_trajectory_rates.R
Last active December 29, 2015 17:11
Plots trajectories of strikeout rates, home run rates, and hit-in-play rates for players with similar batting averages
# requires packages
# dplyr, Lahman, ggplot2
# some preliminary work
library(dplyr)
library(Lahman)
get.birthyear <- function(player.id){
@bayesball
bayesball / server.R
Created January 5, 2014 22:34
Shiny application to fit a beta curve given the median and 90th percentile.
library(shiny)
shinyServer(function(input, output) {
output$distPlot <- renderPlot({
library(LearnBayes)
quantile1 = list(p=.5, x=input$p50)
quantile2 = list(p=.9, x=input$p90)
ab = beta.select(quantile1, quantile2)
@bayesball
bayesball / jan.6.2014.R
Created January 5, 2014 23:47
R script to compute improved estimates of a set of batting rates.
# R Script for Jan 6, 2014 blog
# function shrink will compute improved estimates at true batting rates
# for all players with at least Lower.Bound opportunities
# be sure that packages Lahman, LearnBayes, and plyr have been installed
# before running this function
shrink <- function(year, n.var, d.var, Lower.Bound=100){
# load required packages (each should be installed first)
@bayesball
bayesball / server.R
Last active January 2, 2016 10:59
Shiny application to graph the average number of offensive events (H, HR, SO, etc) per game per team over the history of Major League Baseball.
library(shiny)
library(Lahman)
library(ggplot2)
# Define server logic required to plot various variables against year
shinyServer(function(input, output) {
# Compute the forumla text in a reactive function since it is
# shared by the output$caption and output$mpgPlot functions
formulaText <- reactive({
@bayesball
bayesball / pacestudy.R
Created January 30, 2016 21:52
Exploring the pitcher pace (time between pitches) for games played in a week of the 2015 season
library(pitchRx)
library(dplyr)
library(ggplot2)
dat <- scrape(start = "2015-09-05", end = "2015-09-11")
pitches <- inner_join(select(dat$atbat,
batter_name, pitcher_name, inning,
gameday_link, num, url),
select(dat$pitch,
start_speed, pitch_type, sv_id, num, url),