Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Files for blog post Times Through the Order Effects
---
title: "TTTO"
author: "Jim Albert"
date: "3/23/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
warning = FALSE,
message = FALSE)
```
### Times Through The Order (TTTO) Effects
#### Blake Snell Story
- 2021 SABR Analytics meeting -- story included in the opening remarks by Brian Kenny
- Tampa Bay Ray facing the LA Dodgers in 2020 World Series, Game 6
- Snell had pitched great over 5 1/3 innings
- Top of the 6th -- pop up and a single
- About to face the top of the Dodgers order
- Replaced by Kevin Cash (Rays tended to avoid the third time through the order with their starters)
#### Load in packages
```{r}
library(tidyverse)
library(ggrepel)
library(Lahman)
```
Here are some utlity functions I will be using:
```{r}
increasefont <- function(){
theme(text=element_text(size=18))
}
centertitle <- function(){
theme(plot.title = element_text(colour = "blue", size = 18,
hjust = 0.5, vjust = 0.8, angle = 0))
}
bar_plot <- function (y, ...) {
TH <- theme(plot.title = element_text(colour = "blue", size = 18,
hjust = 0.5, vjust = 0.8, angle = 0))
Y <- NULL
p <- ggplot(data.frame(Y = y), aes(Y)) + geom_bar(width = 0.5,
fill = "red") + ylab("Frequency")
if (nargs() == 2)
p <- p + ggtitle(...) + TH
p
}
```
#### Read in Data
Read in 2019 Retrosheet data and do some initial work.
```{r}
load("~/Dropbox/Google Drive/Retrosheet/pbp.2019.Rdata")
```
#### Add wOBA Information
Read in woba weights for all Retrosheet codes.
```{r}
wts <- read_csv("woba_wts.csv")
```
Only consider batting plays and merge woba weights.
```{r}
d2019 %>%
filter(BAT_EVENT_FL == TRUE) %>%
inner_join(select(wts, Code, Wt19),
by = c("EVENT_CD" = "Code")) ->
d2019b
```
#### Focus on 2019 Starters
Collect the number of batters faced for all starters.
First collect for all games all starters.
```{r}
d2019 %>%
filter(BAT_HOME_ID == 0, INN_CT == 1) %>%
group_by(GAME_ID) %>%
summarize(P = first(PIT_ID),
.groups = "drop") -> S1
d2019 %>%
filter(BAT_HOME_ID == 1, INN_CT == 1) %>%
group_by(GAME_ID) %>%
summarize(P = first(PIT_ID),
.groups = "drop") -> S2
S12 <- rbind(S1, S2)
```
Function will output data for a particular starter.
```{r}
d_game_starter <- function(gid, pid){
filter(d2019b, GAME_ID == gid, PIT_ID == pid) %>%
select(GAME_ID, PIT_ID, BAT_LINEUP_ID, Wt19)
}
```
I'll apply this function across all starter/games to get the main dataset.
```{r}
games <- as.character(S12$GAME_ID)
pids <- as.character(S12$P)
out <- map2_df(games, pids, d_game_starter)
```
#### Number of batters faced?
```{r}
out %>%
group_by(GAME_ID, PIT_ID) %>%
summarize(BF = length(BAT_LINEUP_ID),
.groups = "drop") -> S_out
```
```{r}
bar_plot(S_out$BF) +
increasefont() +
ggtitle("Number of Batters Faced: 2019 Starters") +
centertitle() +
xlab("Number of Batters Faced") +
geom_vline(xintercept = c(9, 18, 27))
```
#### Order Effects
For each pitcher/game, find mean woba 1st, 2nd, 3rd time through lineup.
```{r}
out %>%
mutate(Lineup = ifelse(BAT_LINEUP_ID <= 3, "top",
ifelse(BAT_LINEUP_ID <= 6,
"middle", "bottom"))) %>%
group_by(GAME_ID, PIT_ID, BAT_LINEUP_ID) %>%
summarize(Lineup = first(Lineup),
N = sum(is.na(Wt19) == FALSE),
woba1 = nth(Wt19, 1),
woba2 = nth(Wt19, 2),
woba3 = nth(Wt19, 3),
.groups = "drop") -> S_out2
```
Also find average wOBA for each pitcher and merge these summaries with S_out2 dataset.
```{r}
out %>%
group_by(PIT_ID) %>%
summarize(woba = mean(Wt19, na.rm = TRUE)) ->
S_out4
inner_join(S_out2, S_out4,
by = "PIT_ID") -> S_out2
```
Compute differences.
```{r}
S_out2 %>%
group_by(PIT_ID) %>%
summarize(N = sum(N),
woba = first(woba),
d21 = mean(woba2 - woba1, na.rm = TRUE),
d32 = mean(woba3 - woba2, na.rm = TRUE),
d31 = mean(woba3 - woba1, na.rm = TRUE)) ->
S_out3
```
#### Graphs
First add first and last name data from Lahman database.
```{r}
inner_join(S_out3,
select(Master, retroID,
nameLast, nameFirst),
by = c("PIT_ID" = "retroID")) -> S_out3
S_out3$Name <- paste(S_out3$nameFirst, S_out3$nameLast)
```
Graph of d21 against d31.
```{r}
S_out3 %>%
filter(N >= 500) %>%
ggplot(aes(d21, d31,
label = Name)) +
geom_point() +
geom_hline(yintercept = 0, color = "red") +
geom_vline(xintercept = 0, color = "red") +
xlim(-.168, .168) +
ylim(-.168, .168) +
coord_fixed() +
geom_text_repel(data =
filter(S_out3, N >= 500,
d31 < -0.1,
d21 < 0)) +
geom_text_repel(data =
filter(S_out3, N >= 500,
d21 < -0.1)) +
geom_text_repel(data =
filter(S_out3, N >= 500,
d21 > 0.1, d31 > 0.152)) +
increasefont() +
ggtitle("Change in wOBA from 1st TTTO") +
centertitle() +
xlab("wOBA 2 - wOBA 1") +
ylab("wOBA 3 - wOBA 1") +
geom_text_repel(data =
filter(S_out3,
Name == "Blake Snell")) +
geom_point(data =
filter(S_out3, N >= 500,
d31 < -0.1,
d21 < 0),
color = "red") +
geom_point(data =
filter(S_out3, N >= 500,
d21 < -0.1),
color = "red") +
geom_point(data =
filter(S_out3, N >= 500,
d21 > 0.1, d31 > 0.152),
color = "red") +
geom_point(data =
filter(S_out3,
Name == "Blake Snell"),
color = "red")
```
Graph of d21 against d32.
```{r}
S_out3 %>%
filter(N >= 500) %>%
ggplot(aes(d21, d32, label = Name)) +
geom_point() +
geom_hline(yintercept = 0, color = "red") +
geom_vline(xintercept = 0, color = "red") +
coord_fixed() +
xlim(-.168, .168) +
ylim(-.168, .168) +
increasefont() +
ggtitle("Successive Change in wOBA") +
centertitle() +
xlab("wOBA 2 - wOBA 1") +
ylab("wOBA 3 - wOBA 2") +
geom_text_repel(data =
filter(S_out3,
Name == "Blake Snell")) +
geom_point(data =
filter(S_out3,
Name == "Blake Snell"),
color = "red")
```
Graph of difference and overall wOBA.
```{r}
S_out3 %>%
filter(N >= 500) %>%
ggplot(aes(woba, d21, label = Name)) +
geom_point() +
geom_hline(yintercept = 0, color = "red") +
increasefont() +
centertitle() +
ggtitle("Scatterplot of wOBA and TTTO Effect") +
xlab("Overall wOBA") +
ylab("wOBA2 - wOBA1") +
geom_text_repel(data = filter(S_out3,
Name == "Blake Snell")) +
geom_point(data = filter(S_out3,
Name == "Blake Snell"),
color = "red")
```
```{r}
S_out3 %>%
filter(N >= 500) %>%
ggplot(aes(woba, d31, label = Name)) +
geom_point() +
geom_hline(yintercept = 0, color = "red") +
increasefont() +
centertitle() +
ggtitle("Scatterplot of wOBA and TTTO Effect") +
xlab("Overall wOBA") +
ylab("wOBA3 - wOBA1") +
geom_text_repel(data = filter(S_out3,
Name == "Blake Snell")) +
geom_point(data = filter(S_out3,
Name == "Blake Snell"),
color = "red")
```
```{r}
S_out3 %>%
filter(N >= 500) %>%
ggplot(aes(woba, d32, label = Name)) +
geom_point() +
geom_hline(yintercept = 0, color = "red") +
increasefont() +
centertitle() +
ggtitle("Scatterplot of wOBA and TTTO Effect") +
xlab("Overall wOBA") +
ylab("wOBA3 - wOBA2") +
geom_text_repel(data = filter(S_out3,
Name == "Blake Snell")) +
geom_point(data = filter(S_out3,
Name == "Blake Snell"),
color = "red")
```
Code Meaning Wt19
0 Unknown event 0
1 No event 0
2 Generic out 0
3 Strikeout 0
4 Stolen base 0
5 Defensive indifference 0
6 Caught stealing 0
7 Pickoff error 0
8 Pickoff 0
9 Wild pitch 0
10 Passed ball 0
11 Balk 0
12 Other advance 0
13 Foul error 0
14 Walk 0.69
15 Intentional walk 0
16 Hit by pitch 0.719
17 Interference 0
18 Error 0
19 Fielder’s choice 0
20 Single 0.87
21 Double 1.217
22 Triple 1.529
23 Home run 1.94
24 Missing play 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment