Created
June 21, 2020 21:41
-
-
Save bayesball/675ef5be9600b4094d63a7cc34d4bc8a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "ggplot2 intro 2" | |
output: html_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE, | |
warning = FALSE, | |
message = FALSE) | |
``` | |
Load in the package and read Statcast data. | |
```{r} | |
library(data.table) | |
library(ggplot2) | |
sc <- fread("https://raw.githubusercontent.com/bayesball/ABWRdata/master/data/statcast2017.txt") | |
``` | |
Compare two Philly pitchers -- Aaron Nola and Nick Pivetta for 2017 season. Compare season performances. | |
Create a data frame with variables | |
- Pitcher - name of pitcher | |
- pitch_type | |
- plate_x, plate_z | |
- type | |
- woba_value | |
- release_speed | |
- pfx_x, pfx_z | |
```{r} | |
two_pitchers <- sc[pitcher %in% c(601713, 605400)] | |
two_pitchers <- two_pitchers[, | |
Pitcher := ifelse(pitcher == 601713, | |
"Nick Pivetta", "Aaron Nola") ] | |
two_pitchers <- two_pitchers[, | |
.(Pitcher, pitch_type, plate_x, | |
plate_z, type, woba_value, | |
release_speed, pfx_x, pfx_z)] | |
``` | |
#### What pitches do they throw? | |
```{r} | |
(S <- two_pitchers[, .N, | |
by = .(Pitcher, pitch_type)]) | |
``` | |
Construct barplots of the frequencies of the different pitch types for both pitchers. | |
```{r} | |
ggplot(S, aes(x = Pitcher, y = N, | |
fill = pitch_type)) + | |
geom_bar(stat = "identity", position = "dodge") | |
``` | |
Alternative display using a point geom instead of a bar geom. | |
```{r} | |
ggplot(S, aes(x = Pitcher, y = N, | |
color = pitch_type)) + | |
geom_point() | |
``` | |
#### Compare release speeds of the pitchers | |
```{r} | |
ggplot(two_pitchers[ pitch_type != "SL"], | |
aes(Pitcher, release_speed)) + | |
geom_jitter() + | |
facet_wrap(~ pitch_type) | |
``` | |
Instead of jittered points, can use boxplots to summarize the speeds. | |
```{r} | |
ggplot(two_pitchers[ pitch_type != "SL"], | |
aes(Pitcher, release_speed)) + | |
geom_boxplot() + | |
facet_wrap(~ pitch_type) | |
``` | |
Or we could try parallel violin plots. | |
```{r} | |
ggplot(two_pitchers[ pitch_type != "SL"], | |
aes(Pitcher, release_speed)) + | |
geom_violin() + | |
facet_wrap(~ pitch_type) | |
``` | |
#### Compare horizontal and vertical breaks | |
```{r} | |
ggplot(two_pitchers[ pitch_type != "SL"], | |
aes(pfx_x, pfx_z, color = Pitcher)) + | |
geom_point() + | |
facet_wrap(~ pitch_type) | |
``` | |
#### Compare pitch locations | |
Below I am adding a graph of the zone. Here is a special function add_zone() that consists of a ggplot2 layer. | |
```{r} | |
add_zone <- function(Color = "red") { | |
topKzone <- 3.5 | |
botKzone <- 1.6 | |
inKzone <- -0.85 | |
outKzone <- 0.85 | |
kZone <- data.frame(x = c(inKzone, inKzone, outKzone, outKzone, | |
inKzone), y = c(botKzone, topKzone, topKzone, botKzone, | |
botKzone)) | |
geom_path(aes(.data$x, .data$y), data = kZone, lwd = 1, col = Color) | |
} | |
``` | |
```{r} | |
ggplot(two_pitchers[ pitch_type != "SL"], | |
aes(plate_x, plate_z)) + | |
geom_point(aes(color = Pitcher)) + | |
facet_wrap(~ pitch_type) + | |
add_zone('black') | |
``` | |
Focus on the locations for curve balls and four-seamers. | |
```{r} | |
ggplot(two_pitchers[ pitch_type %in% c("FF", "CU")], | |
aes(plate_x, plate_z)) + | |
geom_point() + | |
facet_grid(pitch_type ~ Pitcher) + | |
add_zone('red') | |
``` | |
#### What happens to balls in play? | |
```{r} | |
ggplot(two_pitchers[ pitch_type %in% | |
c("FF", "CU") & type == "X"], | |
aes(plate_x, plate_z)) + | |
geom_point(aes(color = | |
as.character(woba_value))) + | |
facet_grid(pitch_type ~ Pitcher) + | |
add_zone('black') | |
``` | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment