Skip to content

Instantly share code, notes, and snippets.

@cpsievert
Last active October 13, 2015 22:23
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cpsievert/da555f08f3c9ba2c0b8e to your computer and use it in GitHub Desktop.
Save cpsievert/da555f08f3c9ba2c0b8e to your computer and use it in GitHub Desktop.
Function to get time-sequenced spatial locations of a baseball (optionally summarized over an arbitrary set of variables) using PITCHf/x
if (!require("dplyr")) install.packages("dplyr")
if (!require("tidyr")) install.packages("tidyr")
if (!require("pitchRx")) install.packages("pitchRx")
getLocations <- function(dat, ..., summarise = TRUE) {
# select and group by columns specified in ...
tb <- dat %>%
select(..., x0:az) %>%
group_by(...)
vars <- as.character(attr(tb, "vars"))
if (summarise) {
# average the PITCHf/x parameters over variables specified in ...
labs <- attr(tb, "labels")
tb <- tb %>% summarise_each(funs(mean))
} else {
# another (more complex way to get variables names)
# vars <- as.character(as.list(match.call(expand.dots = FALSE))$...)
dat$pitch_id <- seq_len(nrow(dat))
vars <- c(vars, "pitch_id")
labs <- dat[vars]
}
# returns 3D array of locations of pitches over time
value <- pitchRx::getSnapshots(as.data.frame(tb))
idx <- labs %>% unite_("id", vars, sep = "@&")
dimnames(value) <- list(idx = as.data.frame(idx)[, 1],
frame = seq_len(dim(value)[2]),
coordinate = c("x", "y", "z"))
# tidy things up in a format that ggplot would expect
value %>% as.tbl_cube() %>% as.data.frame() %>% rename_(value = ".") %>%
mutate(idx = as.character(idx)) %>%
separate(idx, vars, sep = "@&") %>%
spread(coordinate, value)
}
@danmalter
Copy link

I think there is a bug with something related to the group_by(...). When I run
dat <- getLocations(pitches, pitcher_name, pitch_type, summarise = TRUE)
I am getting an error saying "Error: index out of bounds"

@cpsievert
Copy link
Author

Thanks! This should work now.

@cjrhp
Copy link

cjrhp commented Oct 13, 2015

Hi Carson,
I downloaded Pitchfx data for all 2014 and joined atbat and pitch tables for Yu Darvish. But when I run getLocations function, R keeps showing Error in 0:(nplots - 1) : NA/NaN argument. Do you have any idea why this function doesn't work?
Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment