Skip to content

Instantly share code, notes, and snippets.

View mrcaseb's full-sized avatar

mrcaseb

  • Munich, Germany
  • 19:08 (UTC +02:00)
  • X @mrcaseb
View GitHub Profile
@mrcaseb
mrcaseb / statsbomb.R
Created November 29, 2023 09:25
How to bring Statsbomb tracking data into a tidy form
## HOW TO USE THIS ##
# Go to https://github.com/statsbomb/amf-open-data/tree/main#getting-started
# and download the zipped json files containing the individual seasons, available via AWS S3
#
# Alternatively click the below links directly
# https://statsbomb-amf-open-data.s3.eu-west-2.amazonaws.com/tracking/SB_tracking_TB12DB_2021.zip
# https://statsbomb-amf-open-data.s3.eu-west-2.amazonaws.com/tracking/SB_tracking_TB12DB_2022.zip
#
# Unzip those files to a local directory. That directory will include deeply nested json files
@mrcaseb
mrcaseb / load_nflmockdraftdatabase_consensus_board.R
Last active September 11, 2023 21:38
Scrape nflmockdraftdatabase consensus board
# Scrape nflmockdraftdatabase consensus board from
# https://www.nflmockdraftdatabase.com
load_nflmockdraftdatabase_consensus_board <- function(year){
cli::cli_progress_step("Loading {.val {year}}. Please be patient, the parser takes a while.")
raw <- glue::glue("https://www.nflmockdraftdatabase.com/big-boards/{year}/consensus-big-board-{year}") |>
rvest::read_html()
mock_list <- raw |>
rvest::html_elements(xpath = "//*[@class='mock-list-item']")
@mrcaseb
mrcaseb / gt_table_align_decimal_separator.R
Last active November 11, 2022 20:55
A suggestion how to align a column in a gt table to the decimal separator without monospace fonts
library(magrittr)
df <- tibble::tibble(
var_a = c("a", "b", "c", "d"),
var_b = c(1.234, 12.34, 123.4, 1234)
)
num_to_html <- function(num, separator = ".") {
stringr::str_split_fixed(num, stringr::fixed(separator), n = 2) %>%
as.data.frame() %>%
@mrcaseb
mrcaseb / load_538_games.R
Created October 4, 2022 18:28
Load historical game data from 538
load_538_games <- function(seasons = "SB_ERA"){
s <- nflreadr::csv_from_url("https://projects.fivethirtyeight.com/nfl-api/nfl_elo.csv") |>
tibble::as_tibble() |>
dplyr::na_if("") |>
dplyr::select(
season,
season_type = playoff,
away_team = team2,
away_score = score2,
home_score = score1,
#' Remove Bookmaker’s Vig from a Vector of Probablities
#' @description This function implements the iterative power method to adjust
#' implied probabilities suggested by Vovk and Zhadanov (2009) and Clarke (2016),
#' where bookmakers’ implied probabilities are raised to a fixed power, which
#' never produces bookmaker or fair probabilities outside the 0-1 range
#' (upper limit can be changed) and allows for the favorite long-shot bias.
#' @param probs A vector of probabilities from which to remove the bookmaker’s
#' vig. The probabilities will be adjusted to sum up to `sum_to`.
library(tidyverse)
drafts <- nflreadr::load_draft_picks()
draft_order <- rvest::read_html("https://en.wikipedia.org/wiki/2022_NFL_Draft") |>
rvest::html_table() |>
purrr::pluck(5) |>
janitor::clean_names() |>
select(round = rnd, pick = pick_no, team = nfl_team, notes) |>
mutate(across(c(round, pick), as.numeric)) |>
library(dplyr, warn.conflicts = FALSE)
library(ggplot2)
preds <- nflreadr::csv_from_url("https://raw.githubusercontent.com/nflverse/nfldata/master/data/predictions.csv")
g <- nflreadr::load_schedules(2021)
points <- preds |>
filter(prediction != 50) |>
left_join(g |> select(game_id, week, result), by = "game_id") |>
library(tidyverse)
library(nflverse)
#> ── Attaching packages ─────────────────────────────────── nflverse 1.0.1.9000 ──
#> ✓ nflfastR 4.3.0.9008 ✓ nflreadr 1.1.3
#> ✓ nflseedR 1.0.2.9001 ✓ nflplotR 1.0.0.9000
#> ✓ nfl4th 1.0.1.9000
#> ──────────────────────────────────────────────────────────────── Ready to go! ──
options(dplyr.summarise.inform = FALSE)
options(nflreadr.verbose = FALSE)
s <- nflreadr::load_schedules(TRUE)
@mrcaseb
mrcaseb / special_teams_epa.R
Created November 22, 2021 15:52
How to compute special teams epa split by kicking and receiving
library(dplyr)
st_plays <- nflreadr::load_pbp(2021) |>
filter(!is.na(epa) & touchback != 1 & special == 1)
epa_kicking <- st_plays |>
group_by(team = posteam) |>
mutate(epa = dplyr::if_else(play_type == "kickoff", -epa, epa)) |>
summarize(epa_kick = mean(epa), plays_kick = n())