Skip to content

Instantly share code, notes, and snippets.

View simonpcouch's full-sized avatar

Simon P. Couch simonpcouch

View GitHub Profile

Supporting older R versions in broom

In preparing the most recent release of the broom package, I’ve run into some headaches related to the large number of Suggests that make it very difficult to support older versions of R. I’m considering revising the package’s approach so managing dependencies such that broom can “support” a package’s output with including it in Suggests.

Setup

Data usage when post-processing workflows

Simon Couch

# with tidymodels/container#12 and tidymodels/workflows#225
library(tidymodels)
── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
@simonpcouch
simonpcouch / fair_ml_timeline.md
Last active November 29, 2023 19:32
Timeline for machine learning fairness implementations in tidymodels

Fair machine learning with tidymodels: timeline

For context on this document, see the fairness reading group summary.

Releases, round 1: Assessment and critique

The first component of fairness metrics functionality, assessment, is completed and merged into the main dev versions of tidymodels.

@simonpcouch
simonpcouch / arguments_with_submodels.md
Created October 11, 2023 18:52
tidymodels arguments with submodels
library(tidyverse)
library(parsnip)
x <- lapply(parsnip:::extensions(), require, character.only = TRUE)
#> Loading required package: baguette
#> Loading required package: censored
#> Loading required package: survival
#> Loading required package: discrim
#> Loading required package: multilevelmod
#> Loading required package: plsmod
@simonpcouch
simonpcouch / chiburbs.R
Created June 22, 2023 15:58
Chicago Suburbs Housing Data
library(tidymodels)
library(tidyverse)
library(stringr)
library(janitor)
library(doMC)
registerDoMC(cores = max(1, parallelly::availableCores() - 1))
# data cleaning --------
# we'd likely just do all this cleaning under the hood and supply
# the `chiburbs` result as the "initial" dataset
# benchmarking the new parsnip release
library(tidymodels)
# with v1.0.2 ------------------------------------------------------------
pak::pkg_install("tidymodels/parsnip@v1.0.2")
num_samples <- 10^(3:7)
num_resamples <- c(5, 10, 20)
nrow <- length(num_samples) * length(num_resamples)
library(tidymodels)
library(cli)

The tune package has machinery to catch and log errors and warnings that occur while evaluating proposed models against resamples.

At the moment, we print those warnings/errors out one-by-one as they appear during evaluation.

Proposed modifications to the internals of tune:::check_grid.


  tune_tbl <- tune_args(workflow)
  tune_params <- tune_tbl$id

  if (nrow(pset) == 0L) {
    msg <- c("!" = "No tuning parameters have been detected; performance will be
 evaluated using the resamples with no tuning.")
library(tidymodels)
library(stacks)
library(bonsai)

tidymodels_prefer()

# regression ------------------------------------------------------------------
reg_bt <-
  boost_tree(mtry = tune()) %>%

auditing one-to-many join warnings

With dev dplyr, we now see:

# pak::pak("tidyverse/dplyr")
library(parsnip)

mod <- 
 linear_reg(engine = 'glmnet', penalty = tune(), mixture = 1) %&gt;%