Skip to content

Instantly share code, notes, and snippets.

View simonpcouch's full-sized avatar

Simon P. Couch simonpcouch

View GitHub Profile

An issue was recently filed in stacks about the object size of a stack increasing on save and reload.

Starting out with a quick reprex:

library(tidymodels)
library(modeldata)
library(readr)
#> 

This issue came up in a conversation with @\mine-cetinkaya-rundel about teaching introductory stats / modeling courses using the tidymodels. I feel that, in some ways, parsnip’s guardrails re: augment make teaching broom’s principles fussier than it ought to. Fitting a model and passing it to each tidier:

library(tidyverse)
library(tidymodels)
library(palmerpenguins)

penguins <- drop_na(penguins)

penguins_tr <- penguins[1:200,]
> devtools::test()
ℹ Loading agua
ℹ Testing agua
✔ | F W S  OK | Context
⠏ |         0 | misc                                                                          openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment Homebrew (build 11.0.15+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.15+0, mixed mode)
✔ |         6 | misc [3.3s]                                                                   
  |======================================================================| 100%               

auditing one-to-many join warnings

With dev dplyr, we now see:

# pak::pak("tidyverse/dplyr")
library(parsnip)

mod <- 
 linear_reg(engine = 'glmnet', penalty = tune(), mixture = 1) %&gt;%
library(tidymodels)
library(stacks)
library(bonsai)

tidymodels_prefer()

# regression ------------------------------------------------------------------
reg_bt <-
  boost_tree(mtry = tune()) %>%

Proposed modifications to the internals of tune:::check_grid.


  tune_tbl <- tune_args(workflow)
  tune_params <- tune_tbl$id

  if (nrow(pset) == 0L) {
    msg <- c("!" = "No tuning parameters have been detected; performance will be
 evaluated using the resamples with no tuning.")
library(tidymodels)
library(cli)

The tune package has machinery to catch and log errors and warnings that occur while evaluating proposed models against resamples.

At the moment, we print those warnings/errors out one-by-one as they appear during evaluation.

# benchmarking the new parsnip release
library(tidymodels)
# with v1.0.2 ------------------------------------------------------------
pak::pkg_install("tidymodels/parsnip@v1.0.2")
num_samples <- 10^(3:7)
num_resamples <- c(5, 10, 20)
nrow <- length(num_samples) * length(num_resamples)
@simonpcouch
simonpcouch / chiburbs.R
Created June 22, 2023 15:58
Chicago Suburbs Housing Data
library(tidymodels)
library(tidyverse)
library(stringr)
library(janitor)
library(doMC)
registerDoMC(cores = max(1, parallelly::availableCores() - 1))
# data cleaning --------
# we'd likely just do all this cleaning under the hood and supply
# the `chiburbs` result as the "initial" dataset
@simonpcouch
simonpcouch / arguments_with_submodels.md
Created October 11, 2023 18:52
tidymodels arguments with submodels
library(tidyverse)
library(parsnip)
x <- lapply(parsnip:::extensions(), require, character.only = TRUE)
#> Loading required package: baguette
#> Loading required package: censored
#> Loading required package: survival
#> Loading required package: discrim
#> Loading required package: multilevelmod
#> Loading required package: plsmod