Skip to content

Instantly share code, notes, and snippets.

@EmilHvitfeldt
Created May 2, 2024 04:32
Show Gist options
  • Save EmilHvitfeldt/1eebe5aed4d160a0cf40449234a07047 to your computer and use it in GitHub Desktop.
Save EmilHvitfeldt/1eebe5aed4d160a0cf40449234a07047 to your computer and use it in GitHub Desktop.
sparse step_dummy progress
library(recipes)
library(nycflights13)

rec <- recipe(dep_delay ~ carrier + tailnum + dest + origin, data = flights) |>
  step_dummy(all_nominal_predictors())

options("recipes.sparse" = FALSE)

system.time({
print(
  lobstr::obj_size(
  rec |>
    prep() |>
    bake(NULL)
  )
)
})
#> Warning: ! There are new levels in a factor: `NA`.
#> 11.22 GB
#>    user  system elapsed 
#> 127.027   9.761 142.584

options("recipes.sparse" = TRUE)

system.time({
  print(
    lobstr::obj_size(
    rec |>
      prep() |>
      bake(NULL)
    )
  )
})
#> 20.80 MB
#>    user  system elapsed 
#>   3.787   0.490   4.330
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment