Skip to content

Instantly share code, notes, and snippets.

Keybase proof

I hereby claim:

  • I am hfrick on github.
  • I am hfcfrick (https://keybase.io/hfcfrick) on keybase.
  • I have a public key ASCjC2DptIzRRquSJbSQWZUu48coNZQf7V2v1igM7RI-Bgo

To claim this, I am signing this object:

library(tidyverse)
library(tidymodels)
otters_raw <- read_csv("seot_morphometricsReproStatus_ak_monson.csv") %>%
janitor::clean_names()
otters <- otters_raw %>%
mutate(
final_age = if_else(final_age == -9, NA_real_, final_age),
weight = if_else(weight == -9, NA_real_, weight),

More thoughts on data sets for post-processing

@simonpcouch and I have been brainstorming with regard to how and where to specify and make the dataset used to estimate a post-processor, currently dubbed the "potato set".

What to specify

  • The proportion of the data used for estimatation (preproc, model, post) that should be held back specificially for estimating the post-processor.
  • The method for how to split that data for estimation. This may need to be a time-based or grouped split rather than a random split. If we are in the context of resampling a workflow, it should most likely be the same method as used to make the resamples.

Where to specify