Skip to content

Instantly share code, notes, and snippets.

@richfitz
Last active May 24, 2021 14:36
Show Gist options
  • Save richfitz/4992242 to your computer and use it in GitHub Desktop.
Save richfitz/4992242 to your computer and use it in GitHub Desktop.

Suppose you have a long running calculation:

f <- function(x) {
  message("Evaluating slow function")
  Sys.sleep(5) # sleep 5 seconds to simulate long running time
  x
}

Which is used like so:

f(10)

However, you only want to rerun f() sometimes (say when an upsteam data source changes). I usually do something like this:

run.cached <- function(expr, filename, regenerate=FALSE) {
  if ( file.exists(filename) && !regenerate ) {
    res <- readRDS(filename)
  } else {
    res <- eval.parent(substitute(expr))
    saveRDS(res, file=filename)
  }
  res
}

This is a simple caching function; tries to load the .rds file indicated by filename if it exists, otherwise it runs the expression in expr and saves the output in the file filename. If you specify regenerate=TRUE it will rerun the expression

Simple caching; run 'expr' and save the output in 'filename'; if 'filename' already exists just load that. If regenerate is TRUE, it always runs the expression.

So you can do this:

run.cached(f(5), 'mycache.rds') # runs the slow function
run.cached(f(5), 'mycache.rds') # won't run, returns cached result
run.cached(f(10), 'mycache.rds', TRUE) # runs the slow function

When I want to make sure everything works correctly for the final published version, I delete the .rds files, which forces everything to be recalculated.

There are a variety of packages on CRAN that do this already, apparently: R.cache, SOAR, and (for Sweave) cacheSweave. These may be more robust!

@davharris
Copy link

knitr has a caching option as well, which has worked well for me so far. It seems to do some pretty clever wizardry to tell if any recalculating is needed.

@richfitz
Copy link
Author

@davharris: I think knitr might use cacheSweave behind the scenes. It's very nifty, for sure. I use this approach on things that aren't in knitr files, or where I want to manually control when recalculation happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment