Skip to content

Instantly share code, notes, and snippets.

@andyreagan
Last active March 9, 2021 19:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andyreagan/2712fa9c4ec1668d661758eea8f55aed to your computer and use it in GitHub Desktop.
Save andyreagan/2712fa9c4ec1668d661758eea8f55aed to your computer and use it in GitHub Desktop.
How to use R like Python

Here's how to use an R script like a Python script, relying on box. While there are reasons to use R, there are also two very bad parts of the R language, avoid them: (1) never use source and (2) never use library. These encourage patterns of code that are difficult to test or maintain. Using source comes along with changing the working directory sometimes, which should also be avoided. I'll quote Jenny Bryan, who, among other things says

Have setwd() in your scripts? PLEASE STOP DOING THAT. ... I will come into your lab and SET YOUR COMPUTER ON FIRE.

She has also written a whole post on why setwd() is, and I quote again, "so problematic and often associated with other awkward workflow problems".

There are three files here:

  • src/main/myscript.R: the main script, has functions that can be used elsewhere and also can be run standalone.
  • src/main/04_unconventional_filename.R: a script loaded by the main one above. Two purposes for including this: (1) example of using an weird file name, and (2) this script also relied on another script.
  • src/main/utils.R: your standard utils.

Using box

Start with why box to understand the rationale behind it's use. I'll cover a few more specific questions.

  • I don't want to use box in every one of my scripts, just a few.
    • Just ignore the box lines and source() away, it won't hurt that workflow. You don't have to completely refactor.
  • How can I have a project that uses box internally, that is loaded by another project.
    • On the other project side, it's relatively simple: just don't set box.path and continue to load things internally relative to project root. On this side, again it's simple, set box.path before loading.
  • How can a testthat script test a script that uses box.
    • This is a tad bit trickier, since the working directory of testthat is actually reset to where to the test file is. Nonetheless, this will work.
# src/main/04_unconventional_filename.R
box::use(src/main/utils[...])
#' @export
some_function <- function() {
...
}
#!/usr/local/bin/Rscript
# src/main/myscript.R
box::use(magrittr[`%>%`, `%<>%`])
box::use(glue[glue])
box::use(dplyr[...])
# ugly, but if we're using as a CLI set the search path directly
# this may / may not be necessary
if (sys.nframe() == 0L) {
options(box.path = here::here())
}
# load all utils
box::use(src/main/utils[...])
# load one function
box::use(src/main/`04_unconventional_filename`[some_function])
# define local functions
`%not-in%` <- function(x, y) !(x %in% y)
#' @title Do something with data
#' @param d A data frame
#' @details
#' ...
#' @return
#' A data frame ...
my_pure_function <- function(d, pos_arg) {
testthat::expect_true(pos_arg %in% c('option_1', 'option_2'))
testthat::test_that("d has correct columns", {
testthat::expect_setequal(colnames(d), c('col_1', 'col_2'))
})
testthat::test_that("gender is M/F", {
testthat::expect_setequal(pull(distinct(mutate(d, gender=as.character(gender)), gender), gender), c('M', 'F'))
})
d %<>% ...
testthat::test_that("output gender is M/F", {
testthat::expect_setequal(pull(distinct(mutate(d, gender=as.character(gender)), gender), gender), c('M', 'F'))
})
}
# local function that has side effects (read/write files, db, etc)
main <- function(...) {
raw <- arrow::read_parquet(...)
processed <- my_pure_function(raw)
arrow::write_parquet(processed, ...)
}
cli <- function() {
parser <- argparse::ArgumentParser(description="Haven Term simulation.")
parser$add_argument("required_option", type="character", help="...")
default_some_dir <- file.path(
Sys.getenv("HOME"),
Sys.getenv("SOME_DIR", "default/some/dir")
)
parser$add_argument(
"--some-dir",
type="character",
default=default_some_dir,
help="full path of some dir"
)
parser$add_argument("--example-flag", action="store_true", help="...")
parser$add_argument("--example-option-with-default", type="character", default="0,1,2,3,4", help="...")
args <- parser$parse_args()
main(
required_option=args$required_option
some_dir=args$some_dir
)
}
if (sys.nframe() == 0L) {
cli()
}
# src/main/utils.R
#' @export
my_util_function <- function() {
....
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment