Instantly share code, notes, and snippets.

Embed
What would you like to do?
The Zen of R
# I am a scientist who has been using R for about 2 years. Today I achieved a measure of enlightenment into
# the zen of R, and I want to share it with you.
# I was simulating a set of independent random walks, which are formed by a multiplicative process. Here is
# the code I had built up over a few months of working on it and modifying it on and off as research
# questions changed:
TimeSteps <- 1000
Walks <- 100
ErrorMagnitude <- 0.03
StartingValue <- 15
Iterate <- function (Function, number.of.times, starting.value, accumulate = FALSE, ...) {
results <- list (Function (starting.value, ...))
for (iteration in seq_len (number.of.times - 1))
results[[iteration + 1]] <- Function (results[[iteration]], ...)
return (if (accumulate) results else results[[number.of.times]])}
RandomWalk <- function (previous.generation) {
previous.generation * rnorm (Walks, 1, ErrorMagnitude)}
PlotResults <- function (results) {
results <- matrix (unlist (results), ncol = length (results))
plot (colMeans (results), type = "l")}
PlotResults (Iterate (RandomWalk, number.of.times = TimeSteps,
starting.value = rep (StartingValue, Walks),
accumulate = TRUE))
# Hopefully it is self-explanatory. 'RandomWalk' makes one time step of random walking happen, 'Iterate' feeds
# this back onto itself a bunch of times, and `PlotResults` is obvious.
# Today I thought for a while about the terse style of R code I sometimes see: code that leans heavily on
# built-in library functions, and finds a clever way of representing one's problem within the semantic world
# of those functions. Eventually, I wrote:
plot (colMeans (t (replicate (100, cumprod (rnorm (1000, 1, 0.03)))) * 15), type = "l")
# You can verify for yourself that this does the same thing. But I believe it captures an essential simplicity
# of the simulation in a way that just can't be seen in the first, long-winded version. The use of typical
# programming structures blinded me to the elegant way something can be expressed if your language offers
# tools truly suited to your domain. And note that in this case the programming structures that were getting
# in the way of my mental clarity were structures from functional programming (functions and higher-order
# functions), not the imperative structures that are usually charged with closing programmers' minds.
# I'm interested in your thoughts and comments, if any. Incidentally, the idea of using a Gist as an
# essay-writing platform tickled my fancy - we shall see how it works, if at all.
@egnha

This comment has been minimized.

egnha commented Oct 11, 2018

@lambder, I came across your gist by googling "zen of R". This is lovely.

Here's an additional remark: in order to bring the compositional nature of random walk into sharper relief, it is necessary to re-express replicate() as a higher-order function, which iterates a function rather than a function call. That is, in place of replicate(), use something like

iterate <- function(.f, .n, ..., .simplify = "array") {
  sapply(integer(.n), function(.) .f(...), simplify = .simplify)
}

Then you can express your random-walk function in a declarative, point-free manner. With the gestalt package (shameless plug—I am the author), you could do it like this:

library(gestalt)
random_walk <- walk: {iterate(rnorm %>>>% cumprod, ...)} %>>>% rowMeans
  • The operator %>>>% composes functions, from left to right
  • The form {<body>} signifies a function of ... whose body is <body>
  • The syntax walk: is an (optional) annotation; it acts as an ordinary name (try names(random_walk) and random_walk$walk)

Here's the plot:

set.seed(123)
plot(random_walk(.n = 100, 1000, 1, 0.03), type = "l")

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment