While reading Hadley's Advanced R book chapter on function operators (see our work on creating a set of answers here), I came across this fact about lazy evaluation:
The function operators we’ve seen so far follow a common pattern:
funop <- function(f, otherargs) { function(...) { # maybe do something res <- f(...) # maybe do something else res } }
> Unfortunately there’s a problem with this implementation because function arguments are lazily evaluated: f() may have changed between applying the FO and evaluating the function. This is a particular problem if you’re using a for loop or lapply() to apply multiple function operators. In the following example, we take a list of functions and delay each one. But when we try to evaluate the mean, we get the sum instead.
> ```R
funs <- list(mean = mean, sum = sum)
funs_m <- lapply(funs, delay_by, delay = 0.1)
funs_m$mean(1:10)
#> [1] 55
We can avoid that problem by explicitly forcing the evaluation of f():
delay_by <- function(delay, f) { force(f) function(...) { Sys.sleep(delay) f(...) } }
> ```R
funs_m <- lapply(funs, delay_by, delay = 0.1)
funs_m$mean(1:10)
#> [1] 5.5
It’s good practice to do that whenever you create a new FO.
The simple answer is that this is because of "lazy evaluation", or how R waits to actually evaluate the values within functions until those functions are called. R can do all sorts of cool things because of lazy evaluation
But why is this so? And what magic does force
do to fix the problem?
For those that have not investigated, force
is just an alias for identity
:
> force
function (x)
x
<bytecode: 0x10abc13b8>
<environment: namespace:base>
> identity
function (x)
x
<bytecode: 0x100aeec28>
<environment: namespace:base>
To make what is happening clearer, let's simplify the function operator:
what_is_love <- function(f) {
function(...) {
cat('f is', f, '\n')
}
}
Now we can call it with lapply
again:
> funs <- lapply(c('love', 'cherry'), what_is_love)
> funs[[1]]()
f is cherry
> funs[[2]]()
f is cherry
But note that this is not the case when you do not use lapply
:
> f1 <- what_is_love('love')
> f2 <- what_is_love('cherry')
> f1()
f is love
> f2()
f is cherry
What gives?
Well, let's take funs <- lapply(c('love', 'cherry'), what_is_love)
and write it out more fully:
params <- c('love', 'cherry')
out <- vector('list', length(params))
for (i in seq_along(params)) {
out[[i]] <- what_is_love(params[[i]])
}
out
And let's insert a browser()
in the for loop and take a look:
params <- c('love', 'cherry')
out <- vector('list', length(params))
for (i in seq_along(params)) {
out[[i]] <- what_is_love(params[[i]])
if (i == 2) browser()
}
out
We can see that both functions have their own environment:
Browse[1]> out[[1]]
function(...) {
cat('f is', f, '\n')
}
<environment: 0x109508478>
Browse[1]> out[[2]]
function(...) {
cat('f is', f, '\n')
}
<environment: 0x1094ff750>
But in each of those environments, f
is the same...
Browse[1]> environment(out[[1]])$f
[1] "cherry"
Browse[1]> environment(out[[2]])$f
[1] "cherry"
TODO: Finish. I got confused. Asked a question here: http://stackoverflow.com/questions/29733257/can-you-more-clearly-explain-lazy-evaluation-in-r-function-operators