Skip to content

Instantly share code, notes, and snippets.

@wviechtb
Created May 4, 2022 09:23
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wviechtb/83e88b7d82d6bf305a075b79c0a3aea5 to your computer and use it in GitHub Desktop.
Save wviechtb/83e88b7d82d6bf305a075b79c0a3aea5 to your computer and use it in GitHub Desktop.
Benchmark comparison of for-loops versus apply()/sapply() in 3 different versions of R (2.5.0, 3.0.0, 4.2.0)
> ############################################################################
>
> # A comparison of for-loops versus apply() and sapply() for 1) computing the
> # row means in a matrix and 2) for computing the means of all elements in a
> # list. For task 1), we can also examine the performance of rowMeans() as a
> # specialized / vectorized function and for task 2), we can also compare
> # sapply() with vapply() (note: vapply() was added in version R-2.12.0). Also,
> # for the for-loop, we can examine what the impact is of pre-allocating the
> # vector in which to store the results versus 'growing' the vector in each
> # iteration.
> #
> # Notes: Not using a package like 'microbenchmark' for the timings since this
> # might not be available in older versions of R. The Sys.sleep() calls after
> # each timing allow the CPU to cool a bit to avoid that the performance is
> # impacted by thermal throttling.
>
> ############################################################################
R version 2.5.0 (2007-04-23)
Copyright (C) 2007 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> ############################################################################
>
> # for-loops versus apply() (versus rowMeans()) for computing row means
>
> n <- 2000
> x <- matrix(runif(n*n), nrow=n, ncol=n)
>
> f1 <- function(x) {
+ nr <- nrow(x)
+ m <- rep(NA_real_, nr)
+ for (i in 1:nr) {
+ m[i] <- mean(x[i,])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:nrow(x)) {
+ m <- c(m,mean(x[i,]))
+ }
+ return(m)
+ }
>
> f3 <- function(x) apply(x, 1, mean)
> f4 <- function(x) rowMeans(x)
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
5.556 0.008 5.564
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
5.835 0.008 5.843
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
9.105 0.008 9.114
> system.time(replicate(100, f4(x))); Sys.sleep(10)
user system elapsed
0.894 0.000 0.894
>
> ############################################################################
>
> # for-loops versus sapply() (and vapply()) for computing list element means
>
> x <- replicate(10000, runif(1000), simplify=FALSE)
>
> f1 <- function(x) {
+ n <- length(x)
+ m <- rep(NA_real_, n)
+ for (i in 1:n) {
+ m[i] <- mean(x[[i]])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:length(x)) {
+ m <- c(m,mean(x[[i]]))
+ }
+ return(m)
+ }
> f3 <- function(x) sapply(x, mean)
> f4 <- function(x) vapply(x, mean, numeric(1))
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
7.955 0.004 7.959
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
10.905 0.003 10.909
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
8.345 0.013 8.358
> system.time(replicate(100, f4(x))); Sys.sleep(10)
Error in f4(x) : could not find function "vapply"
Timing stopped at: 0 0 0 0 0
>
> ############################################################################
R version 3.0.0 (2013-04-03) -- "Masked Marvel"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> ############################################################################
>
> # for-loops versus apply() (versus rowMeans()) for computing row means
>
> n <- 2000
> x <- matrix(runif(n*n), nrow=n, ncol=n)
>
> f1 <- function(x) {
+ nr <- nrow(x)
+ m <- rep(NA_real_, nr)
+ for (i in 1:nr) {
+ m[i] <- mean(x[i,])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:nrow(x)) {
+ m <- c(m,mean(x[i,]))
+ }
+ return(m)
+ }
>
> f3 <- function(x) apply(x, 1, mean)
> f4 <- function(x) rowMeans(x)
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
5.110 0.011 5.122
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
5.433 0.000 5.433
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
7.546 0.012 7.558
> system.time(replicate(100, f4(x))); Sys.sleep(10)
user system elapsed
0.808 0.000 0.809
>
> ############################################################################
>
> # for-loops versus sapply() (and vapply()) for computing list element means
>
> x <- replicate(10000, runif(1000), simplify=FALSE)
>
> f1 <- function(x) {
+ n <- length(x)
+ m <- rep(NA_real_, n)
+ for (i in 1:n) {
+ m[i] <- mean(x[[i]])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:length(x)) {
+ m <- c(m,mean(x[[i]]))
+ }
+ return(m)
+ }
> f3 <- function(x) sapply(x, mean)
> f4 <- function(x) vapply(x, mean, numeric(1))
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
6.902 0.008 6.910
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
13.252 0.012 13.265
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
6.884 0.000 6.884
> system.time(replicate(100, f4(x))); Sys.sleep(10)
user system elapsed
5.815 0.000 5.814
>
> ############################################################################
R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> ############################################################################
>
> # for-loops versus apply() (versus rowMeans()) for computing row means
>
> n <- 2000
> x <- matrix(runif(n*n), nrow=n, ncol=n)
>
> f1 <- function(x) {
+ nr <- nrow(x)
+ m <- rep(NA_real_, nr)
+ for (i in 1:nr) {
+ m[i] <- mean(x[i,])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:nrow(x)) {
+ m <- c(m,mean(x[i,]))
+ }
+ return(m)
+ }
>
> f3 <- function(x) apply(x, 1, mean)
> f4 <- function(x) rowMeans(x)
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
4.980 0.016 4.996
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
5.383 0.000 5.384
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
5.814 0.203 6.019
> system.time(replicate(100, f4(x))); Sys.sleep(10)
user system elapsed
0.76 0.00 0.76
>
> ############################################################################
>
> # for-loops versus sapply() (and vapply()) for computing list element means
>
> x <- replicate(10000, runif(1000), simplify=FALSE)
>
> f1 <- function(x) {
+ n <- length(x)
+ m <- rep(NA_real_, n)
+ for (i in 1:n) {
+ m[i] <- mean(x[[i]])
+ }
+ return(m)
+ }
> f2 <- function(x) {
+ m <- c()
+ for (i in 1:length(x)) {
+ m <- c(m,mean(x[[i]]))
+ }
+ return(m)
+ }
> f3 <- function(x) sapply(x, mean)
> f4 <- function(x) vapply(x, mean, numeric(1))
>
> system.time(replicate(100, f1(x))); Sys.sleep(10)
user system elapsed
4.960 0.008 4.969
> system.time(replicate(100, f2(x))); Sys.sleep(10)
user system elapsed
12.000 0.012 12.013
> system.time(replicate(100, f3(x))); Sys.sleep(10)
user system elapsed
6.098 0.000 6.098
> system.time(replicate(100, f4(x))); Sys.sleep(10)
user system elapsed
4.931 0.000 4.932
>
> ############################################################################
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment