kevinushey/RcppApply.Rmd

## RcppApply.Rmd
---
title: Make your own 'apply' functions for matrices
author: Kevin Ushey
license: GPL (>= 2)
tags: Rcpp apply matrix
summary: Clever use of sourceCpp can allow you to define your own 'apply' functions on the fly.
---

Make your own 'apply' functions
==========

The function `sourceCpp` is, perhaps, more powerful than one might realize at first glance.
You might be familiar with the package `inline` for quickly writing and
prototyping C/C++ code -- `inline` can handle all the background compilation
stuff necessary. With `sourceCpp`, we can consider `inline`-ing code similarily,
to fit code into what I'll call function 'templates'.

For example, when considering the `apply` function in R, all we really care
about is the function definition, and we let `apply` take care of the details
in iterating and applying our function over each row / column of our matrix.
Why not implement a similar interface through C++, whereby we apply a C++
function to each row or column of our matrix?

First, we demonstrate a simple `apply` framework with C++/Rcpp, using
the sugar mean function. It's quite verbose so feel free to skip ahead if you want.

```{r engine='Rcpp'}
#include <Rcpp.h>
using namespace Rcpp;

template <class T>
inline double do_mean( T& x ) {
  return mean(x);
}

NumericVector row_means( NumericMatrix& X ) {

  int nRows = X.nrow();
  NumericVector out = no_init(nRows);

  for( int i=0; i < nRows; i++ ) {
    NumericMatrix::Row tmp = X(i, _);
    out[i] = do_mean( tmp );
  }

  return out;

}

NumericVector col_means( NumericMatrix& X ) {

  int nCols = X.ncol();
  NumericVector out = no_init(nCols);

  for( int j=0; j < nCols; j++ ) {
    NumericMatrix::Column tmp = X(_, j);
    out[j] = do_mean( tmp );
  }

  return out;

}

// [[Rcpp::export]]
NumericVector Mean( NumericMatrix X, int dim ) {

    if( dim == 1 ) {
      return row_means(X);
    } else if( dim == 2 ) {
      return col_means(X);
    }

}
```

And a quick benchmark:
```{r tidy=FALSE}
library(rbenchmark)
x <- matrix( rnorm(1E6), ncol=1E3 )
benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(x, 1, mean),
           Mean(x, 1),
           rowMeans(x),
           apply(x, 2, mean),
           Mean(x, 2),
           colMeans(x)
  )
```

To sum it up:
* The `Mean` function is exported to R, and simply checks the `dim` argument to
see whether we're applying over rows or columns. We are sent to either
`row_means` or `col_means`,
* The `row_means` and `col_means` do some hand-holding so that we are applying
a function either to each row, or each column, of our matrix,
* The work-horse function is called `do_mean`, and is templated to accept
any type (in this case, any type compatible with the sugar `mean` function),
* We avoid copying as much as possible by passing references around, rather
than copying vectors for each operation. Note the use of `NumericMatrix::Row`
and `NumericMatrix::Column` to generate references to rows and columns of `X`:
we don't actually copy anything,
* We can't do better than R's heavily optimized `rowMeans` and `colMeans`
functions, but we pretty much do just as well for `colMeans`.

Notice that (at least, thanks to Rcpp sugar) that computing the mean is just
a tiny function, while handling the row and column iterations ends up being
the brunt of the code. But! One could easily imagine that any function could
sit in there -- all it has to do is return a `double`.

Can we remove the 'boilerplate' and provide an easy interface to `apply`
functions in Rcpp? We can!

We will make an R function that acts similar to `inline`. It will take some
C++ code (input as a string), fit it into that template, compile it with
`sourceCpp`, then return an R function that can call the compiled code.

```{r tidy=FALSE}
rcpp_apply_generator <- function( fun, name=NULL ) {

  ## generate a C++ source file based on 'fun'
  cpp_source <- paste( sep="", tempfile(), ".cpp" )
  cat("C++ source code will be written to", cpp_source, ".\n")

  ## an environment for the cppSource-exported functions to live and be
  ## called from
  cpp_env <- new.env()

  ## internal name for the function
  if( is.null(name) ) {
    name <- paste( collapse="",
                   sample( c(letters, LETTERS), size=20, replace=TRUE)
                   )
    }

  ## open a connection to the file, and ensure we close it after
  ## function execution
  conn <- file( cpp_source, 'w' )
  on.exit( close(conn) )

  ## wrap the function into a giant 'cat'
  ## statement that sends it to 'conn'
  cat( file=conn, sep="", paste( sep="", collapse="",
"#include <Rcpp.h>
using namespace Rcpp;

template <class T>
inline double do_", name, "( T& x ) {\n", fun, "\n }

NumericVector row_", name, "s( NumericMatrix& X ) {

  int nRows = X.nrow();
  NumericVector out = no_init(nRows);

  for( int i=0; i < nRows; i++ ) {
    NumericMatrix::Row tmp = X(i, _);
    out[i] = do_", name, "( tmp );
  }

  return out;

}

NumericVector col_", name, "s( NumericMatrix& X ) {

  int nCols = X.ncol();
  NumericVector out = no_init(nCols);

  for( int j=0; j < nCols; j++ ) {
    NumericMatrix::Column tmp = X(_, j);
    out[j] = do_", name, "( tmp );
  }

  return out;

}

// [[Rcpp::export]]
NumericVector ", name, "( NumericMatrix X, int dim ) {

    if( dim == 1 ) {
      return row_", name, "s(X);
    } else if( dim == 2 ) {
      return col_", name, "s(X);
    }

}

") )

  ## source the file into the 'cpp_env' environment
  cat("Compiling...\n")
  sourceCpp( cpp_source, env=cpp_env )
  cat("Done!")

  ## return a function that exposes the compiled code to the user
  return( function(X, dim) {

    if( is.data.frame(X) ) {
      tryCatch( X <- as.matrix(X),
                error = function(e) {
                  message("Could not convert data.frame to matrix")
                  return(e)
                } )

    }

    stopifnot( is.matrix(X) || is.data.frame(X) )
    stopifnot( is.numeric(dim) )

    call <- call(name, X, dim)
    return( eval(call, envir=cpp_env) )

  })

}
```

So, we've made a function that:
* Accepts a string `fun` as an argument,
* Wraps it into a C++ `apply` framework,
* Sources it,
* And returns an R function that can be used to call the compiled C++ code.

Let's test it out! We'll try to make a `sdApply` function that computes the
standard deviation either row-wise or column-wise for a matrix. We can cheat
a bit and use the Rcpp sugar function `sd` for the first example.

```{r tidy=FALSE}
library(Rcpp)
sdApply <- rcpp_apply_generator("return sd(x);")
X <- matrix( 1:16, ncol=4 )
sdApply(X, 1)
sdApply(X, 2)
all.equal( sdApply(X, 1), apply(X, 1, sd) )
all.equal( sdApply(X, 2), apply(X, 2, sd) )
```

Neat - we can wrap up all the 'grunt work' into an R function, and simply
supply it with a function (as a character string, similar to `inline`), and
have it exported back neat and clean and easy to use.

Of course, for all this effort, we'd hope there's a speed increase. Benchmark
time!

```{r tidy=FALSE}
X <- matrix( rnorm(1E5), nrow=1E3 )
benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(X, 1, sd),
           sdApply(X, 1),
           sdApply( t(X), 2 ),
           apply(X, 2, sd),
           sdApply(X, 2)
           )
```
We get a huge speedup over the columns, and a modest speedup over the rows. Neat!
Note that the amount of speedup you get will depend on the 'shape' and 'size'
of the matrix.

A 'wide' matrix:
```{r tidy=FALSE}
## a 'long' matrix
X <- matrix( rnorm(1E5), nrow=1E1 )
benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(X, 1, sd),
           sdApply(X, 1),
           sdApply( t(X), 2 ),
           apply(X, 2, sd),
           sdApply(X, 2)
           )
```

A 'long' matrix:
```{r tidy=FALSE}

X <- matrix( rnorm(1E5), nrow=1E4 )
benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(X, 1, sd),
           sdApply(X, 1),
           sdApply( t(X), 2 ),
           apply(X, 2, sd),
           sdApply(X, 2)
           )
```

A 'square' matrix:
```{r tidy=FALSE}

X <- matrix( rnorm(1E5), nrow=5E2 )
benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(X, 1, sd),
           sdApply(X, 1),
           sdApply( t(X), 2 ),
           apply(X, 2, sd),
           sdApply(X, 2)
           )
```
Note a little piece of wackiness here -- it's actually faster to transpose
`x` and then call a column apply, compared to a row apply.
Of course, your mileage may vary. I imagine this is (at least partially)
because R stores its matrices in column-major order,
thus making it much faster to iterate over columns.

Finally, one last example with a slightly more complex function: let's implement
an apply version of the Wald test statistic against the mean.
```{r tidy=FALSE}

X <- matrix( rnorm(1E5), ncol=1E2 )
waldApply <- rcpp_apply_generator(
  "return sqrt( x.size() ) * mean(x) / sd(x);"
  )

f <- function(x) {
  return( sqrt( length(x) ) * mean(x) / sd(x) )
  }

all.equal( waldApply( X, 2 ), apply( X, 2, f ) )

benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
           apply(X, 1, f),
           waldApply( X, 1 ),
           waldApply( t(X), 2 ),
           waldApply( X, 2),
           apply(X, 2, f)
           )
```

A huge speedup at almost no cost. Sweet! All the user needs to know is that the
`fun` they supply must be in terms of `x`, and it must return a `double`, to fit
into this apply framework. Of course, one could argue that the time lost due
to compilation might offset any speed gains, but if you imagine performing this
operation many times over many large matrices, it could be worth it.

...and, interestingly, if you want a faster Rcpp row apply for really big matrices,
you might be better served to perform a column apply on the transpose!
	---
	title: Make your own 'apply' functions for matrices
	author: Kevin Ushey
	license: GPL (>= 2)
	tags: Rcpp apply matrix
	summary: Clever use of sourceCpp can allow you to define your own 'apply' functions on the fly.
	---

	Make your own 'apply' functions
	==========

	The function `sourceCpp` is, perhaps, more powerful than one might realize at first glance.
	You might be familiar with the package `inline` for quickly writing and
	prototyping C/C++ code -- `inline` can handle all the background compilation
	stuff necessary. With `sourceCpp`, we can consider `inline`-ing code similarily,
	to fit code into what I'll call function 'templates'.

	For example, when considering the `apply` function in R, all we really care
	about is the function definition, and we let `apply` take care of the details
	in iterating and applying our function over each row / column of our matrix.
	Why not implement a similar interface through C++, whereby we apply a C++
	function to each row or column of our matrix?

	First, we demonstrate a simple `apply` framework with C++/Rcpp, using
	the sugar mean function. It's quite verbose so feel free to skip ahead if you want.

	```{r engine='Rcpp'}
	#include <Rcpp.h>
	using namespace Rcpp;

	template <class T>
	inline double do_mean( T& x ) {
	return mean(x);
	}

	NumericVector row_means( NumericMatrix& X ) {

	int nRows = X.nrow();
	NumericVector out = no_init(nRows);

	for( int i=0; i < nRows; i++ ) {
	NumericMatrix::Row tmp = X(i, _);
	out[i] = do_mean( tmp );
	}

	return out;

	}

	NumericVector col_means( NumericMatrix& X ) {

	int nCols = X.ncol();
	NumericVector out = no_init(nCols);

	for( int j=0; j < nCols; j++ ) {
	NumericMatrix::Column tmp = X(_, j);
	out[j] = do_mean( tmp );
	}

	return out;

	}

	// [[Rcpp::export]]
	NumericVector Mean( NumericMatrix X, int dim ) {

	if( dim == 1 ) {
	return row_means(X);
	} else if( dim == 2 ) {
	return col_means(X);
	}

	}
	```

	And a quick benchmark:
	```{r tidy=FALSE}
	library(rbenchmark)
	x <- matrix( rnorm(1E6), ncol=1E3 )
	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(x, 1, mean),
	Mean(x, 1),
	rowMeans(x),
	apply(x, 2, mean),
	Mean(x, 2),
	colMeans(x)
	)
	```

	To sum it up:
	* The `Mean` function is exported to R, and simply checks the `dim` argument to
	see whether we're applying over rows or columns. We are sent to either
	`row_means` or `col_means`,
	* The `row_means` and `col_means` do some hand-holding so that we are applying
	a function either to each row, or each column, of our matrix,
	* The work-horse function is called `do_mean`, and is templated to accept
	any type (in this case, any type compatible with the sugar `mean` function),
	* We avoid copying as much as possible by passing references around, rather
	than copying vectors for each operation. Note the use of `NumericMatrix::Row`
	and `NumericMatrix::Column` to generate references to rows and columns of `X`:
	we don't actually copy anything,
	* We can't do better than R's heavily optimized `rowMeans` and `colMeans`
	functions, but we pretty much do just as well for `colMeans`.

	Notice that (at least, thanks to Rcpp sugar) that computing the mean is just
	a tiny function, while handling the row and column iterations ends up being
	the brunt of the code. But! One could easily imagine that any function could
	sit in there -- all it has to do is return a `double`.

	Can we remove the 'boilerplate' and provide an easy interface to `apply`
	functions in Rcpp? We can!

	We will make an R function that acts similar to `inline`. It will take some
	C++ code (input as a string), fit it into that template, compile it with
	`sourceCpp`, then return an R function that can call the compiled code.

	```{r tidy=FALSE}
	rcpp_apply_generator <- function( fun, name=NULL ) {

	## generate a C++ source file based on 'fun'
	cpp_source <- paste( sep="", tempfile(), ".cpp" )
	cat("C++ source code will be written to", cpp_source, ".\n")

	## an environment for the cppSource-exported functions to live and be
	## called from
	cpp_env <- new.env()

	## internal name for the function
	if( is.null(name) ) {
	name <- paste( collapse="",
	sample( c(letters, LETTERS), size=20, replace=TRUE)
	)
	}

	## open a connection to the file, and ensure we close it after
	## function execution
	conn <- file( cpp_source, 'w' )
	on.exit( close(conn) )

	## wrap the function into a giant 'cat'
	## statement that sends it to 'conn'
	cat( file=conn, sep="", paste( sep="", collapse="",
	"#include <Rcpp.h>
	using namespace Rcpp;

	template <class T>
	inline double do_", name, "( T& x ) {\n", fun, "\n }

	NumericVector row_", name, "s( NumericMatrix& X ) {

	int nRows = X.nrow();
	NumericVector out = no_init(nRows);

	for( int i=0; i < nRows; i++ ) {
	NumericMatrix::Row tmp = X(i, _);
	out[i] = do_", name, "( tmp );
	}

	return out;

	}

	NumericVector col_", name, "s( NumericMatrix& X ) {

	int nCols = X.ncol();
	NumericVector out = no_init(nCols);

	for( int j=0; j < nCols; j++ ) {
	NumericMatrix::Column tmp = X(_, j);
	out[j] = do_", name, "( tmp );
	}

	return out;

	}

	// [[Rcpp::export]]
	NumericVector ", name, "( NumericMatrix X, int dim ) {

	if( dim == 1 ) {
	return row_", name, "s(X);
	} else if( dim == 2 ) {
	return col_", name, "s(X);
	}

	}

	") )

	## source the file into the 'cpp_env' environment
	cat("Compiling...\n")
	sourceCpp( cpp_source, env=cpp_env )
	cat("Done!")

	## return a function that exposes the compiled code to the user
	return( function(X, dim) {

	if( is.data.frame(X) ) {
	tryCatch( X <- as.matrix(X),
	error = function(e) {
	message("Could not convert data.frame to matrix")
	return(e)
	} )

	}

	stopifnot( is.matrix(X) \|\| is.data.frame(X) )
	stopifnot( is.numeric(dim) )

	call <- call(name, X, dim)
	return( eval(call, envir=cpp_env) )

	})

	}
	```

	So, we've made a function that:
	* Accepts a string `fun` as an argument,
	* Wraps it into a C++ `apply` framework,
	* Sources it,
	* And returns an R function that can be used to call the compiled C++ code.

	Let's test it out! We'll try to make a `sdApply` function that computes the
	standard deviation either row-wise or column-wise for a matrix. We can cheat
	a bit and use the Rcpp sugar function `sd` for the first example.

	```{r tidy=FALSE}
	library(Rcpp)
	sdApply <- rcpp_apply_generator("return sd(x);")
	X <- matrix( 1:16, ncol=4 )
	sdApply(X, 1)
	sdApply(X, 2)
	all.equal( sdApply(X, 1), apply(X, 1, sd) )
	all.equal( sdApply(X, 2), apply(X, 2, sd) )
	```

	Neat - we can wrap up all the 'grunt work' into an R function, and simply
	supply it with a function (as a character string, similar to `inline`), and
	have it exported back neat and clean and easy to use.

	Of course, for all this effort, we'd hope there's a speed increase. Benchmark
	time!

	```{r tidy=FALSE}
	X <- matrix( rnorm(1E5), nrow=1E3 )
	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(X, 1, sd),
	sdApply(X, 1),
	sdApply( t(X), 2 ),
	apply(X, 2, sd),
	sdApply(X, 2)
	)
	```
	We get a huge speedup over the columns, and a modest speedup over the rows. Neat!
	Note that the amount of speedup you get will depend on the 'shape' and 'size'
	of the matrix.

	A 'wide' matrix:
	```{r tidy=FALSE}
	## a 'long' matrix
	X <- matrix( rnorm(1E5), nrow=1E1 )
	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(X, 1, sd),
	sdApply(X, 1),
	sdApply( t(X), 2 ),
	apply(X, 2, sd),
	sdApply(X, 2)
	)
	```

	A 'long' matrix:
	```{r tidy=FALSE}

	X <- matrix( rnorm(1E5), nrow=1E4 )
	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(X, 1, sd),
	sdApply(X, 1),
	sdApply( t(X), 2 ),
	apply(X, 2, sd),
	sdApply(X, 2)
	)
	```

	A 'square' matrix:
	```{r tidy=FALSE}

	X <- matrix( rnorm(1E5), nrow=5E2 )
	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(X, 1, sd),
	sdApply(X, 1),
	sdApply( t(X), 2 ),
	apply(X, 2, sd),
	sdApply(X, 2)
	)
	```
	Note a little piece of wackiness here -- it's actually faster to transpose
	`x` and then call a column apply, compared to a row apply.
	Of course, your mileage may vary. I imagine this is (at least partially)
	because R stores its matrices in column-major order,
	thus making it much faster to iterate over columns.

	Finally, one last example with a slightly more complex function: let's implement
	an apply version of the Wald test statistic against the mean.
	```{r tidy=FALSE}

	X <- matrix( rnorm(1E5), ncol=1E2 )
	waldApply <- rcpp_apply_generator(
	"return sqrt( x.size() ) * mean(x) / sd(x);"
	)

	f <- function(x) {
	return( sqrt( length(x) ) * mean(x) / sd(x) )
	}

	all.equal( waldApply( X, 2 ), apply( X, 2, f ) )

	benchmark( replications=20, columns=c("test", "elapsed", "relative", "user.self", "sys.self"),
	apply(X, 1, f),
	waldApply( X, 1 ),
	waldApply( t(X), 2 ),
	waldApply( X, 2),
	apply(X, 2, f)
	)
	```

	A huge speedup at almost no cost. Sweet! All the user needs to know is that the
	`fun` they supply must be in terms of `x`, and it must return a `double`, to fit
	into this apply framework. Of course, one could argue that the time lost due
	to compilation might offset any speed gains, but if you imagine performing this
	operation many times over many large matrices, it could be worth it.

	...and, interestingly, if you want a faster Rcpp row apply for really big matrices,
	you might be better served to perform a column apply on the transpose!