Skip to content

Instantly share code, notes, and snippets.

@hturner
Created January 22, 2020 16:19
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hturner/3f1b838c66a960fe15108e0bbc6fa981 to your computer and use it in GitHub Desktop.
Save hturner/3f1b838c66a960fe15108e0bbc6fa981 to your computer and use it in GitHub Desktop.
Some examples of where explicitly using an integer scalar/vector is more efficient in R. Often the difference in a single operation is quite small, but if repeatedly using integer values in a script using explicit typing can be an easy way to gain speed.
library(microbenchmark)
# Examples where explictly typing with "L" is more efficient
## create a reasonably large matrix
r <- 5000
c <- 100
ints <- sample(0:10, r*c, replace = TRUE)
M <- matrix(ints, nr = r, nc = c)
## comparison to integer scalar
microbenchmark(M > 0, M > 0L, unit = "ms")
## comparison to integer vector
microbenchmark(M >= rep.int(1, r), M >= rep.int(1L, r), unit = "ms")
# Example where integer vector is more efficient
## index might originally come from a string, e.g. group label
grp_char <- sample(as.character(1:10), 10000, replace = TRUE)
num <- rnorm(10000)
## rowsum fastest with integer
## (conversion takes time, but if only need to do once in script where
## index used a lot, can gain overall)
grp_int <- as.integer(grp_char)
grp_num <- as.numeric(grp_char)
grp_fac <- as.factor(grp_char)
microbenchmark(rowsum(num, grp_char),
rowsum(num, grp_num),
rowsum(num, grp_fac),
rowsum(num, grp_int), unit = "ms")
@SaranjeetKaur
Copy link

Interesting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment