Skip to content

Instantly share code, notes, and snippets.

@emanuelhuber
Last active May 8, 2020 08:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save emanuelhuber/7688d5b476f0c3751c7be4c99bb2d7bd to your computer and use it in GitHub Desktop.
Save emanuelhuber/7688d5b476f0c3751c7be4c99bb2d7bd to your computer and use it in GitHub Desktop.
R tips and tricks

Link: https://gist.github.com/emanuelhuber/7688d5b476f0c3751c7be4c99bb2d7bd

use findInterval()

Figures

plot()

# do not plot the box
bty = "n"

Misc

Class

Instead class(x) == "foo", you should use inherits(x, "foo") or maybe alternatively is(x, "foo") (source).

Display precision

options(digits=10)

Misc

# not x[ind,] but
x[ind, drop=FALSE]
# not 1:length(x) but
seq(along=x)
# or
seq_along(x)
# not 1:5 but
seq_len(5)

range with unwante infinite values

v <- c(10, 12, 13, NA, Inf)

range(v)  # NA NA
range(v, na.rm = TRUE)  # 10 Inf
range(v, finite = TRUE) # 10 13

Algebra

Cross-product

A vectorized way of performing the cross-product operation would be to use t(x) %*% y; however, for large vectors the implementation crossprod(x, y) is faster.

crossprod(X,Y)		# = t(X) %*% Y
tcrossprod(X,Y)  	# = X %*% t(Y)

substract vector to matrix: X - y

X = matrix y = vector

sweep(X, 2, y, '-')

The Moore-Penrose generalised inverse:

A* = (ATA)AT (or something like that)

library( MASS )
A <- rbind(c(1,3,2),c(2,8,9))
ginv(A)

Matrix operations

The order in which matrix operations are performed can have a major impact on computational speed. see: 'R_generalized-additive-models_an-intro-with-R'

n <- 1000
A <- matrix(runif(n * n),n,n)
B <- matrix(runif(n * n),n,n)
y <- runif(n)
system.time((A %*% B) %*%y )    # wrong way
[1] 5.60 0.01 5.61   NA   NA
system.time(A %*% ( B %*%y ))    # right way
[1] 0.02 0.00 0.02   NA   NA
tr(AB) = tr(BA)
m <- 500
n<-5000
A <- matrix(runif(m * n),n,m)
B <- matrix(runif(m * n),m,n)
y <- runif(n)
system.time(sum(diag(A %*% B)))    # wrong way
[1] 5.60 0.01 5.61   NA   NA
system.time(sum(diag(B %*% A)))    # right way
[1] 0.02 0.00 0.02   NA   NA
system.time(sum(t(B) * A))		 # much better way!!!
[1] 0.11 0.00 0.11   NA   NA

Distance in matrix form}

Xa # matrix 3 col, n row
Xb # matrix 3 col, n row
Y <- Xa- Xb
D <- crossprod(Y, Y)

Matrix power

"%^%" <- function(A, n){ 
	if(n == 1) A else A %*% (A %^% (n-1))
}

Choose the next power of 2 greater than n

nextpower2 <- function(x){
return(2^(ceiling(log2(x))))

Binary files

#  returns the current position in con
ftell <- function(con){
  return(seek(con))
}
# Move to specified position in file
# origin : - current, start or end
fseek <- function(con, where, origin="current"){
  seek(con, where=where,origin=origin)
}
# number of bytes in connection
flen <- function(con){
  pos0 <- ftell(con)
  seek(con,0,"end")
  pos <- ftell(con)
  seek(con,where=pos0,"start")
  return(pos)
}
# place pointer at the beginning
frewind <- function(con){
  seek(con,where="0",origin="start")	# start position
}
# test end of file
feof <- function(con){
  return(ftell(con) == flen(con))
}

Output

sprintf("%05d", 9548)
> "09548"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment