Skip to content

Instantly share code, notes, and snippets.

@dmbates
Created December 18, 2012 20:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmbates/4331559 to your computer and use it in GitHub Desktop.
Save dmbates/4331559 to your computer and use it in GitHub Desktop.
Use of the "regular sparse column-oriented" representation for mixed-effects models

Regular sparse matrices

In previous versions of the lme4 package for R, the random-effects model matrix, Z, or its transpose, Zt, have been stored as compressed sparse column-oriented matrices, which is a general sparse matrix representation. As implemented in the Matrix package for R, an m by n dgCMatrix matrix with nnz non-zeros has three vector slots: p, which is the length n + 1 vector of column pointers, i, which is the length nnz vector of row indices, and x, which is the length nnz vector of non-zero values.

For the types of mixed-effects models fit by the lme4 package, the matrix Z is generated from indicator columns and has a constant number, k, of non-zeros in each row. Thus its transpose, Zt, has k non-zeros in each column. If Zt is stored as a dgCMatrix object, the p slot will be a linear sequence of multiples of k starting at 0 and ending at n*k. We can avoid storing and using the p slot if we store the i and x slots as dense matrices of size k by n. Indexing into these matrices to select a particular column of Zt (i.e. a row of Z) uses dense array indexing which is faster and more convenient than indexing into a column of a dgCMatrix.

Examples

For a model with a single simple scalar random-effects term (i.e. a term of the form (1|grp), such as

lmer(Yield ~ 1 + (1|Batch), Dyestuff)

the RSC representation of Zt is

  ..@ i   : int [1, 1:30] 0 0 0 0 0 1 1 1 1 1 ...
  ..@ x   : num [1, 1:30] 1 1 1 1 1 1 1 1 1 1 ...

That is k=1 and n=30. The x slot is a matrix of 1's and the i slot is

matrix(as.integer(Dyestuff$Batch) - 1L, nrow=1)

That is, the i slot is the 0-based indices into the factor levels of Batch.

For a model with multiple simple scalar random-effects terms such as

lmer(diameter ~ 1 + (1|plate) + (1|sample), Penicillin)

the i slot of the RSC representation of Zt is built from

    > im <- do.call(rbind,lapply(Penicillin[c("plate","sample")], 
    +               function(f) as.integer(f) - 1L))
    > im["sample",] <- im["sample",] + 24L
    > im[,1:7]
           [,1] [,2] [,3] [,4] [,5] [,6] [,7]
    plate     0    0    0    0    0    0    1
    sample   24   25   26   27   28   29   24
    > im[, 135:144]
           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    plate    22   22   22   22   23   23   23   23   23    23
    sample   26   27   28   29   24   25   26   27   28    29
    > str(im)
    int [1:2, 1:144] 0 24 0 25 0 26 0 27 0 28 ...
     - attr(*, "dimnames")=List of 2
      ..$ : chr [1:2] "plate" "sample"
      ..$ : NULL

The x slot is an 2 by 144 matrix of 1's.

For a model with vector valued random effects terms, such as

    lmer(Reaction ~ 1 + Days + (1 + Days|Subject), sleepstudy)

the i slot is a 2 by 180 matrix of the form

> i <- 2L * (as.integer(sleepstudy$Subject) - 1L)  
> im <- rbind(i, i+1L)
> str(im)
 int [1:2, 1:180] 0 1 0 1 0 1 0 1 0 1 ...
   - attr(*, "dimnames")=List of 2
    ..$ : chr [1:2] "i" ""
    ..$ : NULL
> im[, 1:11]
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
i    0    0    0    0    0    0    0    0    0     0     2
     1    1    1    1    1    1    1    1    1     1     3

and the x slot is

> x <- t(model.matrix(~ 1 + Days, sleepstudy))
> x[,1:20]
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(Intercept) 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1
Days        0 1 2 3 4 5 6 7 8  9  0  1  2  3  4  5  6  7  8  9

These examples should not be considered representative of the actual code used to generate these matrices. I will find cleaner ways of doing that.

Conversion to the dgCMatrix representation only requires generating the p slot. For the Dyestuff model

> sparseMatrix(i=as.vector(i), x=as.vector(x), p=0:30, index1=FALSE, giveCsparse=TRUE)
 6 x 30 sparse Matrix of class "dgCMatrix"
 
[1,] 1 1 1 1 1 . . . . . . . . . . . . . . . . . . . . . . . . .
[2,] . . . . . 1 1 1 1 1 . . . . . . . . . . . . . . . . . . . .
[3,] . . . . . . . . . . 1 1 1 1 1 . . . . . . . . . . . . . . .
[4,] . . . . . . . . . . . . . . . 1 1 1 1 1 . . . . . . . . . .
[5,] . . . . . . . . . . . . . . . . . . . . 1 1 1 1 1 . . . . .
[6,] . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 1 1

Appending the rows of Xt

Because the x slot of the RSC representation of Zt is already a dense matrix with n columns, we can append the rows of the transpose of X, the model matrix for the fixed-effects parameters. At present I have appended the appropriate rows to the i slot so that the two arrays define an RSC representation of a (q + p) by n sparse matrix with Zt in the first q rows and Xt in the last p rows. This makes updating the system matrix for the penalized least squares problem, and its sparse Cholesky factor, easier. I'll have to do some timings to see if this is a good idea. I fear this may blow up the size of the i array needlessly. Also, it will use sparse matrix operations for the parts of the update that could be done with dense matrix operations. In general I think the convenience will outweigh possible disadvantages.

Updating for new values of theta

The current strategy is to update the value of the system matrix for the penalized least squares problem in place. The matrix is stored as a dsCMatrix object in the A slot. A list of Cholesky factors (usually only one) is stored in its factors slot and these are updated when A is updated (still need to write that part).

The pre-multiplication of Zt by Lambdat is performed one column of Zt (row of Z) at a time using a rather cute algorithm that requires only theta, the variance-component parameter vector, and lower, the vector of lower bounds on these parameters. It happens that an element of lower is zero if and only if the corresponding element of theta occurs on the diagonal of Lambdat and this, combined with the way that the elements of theta are laid out allows for multiplication of parts of the column in the x slot by upper triangular matrices in place.

The update in a simple example

Returning to the simple example with the Dyestuff data

> fm1 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
> pred <- createRSC(fl=as.integer(Dyestuff$Batch), xv=matrix(1, nrow=2L, ncol=nrow(Dyestuff)))
> str(pred)
Formal class 'RSC' [package "lme4"] with 6 slots
  ..@ i    : int [1:2, 1:30] 0 6 0 6 0 6 0 6 0 6 ...
  ..@ x    : num [1:2, 1:30] 1 1 1 1 1 1 1 1 1 1 ...
  ..@ theta: num 1
  ..@ lower: num 0
  ..@ A    :Formal class 'dsCMatrix' [package "Matrix"] with 7 slots
  .. .. ..@ i       : int [1:13] 0 1 2 3 4 5 0 1 2 3 ...
  .. .. ..@ p       : int [1:8] 0 1 2 3 4 5 6 13
  .. .. ..@ Dim     : int [1:2] 7 7
  .. .. ..@ Dimnames:List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. ..@ x       : num [1:13] 6 6 6 6 6 6 5 5 5 5 ...
  .. .. ..@ uplo    : chr "U"
  .. .. ..@ factors :List of 1
  .. .. .. ..$ spDCholesky:Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. .. .. .. ..@ x       : num [1:13] 6 0.833 6 0.833 6 ...
  .. .. .. .. .. ..@ p       : int [1:8] 0 2 4 6 8 10 12 13
  .. .. .. .. .. ..@ i       : int [1:13] 0 6 1 6 2 6 3 6 4 6 ...
  .. .. .. .. .. ..@ nz      : int [1:7] 2 2 2 2 2 2 1
  .. .. .. .. .. ..@ nxt     : int [1:9] 1 2 3 4 5 6 7 -1 0
  .. .. .. .. .. ..@ prv     : int [1:9] 8 0 1 2 3 4 5 6 -1
  .. .. .. .. .. ..@ colcount: int [1:7] 2 2 2 2 2 2 1
  .. .. .. .. .. ..@ perm    : int [1:7] 0 1 2 3 4 5 6
  .. .. .. .. .. ..@ type    : int [1:4] 0 0 0 1
  .. .. .. .. .. ..@ Dim     : int [1:2] 7 7
  ..@ ubeta: num [1:7] 0 0 0 0 0 0 0
> pred@A
7 x 7 sparse Matrix of class "dsCMatrix"
                   
[1,] 6 . . . . .  5
[2,] . 6 . . . .  5
[3,] . . 6 . . .  5
[4,] . . . 6 . .  5
[5,] . . . . 6 .  5
[6,] . . . . . 6  5
[7,] 5 5 5 5 5 5 30
> pred@theta[] <- getME(fm1, "theta")
> RSCupdate(pred, Dyestuff$Yield)
NULL
> pred@A
7 x 7 sparse Matrix of class "dsCMatrix"

[1,] 4.59832 .       .       .       .       .        4.24165
[2,] .       4.59832 .       .       .       .        4.24165
[3,] .       .       4.59832 .       .       .        4.24165
[4,] .       .       .       4.59832 .       .        4.24165
[5,] .       .       .       .       4.59832 .        4.24165
[6,] .       .       .       .       .       4.59832  4.24165
[7,] 4.24165 4.24165 4.24165 4.24165 4.24165 4.24165 30.00000
> as(Cholesky(pred@A, perm=FALSE, LDL=FALSE), "sparseMatrix")
7 x 7 sparse Matrix of class "dtCMatrix"
    
[1,] 2.144369 .        .        .        .        .        .       
[2,] .        2.144369 .        .        .        .        .       
[3,] .        .        2.144369 .        .        .        .       
[4,] .        .        .        2.144369 .        .        .       
[5,] .        .        .        .        2.144369 .        .       
[6,] .        .        .        .        .        2.144369 .       
[7,] 1.978041 1.978041 1.978041 1.978041 1.978041 1.978041 2.554236
> as(getME(fm1, "L"), "sparseMatrix")
6 x 6 sparse Matrix of class "dtCMatrix"
                                                          
[1,] 2.144369 .        .        .        .        .       
[2,] .        2.144369 .        .        .        .       
[3,] .        .        2.144369 .        .        .       
[4,] .        .        .        2.144369 .        .       
[5,] .        .        .        .        2.144369 .       
[6,] .        .        .        .        .        2.144369
> t(getME(fm1, "RZX"))
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 1.978041 1.978041 1.978041 1.978041 1.978041 1.978041
> getME(fm1, "RX")
         [,1]
[1,] 2.554236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment