dmbates/Intro.md

## Intro.md

      
    Raw
  

              Intro.md
            
          
    Regular sparse matrices

In previous versions of the lme4 package for R, the random-effects
model matrix, Z, or its transpose, Zt, have been stored as
compressed sparse column-oriented
matrices, which is a general sparse matrix representation.
As implemented in the
Matrix package for R,
an m by n dgCMatrix matrix with nnz non-zeros has three vector slots: p,
which is the length n + 1 vector of column pointers, i, which is
the length nnz vector of row indices, and x, which is the length
nnz vector of non-zero values.
For the types of mixed-effects models fit by the lme4 package, the
matrix Z is generated from indicator columns and has a constant
number, k, of non-zeros in each row.  Thus its transpose, Zt, has
k non-zeros in each column.  If Zt is stored as a dgCMatrix
object, the p slot will be a linear sequence of multiples of k
starting at 0 and ending at n*k.  We can avoid storing and using
the p slot if we store the i and x slots as dense matrices of
size k by n. Indexing into these matrices to select a particular
column of Zt (i.e. a row of Z) uses dense array indexing which is
faster and more convenient than indexing into a column of a
dgCMatrix.
Examples

For a model with a single simple scalar random-effects term (i.e. a
term of the form (1|grp), such as
lmer(Yield ~ 1 + (1|Batch), Dyestuff)

the RSC representation of Zt is
  ..@ i   : int [1, 1:30] 0 0 0 0 0 1 1 1 1 1 ...
  ..@ x   : num [1, 1:30] 1 1 1 1 1 1 1 1 1 1 ...

That is k=1 and n=30.  The x slot is a matrix of 1's and the i
slot is
matrix(as.integer(Dyestuff$Batch) - 1L, nrow=1)

That is, the i slot is the 0-based indices into the factor levels of
Batch.
For a model with multiple simple scalar random-effects terms such as
lmer(diameter ~ 1 + (1|plate) + (1|sample), Penicillin)

the i slot of the RSC representation of Zt is built from
    > im <- do.call(rbind,lapply(Penicillin[c("plate","sample")], 
    +               function(f) as.integer(f) - 1L))
    > im["sample",] <- im["sample",] + 24L
    > im[,1:7]
           [,1] [,2] [,3] [,4] [,5] [,6] [,7]
    plate     0    0    0    0    0    0    1
    sample   24   25   26   27   28   29   24
    > im[, 135:144]
           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
    plate    22   22   22   22   23   23   23   23   23    23
    sample   26   27   28   29   24   25   26   27   28    29
    > str(im)
    int [1:2, 1:144] 0 24 0 25 0 26 0 27 0 28 ...
     - attr(*, "dimnames")=List of 2
      ..$ : chr [1:2] "plate" "sample"
      ..$ : NULL

The x slot is an 2 by 144 matrix of 1's.
For a model with vector valued random effects terms, such as
    lmer(Reaction ~ 1 + Days + (1 + Days|Subject), sleepstudy)

the i slot is a 2 by 180 matrix of the form
> i <- 2L * (as.integer(sleepstudy$Subject) - 1L)  
> im <- rbind(i, i+1L)
> str(im)
 int [1:2, 1:180] 0 1 0 1 0 1 0 1 0 1 ...
   - attr(*, "dimnames")=List of 2
    ..$ : chr [1:2] "i" ""
    ..$ : NULL
> im[, 1:11]
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
i    0    0    0    0    0    0    0    0    0     0     2
     1    1    1    1    1    1    1    1    1     1     3

and the x slot is
> x <- t(model.matrix(~ 1 + Days, sleepstudy))
> x[,1:20]
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(Intercept) 1 1 1 1 1 1 1 1 1  1  1  1  1  1  1  1  1  1  1  1
Days        0 1 2 3 4 5 6 7 8  9  0  1  2  3  4  5  6  7  8  9

These examples should not be considered representative of the actual
code used to generate these matrices.  I will find cleaner ways of
doing that.
Conversion to the dgCMatrix representation only requires generating
the p slot.  For the Dyestuff model
> sparseMatrix(i=as.vector(i), x=as.vector(x), p=0:30, index1=FALSE, giveCsparse=TRUE)
 6 x 30 sparse Matrix of class "dgCMatrix"
 
[1,] 1 1 1 1 1 . . . . . . . . . . . . . . . . . . . . . . . . .
[2,] . . . . . 1 1 1 1 1 . . . . . . . . . . . . . . . . . . . .
[3,] . . . . . . . . . . 1 1 1 1 1 . . . . . . . . . . . . . . .
[4,] . . . . . . . . . . . . . . . 1 1 1 1 1 . . . . . . . . . .
[5,] . . . . . . . . . . . . . . . . . . . . 1 1 1 1 1 . . . . .
[6,] . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 1 1

Appending the rows of Xt

Because the x slot of the RSC representation of Zt is already a
dense matrix with n columns, we can append the rows of the transpose
of X, the model matrix for the fixed-effects parameters.  At present
I have appended the appropriate rows to the i slot so that the two
arrays define an RSC representation of a (q + p) by n sparse
matrix with Zt in the first q rows and Xt in the last p rows.
This makes updating the system matrix for the penalized least squares
problem, and its sparse Cholesky factor, easier.  I'll have to do some
timings to see if this is a good idea.  I fear this may blow up the
size of the i array needlessly.  Also, it will use sparse matrix
operations for the parts of the update that could be done with dense
matrix operations.  In general I think the convenience will outweigh
possible disadvantages.
Updating for new values of theta

The current strategy is to update the value of the system matrix for
the penalized least squares problem in place.  The matrix is stored as
a dsCMatrix object in the A slot.  A list of Cholesky factors
(usually only one) is stored in its factors slot and these are
updated when A is updated (still need to write that part).
The pre-multiplication of Zt by Lambdat is performed one column of
Zt (row of Z) at a time using a rather cute algorithm that
requires only theta, the variance-component parameter vector, and lower,
the vector of lower bounds on these parameters.  It happens that an
element of lower is zero if and only if the corresponding element of
theta occurs on the diagonal of Lambdat and this, combined with
the way that the elements of theta are laid out allows for
multiplication of parts of the column in the x slot by upper
triangular matrices in place.
The update in a simple example

Returning to the simple example with the Dyestuff data
> fm1 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
> pred <- createRSC(fl=as.integer(Dyestuff$Batch), xv=matrix(1, nrow=2L, ncol=nrow(Dyestuff)))
> str(pred)
Formal class 'RSC' [package "lme4"] with 6 slots
  ..@ i    : int [1:2, 1:30] 0 6 0 6 0 6 0 6 0 6 ...
  ..@ x    : num [1:2, 1:30] 1 1 1 1 1 1 1 1 1 1 ...
  ..@ theta: num 1
  ..@ lower: num 0
  ..@ A    :Formal class 'dsCMatrix' [package "Matrix"] with 7 slots
  .. .. ..@ i       : int [1:13] 0 1 2 3 4 5 0 1 2 3 ...
  .. .. ..@ p       : int [1:8] 0 1 2 3 4 5 6 13
  .. .. ..@ Dim     : int [1:2] 7 7
  .. .. ..@ Dimnames:List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : NULL
  .. .. ..@ x       : num [1:13] 6 6 6 6 6 6 5 5 5 5 ...
  .. .. ..@ uplo    : chr "U"
  .. .. ..@ factors :List of 1
  .. .. .. ..$ spDCholesky:Formal class 'dCHMsimpl' [package "Matrix"] with 10 slots
  .. .. .. .. .. ..@ x       : num [1:13] 6 0.833 6 0.833 6 ...
  .. .. .. .. .. ..@ p       : int [1:8] 0 2 4 6 8 10 12 13
  .. .. .. .. .. ..@ i       : int [1:13] 0 6 1 6 2 6 3 6 4 6 ...
  .. .. .. .. .. ..@ nz      : int [1:7] 2 2 2 2 2 2 1
  .. .. .. .. .. ..@ nxt     : int [1:9] 1 2 3 4 5 6 7 -1 0
  .. .. .. .. .. ..@ prv     : int [1:9] 8 0 1 2 3 4 5 6 -1
  .. .. .. .. .. ..@ colcount: int [1:7] 2 2 2 2 2 2 1
  .. .. .. .. .. ..@ perm    : int [1:7] 0 1 2 3 4 5 6
  .. .. .. .. .. ..@ type    : int [1:4] 0 0 0 1
  .. .. .. .. .. ..@ Dim     : int [1:2] 7 7
  ..@ ubeta: num [1:7] 0 0 0 0 0 0 0
> pred@A
7 x 7 sparse Matrix of class "dsCMatrix"
                   
[1,] 6 . . . . .  5
[2,] . 6 . . . .  5
[3,] . . 6 . . .  5
[4,] . . . 6 . .  5
[5,] . . . . 6 .  5
[6,] . . . . . 6  5
[7,] 5 5 5 5 5 5 30
> pred@theta[] <- getME(fm1, "theta")
> RSCupdate(pred, Dyestuff$Yield)
NULL
> pred@A
7 x 7 sparse Matrix of class "dsCMatrix"

[1,] 4.59832 .       .       .       .       .        4.24165
[2,] .       4.59832 .       .       .       .        4.24165
[3,] .       .       4.59832 .       .       .        4.24165
[4,] .       .       .       4.59832 .       .        4.24165
[5,] .       .       .       .       4.59832 .        4.24165
[6,] .       .       .       .       .       4.59832  4.24165
[7,] 4.24165 4.24165 4.24165 4.24165 4.24165 4.24165 30.00000
> as(Cholesky(pred@A, perm=FALSE, LDL=FALSE), "sparseMatrix")
7 x 7 sparse Matrix of class "dtCMatrix"
    
[1,] 2.144369 .        .        .        .        .        .       
[2,] .        2.144369 .        .        .        .        .       
[3,] .        .        2.144369 .        .        .        .       
[4,] .        .        .        2.144369 .        .        .       
[5,] .        .        .        .        2.144369 .        .       
[6,] .        .        .        .        .        2.144369 .       
[7,] 1.978041 1.978041 1.978041 1.978041 1.978041 1.978041 2.554236
> as(getME(fm1, "L"), "sparseMatrix")
6 x 6 sparse Matrix of class "dtCMatrix"
                                                          
[1,] 2.144369 .        .        .        .        .       
[2,] .        2.144369 .        .        .        .       
[3,] .        .        2.144369 .        .        .       
[4,] .        .        .        2.144369 .        .       
[5,] .        .        .        .        2.144369 .       
[6,] .        .        .        .        .        2.144369
> t(getME(fm1, "RZX"))
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 1.978041 1.978041 1.978041 1.978041 1.978041 1.978041
> getME(fm1, "RX")
         [,1]
[1,] 2.554236