casallas/ng_lmm.md

## ng_lmm.md

      
    Raw
  

              ng_lmm.md
            
          
    lmm

Assumptions

Code based on: http://pages.stat.wisc.edu/~fyang/stat572/disc9.R
Homoscedasticity

# checking that e_i all have the same variance:
gsame_var <- function(flmer){
  plot(resid(flmer) ~ fitted(flmer), main="residual plot")
  abline(h=0)
}

Normality

# checking the normality of residuals e_i:
qqnorm(resid(flmer), main="Q-Q plot for residuals")
hist(resid(flmer))

# checking the normality of field effects alpha_i:
qqnorm(ranef(flmer)$field$"(Intercept)", main="Q-Q plot for the random effect" )

glmm

Goodness of fit

R-squared

Nakagawa and Schielzeth (2013) introduced a way to calculate the marginal and conditional $$R^2$$ for linear mixed-effects models.
A thorough explanation is given, as well as a simulation for the uncertainty is given by mbjoseph
I've modified jonlefchek's function to do this on a variety of [g]lmer/lme models: https://github.com/casallas/rsquared.glmer
Here's Jon's original blog post, along with the Nakagawa and Schielzeth's data: http://jonlefcheck.net/2013/03/13/r2-for-linear-mixed-effects-models/
Alternative

A (more complex) alternative (with R/Bugs code) is given by Gelman & Pardoe. (2006) http://www.stat.columbia.edu/~gelman/research/published/rsquared.pdf
This method is also explained (and coded) in Gelman & Hill 2007, Chapter ?.
nlmm

Usage

Other than the package examples I wasn't able to fit any models, but then I found these two messages from the
"r-sig-mixed-models" mailing list:

question: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/013033.html
answer by bbolker (in nlmer and nlme): https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015936.html

Apparently nlmer (and perhaps also nlme) "assumes that the objective function has a gradient attribute associated.
If you have a simple objective function (i.e. one that R's built-in deriv() function can handle), this is simple to address".
ANOVA

When should I use type I, II, or III?

Rules of thumb:

If your data is balanced it doesn't matter (aov will work just fine).
If you don't have significant interactions use type II (use car::Anova or ez::ezANOVA).
If you have significant interactions use type III (use car::Anova or ez::ezANOVA, and proceed with care when interpreting your results).

A better explanation here: http://goanna.cs.rmit.edu.au/~fscholer/anova.php
"Effect sizes":


Use model.tables. Details on how this works and why you need to give some thought to order for unbalanced designs: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=8275 (this link also explains how model.tables(x, "means") works)

Example

model.tables(aov(yield ~ block + N * P + K, npk))
Yields:
Tables of effects

 block 
block
     1      2      3      4      5      6 
-0.850  2.575  5.900 -4.750 -4.350  1.475 

 N 
N
      0       1 
-2.8083  2.8083 

 P 
P
      0       1 
 0.5917 -0.5917 

 K 
K
      0       1 
 1.9917 -1.9917 

 N:P 
   P
N   0       1      
  0 -0.9417  0.9417
  1  0.9417 -0.9417

The first line (block) can be calculated as follows:
fit1 <- lm(yield ~ block, npk)
n.block <- length(coef(fit1))
-sum(coef(fit1)[-1]/n.block) + c(0, coef(fit1)[-1])

That is without controlling for N * P + K
The second line (N)
fit2 <- lm(yield ~ block + N, npk)
n.N <- length(coef(fit2)) - n.block + 1
i.N <- (n.block+1):(n.block+n.N-1)
-sum(coef(fit2)[i.N]/n.N) + c(0, coef(fit2)[i.N])

The third line (P)
fit3 <- lm(yield ~ block + N + P, npk)
n.P <- length(coef(fit3)) - length(coef(fit2)) + 1
i.P <- (length(coef(fit2))+1):(length(coef(fit2)) + n.P - 1)
-sum(coef(fit3)[i.P]/n.P) + c(0, coef(fit3)[i.P])

The fourth line (K)
fit4 <- lm(yield ~ block + N + P + K, npk)
n.K <- length(coef(fit4)) - length(coef(fit3)) + 1
i.K <- (length(coef(fit3)) + 1):((length(coef(fit3)) + n.K - 1))
-sum(coef(fit4)[i.K]/n.K) + c(0, coef(fit4)[i.K])

The fifth line (N:P)
fit5 <- lm(yield ~ block + N + P + K + N:P, npk) # The full model
n.NP <- c(n.N, n.P)
i.NP <- (length(coef(fit4)) + 1):((length(coef(fit4)) + prod(n.NP) - n.N - n.P + 1))
-sum(coef(fit5)[i.NP]/prod(n.NP)) + c(coef(fit5)[i.NP], 0)/n.NP # Row 1
-sum(coef(fit5)[i.NP]/prod(n.NP)) + c(0, coef(fit5)[i.NP])/n.NP # Row 2

With centered inputs

This is probably easier
library(arm) # standardize
library(magrittr) # %>% (not really necessary)

fit0 <- lm(yield ~ block + N * P + K, npk)
fit0 %>% aov %>% model.tables("effects")
fit0z <- fit0 %>% standardize

The first line (block)
n.block <- fit0$model$block %>% unique %>% length
i.block <- 2:n.block
-sum(coef(fit0z)[i.block]/n.block) + c(0, coef(fit0z)[i.block])

The second line (N)
n.N <- fit0$model$N %>% unique %>% length
i.N <- n.block + (1:n.N-1))
-sum(coef(fit0z)[i.N]/n.N) + c(0, coef(fit0z)[i.N])

The third line (P)
n.P <- fit0$model$P %>% unique %>% length
i.P <- max(i.N) + (1:(n.P-1))
-sum(coef(fit0z)[i.P]/n.P) + c(0, coef(fit0z)[i.P])

The fourth line (K)
n.K <- fit0$model$K %>% unique %>% length
i.K <- max(i.P) + (1:(n.K-1))
-sum(coef(fit0z)[i.K]/n.K) + c(0, coef(fit0z)[i.K])

The fifth line (N:P)
n.NP <- c(n.N, n.P)
i.NP <- max(i.K) + (1:(prod(n.NP) - n.N - n.P + 1))
-sum(coef(fit0z)[i.NP]/prod(n.NP)) + c(coef(fit0z)[i.NP], 0)/n.NP # Row 1
-sum(coef(fit0z)[i.NP]/prod(n.NP)) + c(0, coef(fit0z)[i.NP])/n.NP # Row 2