Skip to content

Instantly share code, notes, and snippets.

@casallas
Last active January 4, 2016 02:59
Show Gist options
  • Save casallas/8559132 to your computer and use it in GitHub Desktop.
Save casallas/8559132 to your computer and use it in GitHub Desktop.
(non | generalized) linear mixed-effects models (or multi-level [non | generalized] linear models), and ANOVA

lmm

Assumptions

Code based on: http://pages.stat.wisc.edu/~fyang/stat572/disc9.R

Homoscedasticity

# checking that e_i all have the same variance:
gsame_var <- function(flmer){
  plot(resid(flmer) ~ fitted(flmer), main="residual plot")
  abline(h=0)
}

Normality

# checking the normality of residuals e_i:
qqnorm(resid(flmer), main="Q-Q plot for residuals")
hist(resid(flmer))

# checking the normality of field effects alpha_i:
qqnorm(ranef(flmer)$field$"(Intercept)", main="Q-Q plot for the random effect" )

glmm

Goodness of fit

R-squared

Nakagawa and Schielzeth (2013) introduced a way to calculate the marginal and conditional $$R^2$$ for linear mixed-effects models.

A thorough explanation is given, as well as a simulation for the uncertainty is given by mbjoseph

I've modified jonlefchek's function to do this on a variety of [g]lmer/lme models: https://github.com/casallas/rsquared.glmer

Here's Jon's original blog post, along with the Nakagawa and Schielzeth's data: http://jonlefcheck.net/2013/03/13/r2-for-linear-mixed-effects-models/

Alternative

A (more complex) alternative (with R/Bugs code) is given by Gelman & Pardoe. (2006) http://www.stat.columbia.edu/~gelman/research/published/rsquared.pdf

This method is also explained (and coded) in Gelman & Hill 2007, Chapter ?.

nlmm

Usage

Other than the package examples I wasn't able to fit any models, but then I found these two messages from the "r-sig-mixed-models" mailing list:

Apparently nlmer (and perhaps also nlme) "assumes that the objective function has a gradient attribute associated. If you have a simple objective function (i.e. one that R's built-in deriv() function can handle), this is simple to address".

ANOVA

When should I use type I, II, or III?

Rules of thumb:

  • If your data is balanced it doesn't matter (aov will work just fine).
  • If you don't have significant interactions use type II (use car::Anova or ez::ezANOVA).
  • If you have significant interactions use type III (use car::Anova or ez::ezANOVA, and proceed with care when interpreting your results).

A better explanation here: http://goanna.cs.rmit.edu.au/~fscholer/anova.php

"Effect sizes":

Example

model.tables(aov(yield ~ block + N * P + K, npk)) Yields:

Tables of effects

 block 
block
     1      2      3      4      5      6 
-0.850  2.575  5.900 -4.750 -4.350  1.475 

 N 
N
      0       1 
-2.8083  2.8083 

 P 
P
      0       1 
 0.5917 -0.5917 

 K 
K
      0       1 
 1.9917 -1.9917 

 N:P 
   P
N   0       1      
  0 -0.9417  0.9417
  1  0.9417 -0.9417

The first line (block) can be calculated as follows:

fit1 <- lm(yield ~ block, npk)
n.block <- length(coef(fit1))
-sum(coef(fit1)[-1]/n.block) + c(0, coef(fit1)[-1])

That is without controlling for N * P + K

The second line (N)

fit2 <- lm(yield ~ block + N, npk)
n.N <- length(coef(fit2)) - n.block + 1
i.N <- (n.block+1):(n.block+n.N-1)
-sum(coef(fit2)[i.N]/n.N) + c(0, coef(fit2)[i.N])

The third line (P)

fit3 <- lm(yield ~ block + N + P, npk)
n.P <- length(coef(fit3)) - length(coef(fit2)) + 1
i.P <- (length(coef(fit2))+1):(length(coef(fit2)) + n.P - 1)
-sum(coef(fit3)[i.P]/n.P) + c(0, coef(fit3)[i.P])

The fourth line (K)

fit4 <- lm(yield ~ block + N + P + K, npk)
n.K <- length(coef(fit4)) - length(coef(fit3)) + 1
i.K <- (length(coef(fit3)) + 1):((length(coef(fit3)) + n.K - 1))
-sum(coef(fit4)[i.K]/n.K) + c(0, coef(fit4)[i.K])

The fifth line (N:P)

fit5 <- lm(yield ~ block + N + P + K + N:P, npk) # The full model
n.NP <- c(n.N, n.P)
i.NP <- (length(coef(fit4)) + 1):((length(coef(fit4)) + prod(n.NP) - n.N - n.P + 1))
-sum(coef(fit5)[i.NP]/prod(n.NP)) + c(coef(fit5)[i.NP], 0)/n.NP # Row 1
-sum(coef(fit5)[i.NP]/prod(n.NP)) + c(0, coef(fit5)[i.NP])/n.NP # Row 2

With centered inputs

This is probably easier

library(arm) # standardize
library(magrittr) # %>% (not really necessary)
fit0 <- lm(yield ~ block + N * P + K, npk)
fit0 %>% aov %>% model.tables("effects")
fit0z <- fit0 %>% standardize

The first line (block)

n.block <- fit0$model$block %>% unique %>% length
i.block <- 2:n.block
-sum(coef(fit0z)[i.block]/n.block) + c(0, coef(fit0z)[i.block])

The second line (N)

n.N <- fit0$model$N %>% unique %>% length
i.N <- n.block + (1:n.N-1))
-sum(coef(fit0z)[i.N]/n.N) + c(0, coef(fit0z)[i.N])

The third line (P)

n.P <- fit0$model$P %>% unique %>% length
i.P <- max(i.N) + (1:(n.P-1))
-sum(coef(fit0z)[i.P]/n.P) + c(0, coef(fit0z)[i.P])

The fourth line (K)

n.K <- fit0$model$K %>% unique %>% length
i.K <- max(i.P) + (1:(n.K-1))
-sum(coef(fit0z)[i.K]/n.K) + c(0, coef(fit0z)[i.K])

The fifth line (N:P)

n.NP <- c(n.N, n.P)
i.NP <- max(i.K) + (1:(prod(n.NP) - n.N - n.P + 1))
-sum(coef(fit0z)[i.NP]/prod(n.NP)) + c(coef(fit0z)[i.NP], 0)/n.NP # Row 1
-sum(coef(fit0z)[i.NP]/prod(n.NP)) + c(0, coef(fit0z)[i.NP])/n.NP # Row 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment