Created
October 16, 2019 19:18
-
-
Save tomsing1/0e6c7f764524a04d81b5952ae364f776 to your computer and use it in GitHub Desktop.
ANOVA in R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Anova in R" | |
author: "Thomas Sandmann" | |
date: "10/15/2019" | |
output: html_document | |
editor_options: | |
chunk_output_type: console | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE) | |
``` | |
```{r echo=FALSE} | |
packages_used <- c("emmeans") | |
for (package in packages_used) { | |
if (!requireNamespace(package, quietly = TRUE)) { | |
install.packages(package) | |
} | |
} | |
``` | |
```{r} | |
suppressPackageStartupMessages(library(emmeans)) | |
``` | |
# Introduction | |
The R programming language offers many different ways to fit common | |
models and perform statisticalt testing. | |
This document uses updated examples from the | |
[R Companion for the Handbook of Biological Statistics](https://rcompanion.org/rcompanion/h_01.html). | |
Because the | |
[lsmeans](https://cran.r-project.org/web/packages/lsmeans/index.html) | |
R package is no longer maintained and has been replaced with the | |
[emmeans](https://cran.r-project.org/web/packages/emmeans/index.html) | |
R package, the latter will be used throughout. | |
## Example dataset | |
In the following example, the effects of drug treatments `D1` and `D2` were | |
assayed under two different conditions `C1` and `C2`. All four combinations | |
were tested and were recorded in the `treatment` column of a data table: | |
```{r} | |
dataset <- data.frame( | |
drug = rep(c("D1", "D1", "D2", "D2"), each = 3), | |
condition = rep(c("C1", "C2", "C1", "C2"), each = 3), | |
treatment = rep(c("D1:C1", "D1:C2", "D2:C1", "D2:C2"), each = 3), | |
response = c(1, 1.2, 1.3, 2.1, 2.2, 2.3, 1.4, 1.6, 1.7, | |
2.5, 2.6, 2.8), | |
batch = factor(rep(c(1, 2, 3 ), 4)) | |
) | |
head(dataset) | |
``` | |
In this dataset, we have separate `drug` and `treatment` columns. But in | |
addition, we have also combined these two factors into a grouping variable | |
called `treatment`. This will simplify how we can specify comparisons between | |
the groups (see below). | |
We can explicitly specify the order of the `treatment` covariate. (Otherwise R | |
will use alphabetical order.) | |
```{r} | |
dataset$treatment <- factor( | |
x = dataset$treatment, | |
levels = c("D1:C1", "D1:C2", "D2:C1", "D2:C2")) | |
``` | |
```{r} | |
boxplot(response ~ drug + condition, data = dataset, | |
ylab = "Response", xlab = "Treatment") | |
``` | |
A visual inspection of the dataset shows that there seem to be differences | |
between all four treatments (= drug x condition combinations.) | |
## ANOVA | |
In addition to `drug`, `condition` and the combined `treatment` column, | |
the dataset also contains the `batch` column. It is not of biological interest, | |
but corresponds to a nuisance variable. It is important to *adjust for* its | |
influence,, but it will not be part of the comparisons we want to examine | |
(see below). | |
R's `lm()` function can fit many different types of linear models, including | |
those with categorical predictor variables (like an ANOVA). | |
The following command fits a linear model taking into account | |
- the effect of the `drug` treatment (`D2` vs `D1`) | |
- the effect of the `condition` (`C2` vs `C1`) | |
- the statistical interaction between `drug` and `condition`, (denoted with the | |
colon) | |
- the effect of the batches | |
Unless otherwise specified, R will consider the level of each categorical | |
variable in alphabetical order, e.g. it will consider `D1` the baseline for the | |
`drug` covariate, `C1` the baseline for the `condition` covariate and `1` | |
the baseline for the `batch`. | |
```{r} | |
fit1 = lm(response ~ drug + condition + drug:condition + batch, data = dataset) | |
``` | |
The `fit1` object contains the mean and standard errors for each of the | |
treatment groups, but it does not perform statistical testing, e.g. it does | |
*not* return p-values. | |
```{r} | |
fit1 | |
``` | |
To perform a statistical test, we can pass the `fit1` object to the `anova()` | |
function: | |
```{r} | |
anova(fit1) | |
``` | |
The output of the `anova()` function shows that there are indeed differences | |
between the experimental groups: | |
- there is a significant `drug` effect | |
- there is a significant effect of the `condition` | |
- there is a significant `batch` effect | |
- but there is no evidence for a drug:condition interaction, e.g. the effect of | |
the drug treatment (`D1` vs `D2`) is not significantly differenct between | |
conditions (`C1` and `C2`). | |
This gives us a high level overview, but which of the groups are actually | |
significantly different *from each other*? | |
### Contrasts | |
#### Recoding the groups of interest | |
In addition to the `drug` and `condition` columns in our dataset, we also have | |
a combined `treatment` variable, which defines the four drug x condition | |
combinations. | |
Because our biological questions are about differences between these groups, | |
we fit the linear model again, this time using the `treatment` column. | |
```{r} | |
fit2 = lm(response ~ treatment + batch, data = dataset) | |
fit2 | |
``` | |
As before, we can use the `anova()` function to ask whether there is evidence | |
that one or more of the four `treatment` groups are different from each other: | |
```{r} | |
anova(fit2) | |
``` | |
To extract statistics about the *individual treatment groups* (and the | |
differences between them) from the `fit2` object, we make use functions from | |
the `emmeans` R package. | |
```{r} | |
library(emmeans) | |
``` | |
For example, the `emmeans` function can return the estimated `mean`, the | |
standard error (`SE`) and the confidence intervals over all `treatment` | |
conditions from the linear model: | |
```{r} | |
group_means = emmeans(fit2, "treatment") | |
``` | |
**Note:** The model we fit also included the `batch` covariate, but because | |
we only want to *adjust* for it, e.g. it is *not* part of the scientific | |
question we want to ask, it is not specified in the call to `emmeans()`. (The | |
output of the `emmeans()` call reminds us that the results are averaged over the | |
levels of the batch covariate). | |
But what about the *differences* between the groups? In statistics, they are | |
usually referred to as `contrasts`. Depending on your questions of interest, | |
you might be interested in | |
- pairwise differences between all conditions | |
- the differences between each condition and a specific control group | |
- a custom set of differences | |
All of the above can be extracted from the `group_means` object we created | |
above using the `contrast` function. | |
#### Pairs | |
To get statistics on *all* possible pairs, choose the `pairwise` method: | |
```{r} | |
contrast(group_means, method = "pairwise", adjust = "tukey") | |
``` | |
The `estimate` column shows the difference between the estimated means of | |
each pair, and the `SE` provides the standard error of this estimate. As | |
expected, the statistical confirms that all pairwise comparisons are | |
statistically significant. | |
When we conduct multiple tests, we need to adjust for the number of | |
analyses. By default, the `contrast()` function uses Tukey's method for p-value | |
adjustment.[^1] | |
**Note:** The `batch` covariate does not feature in the comparisons, because | |
we did not specify it when we created the `group_means` object with the | |
`emmeans()` function earlier. | |
#### Treatment vs control | |
In the example dataset, we have an explicit `Control` treatment. If you are | |
interested in testing for differences between each of the other groups and this | |
reference, you can use the `trt.vs.ctrl` contrast. | |
```{r} | |
contrast(group_means, method = "trt.vs.ctrl") | |
``` | |
The default multiplicity adjustment for `treatment vs control` analysis | |
method is `dunnettx`, a close approximation to the Dunnett adjustment. | |
**But how does R which of the conditions is the reference?** | |
When we defined the dataset, we carefully specified the `treatment` column as | |
a factor and defined its `levels`. | |
```{r} | |
levels(dataset$treatment) | |
``` | |
By default, *first* level is used as the reference. If you want to compare to | |
a different reference group, let's say `D2:C1`, you can specify it in the | |
`contrast()` call: | |
```{r} | |
contrast(group_means, method = "trt.vs.ctrl", ref = "D2:C1") | |
``` | |
#### Custom contrasts | |
Usually, experiments are designed to answer specific questions. To define your | |
own comparisons, you can specify the comparisons manually. | |
Is there a significant difference conditions `D1:C2` and `D2:C2`? | |
Let's plot the estimated means and their confidence intervals. | |
```{r} | |
plot(group_means, CIs = TRUE) | |
``` | |
This plot shows each treatment group on the y-axis and its estimated mean on the | |
x-axis. Groups `D1:C2` and `D2:C2` are spaced relatively far apart, but is that | |
difference significant? | |
**Note:** You can see more information on plotting the output of the `emmeans()` | |
function by consulting the help page: | |
```{r eval=FALSE} | |
?emmeans:::plot.emmGrid | |
``` | |
Another way of asking whether the `D1:C2` and `D2:C2` conditions are different | |
from each other is to examine whether the absolute *difference* between them | |
is larger than zero. | |
Let's take another look at the estimated means we stored in the | |
`group_means` object: | |
```{r} | |
group_means | |
``` | |
There are five rows, one for each condition. To compare `D1:C2` and `D2:C2`, | |
we need to calculate the difference between the second (`D1:C2`) and the fourth | |
row (`D2:C2`). | |
We can specify the comparison of interest using a *vector* of `0`, `1` and `-1`. | |
For example, the following vector | |
```{r} | |
setNames( | |
c(0, -1, 0, 1), | |
as.data.frame(group_means)$treatment | |
) | |
``` | |
- ignores the first and third rows, because they are set to `0`. | |
- pulls out the second and fourth rows, which we want to compare, because they | |
are set to `1` or `-1`. | |
- the sign (`1` or `-1`) defines how difference should be calculated; here, we | |
subtract the second row (`D1:C2`) from the fourth row (`D2:C2`). | |
We can use this vectors to communicate the comparison of interest to the | |
`contrast()` function which comparison: | |
```{r} | |
contrast(group_means, method = list( | |
"D2:C2 - D1:C2" = c(0, -1, 0, 1) | |
)) | |
``` | |
**Note:** Because we only specified a single comparison, no multiple testing | |
correction is applied. | |
To test multiple custom comparisons, we can include them all in the same list: | |
```{r} | |
contrast(group_means, method = list( | |
"D2:C2 - D1:C2" = c(0, -1, 0, 1), | |
"D2:C1 - D1:C1" = c(-1, 0, 1, 0) | |
), adjust = "none") | |
``` | |
#### Interactions | |
The same vector notation can also be used to test for interactions, e.g. | |
*differences between comparisons*. | |
For example, we might want to ask whether the `drug` treatment (`D1` vs `D2`) | |
has different effects under the two different `conditions` (`C1` and `C2`). | |
First, we specify two comparisons to calculate the drug effects, separately | |
for each condition. Then, we compare these differences by subtracting them | |
from each other: | |
```{r} | |
comparisons <- list( | |
"D2:C1 - D1:C1" = c(-1, 0, 1, 0), # drug 2 vs drug 1 in condition 1 | |
"D2:C2 - D1:C2" = c(0, -1, 0, 1), # drug 2 vs drug 1 in condition 2 | |
"(D2:C2 - D1:C2)-(D2:C2 - D1:C2)" = c(-1, 0, 1, 0) - c(0, -1, 0, 1) | |
) | |
``` | |
Each of these comparisons corresponds to a linear combination of the estimated | |
effects. Let's add the names of the treatment groups to clarify: | |
```{r} | |
lapply(comparisons, function(x) { | |
setNames(x, as.data.frame(group_means)$treatment) | |
}) | |
``` | |
We can now pass the list of comparisons to the `contrast()` function. | |
**Note:** Here, we choose not to adjust for multiple testing to compare the | |
results to those obtained with the `anova()` function earlier. Especially when | |
you define *many* different contrasts, you should specify `adjust = "sidak"` | |
instead. | |
```{r} | |
contrast(group_means, method = comparisons, adjust = "none") | |
``` | |
As expected, the drug has significant effects under both conditions, with an | |
estimated shift in means of `0.4` in condition 1 and `0.4333` in condition 2, | |
e.g. the drug has slightly stronger effect in condition 2 (+ `0.0333`) - but | |
that difference is not statistically significant. In other words, there is | |
no significant evidence that the drug has *different* effects under the two | |
conditions. (The interaction p-value matches that obatined by the | |
`anova(fit1)` call we started off with, see above.) | |
#### Additional information | |
For more information on available types of pre-defined contrasts and their | |
associated methods of multiple testing adjustment, please consult the manual: | |
```{r eval=FALSE} | |
?emmeans::`contrast-methods` | |
``` | |
*** | |
[^1]: A number of multiple testing methods have been defined for different | |
use cases. As a rule of thumb, use the `tukey` method when comparing *every* | |
mean with every other mean, use `dunnettx` to compare *every* mean with a | |
control mean. Finally, use `bonferroni` or `sidak` when you manually defined a | |
subset of means to compare. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment