Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
This report analyzes the effects of Vitamin C on tooth growth in guinea pigs, by dosage and dosing method. [R] [statistics] [hypothesis-testing]
title author date output
Exploring the Effect of Vitamin C on Tooth Growth in Guinea Pigs
AJ Heller (; [](
May 27, 2016
toc toc_float theme
knitr::opts_chunk$set(echo = TRUE, fig.width = 8, fig.height = 4)


This report analyzes the effects of Vitamin C on tooth growth in guinea pigs, by dosage and dosing method. After exploring the data, hypotheses are formed around how supplement size and dosing treatments affect tooth growth, and the hypotheses are then tested.


This report does not require many libraries. I prefer ggplot2, though lattice and base plotting are fine alternatives. grid gives us ggplot layout control, and datasets provide the ToothGrowth data themselves. dplyr provides handly tools to manipulate data. And finally, the random number generator is seeded so these results can more easily be reproduced.

pacman::p_load(ggplot2, grid, datasets, dplyr)


Exploring the data

Let's start with some summary data.


With only 60 observations split between 2 supplement methods and 3 dosing levels, we'll have to pay close attention to the significance and power of our tests and confidence intervals.

To get a feel for how tooth growth is affected by either factor, we'll look at a violin plot to see if anything jumps out.

ggplot(ToothGrowth, aes(factor(dose), len)) + 
    geom_violin(aes(fill=supp)) +
    ggtitle("Violin Plot of Tooth Length by Dose Size and Supplement Method")


Regardless of supplementation method, there is a clearly observable trend of increased tooth growth with increasing Vitamin C dosage. Let's test to see if we have enough data to detect a significant difference between the 0.5 and 2 mg/day dosing levels.

There is no clear difference between supplementation methods, regardless of dose size. There is possibly a slight trend in orange juice being more effective than Vitamin C, but there likely isn't enough data to prove it. Let's pretend we seek to show OJ to be more effective than VC.

A Confidence Interval for Dosing Levels

First let's examine how much data we have for each dosing level.

ToothGrowth %>% group_by(dose) %>% summarise(count=n())

With only 20 observations each, and without knowing the population standard deviation, we can't use a Z confidence interval even if the data looks approximately normal. We can, however, find a T confidence interval by running a two-sample T Test.

s0.5 <- ToothGrowth[ToothGrowth$dose == 0.5,]
s2.0 <- ToothGrowth[ToothGrowth$dose == 2,]
tt <- t.test(s2.0$len, s0.5$len)

At the 95% confidence level (meaning 19 out of 20 times we run this experiment), we conclude these data were not sampled from the same population! In other words, we're pretty confident that greater doeses of Vitamin C cause teeth to grow more. In fact, 0 is way outside the confidence interval we get at the 95% level, so we're more than pretty confident.

At what confidence level could we run this test and still conclude these data are not from the same population? To answer that, we look at the p-value.

sprintf("%0.13f%%", tt$p.value * 100)

Wow! This indicates that given these data, we can expect only one in twenty trillion times of runing this experiment that we'd fail to reject the null hypothesis -- namely that we'd fail to conclude these data are from different populations.

Let's quickly examine the power of this test, assuming the sample standard deviation and difference in means are representative of the population. Again, we controlled for Type I error ($\alpha = 0.05$, the probability of rejecting the null hypothesis given that it's true), and power is the probability of rejecting the null hypothesis given that it's false.

pooledVar <- (var(s2.0$len) + var(s0.5$len))/2
power.t.test(n = 20, 
             delta = mean(s2.0$len-s0.5$len), 
             sig.level = 0.05, 
             sd = sqrt(pooledVar))$power

A power of 1 indicates 100% probability of rejecting a false $H_0$! But in reality, it's a number very close to but less than 1, rounded up due to the limitations of floating point math.

A Confidence Interval for Supplementation Method

Much as before, we separate the data by supplementation method and t-test at the 95% level to determine if a difference of 0 is unlikely (indicating separate populations).

oj <- filter(ToothGrowth, supp == "OJ")
vc <- filter(ToothGrowth, supp == "VC")
t.test(oj$len, vc$len, alternative = "greater", var.equal = FALSE, paired = FALSE)

As it turns out, my pretend hypothesis is right, and my intuition was wrong! My interpretation of the data, split by dose size, was that there wouldn't be enough data to determine a difference, but indeed there is enough data and difference. The 95% confidence interval for the true difference in means does not include 0, which leads us to reject $H_0$: that the VC and OJ data are from the same population (no difference in effect), with data that support the alternative hypothesis $H_a$: orange juice causes greater tooth growth than vitamin C alone (as ascorbic acid).

To get a better feel for this difference, let's look at a box plot of just the difference in supplementation methods.

ggplot(ToothGrowth, aes(supp, len)) + 
    geom_boxplot() + 
    geom_jitter(width = 0.2) + 
    ggtitle("Boxplot of Tooth Length by Supplementation Method")

You can see a potential difference in variance between the two population, hence the var.equal = FALSE in the t-test code above. It is odd, however, that the upper limit of the confidence interval is $\infty$, so let's check their math by hand and calculate Welch's one-sided t-test with unequal variances.

df <- (var(oj$len)/30 + (var(vc$len)/30))^2/((((var(oj$len)/30)^2)+(var(vc$len)/30)^2)/29)
mean(oj$len) - mean(vc$len) + c(-1,1)*qt(0.95, df)*sqrt((var(oj$len)+var(vc$len))/30)

We get the exact same lower bound, and a firm number for the upper bound, using the degrees of freedom (calculated above) that we need to approximate a T-distribution: r sprintf("%0.2f", df). Having a firm upper limit does not matter much since 0 is still not within the interval, we reject the null just the same, but at least we know the math checks out.


To use the t-test, we had to make a few assumptions.

  1. each observation is independent of every other
  2. the sampled lengths follow an approximately normal distribution

T-tests are fairly robust to violations of normality, but the data should look approximately normal. Let's check that out now.

tgd <- filter(ToothGrowth, dose %in% c(0.5, 2))
p1 <- ggplot(tgd, aes(len)) +
    facet_grid(. ~ dose) +
    geom_histogram(bins = 17) + 
    ggtitle("Histogram of Tooth Length by Dose")

p2 <- ggplot(ToothGrowth, aes(len)) + 
    facet_grid(. ~ supp) + 
    geom_histogram(bins = 17) + 
    ggtitle("Histogram of Tooth Length by Supplementat")

pushViewport(viewport(layout = grid.layout(1, 2)))
print(p1, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
print(p2, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))

These look approximately normal to me, so I'm comfortable with the assumptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment