Skip to content

Instantly share code, notes, and snippets.

@romanegloo
Created January 11, 2019 19:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save romanegloo/3f9227904aeb9c65424c9e00234dae80 to your computer and use it in GitHub Desktop.
Save romanegloo/3f9227904aeb9c65424c9e00234dae80 to your computer and use it in GitHub Desktop.
STA580 fianl review

STA580 final summary

ANOVA (analysis of variance)

  • testing whether the means of two or more populations are equal
  • must have a continuous response and at least one categorical factor

one-way ANOVA

  • one fixed factor

model $$y_ij (\text{observation}) = mu (grand mean) + tau_i (treatment effect) + epsilon_ij (residual)$$ statistics

degree of freedom sum of squares mean ss f-statistics p-value
group df_g = k - 1 ssb
(between sample means)
msb = ssb / df_g f = msb / msw f in f_{df_g,df_r) distribution
residuals df_r = k (n - 1) ssw
(within each sample)
msw = ssw / df_r

assumptions

  • equal variance
  • normal distribution

hypothesis test

  • examine the presence of the treatment effect
  • if the p-value is less than the significance level, at least one of the means is different.
  • you can analyze further to find out which mean is different with Tukey test

coefficients

Tukey’s pair-wise comparison

  • check if any interval does not contain zero

two-way ANOVA

  • two factors on a response, also called balanced anova

model y_ijk (observation) = mu (grand mean) + alpha_i (treatment A effect) + beta_j (treatment B effect) + gamma_ij (interaction effect) + epsilon_ijk (residual)

hypothesis test

  • in reverse order,
    1. examine the presence of interaction effect
    • H_0: all the gammas are zeroes
    1. examine the presence of main effects (either alpha or beta)
    • H_0: all the betas are zeroes
    • H_0: all the alphas are zeroes

Kruscal-Wallis test and Dunn test

Linear Regression simple linear model: y_i = b_0 (intercept) + b_1 (slope) * x_i + epsilon_i (random error)

assumptions

  • validity: data should be valid upon your research question
  • additivity and linearity (residuals-fitted)
  • independence of errors: errors should be independent from any other errors
  • equal variance of errors: if the variances are not equal, use weighted least squares (residual-fitted, scale-location)
  • normality of errors: errors are normally distributed (QQ, residuals-leverage (examine outliers))

diagnostic plots

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment