romanegloo/sta580_midterm_review.md

## sta580_midterm_review.md

      
    Raw
  

              sta580_midterm_review.md
            
          
    STA 580 midterm summary

Lecture 1 - Numerical and graphical summaries of data


graphical measures and plots


measure of location

mean (x_bar)
median


measure of dispersion


variance


interquartile range (IQR)


how to draw boxplot


Lecture 2 - Probability, conditional probability, Bayes’ theorem

frequentist vs Bayesian

Frequentist inference is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data. (ex. coin flips)
Bayesian inference is a method of statistical inference in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence of information becomes available.


Bayes’ Rule


H is any hypothesis
E is the evidence
P(H | E) is what Bayes’ rule computes. Posterior Probability
P(E | H) is the probability of observing E given H. With the given hypothesis H, this is the likelihood — the compatibility of the evidence with the given hypothesis. (Ex. If we assume that H is the normal distribution, this probability indicates how the observed data fits in the normal distribution.
P(H) is the prior probability. The estimated probability of the hypothesis without seeing any evidence
P(E) is the marginal likelihood. This is the same for all possible hypotheses.

Probability

joint probability
P(A, B, C) = P(A | B, C) P(B | C) P(C)
naive Bayes: if the dependencies in the joint distribution are sparse, we can naively ignore all other features.
joint / marginal distribution

Lecture 3 - random variables, central limit theorem, population and samples

random variables, samples space
probability mass function P(X = x)
cumulative distribution function P(X <= x)
normal distribution
Z-score
Central Limit Theorem: large sample size -> close to normal distribution


Lecture 4 - Point and interval estimation for a mean and a proportion

Confidence does not imply probability
the probability that the true value of a parameter is going to be captured in a confidence interval is either zero or one. “60% of the time, it works every time”
confidence level
t-distribution


Lecture 5 - Introduction to hypothesis testing: test concerning a mean, power and sample size
Hypothesis Testing

state H_o H_a hypotheses
calculate Z-score and P-value
make a decision


if P-value is less than critical value alpha, reject H_o
if P-value is greater than equal to significan level alpha, accept H_o

Error

alpha (significance) level is the probability of rejecting the true null hypothesis = Pr(Type 1 error), false positive
beta is the probability of failing to reject false alternative hypothesis = Pr(Type 2 error), false negative
power = 1 - beta

Lecture 6: two means (paired / unpaired)
unpaired test statistic

paired