Lecture 1 - Numerical and graphical summaries of data
-
graphical measures and plots
-
measure of location
- mean (x_bar)
- median
-
measure of dispersion
-
how to draw boxplot
Lecture 2 - Probability, conditional probability, Bayes’ theorem
- frequentist vs Bayesian
- Frequentist inference is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data. (ex. coin flips)
- Bayesian inference is a method of statistical inference in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence of information becomes available.
Bayes’ Rule
- H is any hypothesis
- E is the evidence
- P(H | E) is what Bayes’ rule computes. Posterior Probability
- P(E | H) is the probability of observing E given H. With the given hypothesis H, this is the likelihood — the compatibility of the evidence with the given hypothesis. (Ex. If we assume that H is the normal distribution, this probability indicates how the observed data fits in the normal distribution.
- P(H) is the prior probability. The estimated probability of the hypothesis without seeing any evidence
- P(E) is the marginal likelihood. This is the same for all possible hypotheses.
Probability
- joint probability
- P(A, B, C) = P(A | B, C) P(B | C) P(C)
- naive Bayes: if the dependencies in the joint distribution are sparse, we can naively ignore all other features.
- joint / marginal distribution
Lecture 3 - random variables, central limit theorem, population and samples
- random variables, samples space
- probability mass function P(X = x)
- cumulative distribution function P(X <= x)
- normal distribution
- Z-score
- Central Limit Theorem: large sample size -> close to normal distribution
Lecture 4 - Point and interval estimation for a mean and a proportion
- Confidence does not imply probability
- the probability that the true value of a parameter is going to be captured in a confidence interval is either zero or one. “60% of the time, it works every time”
- confidence level
- t-distribution
Lecture 5 - Introduction to hypothesis testing: test concerning a mean, power and sample size
Hypothesis Testing
- state H_o H_a hypotheses
- calculate Z-score and P-value
- make a decision
- if P-value is less than critical value alpha, reject H_o
- if P-value is greater than equal to significan level alpha, accept H_o
Error
- alpha (significance) level is the probability of rejecting the true null hypothesis = Pr(Type 1 error), false positive
- beta is the probability of failing to reject false alternative hypothesis = Pr(Type 2 error), false negative
- power = 1 - beta
Lecture 6: two means (paired / unpaired)
unpaired test statistic
paired