Skip to content

Instantly share code, notes, and snippets.

@graebnerc
Created June 21, 2024 13:02
Show Gist options
  • Save graebnerc/ec8973c09a75da199a8cc3f564e1da96 to your computer and use it in GitHub Desktop.
Save graebnerc/ec8973c09a75da199a8cc3f564e1da96 to your computer and use it in GitHub Desktop.
Notes from the lecture on simple linear regression
library(tibble)
library(ggplot2)
library(moderndive)
# 1. Implement linear regression-----------------
# Make a shortcut to the data:
beer_data <- as_tibble(DataScienceExercises::beer)
head(beer_data)
# Conduct the linear regression:
beer_lm <- lm(
formula = consumption ~ income,
data = beer_data)
beer_lm
# To get more information about the regression:
summary(beer_lm)
moderndive::get_regression_table(beer_lm)
# Digression: the results might change drastically if you do a multiple linear
# regression
summary(lm(
formula = consumption ~ income + price,
data = beer_data))
# More info on this: "omitted variable bias"
# 2. Compute R2-----------------
# Illustrating what we mean by total variation:
mean_consumption <- mean(beer_data$consumption)
ggplot(data = beer_data, aes(x=1:30, y=consumption)) +
geom_hline(yintercept = mean_consumption) +
geom_point() + theme_linedraw()
# Compute TSS, RSS and ESS manually:
tss <- sum((beer_data$consumption - mean_consumption)**2)
rss <- sum(beer_lm$residuals**2)
ess <- sum((beer_lm$fitted.values - mean_consumption)**2)
# From this we can compute R2 manually:
ess/tss
# Compare to what you get from, e.g., summary():
summary(beer_lm)[["r.squared"]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment