Skip to content

Instantly share code, notes, and snippets.

@primaryobjects
Last active November 13, 2018 02:51
Show Gist options
  • Save primaryobjects/6fb245f19f9c17b21bdfaa45c5eaca14 to your computer and use it in GitHub Desktop.
Save primaryobjects/6fb245f19f9c17b21bdfaa45c5eaca14 to your computer and use it in GitHub Desktop.
Single variable linear regression, calculating baseline prediction, SSE, SST, R^2 of the model.
---
title: "Single Variable Linear Regression R^2"
output: html_document
---
The following figure shows three data points and the best fit line:
y = 3x + 2.
The x-coordinate, or "x", is our independent variable and the y-coordinate, or "y", is our dependent variable.
Coordinates:
```{r, echo=FALSE}
data <- data.frame(x=c(0, 1, 1), y = c(2, 2, 8))
data
```
```{r}
data <- data.frame(x=c(0, 1, 1), y = c(2, 2, 8))
plot(data, pch=15, xlab='Independent', ylab='Dependent', ylim=c(0,10), xlim=c(-3, 3))
fit <- lm(y ~ x, data = data)
abline(fit)
```
What is the baseline prediction?
```{r}
# (2 + 2 + 8) / 3 = 4
baseline <- sum(data$y) / nrow(data)
plot(data, pch=15, xlab='Independent', ylab='Dependent', ylim=c(0,10), xlim=c(-3, 3))
abline(a=baseline, b=0, col='blue')
```
What is the Sum of Squared Errors (SSE) ?
```{r}
# (2 - 2)^2 + (2 - 5)^2 + (8 - 5)^2
# 0^2 + -3^2 + 3^2
# 0 + 9 + 9
# 18
squares <- apply(data, 1, function(row) {
y <- row[2]
yhat <- round(predict(fit, newdata = data.frame(x=row[1], y=row[2])))
(y - yhat)^2
})
SSE <- sum(squares)
SSE
```
What is the Total Sum of Squares (SST) ?
```{r}
# (2 - 4)^2 + (2 - 4)^2 + (8 - 4)^2
# 4 + 4 + 16
# 24
squares <- apply(data, 1, function(row) {
y <- row[2]
yhat <- baseline
(y - yhat)^2
})
SST <- sum(squares)
SST
```
What is the R² of the model?
```{r}
# 1 - (SSE / SST)
# 1 - (18 / 24)
# 1 - 0.75
# 0.25
1 - (SSE / SST)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment