Skip to content

Instantly share code, notes, and snippets.

@naomispence
Last active April 11, 2024 14:14
Show Gist options
  • Save naomispence/bb5ac0b17697ad44aaaad080d5214819 to your computer and use it in GitHub Desktop.
Save naomispence/bb5ac0b17697ad44aaaad080d5214819 to your computer and use it in GitHub Desktop.
#Lab: Scatterplots and Regression
#Load the libraries and data first
library(ggplot2)
library(dplyr)
library(lsr)
library(descr)
library(Hmisc)
library('lehmansociology')
data(gss123)
options(scipen = 999)
#Scatterplots and regression are for interval-ratio
#independent and interval-ratio dependent variables
#In this example, age will be our independent variable
#and spouse's highest education year will be our dependent variable.
#WHAT WOULD THE RESEARCH QUESTION, NULL HYPOTHESIS,
#AND RESEARCH HYPOTHESIS BE FOR THIS PAIR OF VARIABLES?
#We will begin by graphing the two variables together in
#a scatterplot. The scatterplot will show the line that
#best fits the data
ggplot(gss123, aes(age, speduc)) + geom_point() + stat_smooth(method="lm") +
ggtitle("The Relationship between Age and Spouse's Highest
Year of Education") +
labs(y="Spouse's Highest Year of Education", x="Age")
#INTERPRET THE SCATTERPLOT HERE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment