Skip to content

Instantly share code, notes, and snippets.

@toyeiei
Last active July 30, 2018 06:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save toyeiei/e0ac4e5845ff96b4d86899fbf8fa6efd to your computer and use it in GitHub Desktop.
Save toyeiei/e0ac4e5845ff96b4d86899fbf8fa6efd to your computer and use it in GitHub Desktop.
Clean missing values (NA) in R in these easy steps
# install tidyverse
install.packages("tidyverse")
library(tidyverse)
# step 1 -- review example dataset 'msleep' in tidyverse package
glimpse(msleep)
summary(msleep)
# step 2.1 -- remove all rows with missing values
clean_msleep <- drop_na(msleep)
# step 2.2 -- impute NA with mean or median values
# mean imputation
mean_sleep_rem <- mean(msleep$sleep_rem, na.rm=TRUE)
msleep$sleep_rem <- replace_na(msleep$sleep_rem, mean_sleep_rem)
# median imputation
median_sleep_rem <- median(msleep$sleep_rem, na.rm=TRUE)
msleep$sleep_rem <- replace_na(msleep$sleep_rem, median_sleep_rem)
# export clean dataframe to our working directory
write.csv(clean_msleep, "clean_msleep.csv")
# finished !!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment