Skip to content

Instantly share code, notes, and snippets.

@daltare
Last active December 2, 2020 17:15
Show Gist options
  • Save daltare/27979bf627fde4f397a0e48736f7c1f5 to your computer and use it in GitHub Desktop.
Save daltare/27979bf627fde4f397a0e48736f7c1f5 to your computer and use it in GitHub Desktop.
Steps to load Electronic Annual Report (EAR) Data from the CA State Water Resouces Control Board's web site into R
# data for multiple years is available at: https://www.waterboards.ca.gov/drinking_water/certlic/drinkingwater/ear.html
# this script downloads data for one year to a temporary file, and loads the data into an R data frame
# load R packages
library(dplyr)
library(readr)
# create link to the dataset
year <- 2018
dataset_link <- paste0('https://www.waterboards.ca.gov/drinking_water/certlic/drinkingwater/documents/ear/earsurveyresults_',
year,
'ry.zip')
# create a temporary file (to store the zip file)
temp_file <- tempfile()
# download the zip file
download.file(dataset_link, temp_file)
# get the name of the text file
file_name <- unzip(temp_file, list = TRUE) %>% pull(Name)
# create a connection to the file
con <- unz(temp_file, file_name)
# read the data into a data frame
df_ear_results <- read_tsv(con)
# NOTE: to see problems/errors in importing data from the text file to R, use:
# problems(df_ear_results)
# remove the temp file
unlink(temp_file)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment