Skip to content

Instantly share code, notes, and snippets.

@keberwein
Last active January 9, 2020 18:57
Show Gist options
  • Save keberwein/b0fc06095b3dcb74a95b5f11ab49fbfb to your computer and use it in GitHub Desktop.
Save keberwein/b0fc06095b3dcb74a95b5f11ab49fbfb to your computer and use it in GitHub Desktop.
BLS Occupation Code FTP Scrape Appleton, WI
library(blscrapeR)
library(dplyr)
library(stringr)
library(data.table)
# Using fread from data.table here becuse it's much faster and the first file is pretty huge.
doc <- data.table::fread("https://download.bls.gov/pub/time.series/oe/oe.data.0.Current")
titles <- data.table::fread("https://download.bls.gov/pub/time.series/oe/oe.occupation")
data_type <- data.table::fread("https://download.bls.gov/pub/time.series/oe/oe.datatype")
# Subset only to 11540, which I beleive is Appleton?
ids <- subset(doc, substr(doc$series_id, 1,11) == "OEUM0011540") %>%
# Extranct occupation code from series ID
mutate(occupation_code = str_sub(series_id, -8,-3)) %>%
mutate(occupation_code = as.integer(occupation_code)) %>%
# Extract data type code from series ID
mutate(datatype_code = str_sub(series_id, -2,-1)) %>%
mutate(datatype_code = as.integer(datatype_code))
# Join and produce final data set.
df <- inner_join(ids, titles) %>% inner_join(data_type) %>%
select(year, period, series_id, occupation_name, datatype_name, value)
# Optionial: You can make it look exactly like the webpage table by using the tidyr "spread" function.
pretty_df <- select(df, occupation_name, datatype_name, value) %>%
tidyr::spread(key=datatype_name, value = value)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment