Skip to content

Instantly share code, notes, and snippets.

@helenaeitel
Created September 23, 2016 16:57
Show Gist options
  • Save helenaeitel/1b4035d52b9f26ba21cb59733feb6e0b to your computer and use it in GitHub Desktop.
Save helenaeitel/1b4035d52b9f26ba21cb59733feb6e0b to your computer and use it in GitHub Desktop.
#Helena Eitel
#QSS 30.05
#Lab Assignment 2
#Last Modified: 9/22/16
#open various packages to use later
library(dplyr)
library(readr)
library(tidyr)
#make sure working directory is correct
getwd()
#read IPUMS data
a <- read_csv("./Extract1.csv")
#read a file with the character conversions for race
r <- read_csv("./Race.csv")
#exclude Alaska and Hawaii altogether
aa <- a %>% filter(!(YEAR < 1960 & STATEFIP %in% c(2,15)))
#add the corresponding race characters to dataframe a in a new column
b <- right_join(aa,r,"RACE")
#within each year and within each race category return the number of people (sum of PERWTs)
c <- b %>% group_by(YEAR,RACEC) %>%summarise(NUMBER = sum(PERWT))
#remove two race categories because there is no information there
d <- c %>% filter(!(RACEC == "Three or more major races" | RACEC == "Two major races"))
#display the table such that each race is a column
e <- d %>% spread(YEAR,NUMBER)
#save the final file to the working directory
write_csv(e,"RaceTable.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment