Skip to content

Instantly share code, notes, and snippets.

@zippeurfou
Created December 12, 2014 05:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zippeurfou/6229a07c53a714da6386 to your computer and use it in GitHub Desktop.
Save zippeurfou/6229a07c53a714da6386 to your computer and use it in GitHub Desktop.
Getting the particule level from 1999 to 2013 in R insipired from the EDA class in coursera

Air pollution USA

Wednesday, December 10, 2014

##We will be looking at the pollution of the air in the USA

Data are extracted from http://www.epa.go.

#load tons of lib I'm going to use
library(dplyr)
library(ggplot2)
library(lazyeval)
library(RCurl)
library(xkcd)
library(maps)
data(state.fips)

##get the data
link2013<-"http://www.epa.gov/ttn/airs/airsaqs/detaildata/501files/RD_501_88101_2013.zip"
link1999<-"http://www.epa.gov/ttn/airs/airsaqs/detaildata/501files/Rd_501_88101_1999.Zip"
temp <- tempfile()
download.file(link2013,temp, mode="wb")
unzip(temp, "RD_501_88101_2013-0.txt")
temp2013 <- read.table("RD_501_88101_2013-0.txt", sep="|",comment.char = "#", header=F,na.strings = "")
download.file(link1999,temp, mode="wb")
unzip(temp, "RD_501_88101_1999-0.txt")
temp1999<- read.table("RD_501_88101_1999-0.txt", sep="|",comment.char = "#", header=F,na.strings = "")
cnames <- readLines("RD_501_88101_2013-0.txt",n=1)
cnames<-strsplit(cnames,"|",fixed=T)
cnames[[1]][1]<-"RD"
names(temp2013)<-make.names(cnames[[1]])
cnames <- readLines("RD_501_88101_1999-0.txt",n=1)
cnames<-strsplit(cnames,"|",fixed=T)
cnames[[1]][1]<-"RD"
names(temp1999)<-make.names(cnames[[1]])

Now we have the data formated the correct way, let's see the change between 1999 and 2013.

#play with the data
temp1999.bystate<-temp1999 %>% filter(!is.na(State.Code),State.Code %in% state.fips$fips,!is.na(Sample.Value),as.numeric(Sample.Value)>0) %>% select(State.Code,Sample.Value) %>% arrange(State.Code)
temp2013.bystate<-temp2013 %>% filter(!is.na(State.Code),State.Code %in% state.fips$fips,!is.na(Sample.Value),as.numeric(Sample.Value)>0) %>% select(State.Code,Sample.Value) %>% arrange(State.Code)
temp1999.bystate<- temp1999.bystate %>% mutate(year="1999")
temp2013.bystate<- temp2013.bystate %>% mutate(year="2013")
temp.all<-rbind(temp1999.bystate,temp2013.bystate)
temp.all$State.Code <- state.fips$abb[match(temp.all$State.Code, state.fips$fips)]
temp.all$State.Code<-as.factor(temp.all$State.Code)
temp.all$year<-as.factor(temp.all$year)
temp.all$Sample.Value<-as.numeric(temp.all$Sample.Value)

ggplot(data=temp.all,aes(x=State.Code,y=Sample.Value,fill=year))+geom_boxplot()+scale_y_continuous(breaks=seq(0,30,1))+coord_cartesian(ylim = c(0,30))+ylab("Value")+xlab("State Code")+theme(text=element_text(size = 18),axis.text.x = element_text(angle = 45, hjust = 1))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment