Created

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

XML package example

View advent_XML.R
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
doInstall <- TRUE
toInstall <- c("XML")
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")}
lapply(toInstall, library, character.only = TRUE)
 
myURL <- "http://en.wikipedia.org/wiki/United_States_presidential_election,_2012"
 
allTables <- readHTMLTable(myURL)
str(allTables) # Look at the allTables object to find the specific table we want
stateTable <- allTables[[14]] # We want the 14th table in the list (maybe 13th?)
head(stateTable)
 
# Clean up:
stateTable <- stateTable[1:(nrow(stateTable)-2), ] # Drop summary lines
stateTable$State <- do.call(rbind, strsplit(as.character(stateTable$State), "\\["))[, 1]
stateTable$State[stateTable$State == "District of ColumbiaDistrict of Columbia"] <- "District of Columbia"
whichAreNumeric <- colMeans(apply(stateTable, 2, function(cc){
regexpr(",", cc) != -1})) > 0
stateTable[, whichAreNumeric] <- apply(stateTable[, whichAreNumeric], 2, function(cc){
as.numeric(gsub(",", "", cc))})
 
# Display in order of Obama's proportion of the vote:
stateTable[, c("State", "Obama", "Romney")][with(stateTable, order(Obama/Total)), ]

Should that be table 14? Table 13 appears to just be the summary "table" right above it.

Using table 13:

stateTable 2 obs. of 1 variables

                          V1

1 States/districts won by Obama
2 States/districts won by Romney

Owner

Thanks! Fixed the Gist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.