Skip to content

Instantly share code, notes, and snippets.

@bwv988
Last active August 29, 2015 14:28
Show Gist options
  • Save bwv988/d6fb77ae2ac3a8c146ba to your computer and use it in GitHub Desktop.
Save bwv988/d6fb77ae2ac3a8c146ba to your computer and use it in GitHub Desktop.
# XML package demo.
# Create barplot of distribution of first letters
# in the Mondial database.
# RS15082015
library(XML)
library(ggplot2)
# Load data.
xml <- xmlTreeParse("~/Temp/mondial-3.0.xml", useInternalNodes = TRUE)
# Find what we are looking for.
root <- xmlRoot(xml)
city.names <- unlist(xpathApply(root, "//country/city/name", xmlValue, trim = TRUE))
# Create temporary data frame for better handling.
counts.df <- data.frame(letter = LETTERS,
count = sapply(LETTERS, function(L) {
length(city.names[grepl(paste("^", L, sep = ""), city.names)])
}))
# Now make a nice looking barplot.
p <- ggplot(counts.df, aes(x = factor(letter), y = count,
fill = rep(c("1", "2"), 13))) +
geom_bar(stat = "identity")
p <- p + labs(title = "Empirical Distribution of First Letters",
x = "Letters", y = "Count", fill = NULL)
p <- p + scale_fill_brewer(palette = "Set1") + guides(fill = FALSE)
plot(p)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment