Skip to content

Instantly share code, notes, and snippets.

@rjpower
Created March 20, 2012 21:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjpower/2141707 to your computer and use it in GitHub Desktop.
Save rjpower/2141707 to your computer and use it in GitHub Desktop.
Box plot a distribution of sleeping habits
library("ggplot2")
library("reshape")
getData = function() {
SOURCE=Sys.glob('/home/power/.thunderbird/*/*/*/*/Sent Mail')
return(readLines(SOURCE))
}
getMatches = function(data) {
matched_lines = grep('^Date:.*, .*-0\\d+', data, value=T)
return(gsub('Date:.*, (.*-0\\d+).*', '\\1', matched_lines))
}
lines = getData()
matches = getMatches(lines)
dates = lapply(matches, function(f) {strptime(f, format="%d %b %Y %H:%M:%S %z")})
df = ldply(dates, unlist)
df$year = df$year + 1900
df_counts = count(df, vars=c("year", "hour"))
for (year in df_counts$year) {
m = df_counts$year == year
df_counts$freq[m] = df_counts$freq[m] / sum(df_counts$freq[m])
}
p = ggplot(df_counts, aes(x=year, y=hour)) +
scale_x_continuous(limits=c(2004, 2012)) +
scale_y_datetime() +
geom_tile(aes(fill=freq)) +
scale_fill_gradient()
show(p)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment