Skip to content

Instantly share code, notes, and snippets.

@Su-Shee
Created April 15, 2013 08:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Su-Shee/5386828 to your computer and use it in GitHub Desktop.
Save Su-Shee/5386828 to your computer and use it in GitHub Desktop.
R sqldf
# now let's see if sqldf works nicely. :)
library(sqldf);
file <- '../data/cleaned-einwohnermelderegister-2011.csv'
csv <- read.csv(file, header = TRUE, sep = ';')
# rename columns like the postgres schema
colnames(csv) <- c("official_district", "district", "gender", "nationality", "age", "quantity")
# show the three largest districts by inhabitants
sqldf("SELECT district, sum(quantity) FROM csv GROUP BY district ORDER BY sum(quantity) DESC LIMIT 3")
# list by how many women per district
sqldf("SELECT district, sum(quantity) FROM csv WHERE gender = 'f' GROUP BY district ORDER BY sum(quantity) DESC LIMIT 3")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment