Skip to content

Instantly share code, notes, and snippets.

@rpietro
Created December 9, 2012 08:48
Show Gist options
  • Save rpietro/4243917 to your computer and use it in GitHub Desktop.
Save rpietro/4243917 to your computer and use it in GitHub Desktop.
Test manipulating a large, 1.8 gig dataset with SQLite and sqldf
#creating a dataset with 1.8 GIGs
setwd("~/Desktop")
bigdf <- data.frame(dim=sample(letters, replace=T, 4e7), fact1=rnorm(4e7), fact2=rnorm(4e7, 20, 50))
write.csv(bigdf, 'bigdf.csv', quote = F)
#opening the dataset and measuring performance time
setwd("~/Desktop")
library(sqldf)
f <- file("bigdf.csv")
system.time(bigdf <- sqldf("select * from f", dbname = tempfile(), file.format = list(header = T, row.names = F)))
# user system elapsed
# 246.994 16.165 281.253
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment