Skip to content

Instantly share code, notes, and snippets.

@wrathematics
Created June 15, 2021 17:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wrathematics/17f1091f5016b73fb0729d1311219b5b to your computer and use it in GitHub Desktop.
Save wrathematics/17f1091f5016b73fb0729d1311219b5b to your computer and use it in GitHub Desktop.
out of core example with sqlite
library(DBI)
library(RSQLite)
# create fake data
set.seed(1234)
n = 100
big_tbl = data.frame(
ind = 1:n,
x = runif(n),
y = rnorm(n)
)
# write table to disk
db = dbConnect(RSQLite::SQLite(), "/tmp/db.sqlite")
dbWriteTable(db, "big_tbl", big_tbl)
# process chunks - in principle, can be done in parallel
rows_per_chunk = 7
chunks = ceiling(n / rows_per_chunk)
for (chunk in 1:chunks){
ind_low = (chunk-1) * rows_per_chunk + 1
ind_high = ind_low + rows_per_chunk
query = paste("SELECT * FROM big_tbl WHERE ind >=", ind_low, "AND ind <", ind_high)
sub_tbl = dbGetQuery(db, query)
print(sub_tbl)
}
# close connection
dbDisconnect(db)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment