Skip to content

Instantly share code, notes, and snippets.

@izahn
Created July 18, 2019 18:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save izahn/5ad990ba00a6a32100fbb2cbf2b54501 to your computer and use it in GitHub Desktop.
Save izahn/5ad990ba00a6a32100fbb2cbf2b54501 to your computer and use it in GitHub Desktop.
A quick comparison between fst and rds IO characteristics
library(fst)
n <- 1000000
d1 <- as.data.frame(c(replicate(50, sample(letters[1:10], n, replace = TRUE), simplify = FALSE),
replicate(50, runif(n), simplify = FALSE)))
tmpf <- tempfile()
rw_time <- function(compression, file, write.fun, read.fun) {
data.frame(compression = compression,
writeTime = system.time(write.fun(d1, file, compress = compression))[[3]],
fileSize = paste(round(file.size(file)/1024^2, 2), "Mb"),
readTime = system.time(read.fun(file))[[3]])
}
## fst
do.call(
rbind,
lapply(seq(50, 100, 10),
rw_time,
write.fun = write_fst,
read.fun = read_fst,
file = tmpf))
# compression writeTime fileSize readTime
# 1 50 0.248 430.42 Mb 0.119
# 2 60 0.319 397.7 Mb 0.176
# 3 70 0.614 364.1 Mb 0.147
# 4 80 1.217 329.85 Mb 0.138
# 5 90 3.980 294.82 Mb 0.149
# 6 100 5.985 260.24 Mb 0.226
## rds
do.call(
rbind,
lapply(c(FALSE, TRUE),
rw_time,
write.fun = saveRDS,
read.fun = readRDS,
file = tmpf))
# compression writeTime fileSize readTime
# 1 FALSE 0.551 572.23 Mb 0.52
# 2 TRUE 48.262 284.44 Mb 2.28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment