Skip to content

Instantly share code, notes, and snippets.

@jlopezper
Last active August 26, 2020 17:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jlopezper/c33d7996d578891830e6632dd455757e to your computer and use it in GitHub Desktop.
Save jlopezper/c33d7996d578891830e6632dd455757e to your computer and use it in GitHub Desktop.
library(data.table)
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
library(microbenchmark)
library(ggplot2)
set.seed(123)
df <- data.frame(
letters = sample(c('A', 'B', 'C'), size = 3e6, replace = TRUE, prob = c(.5, .4, .1)),
number = sample(c(1:5), size = 3e6, replace = TRUE)
)
dpt <- lazy_dt(df)
dt <- data.table(df)
mm <-
microbenchmark(
base = aggregate(df["number"], df["letters"], mean),
dplyr = df %>% group_by(letters) %>% summarise(mean(number)),
DT = dt[, mean(number), by = letters],
dtplyr = dpt %>% group_by(letters) %>% summarise(mean(number)) %>% as.data.table()
)
autoplot(mm)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment