Skip to content

Instantly share code, notes, and snippets.

@AlbanSagouis
Created January 31, 2022 15:04
Show Gist options
  • Save AlbanSagouis/a25ee83107fc6afe97545cdbce59509c to your computer and use it in GitHub Desktop.
Save AlbanSagouis/a25ee83107fc6afe97545cdbce59509c to your computer and use it in GitHub Desktop.
microbenchmarking data.table::uniqueN(x) vs. length(unique(x))
library(microbenchmark)
nrows <- 10^5
tst <- data.table::data.table(
char = c("bb", "ds", "ok", "pb")[sample(1:4, nrows, replace = TRUE)]
)
microbenchmark(times = 1000L,
"base" = tst[, length(unique(char))],
"data.table" = tst[, data.table::uniqueN(char)]
)
# data.table alternative faster (+-2) only above 10^6 rows
@AlbanSagouis
Copy link
Author

Unit: milliseconds

expr min lq mean median uq max neval
base 1.3939 2.0471 2.904923 2.42835 2.99170 21.9820 1000
data.table 1.5846 2.5092 2.936799 2.87310 3.15655 16.8523 1000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment