Skip to content

Instantly share code, notes, and snippets.

@pbaylis
Created May 27, 2019 18:31
Show Gist options
  • Save pbaylis/3b736f67aa239ba993b9674f5b5496bc to your computer and use it in GitHub Desktop.
Save pbaylis/3b736f67aa239ba993b9674f5b5496bc to your computer and use it in GitHub Desktop.
Apply multiple functions to multiple columns (with data.table)
my.summary = function(x) list(mean = mean(x), median = median(x))
DT[, as.list(unlist(lapply(.SD, my.summary))), .SDcols = c('a', 'b')]
@pbaylis
Copy link
Author

pbaylis commented May 27, 2019

N.B.: This can be slow for very large datasets – the as.list(unlist()) formulation is the culprit. Another formulation, which is a bit more code-heavy but better in terms of performance, is to melt the relevant columns and compute on the melted data.table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment