Last active
December 9, 2016 21:00
-
-
Save arvi1000/498626a221e2ccdca8580e95ec68d2d2 to your computer and use it in GitHub Desktop.
A setcols function for data.table
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# I love data.table, but I find the syntax for "mutating" columns a little clunky for such a common task | |
# I wonder if it would be useful to have a setcols() function? | |
# Takes advantage of data.table's pass-by-reference | |
library(data.table) | |
setcols <- function(my_dt, cols, my_fun) { | |
# validate inputs | |
stopifnot('data.table' %in% class(my_dt), | |
'character' == class(cols), | |
'function' == class(my_fun)) | |
# apply my_fun to cols in my_dt | |
for(j in cols) set(my_dt, j = j, value = my_fun(my_dt[[j]])) | |
} | |
# Now you can do things like this. | |
# ...given a data.table of mixed types | |
dat <- data.table(letters = c('a', 'b', 'c', 'd', 'e'), | |
fruits = c('apple', 'banana', 'carrot', 'durian', 'elderberry'), | |
num1 = rnorm(5), | |
num2 = seq(10, 50, 10)) | |
# ...change some data types | |
setcols(dat, c('letters', 'fruits'), as.factor) | |
# ...transform some numbers | |
setcols(dat, c('num1', 'num2'), function(x) x*2 + 5) | |
# This simple wrapper seems to stay idiomatic to data.table, which already has | |
# functions setDT and setnames which modify the data.table passed to them | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Not sure
setcols
is the right name. Maybemutate_cols
but that's dplyr talk. Seems like data.table needs something in theset[...]
family