Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
programmatically filter data.table
target <- "col1"
value <- 15L
library(data.table)
set.seed(123)
n <- 1e7
dt <- data.table(col1 = sample(1L:30L, n, TRUE), col2 = sample(letters, n, TRUE), col3 = sample(letters, n, TRUE), col4 = sample(letters, n, TRUE))
cl <- substitute(
x <= y,
list(x = as.name(target), y = value)
)
cl2 <- call("<=", as.name(target), value)
system.time(dt[col1 <= 15L])
# user system elapsed
# 0.389 0.088 0.477
system.time(dt[dt[[target]] <= value])
# user system elapsed
# 0.389 0.084 0.474
system.time(dt[get(target) <= value])
# user system elapsed
# 0.433 0.072 0.511
system.time(dt[eval(cl)])
# user system elapsed
# 0.372 0.096 0.469
system.time(dt[eval(cl2)])
# user system elapsed
# 0.387 0.080 0.469
system.time(dt[eval(as.name(target)) <= value])
# user system elapsed
# 0.385 0.083 0.470
@DavidArenburg
Copy link

DavidArenburg commented May 25, 2015

Tested on n <- 1e8, still getting pretty much same results

> system.time(dt[col1 <= 15L])
   user  system elapsed 
   1.90    0.28    2.22 
> system.time(dt[dt[[target]] <= value])
   user  system elapsed 
   1.78    0.37    2.15 
> system.time(dt[get(target) <= value])
   user  system elapsed 
   1.94    0.36    2.36 
> system.time(dt[eval(cl)])
   user  system elapsed 
   1.84    0.35    2.18 
> system.time(dt[eval(cl2)])
   user  system elapsed 
   1.84    0.35    2.19 
> system.time(dt[eval(as.name(target)) <= value])
   user  system elapsed 
   1.82    0.36    2.19 

@jangorecki
Copy link
Author

jangorecki commented May 25, 2015

Thanks for 1e8. I will leave here also solution to build final quoted expression which can be evaluated later with expression correctly catched in parent function:

substitute(dt[cl], list(cl = evalq(cl)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment