Skip to content

Instantly share code, notes, and snippets.

@alekrutkowski
Last active January 12, 2024 11:37
Show Gist options
  • Save alekrutkowski/54e16d38d5c260372717a47e7a9ac7ba to your computer and use it in GitHub Desktop.
Save alekrutkowski/54e16d38d5c260372717a47e7a9ac7ba to your computer and use it in GitHub Desktop.
Concise/terse filtering R data.table rows by condition within groups (inside a magrittr)
library(data.table)
library(magrittr)
## Pattern:
# DT[DT[,row_filtering_expression, by=grouping_column][[2]]]
## or generally
# DT[DT[,row_filtering_expression, by=.(grouping_col1,grouping_col2,etc.)][[number_of_grouping_cols+1]]]
## Example:
data.table(a=c(1,1,1,2,2),
b=c(1,0,3,-1,5),
v=1:5) %>%
print() %>%
# a b v
# 1: 1 1 1
# 2: 1 0 2
# 3: 1 3 3
# 4: 2 -1 4
# 5: 2 5 5
.[.[,b==min(b), by=a][[2]]] %>%
print()
# a b v
# 1: 1 0 2
# 2: 2 -1 4
## A helper function:
filterDTrowsWithinGroups <-
function(DT, row_filtering_expression, by, dot_is_list=TRUE)
eval(bquote({
if (dot_is_list)
. <- list # data.table's alias
DT[DT[, .(substitute(row_filtering_expression))
, by = .(substitute(by))]
[[if (is.list(.(substitute(by))))
length(.(substitute(by))) + 1
else
2]]]
}))
## Usage example (the same as before):
data.table(a=c(1,1,1,2,2),
b=c(1,0,3,-1,5),
v=1:5) %>%
filterDTrowsWithinGroups(b==min(b), by=a) %>%
print()
## Returns again:
# a b v
# 1: 1 0 2
# 2: 2 -1 4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment