Skip to content

Instantly share code, notes, and snippets.

@jimhester
Created October 5, 2015 16:43
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jimhester/a060323a05b40c6ada34 to your computer and use it in GitHub Desktop.
Save jimhester/a060323a05b40c6ada34 to your computer and use it in GitHub Desktop.
Case-insensitive joins for dplyr
a <- data.frame(letter = letters[1:10], num = 1:10)
b <- data.frame(letter = LETTERS[1:5], num2 = c(1:3, 9:10), val = 11:15)
library(dplyr)
insensitive <- function(fun = inner_join) {
new_fun <- fun
body(new_fun) <- substitute({
by <- dplyr:::common_by(by, x, y)
tmp_by_x <- paste0("_", by$x, "_")
tmp_by_y <- paste0("_", by$y, "_")
for (i in seq_along(by$x)) {
x[[tmp_by_x[[i]]]] <- tolower(x[[by$x[[i]]]])
y[[tmp_by_y[[i]]]] <- tolower(y[[by$y[[i]]]])
y[[by$y[[i]]]] <- NULL
}
res <- fun(x, y, list(x = tmp_by_x, y = tmp_by_y))
res[tmp_by_x] <- list(NULL)
res
})
new_fun
}
insensitive(inner_join)(a, b)
#> Joining by: "letter"
#> letter num num2 val
#> 1 a 1 1 11
#> 2 b 2 2 12
#> 3 c 3 3 13
#> 4 d 4 9 14
#> 5 e 5 10 15
insensitive(inner_join)(a, b,
by = list(x = c('letter', 'num'), y = c('letter', 'num2')))
#> letter num val
#> 1 a 1 11
#> 2 b 2 12
#> 3 c 3 13
insensitive(left_join)(a, b,
by = list(x = c('letter', 'num'), y = c('letter', 'num2')))
#> letter num val
#> 1 a 1 11
#> 2 b 2 12
#> 3 c 3 13
#> 4 d 4 NA
#> 5 e 5 NA
#> 6 f 6 NA
#> 7 g 7 NA
#> 8 h 8 NA
#> 9 i 9 NA
#> 10 j 10 NA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment