Skip to content

Instantly share code, notes, and snippets.

@jonocarroll
Created October 12, 2022 00:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jonocarroll/0a0c69100d972f6cce798a5bf6e70518 to your computer and use it in GitHub Desktop.
Save jonocarroll/0a0c69100d972f6cce798a5bf6e70518 to your computer and use it in GitHub Desktop.
string sorting in R
splt <- function(s) {
x <- strsplit(s, "-")
s[order(sapply(x, `[[`, 1), as.integer(sapply(x, `[[`, 2)))]
}
splt_radix <- function(s) {
x <- strsplit(s, "-")
s[order(sapply(x, `[[`, 1), as.integer(sapply(x, `[[`, 2)), method = "radix")]
}
subs <- function(s) {
splpos <- regexpr("-", s, fixed = TRUE)
pt1 <- substr(s, 1, splpos - 1)
pt2 <- substr(s, splpos+1, 100000)
o <- order(pt1, as.integer(pt2), method = "radix")
s[o]
}
set.seed(1)
vec <- expand.grid(
c("aa", "ab", "ba", "bb", "ac", "ca", "cc"),
1:200
) |>
tidyr::unite(x, sep = "-") |>
dplyr::pull() |>
sample()
tst <- c("aa-2", "ab-100", "aa-10", "ba-25", "ab-3")
bench::mark(
splt(tst),
splt_radix(tst),
subs(tst)
)
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 splt(tst) 23.53µs 25.1µs 38233. 0B 53.6
#> 2 splt_radix(tst) 23µs 24.2µs 40097. 0B 52.2
#> 3 subs(tst) 9.72µs 10.8µs 88700. 0B 53.3
bench::mark(
splt(vec),
splt_radix(vec),
subs(vec)
)
#> # A tibble: 3 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 splt(vec) 1.56ms 1.59ms 623. 131KB 23.7
#> 2 splt_radix(vec) 1.08ms 1.12ms 876. 131KB 35.5
#> 3 subs(vec) 165.52µs 173.51µs 5706. 88KB 8.44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment