Skip to content

Instantly share code, notes, and snippets.

@jmbarbone
Last active September 28, 2021 01:33
Show Gist options
  • Save jmbarbone/ccba55156bdc322fa15a27596e930dc7 to your computer and use it in GitHub Desktop.
Save jmbarbone/ccba55156bdc322fa15a27596e930dc7 to your computer and use it in GitHub Desktop.
Playing around with percentile ranks
library(dplyr)
percentile_rank <- function(x, na.rm = TRUE) {
p <- lengths(split(x, x)) / length(if (na.rm) na.omit(x) else x)
(cumsum(p) - p * 0.5)[match(x, sort.int(unique(x)))] * 100
}
x <- runif(11)
tibble(x, order(x), percent_rank(x), percentile_rank(x))
# Example table from wiki
# https://en.wikipedia.org/wiki/Percentile_rank
x <- rep(7:1, c(1, 0, 2, 2, 3, 1, 1))
ns <- sapply(7:1, function(i) sum(x == i))
cs <- rev(cumsum(ns))
pr <- percentile_rank(x)
pr <- pr[as.character(7:1)]
tibble(x = 7:1, ns, cs, pr)
# one issue is that '6' is missing from x and therefore doesn't get a score
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment