Skip to content

Instantly share code, notes, and snippets.

@briatte
Created October 19, 2013 09:36
Show Gist options
  • Save briatte/7053664 to your computer and use it in GitHub Desktop.
Save briatte/7053664 to your computer and use it in GitHub Desktop.
weighted quantile functions that actually do NOT return exactly the same output: reldist often returns the same thing for any value of its quantile argument
> Hmisc::wtd.quantile
function (x, weights = NULL, probs = c(0, 0.25, 0.5, 0.75, 1),
type = c("quantile", "(i-1)/(n-1)", "i/(n+1)", "i/n"), normwt = FALSE,
na.rm = TRUE)
{
if (!length(weights))
return(quantile(x, probs = probs, na.rm = na.rm))
type <- match.arg(type)
if (any(probs < 0 | probs > 1))
stop("Probabilities must be between 0 and 1 inclusive")
nams <- paste(format(round(probs * 100, if (length(probs) >
1) 2 - log10(diff(range(probs))) else 2)), "%", sep = "")
if (type == "quantile") {
w <- wtd.table(x, weights, na.rm = na.rm, normwt = normwt,
type = "list")
x <- w$x
wts <- w$sum.of.weights
n <- sum(wts)
order <- 1 + (n - 1) * probs
low <- pmax(floor(order), 1)
high <- pmin(low + 1, n)
order <- order%%1
allq <- approx(cumsum(wts), x, xout = c(low, high), method = "constant",
f = 1, rule = 2)$y
k <- length(probs)
quantiles <- (1 - order) * allq[1:k] + order * allq[-(1:k)]
names(quantiles) <- nams
return(quantiles)
}
w <- wtd.Ecdf(x, weights, na.rm = na.rm, type = type, normwt = normwt)
structure(approx(w$ecdf, w$x, xout = probs, rule = 2)$y,
names = nams)
}
<environment: namespace:Hmisc>
> reldist::wtd.quantile
function (x, q = 0.5, na.rm = FALSE, weight = FALSE)
{
if (mode(x) != "numeric")
stop("need numeric data")
x <- as.vector(x)
wnas <- is.na(x)
if (sum(wnas) > 0) {
if (na.rm)
x <- x[!wnas]
if (!missing(weight)) {
weight <- weight[!wnas]
}
else return(NA)
}
n <- length(x)
half <- (n + 1)/2
if (n%%2 == 1) {
if (!missing(weight)) {
weight <- weight/sum(weight)
sx <- sort.list(x)
sweight <- cumsum(weight[sx])
min(x[sx][sweight >= q])
}
else {
x[order(x)[half]]
}
}
else {
if (!missing(weight)) {
weight <- weight/sum(weight)
sx <- sort.list(x)
sweight <- cumsum(weight[sx])
min(x[sx][sweight >= 0.5])
}
else {
half <- floor(half) + 0:1
sum(x[order(x)[half]])/2
}
}
}
<environment: namespace:reldist>
@xrobin
Copy link

xrobin commented Nov 12, 2014

Thanks for checking this. It looks like reldist ignores the q argument when passing an even number of elements in x. See the line

min(x[sx][sweight >= 0.5])

I would trust Hmisc more here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment