Skip to content

Instantly share code, notes, and snippets.

@phipsgabler
Last active February 27, 2017 17:02
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save phipsgabler/91a81883a82a54bb6a92 to your computer and use it in GitHub Desktop.
Save phipsgabler/91a81883a82a54bb6a92 to your computer and use it in GitHub Desktop.
A new stat for ggplot2 creating the comparison line in a qq-plot.
library(ggplot2)
qq.line <- function(data, qf, na.rm) {
# from stackoverflow.com/a/4357932/1346276
q.sample <- quantile(data, c(0.25, 0.75), na.rm = na.rm)
q.theory <- qf(c(0.25, 0.75))
slope <- diff(q.sample) / diff(q.theory)
intercept <- q.sample[1] - slope * q.theory[1]
list(slope = slope, intercept = intercept)
}
StatQQLine <- ggproto("StatQQLine", Stat,
# http://docs.ggplot2.org/current/vignettes/extending-ggplot2.html
# https://github.com/hadley/ggplot2/blob/master/R/stat-qq.r
required_aes = c('sample'),
compute_group = function(data, scales,
distribution = stats::qnorm,
dparams = list(),
na.rm = FALSE) {
qf <- function(p) do.call(distribution, c(list(p = p), dparams))
n <- length(data$sample)
theoretical <- qf(stats::ppoints(n))
qq <- qq.line(data$sample, qf = qf, na.rm = na.rm)
line <- qq$intercept + theoretical * qq$slope
data.frame(x = theoretical, y = line)
}
)
stat_qqline <- function(mapping = NULL, data = NULL, geom = "line",
position = "identity", ...,
distribution = stats::qnorm,
dparams = list(),
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE) {
layer(stat = StatQQLine, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(distribution = distribution,
dparams = dparams,
na.rm = na.rm, ...))
}
test.data <- data.frame(sample=rnorm(100, 10, 2))
test.data.2 <- data.frame(sample=rt(100, df=2))
Copy link

ghost commented Apr 28, 2016

Great work!

Is it possible to implement geom_ribbon like in http://stackoverflow.com/a/27191036 ? From that code I made such:

 compute_group = function(data, scales,
                             distribution = stats::qnorm,
                             dparams = list(),
                             na.rm = FALSE,
                             conf = 0,95) {
        qf <- function(p) do.call(distribution, c(list(p = p), dparams))
        df <- eval(parse(text = paste0("d", distribution))) 
        n <- length(data$sample)
        theoretical <- qf(stats::ppoints(n))
        qq <- qq.line(data$sample, qf = qf, na.rm = na.rm)
        line <- qq$intercept + theoretical * qq$slope
        zz <- qf(1 - (1 - conf)/2)
        SE <- (slope/df(theoretical)) * sqrt(ppoints(n) * (1 - ppoints(n))/n)
        ymax <- line + zz * SE
        ymax <- line - zz * SE
        data.frame(x = theoretical, y = line, ymax = ymax, ymin = ymin)
    } 

But I don't understand what to do further. The main advantage of Your variant is that it can be facetted, so I can view normality by groups and decide if it is possible to make ANOVA. With this aim in view ribbon (like envelope in car::qqPlot) can be very useful.

@phipsgabler
Copy link
Author

@UlvHare Hi, I'm sorry to respond so late, but apparently one doesn't get notified on gist comments...

Anyway, maybe I can find time to work on this next month. Probably one would have to clone some stuff from geom_ribbon into the functions, since geom_line, which I used, by default doesn't allow to specify an additional ribbon. If you want to try it out yourself, this is actually the only ressource I've used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment