Skip to content

Instantly share code, notes, and snippets.

@simonwmeng
Last active February 16, 2017 21:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simonwmeng/97f99e7d319f980ed2f3036d873f95a3 to your computer and use it in GitHub Desktop.
Save simonwmeng/97f99e7d319f980ed2f3036d873f95a3 to your computer and use it in GitHub Desktop.
KDE scatterplot
library(data.table)
library(ggplot2)
library(ks)
# This assumes that you have a data.table, stored in `dt.data`, with columns `x` and `y`.
get.densities <- function (x, y) {
# You can also define your own bandwidth matrix or generate one using Hpi().
bw.x <- abs(do.call('-', as.list(range(x)))) / 10
bw.y <- abs(do.call('-', as.list(range(y)))) / 10
H <- diag(c(bw.x, bw.y), nrow = 2, ncol = 2)
kde.res <- kde(cbind(x, y), eval.points = cbind(x, y), H = H)
kde.res$estimate / max(kde.res$estimate)
}
dt.data[, density := get.densities(x, y)]
# Ensure that higher densities are plotted last (i.e., on top of everything else).
setorder(dt.data, density)
# Drop duplicate x and y values -- plotting them is pointless.
dt.data <- unique(dt.data, by = c('x', 'y'))
# Plot away. Remember to add a colour scale that you like!
ggplot(dt.data, aes(x = x, y = y)) + geom_point(aes(colour = density)) + scale_colour_gradientn(colours = c(grey(0.85), 'black'), breaks = pretty)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment