Skip to content

Instantly share code, notes, and snippets.

@gokceneraslan
Created June 16, 2014 14:32
Show Gist options
  • Save gokceneraslan/c5a09fb05872a7928d38 to your computer and use it in GitHub Desktop.
Save gokceneraslan/c5a09fb05872a7928d38 to your computer and use it in GitHub Desktop.
Code snippet in R to produce clustering results through parallel coordinates
# labels is a NxL data.frame where N=number of observations being clustered and
# L = number of clustering results (number of vertical lines)
plot.label.distribution <- function(labels,
groupColumn,
alphaLines = 1/10,
useSplines = T,
showPoints = T,
scale='uniminmax', ...) {
library(GGally)
library(ggplot2)
N <- nrow(labels)
L <- ncol(labels)
if (missing(groupColumn)) groupColumn <- L
labels[,(L+1):(2*L)] <- lapply(labels, function(l)jitter(as.numeric(l))) #add noise
colnames(labels)[(L+1):(2*L)] <- as.character(seq_len(L))
ggp <- ggparcoord(labels, L+seq(L), scale=scale, groupColumn=groupColumn,
alphaLines = alphaLines,
showPoints = showPoints,
# useSplines = useSplines,
...) +
xlab('Number of clusters') +
ylab('Observation memberships') +
guides(colour = F) +
theme_minimal() +
theme(axis.ticks = element_blank(), axis.text.y = element_blank())
# if(useSplines) ggp <- ggp + scale_x_continuous(breaks=seq(L))
ggp
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment