Skip to content

Instantly share code, notes, and snippets.

View coppeliaMLA's full-sized avatar

coppelia machine learning and analytics coppeliaMLA

View GitHub Profile
@coppeliaMLA
coppeliaMLA / busterExample2.R
Created July 8, 2014 16:07
Buster Example 2
#Another simple test
x.1<-rnorm(50, 10, 3)
y.1<-rnorm(50, 10, 3)
x.2<-rnorm(50, 20, 3)
y.2<-rnorm(50, 10, 3)
x.3<-rnorm(50, 13, 3)
y.3<-rnorm(50, 20, 3)
test.data<-data.frame(group=rep(1:3, each=50), x=c(x.1, x.2, x.3), y=c(y.1, y.2, y.3))
@coppeliaMLA
coppeliaMLA / busterExample1.R
Created July 8, 2014 16:02
Buster Example 1
#Testing on the iris data set
iris.dist<-dist(iris[,1:4])
bhc<-buster(iris.dist, n=250, k=3, size=0.66, method='ward', pct.exc=0.1)
plot(bhc)
#We see the unstable observations in pink.
cluster<-bhc$obs.eval$cluster[order(bhc$obs.eval$obs.ind)]
plot(iris[,1:4], col=6-cluster, pch = rep(15:17, each=50))
@coppeliaMLA
coppeliaMLA / visGAExamples.R
Created June 27, 2014 10:52
Examples for visualising the path of a genetic algorithm
#Maximize a mixture of multivariate normal distributions
library(mvtnorm)
mnMix&lt;-function(args){
mean.vec.d1&lt;-rep(0.3,5)
std.vec.d1&lt;-diag(rep(1,5))
mean.vec.d2&lt;-rep(1,5)
std.vec.d2&lt;-diag(rep(1.5,5))
mean.vec.d3&lt;-c(1, 5, 2, 1, 0)
std.vec.d3&lt;-diag(rep(0.5, 5))
if (args[1]&lt;0){
@coppeliaMLA
coppeliaMLA / visGAPath.R
Last active August 29, 2015 14:03
Visualising the path of a genetic algorithm
# *--------------------------------------------------------------------
# | FUNCTION: visGAPath
# | Function for visualising the path of a genetic algorithmn using
# | principal components analysis
# *--------------------------------------------------------------------
# | Version |Date |Programmer |Details of Change
# | 01 |18/04/2012|Simon Raper |first version.
# *--------------------------------------------------------------------
# | INPUTS: func The function to be optimised
# | npar The number of parameters to optimise over
@coppeliaMLA
coppeliaMLA / bagHclust.R
Created June 26, 2014 16:28
Bagging algorithm for hclust
library(reshape2)
#Bagging hierarchical clustering
bagHClust<-function(data, n, k, size, outlier.th) {
clus.bs<-NULL
for (i in 1:n) {
@coppeliaMLA
coppeliaMLA / SankeyClusComp
Last active August 29, 2015 14:03
Generates the data for comparing two clusters using a Sankey diagram
clusComp<-function(cl1, cl2, num.clus){
#Set up object for recording clusters
clus.change<-NULL
ct1<-cutree(cl1, k=num.clus)
add.1<-data.frame(size=rep(1, length(ct1)), ind=names(ct1), cluster=paste0(1, ".", ct1))
ct2<-cutree(cl2, k=num.clus)
add.2<-data.frame(size=rep(2, length(ct2)), ind=names(ct2), cluster=paste0(2, ".", ct2))
@coppeliaMLA
coppeliaMLA / compCorrMI.R
Created June 25, 2014 16:00
Look at the relationship between MI and correlation for binary vars (since it's quicker than doing the maths)
#Check the relationship between correlation and mutual information for binary vars
store<-NULL
for (i in 1:1000){
prob.1<-runif(1)
prob.2<-runif(1)
x<-rbinom(10000, 1, prob.1)
y<-rbinom(10000, 1, prob.2)
c<-cor(x,y)
m<-mi.empirical(table(x,y))
store<-rbind(store, data.frame(c=c, m=m))
@coppeliaMLA
coppeliaMLA / confusion.htm
Created June 24, 2014 07:52
Exploration of a confusion matrix using tangle.js
<!DOCTYPE html>
<html>
<head>
<title>Tangle: a JavaScript library for reactive documents</title>
<link rel="stylesheet" href="http://worrydream.com/Tangle/TangleKit/TangleKit.css" type="text/css">
<script type="text/javascript" src="http://worrydream.com/Tangle/TangleKit/mootools.js"></script>
<script type="text/javascript" src="http://worrydream.com/Tangle/TangleKit/sprintf.js"></script>
<script type="text/javascript" src="http://worrydream.com/Tangle/TangleKit/BVTouchable.js"></script>
@coppeliaMLA
coppeliaMLA / DendToForce.R
Created June 20, 2014 16:30
Converts a hclust dendrogram into a graph in JSON for input into D3
#Run hclust
hc <- hclust(dist(USArrests[1:40,]), "ave")
#Function for extracting nodes and links
extractGraph<-function(hc){
n<-length(hc$order)
m<-hc$merge
links<-data.frame(source=as.numeric(), target=as.numeric(), value=as.numeric())
@coppeliaMLA
coppeliaMLA / clusterSankey.R
Last active August 29, 2015 14:02
Visualising cluster stability using a Sankey diagram
#Sequence for adding new data
s<-seq(20,50, by=5)
#Set up object for recording clusters
clus.change<-NULL
#Cycle through the clustering solutions
for (i in s){
hc <- hclust(dist(USArrests[1:i,]), "ave")