Skip to content

Instantly share code, notes, and snippets.

@josefslerka
Created April 8, 2012 07:32
Show Gist options
  • Save josefslerka/2335620 to your computer and use it in GitHub Desktop.
Save josefslerka/2335620 to your computer and use it in GitHub Desktop.
Jak zklastrovat uzivatele Twitteru podle podobnosti v siti a jak je vykreslit
require("igraph")
g <- read.graph("listalumia.txt", format="ncol", directed=TRUE)
# funkce similirarity vychazi z stuie Friends and Neighbors on the Web (http://www.hpl.hp.com/research/idl/papers/web10/fnn2.pdf)
m <-similarity.dice(g)
colnames(m)=c(V(g)$name)
rownames(m)=colnames(m)
d <- dist(m, method = "euclidean") # distance matrix
fit <- hclust(d, method="ward")
plot(fit) # display dendogram
groups <- cutree(fit, k=5) # cut tree into 5 clusters
rect.hclust(fit, k=5, border="red")
# dalsi moznost Classical MDS
# http://www.statmethods.net/advstats/mds.html
# N rows (objects) x p columns (variables)
# each row identified by a unique row name
d <- dist(m) # euclidean distances between the rows
fit <- cmdscale(d,eig=TRUE, k=2) # k is the number of dim
fit # view results
# plot solution
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Metric MDS", type="n")
text(x, y, labels = row.names(m), cex=.7)
# dalsi moznost Nonmetric MDS
# http://www.statmethods.net/advstats/mds.html
# Nonmetric MDS
# N rows (objects) x p columns (variables)
# each row identified by a unique row name
library(MASS)
d <- dist(m) # euclidean distances between the rows
fit <- isoMDS(d, k=2) # k is the number of dim
fit # view results
# plot solution
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Nonmetric MDS", type="n")
text(x, y, labels = row.names(m), cex=.7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment