Skip to content

Instantly share code, notes, and snippets.

View pdwaggoner's full-sized avatar

Philip Waggoner pdwaggoner

View GitHub Profile
@mahito-sugiyama
mahito-sugiyama / k-meansp2.R
Last active January 2, 2021 10:20
k-means++ (written in R; with the Euclidean distance; distance computation is vectorized)
kmeansp2 <- function(x, k, iter.max = 10, nstart = 1, ...) {
n <- nrow(x) # number of data points
centers <- numeric(k) # IDs of centers
distances <- matrix(numeric(n * (k - 1)), ncol = k - 1) # distances[i, j]: The distance between x[i,] and x[centers[j],]
res.best <- list(tot.withinss = Inf) # the best result among <nstart> iterations
for (rep in 1:nstart) {
pr <- rep(1, n) # probability for sampling centers
for (i in 1:(k - 1)) {
centers[i] <- sample.int(n, 1, prob = pr) # Pick up the ith center
distances[, i] <- colSums((t(x) - x[centers[i], ])^2) # Compute (the square of) distances to the center