Skip to content

Instantly share code, notes, and snippets.

@rreas
Created June 2, 2012 17:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rreas/2859355 to your computer and use it in GitHub Desktop.
Save rreas/2859355 to your computer and use it in GitHub Desktop.
Euclidean Distance K-Means Clustering
function [U,V,idx] = distkmeans(X,k,tol,imax)
[d,n] = size(X);
U = zeros(d,k);
V = zeros(k,n);
% random clusters.
for j = 1:n
V(randi(k),j) = 1;
end
% monitor convergence.
olderr = 0;
for iter = 0:imax
% recompute assignments.
if iter > 0
for i = 1:n
dists = sum(bsxfun(@minus, U, X(:,i)).^2, 1);
[~, ix] = min(dists);
col = zeros(k,1);
col(ix) = 1;
V(:,i) = col;
end
end
% monitor cost.
newerr = 0;
% update or initialize cluster vectors.
for i = 1:k
ix = find(V(i,:) > 0);
c = mean(X(:,ix), 2);
U(:,i) = c;
for ixi = 1:numel(ix)
newerr = newerr + sum((X(:,ix(ixi)) - c).^2);
end
end
newerr = sqrt(newerr);
fprintf('Iteration %f\tCost function: %f\n', iter, newerr);
if newerr > 0 && olderr > 0 && olderr - newerr < tol
break;
end
olderr = newerr;
end
% which clusters?
[~,idx] = max(V);
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment