Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@jamesweakley
Last active August 21, 2019 10:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jamesweakley/edd9edfa3d9cffe3134e8e0f83670844 to your computer and use it in GitHub Desktop.
Save jamesweakley/edd9edfa3d9cffe3134e8e0f83670844 to your computer and use it in GitHub Desktop.
-- The purpose of WHICH_CLUSTER is to assign each input record to the closest cluster out of a given list of clusters.
-- This forms part of the k-means clustering algorithm.
-- For each pair of values, the Euclidian Distance formula is used to determine the closest cluster,
-- and the index of that cluster is returned.
create or replace function WHICH_CLUSTER(X float, Y float, CLUSTER_CENTROIDS variant)
returns float
language javascript
AS '
function euclidianDistance(x1,x2,y1,y2){
return Math.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2);
}
var clusterIds=Object.keys(CLUSTER_CENTROIDS);
var winningClusterIndex=clusterIds[0];
let cluster=CLUSTER_CENTROIDS[winningClusterIndex];
let distance;
let clusterId;
var winningClusterDistance=euclidianDistance(cluster.x,X,cluster.y,Y);
// compare all clusters, starting from the second cluster id
for (var clusterIdIndex=1; clusterIdIndex<clusterIds.length;clusterIdIndex++){
clusterId=clusterIds[clusterIdIndex];
cluster_centroid=CLUSTER_CENTROIDS[clusterId];
distance=euclidianDistance(cluster_centroid.x,X,cluster_centroid.y,Y);
if (distance<winningClusterDistance){
winningClusterIndex=clusterId;
winningClusterDistance=distance;
}
}
return winningClusterIndex;
';
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment