Skip to content

Instantly share code, notes, and snippets.

@rrblogdatascience
Last active August 29, 2015 14:05

Revisions

  1. rrblogdatascience revised this gist Sep 21, 2014. No changes.
  2. rrblogdatascience created this gist Aug 24, 2014.
    10 changes: 10 additions & 0 deletions gistfile1.sql
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,10 @@
    SELECT data.*, (madlib.closest_column(centroids, points)).column_id as cluster_id
    FROM public.iris_data as data,
    (SELECT centroids
    FROM madlib.kmeanspp('iris_data', 'points',
    <Parameters.K>,
    <Parameters.distance function>,
    <Parameters.aggregation method>,
    <Parameters.max number of iterations>,
    <Parameters.min frac reassigned >)) as centroids
    ORDER BY data.pid