Skip to content

Instantly share code, notes, and snippets.

View GhoshSrinjoy's full-sized avatar

Srinjoy Ghosh GhoshSrinjoy

View GitHub Profile
We can make this file beautiful and searchable if this error is corrected: It looks like row 9 should actually have 7 columns, instead of 2 in line 8.
Algorithm,Type,Cluster shape bias,Outliers,Scalability,Key parameters,Best for
K-Means,Centroid,"Spherical/convex",Sensitive,Excellent,"k","Large datasets with roughly spherical clusters"
K-Means++,Centroid,"Spherical/convex",Sensitive,Excellent,"k","Same as K-Means; better, stabler initialization"
Mini-Batch K-Means,Centroid,"Spherical/convex",Sensitive,"Excellent (streaming)","k, batch_size","Very large/streaming data; fast approximate K-Means"
K-Medoids (PAM),"Centroid (medoids)","Spherical/convex",Robust,Poor,"k","Outliers; non-Euclidean distances; representative exemplars"
Agglomerative (generic),Hierarchical,"Flexible (linkage-dependent)",Depends,"Poor–Moderate","linkage, distance","Hierarchical structure; small–medium data"
Ward linkage,Hierarchical,"Spherical/compact",Moderate,Moderate,distance,"Default hierarchical choice for compact clusters"
BIRCH,"Hierarchical (CF-tree)","Spherical/compact",Moderate,Excellent,"threshold, branching factor","Very large datasets; incremental/streaming"
DBSCAN,Density