Skip to content

Instantly share code, notes, and snippets.

@makispl
Created February 22, 2021 07:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save makispl/913c276fa7982c36c58835edd41e9b9f to your computer and use it in GitHub Desktop.
Save makispl/913c276fa7982c36c58835edd41e9b9f to your computer and use it in GitHub Desktop.
# Switch to a new dataframe, reduced to the rows with no nulls
df_no_nuls = df.dropna().copy()
# Subset to the numerical columns we are about to use on the ML algorithm
data = df_no_nuls[['rating', 'alcohol', 'age']].copy()
# Calculate the wcss
max_clusters = 11
wcss = list()
for k in range(1, max_clusters):
kmeans = KMeans(n_clusters=k, init='k-means++', random_state=1)
kmeans.fit(data)
wcss.append(kmeans.inertia_)
# Locate the elbow
n_clusters = KneeLocator([i for i in range(1, max_clusters)], wcss, curve='convex', direction='decreasing').knee
print("Optimal # of clusters:", n_clusters)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment