Skip to content

Instantly share code, notes, and snippets.

@georgerichardson
Created June 20, 2023 06:22
Show Gist options
  • Save georgerichardson/9bf1229b1d854de3839ba40db422adc2 to your computer and use it in GitHub Desktop.
Save georgerichardson/9bf1229b1d854de3839ba40db422adc2 to your computer and use it in GitHub Desktop.
import pandas as pd
import seaborn as sns
def order_by_correlation_cluster(data: pd.DataFrame, x: str, y: str, values: str) -> pd.DataFrame:
"""Takes a tidy dataframe, pivots it, and creates a clustermap based on
correlations between rows. Returns the original data with a new 'order'
column that contains the cardinal ordering of the y axis from the clustermap.
Made with plotting in altair in mind.
"""
ordered_corr = (
sns.clustermap(
data
.pivot(index=y, columns=x, values=values)
.corr()
)
.data2d
.index
.values
)
order_map = dict(zip(ordered_corr, range(ordered_corr.shape[0])))
data["order"] = data[y].map(order_map)
return data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment