Skip to content

Instantly share code, notes, and snippets.

@ahwillia
Created June 22, 2020 23:26
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ahwillia/e9c62190f8f3a588a7a20c81dd294335 to your computer and use it in GitHub Desktop.
Save ahwillia/e9c62190f8f3a588a7a20c81dd294335 to your computer and use it in GitHub Desktop.
Sort data points by hierarchical clustering
from sklearn.datasets import make_biclusters
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
def resort_rows_hclust(U):
"""Sorts the rows of a matrix by hierarchical clustering
Parameters:
U (ndarray) : matrix of data
Returns:
prm (ndarray) : permutation of the rows
"""
from scipy.cluster import hierarchy
Z = hierarchy.ward(U)
return hierarchy.leaves_list(hierarchy.optimal_leaf_ordering(Z, U))
data = make_biclusters(shape=(300, 300), n_clusters=5, noise=5, shuffle=True, random_state=0)[0]
plt.imshow(data, aspect='auto')
ii = resort_rows_hclust(data)
jj = resort_rows_hclust(data.T)
new_data = data.copy()
new_data = new_data[ii]
new_data = new_data[:,jj]
plt.imshow(new_data, aspect='auto')
@ahwillia
Copy link
Author

Expected output:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment