Skip to content

Instantly share code, notes, and snippets.

@mortonjt
Created July 10, 2020 18:38
Show Gist options
  • Save mortonjt/219e3485bdfa797da6e74fb22417771d to your computer and use it in GitHub Desktop.
Save mortonjt/219e3485bdfa797da6e74fb22417771d to your computer and use it in GitHub Desktop.
import numpy as np
import matplotlib.pyplot as plt
from skbio.stats.ordination import pcoa
from skbio import DistanceMatrix
from scipy.spatial.distance import pdist, squareform
# embedding = < your language model embedding > # dim L x D where L is the length, D is the dimension
rr_dist = squareform(pdist(embedding))
dm = DistanceMatrix(rr_dists_d.mean(2))
ord_res = pcoa(dm)
cm = plt.cm.get_cmap('viridis')
sc = plt.scatter(ord_res.samples['PC1'], ord_res.samples['PC2'],
c=np.arange(ord_res.samples.shape[0]), cmap=cm)
plt.colorbar(sc)
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.title('Residue PCA')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment