Skip to content

Instantly share code, notes, and snippets.

@pszemraj
Created February 1, 2024 20:56
Show Gist options
  • Save pszemraj/7c0a9aa9cabeab9e9f8a03ae2a797f4b to your computer and use it in GitHub Desktop.
Save pszemraj/7c0a9aa9cabeab9e9f8a03ae2a797f4b to your computer and use it in GitHub Desktop.
download and run this periodically during setup so Colab doesn't whine about you not using the GPU
# pip install sentence-transformers -q
# source: https://www.sbert.net/docs/usage/semantic_textual_similarity.html
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer("all-MiniLM-L6-v2")
# Two lists of sentences
sentences1 = [
"The cat sits outside",
"A man is playing guitar",
"The new movie is awesome",
]
sentences2 = [
"The dog plays in the garden",
"A woman watches TV",
"The new movie is so great",
]
# Compute embedding for both lists
embeddings1 = model.encode(sentences1, convert_to_tensor=True)
embeddings2 = model.encode(sentences2, convert_to_tensor=True)
# Compute cosine-similarities
cosine_scores = util.cos_sim(embeddings1, embeddings2)
# Output the pairs with their score
for i in range(len(sentences1)):
print("{} \t\t {} \t\t Score: {:.4f}".format(
sentences1[i], sentences2[i], cosine_scores[i][i]
))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment