Skip to content

Instantly share code, notes, and snippets.

@btahir
Last active April 25, 2024 04:58
Show Gist options
  • Save btahir/dab97ea384360999425707950f1ee2b0 to your computer and use it in GitHub Desktop.
Save btahir/dab97ea384360999425707950f1ee2b0 to your computer and use it in GitHub Desktop.
MVP For Semantic Search using Sentence Transformers + FAISS
# install packages
# pip install faiss-cpu sentence-transformers
import numpy as np
import torch
import faiss
import time
from sentence_transformers import SentenceTransformer
# https://www.sbert.net/docs/pretrained_models.html#multi-qa-models
embedder = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')
# Corpus with example sentences
corpus = ['A man is eating food.',
'A man is eating a piece of bread.',
'The girl is carrying a baby.',
'A man is riding a horse.',
'A woman is playing violin.',
'Two men pushed carts through the woods.',
'A man is riding a white horse on an enclosed ground.',
'A monkey is playing drums.',
'A cheetah is running behind its prey.'
]
corpus_embeddings = embedder.encode(corpus, convert_to_tensor=True)
# get embedding dimension
embed_dim = embedder.get_sentence_embedding_dimension()
# index on faiss
index = faiss.IndexIDMap(faiss.IndexFlatIP(embed_dim))
index.add_with_ids(corpus_embeddings, np.array(range(0, len(corpus))))
# save index and read it for future!
faiss.write_index(index, 'my_index')
index = faiss.read_index('my_index')
def search(query):
t=time.time()
query_vector = embedder.encode([query])
k = 5
top_k = index.search(query_vector, k)
print('totaltime: {}'.format(time.time()-t))
return [corpus[_id] for _id in top_k[1].tolist()[0]]
# example query
query='music instrument'
results=search(query)
print('results :')
for result in results:
print('\t',result)
@acalatrava
Copy link

I’m pretty new on ML, I tried this gist on colab and get this error on line 31

ValueError: input not a numpy array

any hints?

@btahir
Copy link
Author

btahir commented Mar 2, 2023

Works for me. 🤷‍♂️

@acalatrava
Copy link

I tried on Google Colab and shows that error. I installed faiss-gpu, maybe is another package?

@acalatrava
Copy link

Ok I saw you mention faiss-cpu on your gist and with that package works. Is there anyway to make it work using gpu?

@btahir
Copy link
Author

btahir commented Mar 2, 2023

I'm sure there are. You can google around for faiss-gpu examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment