Last active
August 25, 2020 02:59
-
-
Save purva91/da5ba13092925f2ec4662eee82235724 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
! mkdir encoder | |
! curl -Lo encoder/infersent2.pkl https://dl.fbaipublicfiles.com/infersent/infersent2.pkl | |
! mkdir GloVe | |
! curl -Lo GloVe/glove.840B.300d.zip http://nlp.stanford.edu/data/glove.840B.300d.zip | |
! unzip GloVe/glove.840B.300d.zip -d GloVe/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from models import InferSent | |
import torch | |
V = 2 | |
MODEL_PATH = 'encoder/infersent%s.pkl' % V | |
params_model = {'bsize': 64, 'word_emb_dim': 300, 'enc_lstm_dim': 2048, | |
'pool_type': 'max', 'dpout_model': 0.0, 'version': V} | |
model = InferSent(params_model) | |
model.load_state_dict(torch.load(MODEL_PATH)) | |
W2V_PATH = '/content/GloVe/glove.840B.300d.txt' | |
model.set_w2v_path(W2V_PATH) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
model.build_vocab(sentences, tokenize=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
query = "I had pizza and pasta" | |
query_vec = model.encode(query)[0] | |
query_vec |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
similarity = [] | |
for sent in sentences: | |
sim = cosine(query_vec, model.encode([sent])[0]) | |
print("Sentence = ", sent, "; similarity = ", sim) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment