Last active
August 25, 2020 02:59
-
-
Save purva91/da5ba13092925f2ec4662eee82235724 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ! mkdir encoder | |
| ! curl -Lo encoder/infersent2.pkl https://dl.fbaipublicfiles.com/infersent/infersent2.pkl | |
| ! mkdir GloVe | |
| ! curl -Lo GloVe/glove.840B.300d.zip http://nlp.stanford.edu/data/glove.840B.300d.zip | |
| ! unzip GloVe/glove.840B.300d.zip -d GloVe/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from models import InferSent | |
| import torch | |
| V = 2 | |
| MODEL_PATH = 'encoder/infersent%s.pkl' % V | |
| params_model = {'bsize': 64, 'word_emb_dim': 300, 'enc_lstm_dim': 2048, | |
| 'pool_type': 'max', 'dpout_model': 0.0, 'version': V} | |
| model = InferSent(params_model) | |
| model.load_state_dict(torch.load(MODEL_PATH)) | |
| W2V_PATH = '/content/GloVe/glove.840B.300d.txt' | |
| model.set_w2v_path(W2V_PATH) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| model.build_vocab(sentences, tokenize=True) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| query = "I had pizza and pasta" | |
| query_vec = model.encode(query)[0] | |
| query_vec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| similarity = [] | |
| for sent in sentences: | |
| sim = cosine(query_vec, model.encode([sent])[0]) | |
| print("Sentence = ", sent, "; similarity = ", sim) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment