Skip to content

Instantly share code, notes, and snippets.

@oborchers
Last active June 7, 2019 16:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save oborchers/4cd3b473f2a7cf1b5774ae9525ec2c70 to your computer and use it in GitHub Desktop.
Save oborchers/4cd3b473f2a7cf1b5774ae9525ec2c70 to your computer and use it in GitHub Desktop.
def sif_embeddings(sentences, model):
""" Precomputes the sif_vectors in a separate matrix
"""
vlookup = model.wv.vocab
vectors = model.wv.sif_vectors
# The sif_vectors are pre-computed as:
# sif_vectors = (model.wv.vectors * model.wv.sif[:, None])
output = []
for s in sentences:
idx = [vlookup[w].index for w in s if w in vlookup]
v = np.sum(vectors[idx], axis=0)
if len(idx) > 0:
v *= 1/len(idx)
output.append(v)
return np.vstack(output).astype(REAL)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment