Skip to content

Instantly share code, notes, and snippets.

@allanj
Created March 12, 2019 05:51
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save allanj/c3c0406abd12d4803588f467900c2880 to your computer and use it in GitHub Desktop.
Save allanj/c3c0406abd12d4803588f467900c2880 to your computer and use it in GitHub Desktop.
Convert the word2vec bin file to txt
#
# @author: Allan
#
def convert(input, output):
from gensim.models.keyedvectors import KeyedVectors
embedding = KeyedVectors.load_word2vec_format(input, binary=True)
f= open(output, 'w', encoding='utf-8')
for word in embedding.wv.vocab:
emb = embedding[word].tolist()
emb_str = word + ' ' + ' '.join(map(str, emb))
f.write(emb_str + '\n')
f.close()
##modify these two paths
path = "F:/data/embedding/PubMed-w2v.bin"
file_path = "F:/data/embedding/pubmed-w2v.txt"
convert(path, file_path)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment