Skip to content

Instantly share code, notes, and snippets.

@kanekomasahiro
Last active April 17, 2021 01:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kanekomasahiro/101e619068fd7d2c51969bd5b41ec52f to your computer and use it in GitHub Desktop.
Save kanekomasahiro/101e619068fd7d2c51969bd5b41ec52f to your computer and use it in GitHub Desktop.
textで保存されたgensimの単語分散表現をbinで保存する.
import sys
import linecache
from gensim.models import KeyedVectors
def save_word_embedding_text_to_binary(input, output):
if linecache.getline(input, 1).split() == 2:
no_header = False
else:
no_header = True
embedding = KeyedVectors.load_word2vec_format(args[1], binary=False, no_header=no_header)
embedding.save_word2vec_format(args[2], binary=True)
def main(args):
input = args[1]
output = args[2]
save_word_embedding_text_to_binary(input, output)
if __name__ == "__main__":
args = sys.argv
main(args)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment