Last active
January 15, 2020 12:41
-
-
Save jshhrrsn/5377b9dd282ef51f5564f1347a7d5aef to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import shutil | |
KEEP = ('en') | |
def remove_dirs(): | |
os.chdir('lang') | |
for dir_ in os.listdir(os.getcwd()): | |
if os.path.isdir(dir_): | |
if dir_ not in KEEP: | |
shutil.rmtree(dir_, ignore_errors=False, onerror=None) | |
os.chdir('./') | |
if __name__ == "__main__": | |
remove_dirs() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is intended to remove unneeded languages from spacy (e.g. for slimming down to run in AWS Lambda). To use:
spacy/lang/
KEEP
to include any language models you require (anything not in here will be removed)python prune_langs.py