Skip to content

Instantly share code, notes, and snippets.

@seozed
Created October 12, 2022 08:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save seozed/4dcc9110ced6959e55b441d9f9c3c9da to your computer and use it in GitHub Desktop.
Save seozed/4dcc9110ced6959e55b441d9f9c3c9da to your computer and use it in GitHub Desktop.
lemmatize.py
import nltk
nltk.download('omw-1.4')
nltk.download('punkt')
from nltk.stem import PorterStemmer
from nltk.stem import LancasterStemmer
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
import os
words = []
porter = PorterStemmer()
lancaster = LancasterStemmer()
new_words = [wordnet_lemmatizer.lemmatize(word, pos="v") for word in words]
new_words.sort()
filename = 'words.txt'
with open(os.path.join(os.getcwd(), filename), 'w') as file:
file.writelines([line + '\n' for line in new_words])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment