Skip to content

Instantly share code, notes, and snippets.

@EnkrateiaLucca
Created July 24, 2020 17:26
Show Gist options
  • Save EnkrateiaLucca/c141d90f940babeeaa28c611e16214a1 to your computer and use it in GitHub Desktop.
Save EnkrateiaLucca/c141d90f940babeeaa28c611e16214a1 to your computer and use it in GitHub Desktop.
Stemming
from nltk.stem import PorterStemmer, LancasterStemmer
#Instantiating the Stemmer classes from nltk
porter = PorterStemmer()
lancaster = LancasterStemmer()
# Selecting a Sentence
sentence = text[3]
# Making a nice print output
print("Sentence:")
print("'" + sentence + "'")
print("{0:20} {1:20}".format("Ported Stemmer", "Lancaster Stemmer"))
for word in word_tokenize(sentence):
print("{0:20} {1:20}".format(porter.stem(word), lancaster.stem(word)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment