Skip to content

Instantly share code, notes, and snippets.

@shreybatra
Created March 4, 2019 04:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save shreybatra/684ca3a238fa6d3555491b2dabc68bc3 to your computer and use it in GitHub Desktop.
Save shreybatra/684ca3a238fa6d3555491b2dabc68bc3 to your computer and use it in GitHub Desktop.
import nltk
from nltk.corpus import stopwords
text = "Hi, Laptops from Hewlett-Packard aren't running MacOS. We would love an Apple Mac for our work."
print('Word tokenization - ')
word_token = nltk.word_tokenize(text)
print(word_token)
print('\nSentence tokenization - ')
print(nltk.sent_tokenize(text))
print('Word tokens after removing stopwords - ')
stop = set(stopwords.words('english'))
ans = [token for token in word_token if token not in stop]
print(ans)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment