Skip to content

Instantly share code, notes, and snippets.

@benglewis
Created July 22, 2014 20:03
Show Gist options
  • Save benglewis/6d17ff28361f247a01d9 to your computer and use it in GitHub Desktop.
Save benglewis/6d17ff28361f247a01d9 to your computer and use it in GitHub Desktop.
import nltk
string = '''
I hate waiting in hospitals
Car parking is very expensive
I keep hearing your voice
Someone bought chocoolate and didn't give me any
The sound of Rosie's voice makes me want to cry
Kevin has stopped laughing because he lost his voice
The water is too wet
My app isn't working
'''
word_list = nltk.word_tokenize(string)
from nltk.corpus import stopwords
stopset = set(stopwords.words('english'))
filtered_words = [w for w in word_list if not w.lower() in stopset]
print filtered_words
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment