Skip to content

Instantly share code, notes, and snippets.

@oaguy1
Last active June 2, 2020 22:31
Show Gist options
  • Save oaguy1/d11aa812b095e300515234072521cfb4 to your computer and use it in GitHub Desktop.
Save oaguy1/d11aa812b095e300515234072521cfb4 to your computer and use it in GitHub Desktop.
Removing stop words from a raw natrual language text
import spacy
nlp = spacy.load('en_core_web_sm')
# comments is an array of strings we generated earlier
parsed_bodies = [nlp(comm) for comm in comments]
cleaned = []
for doc in parsed_bodies:
current = []
for token in doc:
if not token.is_stop:
current.append(token.text)
cleaned.append(current)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment