Skip to content

Instantly share code, notes, and snippets.

@varunchitale
Created October 11, 2018 08:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save varunchitale/e844641c7b3115e8b62deb0cd7540af1 to your computer and use it in GitHub Desktop.
Save varunchitale/e844641c7b3115e8b62deb0cd7540af1 to your computer and use it in GitHub Desktop.
I ❥ Spacy!
import spacy
#Load the (small) model
_nlp = spacy.load('en_core_web_sm')
#Adjust the max length of the document
#_nlp.max_length = 1000000
#text contains the data that we want to extract phrases from, in string/buffer format
all_prompts = []
doc = _nlp(text.decode('utf-8'))
def certain_conditions(prompt):
#Insert your checks here
return True
for token in doc.noun_chunks:
prompt = token.text
#prompt now contains the necessary phrase
if certain_conditions(prompt = prompt):
all_prompts.append(prompt)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment