Skip to content

Instantly share code, notes, and snippets.

@conormm
Created May 18, 2021 20:35
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save conormm/25c5219ebecdaaf02a46071e8de650b1 to your computer and use it in GitHub Desktop.
Save conormm/25c5219ebecdaaf02a46071e8de650b1 to your computer and use it in GitHub Desktop.
def extract_tokens_plus_meta(doc:spacy.tokens.doc.Doc):
"""Extract tokens and metadata from individual spaCy doc."""
return [
(i.text, i.i, i.lemma_, i.ent_type_, i.tag_,
i.dep_, i.pos_, i.is_stop, i.is_alpha,
i.is_digit, i.is_punct) for i in doc
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment