Skip to content

Instantly share code, notes, and snippets.

@chiefastro
Created November 30, 2020 04:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chiefastro/fd0f360d31dcb0106a9b8433cce066af to your computer and use it in GitHub Desktop.
Save chiefastro/fd0f360d31dcb0106a9b8433cce066af to your computer and use it in GitHub Desktop.
Custom entity tagging with spaCy's EntityRuler
import spacy
from spacy.pipeline import EntityRuler
from spacy import displacy
# load pre-trained model pipeline
nlp = spacy.load('en_core_web_sm')
# sentence for grammar rules
text = """He does not eat meat, but he loves Beyond Burgers."""
# rules for a custom named entity
# overwrite to ensure your rules take precedence when
# tokens could be tagged with multiple entities
ruler = EntityRuler(nlp, overwrite_ents=True)
ruler.add_patterns([
{"label": "BEYONDPRODUCT", "pattern": [
{"LOWER": "beyond"}, {"LOWER": "meat"}
]},
{"label": "BEYONDPRODUCT", "pattern": [
{"LOWER": "beyond"}, {"LOWER": "burgers"}
]},
{"label": "BEYONDPRODUCT", "pattern": [
{"LOWER": "beyond"}, {"LOWER": "sausage"}
]},
{"label": "BEYONDPRODUCT", "pattern": [
{"TEXT": "Beyond"}
]},
])
nlp.add_pipe(ruler)
# apply pipeline
doc = nlp(text)
# display
displacy.render(doc, style='ent', jupyter=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment