Skip to content

Instantly share code, notes, and snippets.

@AlexMikhalev
Created February 11, 2021 12:05
Show Gist options
  • Save AlexMikhalev/94863f1b1201f3e60730d5dd1aa6f078 to your computer and use it in GitHub Desktop.
Save AlexMikhalev/94863f1b1201f3e60730d5dd1aa6f078 to your computer and use it in GitHub Desktop.
UMLS scispacy
import scispacy
import spacy
from scispacy.abbreviation import AbbreviationDetector
from scispacy.umls_linking import UmlsEntityLinker
nlp = spacy.load("en_core_sci_sm")
#^^^ at this import you area looking at 9 GB RAM consumption
#even before processing starts
# Add the abbreviation pipe to the spacy pipeline.
abbreviation_pipe = AbbreviationDetector(nlp)
nlp.add_pipe(abbreviation_pipe)
# Adding
# the AbbreviationDetector pipe and setting resolve_abbreviations to True means
# that linking will only be performed on the long form of abbreviations.
linker = UmlsEntityLinker(resolve_abbreviations=True)
nlp.add_pipe(linker)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment