Skip to content

Instantly share code, notes, and snippets.

@tsh-code
Created March 4, 2024 08:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tsh-code/680efb3ae87272ff5b2500f1d7c6e0cf to your computer and use it in GitHub Desktop.
Save tsh-code/680efb3ae87272ff5b2500f1d7c6e0cf to your computer and use it in GitHub Desktop.
spacy spanmarker example
import spacy
nlp = spacy.load("en_core_web_trf")
nlp.add_pipe("span_marker", config={"model": "lxyuan/span-marker-bert-base-multilingual-cased-multinerd"})
def extract_people(text: str):
entities = nlp(text)
full_names = set()
for entity in entities.ents:
if entity.label_ in ['PER', 'PERSON']:
# Check if the entity has both a first name and a last name
if len(entity.text.split()) >= 2:
full_names.add(entity.text)
return list(full_names)
content = ‘text goes here'
entities = extract_people(content)
print(entities)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment