Created
August 18, 2020 08:09
-
-
Save tranduydat/5fec0b549b6842ee3fe1a1ee10263d4c to your computer and use it in GitHub Desktop.
NLP with spaCy
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Install spaCy | |
pip3 install -U spacy | |
# Download the large English model for spaCy | |
python3 -m spacy download en_core_web_lg | |
# Install textacy which will also be useful | |
pip3 install -U textacy | |
# Insert into file.py | |
import spacy | |
# Load the large English NLP model | |
nlp = spacy.load('en_core_web_lg') | |
# The text we want to examine | |
text = """London is the capital and most populous city of England and | |
the United Kingdom. Standing on the River Thames in the south east | |
of the island of Great Britain, London has been a major settlement | |
for two millennia. It was founded by the Romans, who named it Londinium. | |
""" | |
# Parse the text with spaCy. This runs the entire pipeline. | |
doc = nlp(text) | |
# 'doc' now contains a parsed version of text. We can use it to do anything we want! | |
# For example, this will print out all the named entities that were detected: | |
for entity in doc.ents: | |
print(f"{entity.text} ({entity.label_})") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment