Skip to content

Instantly share code, notes, and snippets.

@gavinmh
Last active Dec 5, 2018
Embed
What would you like to do?
Named Entity Extraction with NLTK in Python
# -*- coding: utf-8 -*-
'''
'''
from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk
def extract_entities(text):
entities = []
for sentence in sent_tokenize(text):
chunks = ne_chunk(pos_tag(word_tokenize(sentence)))
entities.extend([chunk for chunk in chunks if hasattr(chunk, 'node')])
return entities
if __name__ == '__main__':
text = """
A multi-agency manhunt is under way across several states and Mexico after
police say the former Los Angeles police officer suspected in the murders of a
college basketball coach and her fiancé last weekend is following through on
his vow to kill police officers after he opened fire Wednesday night on three
police officers, killing one.
"In this case, we're his target," Sgt. Rudy Lopez from the Corona Police
Department said at a press conference.
The suspect has been identified as Christopher Jordan Dorner, 33, and he is
considered extremely dangerous and armed with multiple weapons, authorities
say. The killings appear to be retribution for his 2009 termination from the
Los Angeles Police Department for making false statements, authorities say.
Dorner posted an online manifesto that warned, "I will bring unconventional
and asymmetrical warfare to those in LAPD uniform whether on or off duty."
"""
print extract_entities(text)
@PandaWhoCodes

This comment has been minimized.

Copy link

@PandaWhoCodes PandaWhoCodes commented Sep 25, 2017

entities.extend([chunk for chunk in chunks if hasattr(chunk, 'node')])

File "C:\ProgramData\Anaconda3\lib\site-packages\nltk\tree.py", line 202, in _get_node
raise NotImplementedError("Use label() to access a node label.")
NotImplementedError: Use label() to access a node label.

@aditiprakash

This comment has been minimized.

Copy link

@aditiprakash aditiprakash commented Oct 6, 2017

@reach2ashish Replace 'node' with 'label' on Line 12 and it will work :)
If you're using Python3, you will also have to add additional ( ) around the print statement.

@EdenBelouadah

This comment has been minimized.

Copy link

@EdenBelouadah EdenBelouadah commented Oct 15, 2017

This code doesn't recognize dates????

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment