Skip to content

Instantly share code, notes, and snippets.

@SingAvi
Created April 24, 2020 07:05
Show Gist options
  • Save SingAvi/d23b783b28bb0c5e79ff4473680053cf to your computer and use it in GitHub Desktop.
Save SingAvi/d23b783b28bb0c5e79ff4473680053cf to your computer and use it in GitHub Desktop.
Extracting Facts from a datasets of information
import wikipedia
import spacy
import textacy.extract
# f= open("facts.txt","a+")
# Load the large English NLP model
nlp = spacy.load('en_core_web_sm')
# Detail of the searched word
content =wikipedia.page("Apple Inc").content
doc = nlp(content)
# Extract semi-structured statements with "Apple" as the Head-Word
statements = textacy.extract.semistructured_statements(doc, "Apple")
# Print the results
print("Here are the things I know about Apple:")
for statement in statements:
subject, verb, fact = statement
print(f"-{fact}")
# facts = f" - {fact}"
# print(facts)
# f.write(facts)
# f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment