Skip to content

Instantly share code, notes, and snippets.

@macloo
Last active April 7, 2019 14:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save macloo/fe5a1c8d0141d239766c95f3f704ded3 to your computer and use it in GitHub Desktop.
Save macloo/fe5a1c8d0141d239766c95f3f704ded3 to your computer and use it in GitHub Desktop.
Use Python Wikipedia-API to get text summary for any subject in a list of subjects
"""
Requires Wikipedia-API 0.5.1 or greater - and Python 3
https://pypi.org/project/Wikipedia-API/
"""
import wikipediaapi
w = wikipediaapi.Wikipedia('en')
p = w.page('N._K._Jemisin')
# print exactly 2 sentences from summary
print(w.extracts(p, exsentences=2))
# print exactly 6 sentences from summary
# note - not all extracts have that many sentences
print(w.extracts(p, exsentences=6))
"""
Requires Wikipedia-API and Python 3
https://pypi.org/project/Wikipedia-API/
"""
import wikipediaapi
w = wikipediaapi.Wikipedia(language='en', extract_format=wikipediaapi.ExtractFormat.WIKI)
subjects = ['N. K. Jemisin', 'Cixin Liu', 'Ann Leckie', 'John Scalzi', 'Red G. Bloo', 'Jo Walton']
for subject in subjects:
p = w.page(subject)
if p.exists():
print(p.summary, '\n')
print(p.fullurl, '\n')
else:
print(subject + ": No information available.\n")
@macloo
Copy link
Author

macloo commented Apr 6, 2019

Only way to control length of summary is with split: p.summary[0:60]

Regular API can get an extract, different from summary (shorter), but I haven't found a way to get this with Wikipedia-API.

Example:
https://en.wikipedia.org/w/api.php?action=query&prop=extracts&exintro&explaintext&exsentences=3& format=json&titles=Arundhati_Roy

.

@macloo
Copy link
Author

macloo commented Apr 7, 2019

The package was updated April 7 to allow selection of extract length by number of sentences. @martin-majlis is amazing!

martin-majlis/Wikipedia-API#21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment