Skip to content

Instantly share code, notes, and snippets.

@swinton
Created July 8, 2014 16:25
Show Gist options
  • Save swinton/14a584a3f650e51ef5e4 to your computer and use it in GitHub Desktop.
Save swinton/14a584a3f650e51ef5e4 to your computer and use it in GitHub Desktop.
Grab some text from a random, featured Wikipedia page. Useful for testing.
#!/usr/bin/env python
import requests
from pyquery import PyQuery as pq
wiki_link = 'http://tools.wikimedia.de/~dapete/random/enwiki-featured.php'
def retrieve_article():
article = requests.get(wiki_link)
text = pq(article.text)
title = text('#firstHeading').find('span').text()
first_paragraph = text('#mw-content-text').find('p').eq(1).text()
if first_paragraph is None or first_paragraph == '':
retrieve_article()
else:
print title + '\n' + first_paragraph + '\n' + article.url
if __name__ == '__main__':
retrieve_article()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment