Skip to content

Instantly share code, notes, and snippets.

@s3thi
Created October 26, 2011 18:33
Show Gist options
  • Save s3thi/1317304 to your computer and use it in GitHub Desktop.
Save s3thi/1317304 to your computer and use it in GitHub Desktop.
script to scrape reddit's favorite books
import urllib2
import json
# grab data
raw_data = urllib2.urlopen('http://www.reddit.com/r/books/comments/lol8l/time_for_a_new_reddits_favorite_books_thread/.json').read()
thread_data = json.loads(raw_data)[1]
toplevel_comments = thread_data['data']['children']
# extract books, upvotes and downvotes
votes = dict()
for book in toplevel_comments:
try:
book_name = book['data']['body']
book_upvotes = book['data']['ups']
book_downvotes = book['data']['downs']
votes[book_name] = (book_upvotes, book_downvotes)
except KeyError:
break
# create a dictionary sorted by upvotes
votes_by_up = reversed(sorted(votes.items(), key = lambda t: t[1][0]))
# print
for item in votes_by_up:
book_name, votes = item
book_upvotes, book_downvotes = votes
print(book_name + ' -- ' + str(book_upvotes) + ' upvotes, ' +
str(book_downvotes) + ' downvotes')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment