Skip to content

Instantly share code, notes, and snippets.

@interstar
Created July 22, 2013 04:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save interstar/6051256 to your computer and use it in GitHub Desktop.
Save interstar/6051256 to your computer and use it in GitHub Desktop.
Quora RSS scraper.
import feedparser
import hashlib
import json
from bs4 import BeautifulSoup
d = feedparser.parse("http://www.quora.com/YOUR-QUORA-NAME/answers/rss")
for e in d["entries"] :
title = e["title"]
summary = e["summary"]
h = hashlib.sha224(title).hexdigest()
summary = summary.replace("<br />","__LINEBREAK__")
soup = BeautifulSoup(summary)
answer = soup.get_text()[11:]
answer = answer[:-21]
answer = answer.replace("__LINEBREAK__","\n")
answer = answer.strip()
print title
print answer
j = {"question":title,"answer":answer,"link":e["link"]}
f=open(h+".quora.txt",'w')
f.write(json.dumps(j))
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment