Skip to content

Instantly share code, notes, and snippets.

@scturtle
Created March 18, 2014 03:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save scturtle/9613323 to your computer and use it in GitHub Desktop.
Save scturtle/9613323 to your computer and use it in GitHub Desktop.
163 reader
import requests
import base64
from bs4 import BeautifulSoup
cookie = open('cookie163.txt').read().strip()
bookid = '23c816b98303491ab82e898d349f6154_4&tradeId='
bookinfo = requests.get('http://yuedu.163.com/getBook.do?id='+bookid).json()
content = ''
for p in bookinfo['portions']:
c = requests.get("http://yuedu.163.com/getArticleContent.do?"
"sourceUuid={}&articleUuid={}&bigContentId={}".format(
bookid, p['id'], p['bigContentId']),
headers={'Cookie':cookie}).json()
cnt = base64.decodestring(c['content'])
print p['title'], len(cnt)
content += '\n' + cnt
soup = BeautifulSoup(content)
content = '\n'.join(soup.findAll(text=True))
content = '\n\n'.join(filter(lambda l: bool(l.strip()), content.split('\n')))
open(bookinfo['title'] + '.txt', 'w').write(content.encode('utf8'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment