Skip to content

Instantly share code, notes, and snippets.

@AnalogJ
Created August 29, 2017 01:54
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save AnalogJ/bfa4a497dedf507beddeb0173c3d98b0 to your computer and use it in GitHub Desktop.
Save AnalogJ/bfa4a497dedf507beddeb0173c3d98b0 to your computer and use it in GitHub Desktop.
download all books in OPDS catalog.
import urllib2
import os
import urllib, urlparse
import xml.etree.cElementTree as et
e = et.ElementTree(file=urllib2.urlopen('https://standardebooks.org/opds/all')).getroot()
print(e)
print("parsing")
for atype in e.iter('{http://www.w3.org/2005/Atom}link'):
if atype.get('href') and atype.get('type') == "application/epub+zip" and atype.get('href').split('.')[-1] == 'epub':
dl_url = "https://standardebooks.org{0}".format(atype.get('href'))
print(dl_url)
split = urlparse.urlsplit(dl_url)
filename = "./" + split.path.split("/")[-1]
urllib.urlretrieve (dl_url, filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment