Skip to content

Instantly share code, notes, and snippets.

Created December 29, 2015 14:14
Show Gist options
  • Save anonymous/dc9c249318f3f92d716b to your computer and use it in GitHub Desktop.
Save anonymous/dc9c249318f3f92d716b to your computer and use it in GitHub Desktop.
Download math books from springer
import urllib2, re
pmax = 13
links = []
for i in xrange(1, pmax):
index = 'http://link.springer.com/search/page/%d?facet-series="136"&facet-content-type="Book"&showAll=false' % i
page = urllib2.urlopen(index).read()
links.extend(re.findall('<a class="title" href="(.*?)"', page))
for link in links:
url = 'http://link.springer.com' + link
page = urllib2.urlopen(url).read()
name = re.search('<h1 id="title">(.*?)<', page).group(1)
pdf = re.search('<a id="toc-download-book-pdf.*?href="(.*?)"', page).group(1)
print "wget http://link.springer.com" + pdf
print 'mv %s "%s.pdf"' % (
re.search("([^/]*)$", pdf).group(1),
name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment