Skip to content

Instantly share code, notes, and snippets.

@jboynyc
Last active June 8, 2016 10:23
Show Gist options
  • Save jboynyc/310e34cebe7678f7ac1fe982d26bbcc0 to your computer and use it in GitHub Desktop.
Save jboynyc/310e34cebe7678f7ac1fe982d26bbcc0 to your computer and use it in GitHub Desktop.
Easily download all linked PDFs from a web page using the IPython/Jupyter console
from bs4 import BeautifulSoup as bs
r = !curl -s 'https://example.net'
s = bs(''.join(r))
for link in s.findAll('a', {'href': lambda x: x.endswith('pdf')}):
loc = link['href']
!curl -O $loc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment