Skip to content

Instantly share code, notes, and snippets.

@tecknoh19
Created September 20, 2018 00:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tecknoh19/8cd7b23983a5f587ce236c664c843962 to your computer and use it in GitHub Desktop.
Save tecknoh19/8cd7b23983a5f587ce236c664c843962 to your computer and use it in GitHub Desktop.
# Downloads all PDF files in given URL
# USAGE: pdf_dl.py <URL>
from bs4 import BeautifulSoup
import urllib2
import sys
resp = urllib2.urlopen(sys.argv[1])
soup = BeautifulSoup(resp, from_encoding=resp.info().getparam('charset'))
for link in soup.find_all('a', href=True):
if ".pdf" in link['href']:
print "Downloading " + link['href']
dl = urllib2.urlopen(sys.argv[1] + link['href'])
fh = open(link['href'], "w")
fh.write(dl.read())
fh.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment