Skip to content

Instantly share code, notes, and snippets.

@tobocop2
Created October 22, 2014 00:45
Show Gist options
  • Save tobocop2/5e9cd803f301dc622f7d to your computer and use it in GitHub Desktop.
Save tobocop2/5e9cd803f301dc622f7d to your computer and use it in GitHub Desktop.
OS pdfs
import requests
from bs4 import BeautifulSoup
import wget
import os
url = 'http://www.cs.rutgers.edu/~vinodg/teaching/fall-2014-cs416/'
try:
os_dir = 'OS'
os.mkdirs(os_dir)
except OSError:
if not os.path.isdir(os_dir):
raise
os.chdir(os_dir)
res = requests.get('http://www.cs.rutgers.edu/~vinodg/teaching/fall-2014-cs416/schedule.html')
soup = BeautifulSoup(res.text)
for a in soup.select('a'):
if 'pdf' in a['href']:
if 'http' not in a['href']:
wget.download((url+a['href']))
print url+a['href']
else:
wget.download((a['href']))
@tobocop2
Copy link
Author

All of the OS pdf's that we need

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment