Skip to content

Instantly share code, notes, and snippets.

@vidhan13j07
Last active December 5, 2016 17:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vidhan13j07/e6d6483cf18ce10465a0e2b9715d70b1 to your computer and use it in GitHub Desktop.
Save vidhan13j07/e6d6483cf18ce10465a0e2b9715d70b1 to your computer and use it in GitHub Desktop.
Downloads pdf files in nptel videos.
from bs4 import BeautifulSoup
import requests
import os
r = requests.get(raw_input('Enter the url: '))
soup = BeautifulSoup(r.text, 'html.parser')
pdfs = soup.find('div', {'class': 'pdf-label'}).findAll('a')
os.chdir('Enter the location where the files are to be stored: ')
for label in pdfs:
link = label.get('href')
filename = link.split('/')[-1]
if os.path.exists(os.curdir + '/' + filename):
continue
rr = requests.get(link, stream=True)
with open(filename, 'wb') as f:
for chunk in rr.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
print('Done with ' + filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment