Skip to content

Instantly share code, notes, and snippets.

@anbublacky
Created May 19, 2011 13:41
Show Gist options
  • Select an option

  • Save anbublacky/980777 to your computer and use it in GitHub Desktop.

Select an option

Save anbublacky/980777 to your computer and use it in GitHub Desktop.
python-webscraping
import urllib2
import re
import subprocess
from BeautifulSoup import BeautifulSoup, SoupStrainer
url='http://media.railscasts.com/assets/episodes/videos/'
soup = BeautifulSoup(urllib2.urlopen(url).read())
for tag in soup.findAll('a',href=True):
a=tag['href']
b=a.endswith('ogv')
if b == True :
z=url+a
print z
subprocess.call(["wget","-c" , z])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment