Skip to content

Instantly share code, notes, and snippets.

@jorgeavaldez
Created May 24, 2015 16:07
Show Gist options
  • Save jorgeavaldez/bd6e2ac0069a2c884829 to your computer and use it in GitHub Desktop.
Save jorgeavaldez/bd6e2ac0069a2c884829 to your computer and use it in GitHub Desktop.
cids - image downloader
from bs4 import BeautifulSoup
import re, os, sys, glob, requests, urllib2
def getSource(url):
return urllib2.urlopen(url).read()
def getMatches(source):
#This match changes according to the source... The fileThumb works for anything methinks
return BeautifulSoup(source).select('.fileThumb')
def downloadImage(imageURL, fileName):
response = requests.get(imageURL)
#Make sure it's a successful download
if response.status_code == 200:
print('Downloading %s...' % (fileName))
else:
print('File could not be downloaded :(')
if not os.path.exists('Images'):
os.makedirs('Images')
#Write to file
with open(os.path.join('Images', fileName), 'wb') as fo:
for chunk in response.iter_content(4096):
fo.write(chunk)
def run():
url = raw_input("Thread URL? ")
matches = getMatches(getSource(url))
for i in xrange(len(matches)):
href = matches[i]['href']
localFileNameMatches = re.search(r"//(.*)/(.*)/(.*\..*)", href)
localFileName = localFileNameMatches.group(2) + "_" + localFileNameMatches.group(3)
downloadImage('http:' + href, localFileName)
run()
@jorgeavaldez
Copy link
Author

Little image downloader I wrote back when I was first learning python. It was originally only for 4chan thread images. It only consists of 1 file so I didn't think it deserved its own repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment