Skip to content

Instantly share code, notes, and snippets.

@abhinavgupta
Created October 16, 2012 13:44
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save abhinavgupta/3899375 to your computer and use it in GitHub Desktop.
Save abhinavgupta/3899375 to your computer and use it in GitHub Desktop.
Simple script for faster fetching with urllib2 with exception handling
import threading, urllib2
import Queue
def read_url(url, queue):
try:
data = urllib2.urlopen(url).read()
except urllib2.HTTPError, e:
checksLogger.error('HTTPError = ' + str(e.code))
except urllib2.URLError, e:
checksLogger.error('URLError = ' + str(e.reason))
except httplib.HTTPException, e:
checksLogger.error('HTTPException')
except Exception:
import traceback
checksLogger.error('generic exception: ' + traceback.format_exc())
print('Fetched %s from %s' % (len(data), url))
queue.put(data)
def fetch_parallel(list_of_urls):
result = Queue.Queue()
threads = [threading.Thread(target=read_url, args = (url,result)) for url in list_of_urls]
for t in threads:
t.start()
for t in threads:
t.join()
return result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment