Skip to content

Instantly share code, notes, and snippets.

@greglinch
Last active March 10, 2017 22:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save greglinch/608001fa0ae39834af18354c9e8c6f09 to your computer and use it in GitHub Desktop.
Save greglinch/608001fa0ae39834af18354c9e8c6f09 to your computer and use it in GitHub Desktop.
Set a list of congressional bio directory IDs in order to download members' photos. I used wget instead of requests because of a TLS handshake issue. For getting the IDs, see https://gist.github.com/greglinch/5197267b6ff8fcb19192ba5443f1f71d
import os
# dimensions = '225x275'
dimensions = 'original'
## add a list of IDs here based on http://bioguide.congress.gov/biosearch/biosearch.asp
id_list = []
images_downloaded = 0
file_path = '~/Downloads/images/'
urls = ''
for bio_id in id_list:
img_url = 'https://theunitedstates.io/images/congress/%s/%s.jpg' % (dimensions, bio_id)
# print img_url
file_name = '%s.jpg' % (bio_id)
file = file_path + file_name
command = 'wget -O %s %s' % (file, img_url)
try:
os.system(command)
images_downloaded += 1
# urls += img_url + ','
except:
pass
# print 'Error:\t\t' + bio_id + '\n'
# print urls
print 'Images downloaded:\t\t' + str(images_downloaded)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment