This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import json | |
import argparse | |
import requests | |
import tinys3 | |
''' | |
Modified version of nickjevershed's code |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
# dimensions = '225x275' | |
dimensions = 'original' | |
## add a list of IDs here based on http://bioguide.congress.gov/biosearch/biosearch.asp | |
id_list = [] | |
images_downloaded = 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
''' | |
Prereqs: | |
- Go to the congressional bio directory http://bioguide.congress.gov/biosearch/biosearch.asp | |
- Search the parameters you want | |
- inspect element and copy the html | |
- paste into a file and (optional?) wrap with <html></html> tags |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<html> | |
<head> | |
<title>This is my test page</title> | |
</head> | |
<body> | |
<h1>My article headline</h1> | |
<p>This is <em>my</em> article.</p> | |
<p>It's the <strong>greatest</strong> article ever written.</p> | |
</body> | |
</html> |
OlderNewer