This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install beautifulsoup4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!-- saved from url=(0053)http://bioguide.congress.gov/biosearch/biosearch1.asp --> | |
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Congressional Biographical Directory</title></head> | |
<body background="./43rd-congress_files/paper1.gif" text="#000000"> | |
<table border="1" cellpadding="0" cellspacing="0" width="100%"> | |
<tbody><tr> | |
<td width="100%" valign="TOP" bgcolor="#990000"><center><img src="./43rd-congress_files/topbanner.jpg" border="0"></center></td> | |
</tr></tbody></table> | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"ADAMS, George Madison",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000035 | |
"ALBERT, William Julian",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000074 | |
"ALBRIGHT, Charles",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000077 | |
"ALCORN, James Lusk",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000079 | |
"ALLISON, William Boyd",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000160 | |
"AMES, Adelbert",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000172 | |
"ANTHONY, Henry Bowen",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000262 | |
"ARCHER, Stevenson",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000274 | |
"ARMSTRONG, Moses Kimball",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000283 | |
"ARTHUR, William Evans",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000304 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
final_link = soup.p.a | |
final_link.decompose() | |
clean_list = [] | |
links = soup.find_all('a') | |
for link in links: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
print(soup.prettify()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
final_link = soup.p.a | |
final_link.decompose() | |
links = soup.find_all('a') | |
for link in links: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
final_link = soup.p.a | |
final_link.decompose() | |
people = soup.find_all('a') | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
print(soup.get_text()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
links = soup.find_all('a') | |
for link in links: | |
print link |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
import csv | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
final_link = soup.p.a | |
final_link.decompose() | |
links = soup.find_all('a') | |
for link in links: |
OlderNewer