This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<section class="block"> | |
<h2>some header </h2> | |
<p>Some text</p> | |
<a href="#">Read More</a> | |
</section> | |
<section class="block"> | |
<h2>another header </h2> | |
<p>Some text</p> | |
<a href="#">Read More</a> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.block { | |
width: 30%; | |
float:left; | |
background-color: gray; | |
margin: 10px; | |
} | |
.navigation { | |
background-color:green; | |
width: 33%; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import nltk | |
with open('sample.txt', 'r') as f: | |
sample = f.read() | |
sentences = nltk.sent_tokenize(sample) | |
tokenized_sentences = [nltk.word_tokenize(sentence) for sentence in sentences] | |
tagged_sentences = [nltk.pos_tag(sentence) for sentence in tokenized_sentences] | |
chunked_sentences = nltk.batch_ne_chunk(tagged_sentences, binary=True) |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Use Gists to store code you would like to remember later on | |
console.log(window); // log the "window" object to the console |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php echo search_form(array('show_advanced' => true, 'submit_value' => 'Lucky', 'form_attributes' => array('role' => 'search', 'class' => 'form'))); ?> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install beautifulsoup4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!-- saved from url=(0053)http://bioguide.congress.gov/biosearch/biosearch1.asp --> | |
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Congressional Biographical Directory</title></head> | |
<body background="./43rd-congress_files/paper1.gif" text="#000000"> | |
<table border="1" cellpadding="0" cellspacing="0" width="100%"> | |
<tbody><tr> | |
<td width="100%" valign="TOP" bgcolor="#990000"><center><img src="./43rd-congress_files/topbanner.jpg" border="0"></center></td> | |
</tr></tbody></table> | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"ADAMS, George Madison",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000035 | |
"ALBERT, William Julian",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000074 | |
"ALBRIGHT, Charles",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000077 | |
"ALCORN, James Lusk",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000079 | |
"ALLISON, William Boyd",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000160 | |
"AMES, Adelbert",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000172 | |
"ANTHONY, Henry Bowen",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000262 | |
"ARCHER, Stevenson",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000274 | |
"ARMSTRONG, Moses Kimball",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000283 | |
"ARTHUR, William Evans",http://bioguide.congress.gov/scripts/biodisplay.pl?index=A000304 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from bs4 import BeautifulSoup | |
soup = BeautifulSoup (open("43rd-congress.html")) | |
final_link = soup.p.a | |
final_link.decompose() | |
clean_list = [] | |
links = soup.find_all('a') | |
for link in links: |
OlderNewer