Skip to content

Instantly share code, notes, and snippets.

@JeffersGlass
Created November 22, 2020 17:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JeffersGlass/b2b157727697854eb091549c5a91a6f2 to your computer and use it in GitHub Desktop.
Save JeffersGlass/b2b157727697854eb091549c5a91a6f2 to your computer and use it in GitHub Desktop.
import requests
from bs4 import BeautifulSoup
import re
url = "http://shakespeare.mit.edu/henryv/full.html"
def isNamedLine(tag):
return tag.has_attr('name')
websiteText = requests.get(url).text
soup = BeautifulSoup(websiteText, 'lxml')
lines = soup.find_all(isNamedLine)
print(lines)
with open('henryV.txt', "w") as outfile:
for l in lines:
outfile.write(str(l.text) + "\n")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment