Skip to content

Instantly share code, notes, and snippets.

@DewofyourYouth
Created February 15, 2021 22:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DewofyourYouth/9aef4cae46e400bf8b7380e26c45e5e4 to your computer and use it in GitHub Desktop.
Save DewofyourYouth/9aef4cae46e400bf8b7380e26c45e5e4 to your computer and use it in GitHub Desktop.
import urllib.request
from bs4 import BeautifulSoup
start = 2801
# only_trs = SoupStrainer('tbody')
end = 2832
for i in range(start, end):
with urllib.request.urlopen(f'https://www.mechon-mamre.org/p/pt/pt{i}.htm') as response:
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
# print(soup.prettify())
tb = soup.find_all('table')
rows = tb[1].find_all('tr')
for r in rows:
p = r.find_all('td')
line = p[1].extract()
with open("proverbs.txt", "a") as myfile:
myfile.write(line.contents[2].strip() + '\n')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment