Skip to content

Instantly share code, notes, and snippets.

@yanniskatsaros
Created May 7, 2020 19:32
Show Gist options
  • Save yanniskatsaros/8974787561a01d75a2f2f7f641246631 to your computer and use it in GitHub Desktop.
Save yanniskatsaros/8974787561a01d75a2f2f7f641246631 to your computer and use it in GitHub Desktop.
Simple requests and beautiful soup example
import sys
import requests
from bs4 import BeautifulSoup
def download_html(url: str, filename: str):
r = requests.get(url)
r.raise_for_status()
html = r.text
soup = BeautifulSoup(html, 'html.parser')
html = soup.prettify()
with open(filename, 'w') as f:
f.write(html)
if __name__ == '__main__':
n_args = len(sys.argv) - 1
if (n_args < 1) or (n_args > 3):
raise ValueError('Invalid arguments, use: python scraper.py url output.html')
url = sys.argv[1]
if n_args == 1:
download_html(url, 'website.html')
else:
filename = sys.argv[2]
download_html(url, filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment