Skip to content

Instantly share code, notes, and snippets.

@lmeulen
Created November 14, 2021 09:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lmeulen/4652cd15f98b8cc3b07c595814424852 to your computer and use it in GitHub Desktop.
Save lmeulen/4652cd15f98b8cc3b07c595814424852 to your computer and use it in GitHub Desktop.
fia_doc_parser
import requests
from bs4 import BeautifulSoup
from urllib.parse import quote
BASE_LINK = "https://www.fia.com"
DOCS_URL = BASE_LINK + "/documents/season/season-2021-1108/championships/fia-formula-one-world-championship-14"
content = requests.get(DOCS_URL)
soup = BeautifulSoup(content.text, features='lxml')
docs = soup.find('ul', {'class': 'document-row-wrapper'}).findAll('li')
for doc in docs:
link = BASE_LINK + quote(doc.find('a')['href'].strip())
desc = doc.find('div', {'class': 'title'}).text.strip()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment