Skip to content

Instantly share code, notes, and snippets.

@aleenprd
Last active October 24, 2022 18:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aleenprd/63b736f9f6183e55a237f52afd86e12f to your computer and use it in GitHub Desktop.
Save aleenprd/63b736f9f6183e55a237f52afd86e12f to your computer and use it in GitHub Desktop.
scraper_base_class
class ScraperException(Exception):
"""Starting point for Scraper exceptions."""
pass
class ImdbScraperException(ScraperException):
"""Starting point for Scraper exceptions."""
pass
class Scraper():
"""Class meant to be parent of various other scrapers.
Attributes:
chromedriver (chromedriver): a Chrome webdriver for Selenium.
Methods:
make_soup_with_selenium
@staticmethod fetch_el_if_available
"""
def __init__(self):
driver_service = Service(ChromeDriverManager().install())
self.chromedriver = webdriver.Chrome(service=driver_service)
@staticmethod
def fetch_el_if_available(soup: BeautifulSoup, element_type: str, class_type: str):
"""Returns element text if found, otherwise returns None.
Args:
soup (BeautifulSoup): a b24 soup.
element_type (str): HTML type e.g. 'div'.
class_type (str): the class of the desired element.
Returns:
element (str): text inside element.
"""
element = soup.find(element_type, class_type)
if element is not None:
element = element.text
return element
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment