Skip to content

Instantly share code, notes, and snippets.

@yogendratamang48
Last active September 7, 2023 16:00
Show Gist options
  • Save yogendratamang48/9d198673a29beee5412524937de3c87a to your computer and use it in GitHub Desktop.
Save yogendratamang48/9d198673a29beee5412524937de3c87a to your computer and use it in GitHub Desktop.
Example Script
import requests
from lxml import html
LINKS = [
"https://livingcost.org/cost/united-states/ny",
"https://livingcost.org/cost/netherlands/amsterdam"
]
EXTRACTOR_XPATH = '//th[contains(text(), "Total with rent")]/following-sibling::td[1]/div/span/text()'
def parse_page(url):
resp = requests.get(url)
page = html.fromstring(resp.text)
cost_of_living = page.xpath(EXTRACTOR_XPATH)
if cost_of_living:
print(f"Cost of living: {cost_of_living[0]}")
else:
print(f"Parse error: {url}")
for link in LINKS:
parse_page(link)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment