Skip to content

Instantly share code, notes, and snippets.

@jkokatjuhha
Last active September 2, 2019 09:36
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jkokatjuhha/02af3a28cf512ee8a3096273850fe029 to your computer and use it in GitHub Desktop.
Save jkokatjuhha/02af3a28cf512ee8a3096273850fe029 to your computer and use it in GitHub Desktop.
from bs4 import BeautifulSoup
import requests
page_link ='https://www.website_to_crawl.com'
# fetch the content from url
page_response = requests.get(page_link, timeout=5)
# parse html
page_content = BeautifulSoup(page_response.content, "html.parser")
# extract all html elements where price is stored
prices = page_content.find_all(class_='main_price')
# prices has a form:
#[<div class="main_price">Price: $66.68</div>,
# <div class="main_price">Price: $56.68</div>]
# you can also access the main_price class by specifying the tag of the class
prices = page_content.find_all('div', attrs={'class':'main_price'})
@fALKENdk
Copy link

Thanks @jkokatjuhha :)

@hmarkopcuoglu
Copy link

Thanks

@thioseck
Copy link

thioseck commented Sep 2, 2019

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment