Skip to content

Instantly share code, notes, and snippets.

Last active May 13, 2021 10:46
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save dimitryzub/688d4d6990d8a8f6e50764efb66e69c5 to your computer and use it in GitHub Desktop.
import requests, lxml, urllib.parse
from bs4 import BeautifulSoup
# Adding User-agent (default user-agent from requests library is 'python-requests')
headers = {
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3538.102 Safari/537.36 Edge/18.19582"
# Search query
params = {'q': 'сoffee buy'}
# Getting HTML response
html = requests.get(f'',
# Getting HTML code from BeautifulSoup
soup = BeautifulSoup(html, 'lxml')
# Looking for container that has all necessary data findAll() or find_all()
for container in soup.findAll('div', class_='RnJeZd top pla-unit-title'):
# Scraping title
title = container.text
# Creating beginning of the link to join afterwards
startOfLink = ''
# Scraping end of the link to join afterwards
endOfLink = container.find('a')['href']
# Combining (joining) relative and absolute URL's (adding begining and end link)
ad_link = urllib.parse.urljoin(startOfLink, endOfLink)
# Printing each title and link on a new line
# Output
Jot Ultra Coffee Triple | Ultra Concentrated
MUD\WTR | A Healthier Coffee Alternative, 30 servings
Jot Ultra Coffee Double | 2 bottles = 28 cups
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment