Skip to content

Instantly share code, notes, and snippets.

@kunalrustagi08
Created May 6, 2020 12:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kunalrustagi08/b8c59608db3b98e78b70011810a1aff3 to your computer and use it in GitHub Desktop.
Save kunalrustagi08/b8c59608db3b98e78b70011810a1aff3 to your computer and use it in GitHub Desktop.
import requests
from requests import get
from bs4 import BeautifulSoup
import numpy as np
import time
import random
pages = np.arange(1,51,1)
book_title = []
star_rating = []
product_price = []
start = time.time()
for page in pages:
time.sleep(random.randint(1,10))
url = 'http://books.toscrape.com/catalogue/page-' + str(page) + '.html'
results = requests.get(url)
soup = BeautifulSoup(results.text, 'html.parser')
book_div = soup.find_all('li', class_='col-xs-6 col-sm-4 col-md-3 col-lg-3')
for container in book_div:
title = container.article.h3.a['title']
book_title.append(title)
price = container.article.find('div', class_='product_price').p.text
product_price.append(price)
rating = container.article.p['class'][-1]
star_rating.append(rating)
end = time.time()
print('It took', (end-start), 'seconds')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment