Created
June 3, 2020 18:28
-
-
Save francoisstamant/8b392c33ff393d845bee9c926cb9ac4a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
data=[] | |
for i in range(0,10): | |
url = final_list[i] | |
driver2 = webdriver.Chrome() | |
driver2.get(url) | |
sleep(randint(10,20)) | |
soup = BeautifulSoup(driver2.page_source, 'html.parser') | |
my_table2 = soup.find_all(class_=['title-2', 'rating-score body-3']) | |
review=soup.find_all(class_='reviews')[-1] | |
try: | |
price=soup.find_all('span', attrs={'class':'price'})[-1] | |
except: | |
price=soup.find_all('span', attrs={'class':'price'}) | |
for tag in my_table2: | |
data.append(tag.text.strip()) | |
for p in price: | |
data.append(p) | |
for r in review: | |
data.append(r) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Data is an empty list. Inside the for loop, you will be appending the data (lines 19,22 and 25).
All the gists are run separetely, one after the other, since they just showcase different ways of scraping the data. Go here for some description of the instructions, in case you need it: https://towardsdatascience.com/scraping-multiple-urls-with-python-tutorial-2b74432d085f
I hope this helped!