Skip to content

Instantly share code, notes, and snippets.

@lowweihong
Last active December 24, 2019 21:55
Show Gist options
  • Save lowweihong/d0f277bff4f2ba99eb99dc1c909a3047 to your computer and use it in GitHub Desktop.
Save lowweihong/d0f277bff4f2ba99eb99dc1c909a3047 to your computer and use it in GitHub Desktop.
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd
import requests
from bs4 import BeautifulSoup
# Access credentials for google sheet and access the google sheet
scope = ['https://spreadsheets.google.com/feeds',
'https://www.googleapis.com/auth/drive']
# Copy your path to your credential JSON file.
PATH_TO_CREDENTIAL = '/Users/bla/credential/data science-4936da0d6b1f.json'
# Initiate your credential
credentials = ServiceAccountCredentials.from_json_keyfile_name(PATH_TO_CREDENTIAL, scope)
# Authorize your connection to your google sheet
gc = gspread.authorize(credentials)
# Access google sheet
sht1 = gc.open_by_key('11DU3qbPoXlbUXJ-o_zerndJdqQ8pNBMsAZpeBfbwyc0')
# Get the data from the google sheet
df = pd.DataFrame(sht1.worksheet("Sheet1").get_all_values()[1:])
# Rename column name
df.columns =df.iloc[0]
df.drop(df.index[0], inplace=True)
# Make a request to the competitor's link
header = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36",
"upgrade-insecure-requests": "1",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
"accept-encoding": "gzip, deflate, br",
"accept-language": "en-GB,en-US;q=0.9,en;q=0.8",
"cache-control": "max-age=0"}
res = requests.get(df['Competitor Link'].iloc[0], headers=header)
# Get the price from the response
soup = BeautifulSoup(res.text)
price = soup.find("span", {"id": "priceblock_ourprice"}).text.replace("$","")
# Update the crawled price to the google sheet with:
# sheet name: Sheet1
# cell name: D3
sht1.worksheet("Sheet1").update_acell('D3', price)
@thomas2907
Copy link

Hello,
Nice script and it works, thank you.
But how do I use it if I have more prices to watch, so more items in column A and B
I tried something like this but didn't work properly (it only copied the resuult of D3 into D4 and D5):

`# Select range
cell_list = sht1.worksheet("Sheet1").range('D3:D5')

for cell in cell_list:
cell.value = price

Update in batch

sht1.worksheet("Sheet1").update_cells(cell_list)`

Any ideas?

@lowweihong
Copy link
Author

Hi there,

Great to hear it. :)

May I ask what is the data type of your price variable? Cause I am suspecting in the original for loop, the price is a constant, and it just copies the same value to each cell, resulting in the case you describe.

Maybe you want to change the for loop from

for cell in cell_list:
    cell.value = price

To something like:

for i, cell in enumerate(cell_list):
    cell.value = price[i]

And make sure price is an array having same length as cell_list.

Happy coding:)

@thomas2907
Copy link

Hello,
Thank you for your quick reply.
Your suggestion works a little bit better :)
I'm not sure what the data type is , nor if the price array is having the same length as the cell list.
How would you expand your own script for more items?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment