Skip to content

Instantly share code, notes, and snippets.

@natzir
Last active November 24, 2023 14:47
Show Gist options
  • Save natzir/9b7f8faa2a1d76b762788f2b6baebd6e to your computer and use it in GitHub Desktop.
Save natzir/9b7f8faa2a1d76b762788f2b6baebd6e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jcchouinard
Copy link

jcchouinard commented Nov 7, 2019

Awesome Notebook!

For those who want to automate the process everyday you can also use this bit of code to log in via JSON instead of OAuth.

To download your JSON file in console.developers.google.com > credentials and click on the download arrow. Rename your file client_secrets.json and run the code below.

image

import pandas as pd
import datetime
import httplib2
from apiclient.discovery import build
import qgrid
from collections import defaultdict
from dateutil import relativedelta
import requests
from bs4 import BeautifulSoup
import argparse
from oauth2client import client
from oauth2client import file
from oauth2client import tools

site = 'https://www.example.com'`

SCOPES = ['https://www.googleapis.com/auth/webmasters.readonly']
DISCOVERY_URI = ('https://www.googleapis.com/discovery/v1/apis/customsearch/v1/rest')

CLIENT_SECRETS_PATH = r'C:\Users\YOUR_PATH\client_secrets.json' # Path to client_secrets.json file.

parser = argparse.ArgumentParser(
    formatter_class=argparse.RawDescriptionHelpFormatter,
    parents=[tools.argparser])
flags = parser.parse_args([])

flow = client.flow_from_clientsecrets(
    CLIENT_SECRETS_PATH, scope=SCOPES,
    message=tools.message_if_missing(CLIENT_SECRETS_PATH))

storage = file.Storage(r'C:\Users\YOUR_PATH\searchconsolereporting.dat')
credentials = storage.get()

if credentials is None or credentials.invalid:
  credentials = tools.run_flow(flow, storage, flags)
http = credentials.authorize(http=httplib2.Http())

webmasters_service = build('webmasters', 'v3', http=http)

end_date = datetime.date.today()
start_date = end_date - relativedelta.relativedelta(months=3)

def execute_request(service, property_uri, request):
    return service.searchanalytics().query(siteUrl=property_uri, body=request).execute()

"""
Complete with the code above
"""

To automate, export to mysql or csv and use Windows task scheduler to run the task everyday.

@anasshabrah
Copy link

@natzir thanks for the script, but I have an issue that the last step is running endless
image

@natzir
Copy link
Author

natzir commented Jan 26, 2022

@anasshabrah you have to wait, the script is crawling all the urls and doing the extraction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment