Skip to content

Instantly share code, notes, and snippets.

@mjdargen
Last active September 16, 2023 13:10
Show Gist options
  • Save mjdargen/3cd055bcf6883e891a9e39e9078ef2b9 to your computer and use it in GitHub Desktop.
Save mjdargen/3cd055bcf6883e891a9e39e9078ef2b9 to your computer and use it in GitHub Desktop.

Pull Google Sheet Data into Python

Shows how to use pull data from a Google Sheet into Python without the use of an API.

Publishing Google Sheet

  1. Go to "File -> Publish to the web"
  2. This will bring up the window shown below. Select the "Link" tab.
  3. Select the specific page that you want to publish.
  4. Then select to publish it as "Comma-separated values (.csv)"
  5. Once you are ready, click Publish. A pop-up will appear asking if you are sure. Click "OK".
  6. The window will now show the link to the .csv file.
  7. You can open this link in a browser and it will download the .csv file so you can take a look.
  8. Here's a link to my .csv file: https://docs.google.com/spreadsheets/d/e/2PACX-1vTs5kcL7JagsiyfeL7HvqYy-dWQv12Kl-IXvfjzeKw1YA1Luj0v8jg_P_NnSHGgvb4v8z1skOccVK4z/pub?gid=1787645985&single=true&output=csv
  9. Take a look at what a csv file looks like. It is laid out as a table just as it was in the Google Sheet. Commas separate the columns and new lines separate the rows.

Sheet Publishing

Retrieving Sheet Data

Use the script below and change the URL to be the link to your CSV file to pull in the data.

import os
import requests
import filecmp

DIR_PATH = os.path.dirname(os.path.realpath(__file__))
URL = "https://docs.google.com/spreadsheets/d/e/2PACX-1vTs5kcL7JagsiyfeL7HvqYy-dWQv12Kl-IXvfjzeKw1YA1Luj0v8jg_P_NnSHGgvb4v8z1skOccVK4z/pub?gid=1787645985&single=true&output=csv"

def main():
    print("Fetching form data...")
    if fetch_form_data(URL):
        print("New form submitted! Proceeding... ")
    else:
        print("No new forms submitted. Exiting...")
    print("Done!")


# retrieve Google Form data as CSV
def fetch_form_data(url):
    # request and download csv
    r = requests.get(url)
    decoded_content = r.content.decode('utf-8').splitlines()

    # requests downloads the content, create a new file to store the data
    decoded_content = decoded_content[1:]  # remove header
    with open(f"{DIR_PATH}/new.csv", "w") as f:
        for c in decoded_content:
            f.write(c + '\n')

    # see if files match
    try:
        filecmp.cmp(f"{DIR_PATH}/old.csv", f"{DIR_PATH}/new.csv")
    except IOError:
        # old file doesn't exist yet, create empty file
        f = open(f"{DIR_PATH}/old.csv", "w")
        f.close()

    # files match, no new forms submitted
    if filecmp.cmp(f"{DIR_PATH}/old.csv", f"{DIR_PATH}/new.csv"):
        return False
    # files don't match, so process new data
    else:
        return True


if __name__ == '__main__':
    main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment