Skip to content

Instantly share code, notes, and snippets.

@Bench-amblee
Last active November 13, 2021 15:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Bench-amblee/9dde4918177888adaeee41cc896a1204 to your computer and use it in GitHub Desktop.
Save Bench-amblee/9dde4918177888adaeee41cc896a1204 to your computer and use it in GitHub Desktop.
import pandas as pd
import json
import requests
from pandas.io import gbq
import pandas_gbq
import gcsfc
'''
function 1: All this function is doing is responding and validating any HTTP request, this is
important if you want to schedule an automatic refresh or test the function locally.
'''
def validate_http(request):
request.json = request.get_json()
if request.args:
get_api_data()
return f'Data pull complete'
elif request_json:
get_api_data()
return f'Data pull complete'
else:
get_api_data()
return f'Data pull complete'
'''
function 2: This is where you put your own code, as long as the output is a
pandas dataframe you can write it out however you want, here's an example:
'''
def get_api_data():
url = 'https://www.apidata.com'
r = requests.get(url)
data = r.json()
df = pd.DataFrane.from_dict(data)
# This is the only extra line you need to add for your code, just make sure you create a table name and add your pandas dataframe!
bq_load('TABLE NAME', df)
'''
function 3: This function just converts your pandas dataframe into a bigquery table,
you'll also need to designate the name and location of the table in the variable
names below.
'''
def bq_load(key, value):
project_name = 'YOUR PROJECT NAME'
dataset_name = 'YOUR DATASET NAME'
table_name = key
value.to_gbq(destination_table='{}.{}'.format(dataset_name, table_name), project_id=project_name, if_exists='replace')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment