Skip to content

Instantly share code, notes, and snippets.

SELECT
FROM my_table
@Geremie
Geremie / define_address_schema.py
Created May 19, 2022 14:54
Do Yourself a Favor and Stay Away from BigQuery Schema Auto-detection
import json
schema = [
{
"description": "Name of the house",
"name": "name",
"type": "STRING",
"mode": "REQUIRED"
},
{
@Geremie
Geremie / load_to_bq.sh
Created May 19, 2022 14:52
Do Yourself a Favor and Stay Away from BigQuery Schema Auto-detection
!bq load --autodetect \
--source_format NEWLINE_DELIMITED_JSON \
tmp.test_sample_2 gs://<path_to_folder_containing_the_file>test_sample_2.jsonl address_schema.json
@Geremie
Geremie / load_to_bq.sh
Last active May 19, 2022 14:49
Do Yourself a Favor and Stay Away from BigQuery Schema Auto-detection
!bq load --autodetect \
--source_format NEWLINE_DELIMITED_JSON \
tmp.test_sample_2 gs://<path_to_folder_containing_the_file>test_sample_2.csv code:STRING
@Geremie
Geremie / load_to_bq.sh
Last active May 19, 2022 14:50
Do Yourself a Favor And Stay Away From BigQuery Schema Auto-detection
!bq load --autodetect \
--source_format NEWLINE_DELIMITED_JSON \
tmp.test_sample_2 gs://<path_to_folder_containing_the_file>test_sample_2.jsonl
@Geremie
Geremie / load_to_bq.sh
Last active May 18, 2022 15:35
Do Yourself a Favor And Stay Away From BigQuery Schema Auto-detection
bq load --autodetect \
--skip_leading_rows=1 \
tmp.test_sample_1 gs://<path_to_folder_containing_the_file>/test_sample_1.csv
@Geremie
Geremie / dag.py
Created April 18, 2022 21:00
If you are using Python and Google Cloud Platform, this will Simplify Life for you (Part 2)
def print_haversine():
from mypythonlib import myfunctions
x1, y1, x2, y2 = 1, 2, 3, 4
haversine_distance = myfunctions.haversine(x1, y1, x2, y2)
print(f'The haversine distance between ({x1}, {y1}) and ({x2}, {y2}) est {haversine_distance}')
dag = DAG('haversine-distance-dag',
default_args=DEFAULT_ARGS,
schedule_interval=None,
@Geremie
Geremie / deploy_resources.sh
Created April 18, 2022 20:43
If you are using Python and Google Cloud Platform, this will Simplify Life for you (Part 2)
# Deploy the DAG
gsutil cp dag.py $dags_folder
@Geremie
Geremie / deploy_resources.sh
Last active April 18, 2022 19:51
If you are using Python and Google Cloud Platform, this will Simplify Life for you (Part 2)
# Add artifact registry reader role for service account
gcloud projects add-iam-policy-binding <project_id> \
--member=serviceAccount:<your_service_account_id>@<project_id>.iam.gserviceaccount.com \
--role=roles/artifactregistry.reader
# Create key for service account
gcloud iam service-accounts keys create <path_to_key_file> \
--iam-account=<your_service_account_id>@<project_id>.iam.gserviceaccount.com
# Get the python repository url and authentication credentials
gcloud artifacts print-settings python --project <your_gcp_project> \
--repository <your_repository_name> --location <your_repository_location> --json-key <path_to_key_file>
@Geremie
Geremie / deploy_resources.sh
Created April 17, 2022 21:11
If you are using Python and Google Cloud Platform, this will Simplify Life for you (Part 2)
# Install the private python package
gcloud composer environments update <your_environment_name> \
--update-pypi-package mypythonlib --location <your_location_name>