Skip to content

Instantly share code, notes, and snippets.

View Intelrunner's full-sized avatar

Eric E. Intelrunner

View GitHub Profile
@Intelrunner
Intelrunner / bq_slot_usage.sql
Last active March 29, 2022 18:43
This Query Estimates slot usage per SECOND at an organizational level, segregated by project_id. Use: Finding average number of slot seconds per second to determine if purchasing Google Cloud Platform Bigquery Slots is valuable for an organization. Requires: bigquery.jobs.listAll
SELECT
TIMESTAMP_TRUNC(jobs.start_time, second) as sec,
SUM(SAFE_DIVIDE(total_slot_ms, TIMESTAMP_DIFF(end_time, start_time, MILLISECOND))) AS Slot_Count, project_ID as project
FROM
`region-us`.INFORMATION_SCHEMA.JOBS_BY_ORGANIZATION jobs
WHERE jobs.start_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 3 DAY) AND CURRENT_TIMESTAMP()
GROUP BY project_id, sec
ORDER BY sec DESC
@Intelrunner
Intelrunner / count_objects_in_bucket.sh
Created February 25, 2022 14:58
List Google Cloud Storage Buckets, and Count of Objects in Each
for VARIABLE in $(gsutil ls)
do
echo $(gsutil du $VARIABLE | grep -v /$ | wc -l) $VARIABLE
done
@Intelrunner
Intelrunner / resume.json
Created February 25, 2022 16:23 — forked from ycaillaud/resume.json
CV 2
{
"$schema": "https://raw.githubusercontent.com/jsonresume/resume-schema/v1.0.0/schema.json",
"basics": {
"name": "Yannick Caillaud",
"label": "Data Scientist @Betclic-Group",
"image": "",
"email": "yannick.caillaud@hotmail.fr",
"phone": "",
"url": "",
"summary": "Passionate about Machine Learning, Data Mining and Data Science in general.\nEnjoy create value based on not enough exploited data: from POC to MVP to Production, from quick-wins to R&D challenge.\n[Kaggle](https://www.kaggle.com/kinnay) competition contributor.",
@Intelrunner
Intelrunner / parse_pricing_api.py
Created February 28, 2022 05:04 — forked from wnojopra/parse_pricing_api.py
Script that parses pricing data from Google's API.
"""Script that parses pricing data from Google's API.
Intended to be run periodically for the terra-ui's estimated price UI. The
output is something you could copy and paste into a javascript file.
See https://cloud.google.com/billing/v1/how-tos/catalog-api for more detail
on Google Cloud pricing information.
Usage:
1) Follow the instructions at the above URL to create an API key
2) Run `python3 parse_pricing_api.py ${API_KEY}`
@Intelrunner
Intelrunner / url_check.py
Created February 28, 2022 05:14 — forked from seenimohamed/url_check.py
To check whether a given url is up or not
from six.moves import urllib
import requests
def url_is_alive(url):
"""
Checks that a given URL is reachable.
:param url: A URL
:rtype: bool
"""
request = urllib.request.Request(url)
@Intelrunner
Intelrunner / gcp-iam-restrict-user-bucket.sh
Created March 3, 2022 20:10 — forked from mikesparr/gcp-iam-restrict-user-bucket.sh
Google Cloud Platform example to add IAM role restricting user to specific storage buckets with conditions
#!/usr/bin/env bash
export PROJECT_ID=$(gcloud config get-value project)
export PROJECT_USER=$(gcloud config get-value core/account) # set current user
export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")
export IDNS=${PROJECT_ID}.svc.id.goog # workload identity domain
export GCP_REGION="us-central1"
export GCP_ZONE="us-central1-a"
@Intelrunner
Intelrunner / asset_inventory_to_bq.sh
Created March 11, 2022 20:28
This is is a simple set of shell commands that transfers an existing GCP Asset Inventory to Bigquery
#!/bin/sh
# more information: https://cloud.google.com/asset-inventory/docs/exporting-to-bigquery
# must have gcloud cli installed
gcloud asset export \
# must provide a dataset
--bigquery-dataset $DATASET \
# this is for a resource based output
--content-type resource \
@Intelrunner
Intelrunner / Dockerfile
Created April 6, 2022 20:31
Dockerfile Cheat Sheet
FROM — specifies the base (parent) image.
LABEL —provides metadata. Good place to include maintainer info.
ENV — sets a persistent environment variable.
RUN —runs a command and creates an image layer. Used to install packages into containers.
COPY — copies files and directories to the container.
ADD — copies files and directories to the container. Can upack local .tar files.
CMD — provides a command and arguments for an executing container. Parameters can be overridden. There can be only one CMD.
WORKDIR — sets the working directory for the instructions that follow.
ARG — defines a variable to pass to Docker at build-time.
ENTRYPOINT — provides command and arguments for an executing container. Arguments persist.
@Intelrunner
Intelrunner / gcloud_cmd_list.md
Created April 13, 2022 02:37
gCloud Command List

List only instances with no labels

gcloud compute instances list --filter=-labels:'*'

@Intelrunner
Intelrunner / gcp_scan_all_tables_dlp.py
Created May 4, 2022 17:06
Will start an inspect job for ALL TABLES for all datasets in a user's BQ.
""" Warning - this script automatically submits a request for GCP DLP job creation
to scan for every table, in every dataset available to a user. This will fail if the following APIs / Permissions
are not enabled.
This script can, and may, cost you actual $$. Outcomes can be seen in the DLP console.
DLP - jobs.create, jobs.get, jobs.list
BQ - bigquery.user (role)