Skip to content

Instantly share code, notes, and snippets.

View Intelrunner's full-sized avatar

Eric E. Intelrunner

View GitHub Profile
@Intelrunner
Intelrunner / resume.json
Created February 25, 2022 16:23 — forked from ycaillaud/resume.json
CV 2
{
"$schema": "https://raw.githubusercontent.com/jsonresume/resume-schema/v1.0.0/schema.json",
"basics": {
"name": "Yannick Caillaud",
"label": "Data Scientist @Betclic-Group",
"image": "",
"email": "yannick.caillaud@hotmail.fr",
"phone": "",
"url": "",
"summary": "Passionate about Machine Learning, Data Mining and Data Science in general.\nEnjoy create value based on not enough exploited data: from POC to MVP to Production, from quick-wins to R&D challenge.\n[Kaggle](https://www.kaggle.com/kinnay) competition contributor.",
@Intelrunner
Intelrunner / count_objects_in_bucket.sh
Created February 25, 2022 14:58
List Google Cloud Storage Buckets, and Count of Objects in Each
for VARIABLE in $(gsutil ls)
do
echo $(gsutil du $VARIABLE | grep -v /$ | wc -l) $VARIABLE
done
@Intelrunner
Intelrunner / bq_slot_usage.sql
Last active March 29, 2022 18:43
This Query Estimates slot usage per SECOND at an organizational level, segregated by project_id. Use: Finding average number of slot seconds per second to determine if purchasing Google Cloud Platform Bigquery Slots is valuable for an organization. Requires: bigquery.jobs.listAll
SELECT
TIMESTAMP_TRUNC(jobs.start_time, second) as sec,
SUM(SAFE_DIVIDE(total_slot_ms, TIMESTAMP_DIFF(end_time, start_time, MILLISECOND))) AS Slot_Count, project_ID as project
FROM
`region-us`.INFORMATION_SCHEMA.JOBS_BY_ORGANIZATION jobs
WHERE jobs.start_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 3 DAY) AND CURRENT_TIMESTAMP()
GROUP BY project_id, sec
ORDER BY sec DESC