Skip to content

Instantly share code, notes, and snippets.

@4sushi
4sushi / BigQuery - encryption exemples (with KMS).md
Last active October 13, 2022 10:08
BigQuery - encryption exemples (with KMS) - AEAD and DETERMINISTIC encrypt
from google.cloud import bigquery
from datetime import date
# You have to create a dataset in your BQ project before running this script, here the name is "archive"
ARCHIVE_DATASET = 'archive'
def archive_table(bq, dataset, table):
d = date.today()
table_src_id = '{}.{}.{}'.format(bq.project, dataset, table)
table_dest_id = '{}.{}.{}-{}-{:%Y-%m-%d}'.format(bq.project, ARCHIVE_DATASET, dataset, table, d)
import requests
from multiprocessing import Pool
import time
def get_url(url):
result = requests.get(url, params={})
return url
@4sushi
4sushi / airflow_mutiple_schedule_interval.py
Created July 20, 2021 13:55
Airflow use multiple schedule interval for one DAG. Condition based on date execution. Ex: run daily + monthly
from datetime import timedelta
from airflow import DAG
from airflow.utils.dates import days_ago
from airflow.operators.python_operator import BranchPythonOperator
from airflow.operators.dummy_operator import DummyOperator
from airflow.utils.trigger_rule import TriggerRule
from croniter import croniter
import dateutil.parser
default_args = {
"""
Exemple of script to copy tables from Cloud SQL database to Bigquery (using SQL proxy) with airflow
Note: part of the code inside function get_proxy_connection_engine is from airflow.contrib.operators
Author: 4sushi
Date: 2021-06-30
"""
import pandas as pd
from sqlalchemy import inspect
from airflow.hooks.base_hook import BaseHook
"""
Python script to generate CREATE SQL statement, based on json data
Author: 4sushi
Creation date: 2021-06-30
"""
from google.cloud import bigquery
import json
# Load json data from string or file...
@4sushi
4sushi / bypass_distil_security_selnium.py
Created July 1, 2019 13:27
Bypass distil security with Selenium python
"""
Example to bypass distil security (https://www.distilnetworks.com/) with Selenium.
They use the javascript field navigator.webdriver to ban Selenium
The solution is to inject javascript code before the laoding og the webpage, to set webdriver to false
Works only with chromium driver
"""
from datetime import datetime
import os
import sys
@4sushi
4sushi / scrapely_set_header.py
Created July 1, 2019 13:16
Set header with Scrapely lib
# Example to set header with Scrapely
# Test with python 3.6
# pip install requests scrapely
from scrapely import Scraper
from scrapely.htmlpage import HtmlPage
import requests
def get_html_page(url, headers=None, encoding='utf-8'):
r = requests.get(url, headers=headers)
@4sushi
4sushi / firewall.sh
Created July 3, 2018 17:01
Linux IPTABLES firewall hadoop cluster example
# Simple example of firewall for hadoop cluster (on public network) with IPTABLES
# run the script with sudo
# Example :
# master : 10.0.0.1 (public IP)
# slave1 : 10.0.0.2 (public IP)
# slave2 : 10.0.0.3 (public IP)
# company : 5.0.0.1 (public IP)
@4sushi
4sushi / benchmarkihm.js
Last active September 7, 2017 13:33
BenchmarkIHM.js : understand benchmarkJS result with simple IHM
var BenchmarkIHM = (function() {
'use strict';
function Test(benchmarkTest) {
this.name = this.getName(benchmarkTest);
this.pctMOE = this.getPctMOE(benchmarkTest);
this.opsSec = this.getOpsSec(benchmarkTest);
this.opsSecMin = ~~(this.opsSec * (1 - this.pctMOE / 100));
this.opsSecMax = ~~(this.opsSec * (1 + this.pctMOE / 100));
this.pctMin = null;