Skip to content

Instantly share code, notes, and snippets.

View QuantVI's full-sized avatar

QuantVI

View GitHub Profile
@QuantVI
QuantVI / SQL_Window_Functions.SQL
Last active September 4, 2022 10:50
postgre sql wind function queries
SELECT
*,
row_number () over () AS Row_N
FROM Summer_Medals
ORDER BY Row_N ASC;
--------------------
SELECT
Year,
row_number() over() AS Row_N
@QuantVI
QuantVI / prob_calculator.py
Created February 26, 2022 14:21
Simulate Sampling without replacement from an Urn, to get a probability
import copy
import random
# Consider using the modules imported above.
class Hat:
def __init__(self, **kwargs):
self.d = kwargs
self.contents = [
key for key, val in kwargs.items() for num in range(val)
]
@QuantVI
QuantVI / assorted_python_blurbs.py
Created May 5, 2020 10:19
Python commads that either came in handy or I really like
# Within a Jupyter Notebook
import sys
sys.path.append("C:/somewhere/else/not/in/the/tree/of/this/jpfile")
# Now we can import Windows directories as packages, and refer to Python files within them
# the folder jpyfile need not have an __init__.py file in it
# folders that we import from within jpfile should
from folder_w_in_jpfile import my_modular_code
# folder_w_in_jpfile contains an __init__.py file whihc can be empty
# my_modular_code is a file my_modular_code.py with any other Python code you've written.
# This snippet is for when there are issues with
@QuantVI
QuantVI / ss_simple_segments_full_1_hardcode_pid1731.sql
Created July 24, 2019 00:08
A HIVE/Hadoop HQL query over a large distributed cluster. For a single client, it scans 45+ days worth of re-targeting campaign data related to clicks, click revenue, sales and order value. It returns an aggregated result based on site-visitor segments, such as "Abandoned Cart" visitors.
-- Query to breakout events into legacy segmentation
CREATE TEMPORARY FUNCTION rank AS 'ABCxyz.QWERTY.hadoop.hive.udf.Rank';
SET mapred.map.output.compression.codec = ABCxyz.hadoop.compression.lzo.LzopCodec;
SET hive.cli.print.header = TRUE;
SET mapred.job.priority = HIGH;
SET partnerid = 1731;
SET golivedate = "2014-03-01";
-- abandondate is 15 days before golivedate
@QuantVI
QuantVI / clicks_per_day_w_7d_csum_advsql2.sql
Created July 23, 2019 23:48
HP Vertica SQL query using the OVER window function combined with a relative date range.
SELECT
day,
clicks,
SUM(clicks) OVER (order by day range between '7 days' preceding and current row) accumulated_clicks
FROM
datamart.fact_network_stats_daily
WHERE
network_id = 72
AND affiliate_country_code = 'FR'
AND banner_currency_id = 1
@QuantVI
QuantVI / revenue_by_client_advsql2.sql
Created July 23, 2019 23:43
HP Vertica SQL query using a common table expression (CTE) and window function available in this variant of SQL
WITH
revenue_by_client
AS
(
SELECT
f.day,
c.client_country_name,
c.client_name,
SUM(f.revenue * er.rate) revenue_euro
FROM
@QuantVI
QuantVI / quick_split_batch.py
Created July 23, 2019 23:20
A text file-splitter. Probably the second file-splitter I wrote on the job.
# Editing to be able to split all files in a directory
import sys
import os # for getting the list of all files in the directory, when we use batch mode
import re # to find file matching a particular handle/pattern
a = sys.argv[0]
print '\n', sys.argv
@QuantVI
QuantVI / output_status_distrib_w_conf_status_min_max.json
Created July 23, 2019 23:12
ElasticSearch output: Example output to and early version of the "external confirmation status, versus internal status code" query (status_distrib_w_conf_status_min_max.json)
{
"hits": {
"hits": [],
"total": 123928,
"max_score": 0.0
},
"_shards": {
"successful": 20,
"failed": 0,
"total": 20
@QuantVI
QuantVI / total_room_Rate_ranges.json
Created July 23, 2019 23:07
ElasticSearch query: range of rates paid per night for a room. Example of an aggregation query returning results in JSON format. These result were passed to "Excel Printer" via "Scribe Report", two other gists you'll find here.
{ "size": 0,
"query": {
"term": {"propertyId":"{{property_id}}"}
},
"aggs" : {
"zero_or_not": {
"range" : {
"field" : "totalRoomRateInPropertyCurrency",
"ranges" : [
{"from": 0, "to" : 0.1 },
@QuantVI
QuantVI / status_distrib_w_conf_status_min_max.json
Created July 23, 2019 23:05
ElasticSearch query: external confirmation status, versus internal status code. Example of an aggregation query returning results in JSON format. These result were passed to "Excel Printer" via "Scribe Report", two other gists you'll find here.
{ "size": 0,
"query": {
"term": {"propertyId":"{{property_id}}"}
},
"aggs": {
"pms_status_code": {
"terms" : { "field" : "confirmationStatus"},
"aggs": {
"conf_status": {
"terms" : { "field" : "statusCode"},