Skip to content

Instantly share code, notes, and snippets.

@canimus
canimus / forward_fill.py
Created December 21, 2021 22:42
PySpark FFill Implementation
import pyspark.sql.functions as F
from pyspark.sql import DataFrame
from pyspark.sql import Window as W
from pyspark.sql.window import WindowSpec
__all__ = ["forward_fill"]
def _window_all_previous_rows(partition, order) -> WindowSpec:
"""Select the window on which values are filled in a forward manner."""
@canimus
canimus / TreasuryPricing.py
Created December 17, 2021 01:00 — forked from RamonWill/TreasuryPricing.py
How to convert a Treasury price to decimal in python and vice versa
# Convert Treasury price to decimal in python
# calculating US treasury pricing in python
def treasury_to_decimal(price):
"""
Converts a treasury priced in 32nds into a decimal. This works for
treasurys priced up to 3dp i.e. "99-126"
"""
price_split = price.split("-")
integer_part = int(price_split[0])
# Pull the necessary images:
docker pull nathanleclaire/curl:latest
docker pull openjdk:8u111-jre-alpine

# Start the controller container, note that it has RW access to the Docker API socket:
docker run \
  -ti \
  --rm \
@canimus
canimus / pyspark_timestamp_cast.py
Last active May 11, 2021 12:29
Pyspark Cast 100 columns
from functools import reduce
from operator import methodcaller
import pyspark.sql.functions as F
_ts = lambda dataFrame, col: methodcaller('withColumn', f'{col}TimeStamp', F.to_timestamp(F.col(col)/1000))(dataFrame)
reduce(lambda a,b: _ts(a,b), ['start','stop'], df).select('startTimeStamp', 'stopTimeStamp').show(truncate=False)
@canimus
canimus / AWS Swarm cluster.md
Created March 29, 2021 23:35 — forked from ghoranyi/AWS Swarm cluster.md
Create a Docker 1.12 Swarm cluster on AWS

This gist will drive you through creating a Docker 1.12 Swarm cluster (with Swarm mode) on AWS infrastructure.

Prerequisites

You need a few things already prepared in order to get started. You need at least Docker 1.12 set up. I was using the stable version of Docker for mac for preparing this guide.

$ docker --version
Docker version 1.12.0, build 8eab29e

You also need Docker machine installed.

@canimus
canimus / parse.py
Created March 16, 2021 22:23
IBM Web Server Log Parse
from hashlib import md5 as xx
from collections import namedtuple
import os
import re
# Environment parametes
COLUMN_SEPARATOR = chr(os.getenv('SEPARATOR', 449))
FILE_NAME = os.getenv('FILE', 'hashed.csv')
# Regular expression to capture JSESSIONID
@canimus
canimus / pie.py
Created February 15, 2021 22:44
Pie Chart Matplotlib
import matplotlib.pyplot as plt
# create data
names='Failed: 20', 'Passed: 80',
size=[11.8, 98.2]
# Create a circle for the center of the plot
my_circle=plt.Circle( (0,0), 0.7, color='white')
plt.pie(size, labels=names, colors=['#f56262', '#23d993'])
@canimus
canimus / git-author.sh
Created February 14, 2021 12:44
Git Authors
find . -type f -exec git log --reverse --format="{} %cn" -1 {} \; | cut -d" " -f2- | sort | uniq -c | sort -n -k1
@canimus
canimus / wait_seconds.sh
Created February 14, 2021 12:34
Feature waste
rg ".*I wait for (\d+) secon.*$" -N -r '$1' | rg "\d+" | awk -F: '{s[$1]+=$2}END{for (i in s) print i,s[i]}' | sort -n -k2 | awk '{s+=$2}END{print s}'
@canimus
canimus / tag_count.js
Created February 1, 2021 18:03
Count tags in web page with webdriverio
browser.execute(
() => {
return Array.from(document.querySelectorAll("*")).map(e=>e.tagName.toLowerCase())
})
.reduce((a,b) => {a[b]=(a[b] || 0)+1; return a}, {})