Skip to content

Instantly share code, notes, and snippets.

@shashisingh
shashisingh / EventTimeSessionWindowImplementationViaFlatMapGroupsWithState.scala Implementation of session window with event time and watermark via flatMapGroupsWithState, and SPARK-10816
case class SessionInfo(sessionStartTimestampMs: Long,
sessionEndTimestampMs: Long,
numEvents: Int) {
/** Duration of the session, between the first and last events + session gap */
def durationMs: Long = sessionEndTimestampMs - sessionStartTimestampMs
}
case class SessionUpdate(id: String,
sessionStartTimestampSecs: Long,
@shashisingh
shashisingh / kafka-cheat-sheet.md
Created December 1, 2020 10:25 — forked from ursuad/kafka-cheat-sheet.md
Quick command reference for Apache Kafka

Kafka Topics

List existing topics

bin/kafka-topics.sh --zookeeper localhost:2181 --list

Describe a topic

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic mytopic

Purge a topic

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic mytopic --config retention.ms=1000

... wait a minute ...

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
import tensorflow as tf
model_name = 'bert-base-cased'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
texts = ["I'm a positive example!", "I'm a negative example!"]
labels = [1, 0]
@shashisingh
shashisingh / README.md
Created July 29, 2021 11:01 — forked from sandys/Fastapi-sqlalchemy-pydantic-dataclasses-reloadable-logging.md
fastapi with python 3.7 dataclasses - used to create both sqlalchemy and pydantic models simultaneously

cmdline

poetry run gunicorn testpg:app -p 8080 --preload --reload --reload-engine inotify -w 10 -k uvicorn.workers.UvicornWorker --log-level debug --access-logfile - --error-logfile - --access-logformat "SSSS - %(h)s %(l)s %(u)s %(t)s \"%(r)s\" %(s)s %(b)s \"%(f)s\" \"%(a)s"

How to quickly run postgres (using docker)

docker run --network="host" -it --rm --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -e PGDATA=/var/lib/postgresql/data/pgdata -v /tmp/pgdata2:/var/lib/postgresql/data -e POSTGRES_USER=test postgres

This command will quickly start postgres on port 5432 and create a database test with user test and password mysecretpassword

@shashisingh
shashisingh / 1-scm-background.txt
Created October 25, 2021 13:13 — forked from rvanbruggen/1-scm-background.txt
Supply Chain Management Example in Neo4j
// Found article on https://or.stackexchange.com/questions/529/supply-chain-public-data-repository
// https://pubsonline.informs.org/doi/suppl/10.1287/msom.1070.0176
// https://pubsonline.informs.org/doi/suppl/10.1287/msom.1070.0176/suppl_file/msom.1070.0176-sm-datainexcel.zip
//Article by Sean P. Willems: https://pdfs.semanticscholar.org/232c/451fcf58dbcc1527de6d02cd6e76aea9e871.pdf?_ga=2.33151675.429569592.1581427039-1552162479.1581427039
Table 2 Classifications Used to Label Every Stage in the Chains
Classifications label Activity
Dist_ A stage that distributes an item
Manuf_ A stage that manufactures or assembles an item
Part_ A stage that procures an item
@shashisingh
shashisingh / anonymize.sql
Created December 9, 2021 11:49 — forked from t0mpere/anonymize.sql
anonymization dbt macro
{% macro anonymize(column_name, function, add_alias=True) %}
{% if execute %}
{% call statement('salt', fetch_result=True) %}
select salt from sources.private.salt limit 1;
{% endcall %}
{% set salt = load_result('salt')['data'][0][0] %}
{% if function == 'full_hash' %}
SHA256(concat({{ column_name }},'{{ salt }}'))
{% endif %}
@shashisingh
shashisingh / uWSGI.sh
Created March 10, 2022 16:30 — forked from omedhabib/uWSGI.sh
uWSGI.sh
#!/usr/bin/env bash
PROCESSOR_COUNT=$(nproc)
THREAD_COUNT=2
uwsgi --http :9808 --plugin python2 --wsgi-file app.py --processes "$PROCESSOR_COUNT" --threads "$THREAD_COUNT" --disable-logging

How we incorporate next and cloudfront (2018-04-21)

Feel free to contact me at robert.balicki@gmail.com or tweet at me @statisticsftw

This is a rough outline of how we utilize next.js and S3/Cloudfront. Hope it helps!

It assumes some knowledge of AWS.

Goals

@shashisingh
shashisingh / aws-cfn-self-referencing-sg.json
Created July 29, 2022 06:30 — forked from alanwill/aws-cfn-self-referencing-sg.json
AWS CloudFormation example that allows a security group rule to reference the same security group as the source.
{
"Description": "Create a VPC with a SG which references itself",
"AWSTemplateFormatVersion": "2010-09-09",
"Resources": {
"vpctester": {
"Type": "AWS::EC2::VPC",
"Properties": {
"CidrBlock": "172.16.0.0/23",
"EnableDnsSupport": false,
"EnableDnsHostnames": false,
@shashisingh
shashisingh / nginx.conf
Created May 14, 2023 03:06 — forked from shortjared/nginx.conf
AWS API Gateway Nginx Reverse Proxy
# NOTE
#
#
# Use sed on the instance up to replace the INSTANCE_ID and DNS_RESOLVER with the following commands
#
####################################################################################################
# Fetch the private IP for resolving DNS dynamically in nginx
# We also need to escape the `.` from it for usage in later sed
#
# DNS_RESOLVER=`grep nameserver /etc/resolv.conf | cut -d " " -f2 | sed 's/\./\\./g'`