Skip to content

Instantly share code, notes, and snippets.

@panasenco
panasenco / data-modeling.md
Last active October 20, 2022 16:31
Data modeling resources

ROI Analysis

  • A/B testing
    • Ron Kohavi's Trustworthy Online Controlled Experiments:

      Features are built because teams believe they are useful, yet in many domains most ideas fail to improve key metrics. Only one third of the ideas tested at Microsoft improved the metric(s) they were designed to improve (Kohavi, Crook and Longbotham 2009). Success is even harder to find in well-optimized domains like Bing and Google, whereby some measures’ success > rate is about 10–20% (Manzi 2012).

@panasenco
panasenco / ConvertTo-PlantUML.ps1
Created June 7, 2021 23:45
Converts dbt manifest.json to PlantUML ERD diagram. It's not pretty but it's readable.
#!/usr/bin/env pwsh
<#
.Synopsis
Creates a PlantUML ERD diagram from a manifest.json file.
.Parameter Path
Path to the manifest.json file. Defaults to .\target\manifest.json
#>
[CmdletBinding()]
param (
@panasenco
panasenco / ksqldb-multistage-request.md
Last active June 18, 2020 20:55
Processing a multi-stage request in ksqlDB

Suppose there's a stream of multi-stage requests for documents with given IDs to be retrieved from source, contrast-adjusted, OCRed, word-counted, placed in a destination directory, etc. (the exact stages can vary from request to request). The request stages have to be done in order and intermediate results in the cache should be reused (don't want to keep OCRing the same document over and over). I'd like to create a stream of requests for the individual applications (document retrieval app, OCR app, etc.). How can I do that with ksqlDB?


Let's represent both requests and products as arrays of steps it takes to create them. In the worst case, assume each step can contain custom information, so then the representation should be of the type ARRAY<MAP<STRING,STRING>>. Using an array as a key is not currently supported in ksqlDB, but we can 'cheat' by using the array cast as string as the key.

Let's break down the logic first:

  • Any new request should trigger the requests of all its prerequisites.
  • Any
@panasenco
panasenco / ksqldb-autoincrement-example.md
Last active June 16, 2020 20:54
Creating an auto-incrementing column in ksqlDB

Suppose you want to insert values from one ksqlDB stream into another while auto-incrementing some integer value in the destination stream.

First, create the two streams:

CREATE STREAM dest (ROWKEY INT KEY, i INT, x INT) WITH (kafka_topic='test_dest', value_format='json', partitions=1);
CREATE STREAM src (x INT) WITH (kafka_topic='test_src', value_format='json', partitions=1);
@panasenco
panasenco / tmux_local_install.sh
Last active June 20, 2017 23:05 — forked from mkhalooei/tmux_local_install.sh
bash script for installing tmux without root access
#!/bin/bash
# Script for installing tmux on systems where you don't have root access.
# tmux will be installed in $HOME/local/bin.
# It's assumed that wget and a C/C++ compiler are installed.
# exit on error
set -e
TMUX_VERSION=2.5