Skip to content

Instantly share code, notes, and snippets.

View ludflu's full-sized avatar
🎯
Focusing

Jim Snavely ludflu

🎯
Focusing
View GitHub Profile
@ludflu
ludflu / mwaa-var
Last active February 5, 2024 19:37
script for setting airflow variables in AWS MWAA
#!/bin/bash
# as described here: https://blog.beachgeek.co.uk/working-with-parameters-and-variables-in-amazon-managed-workflows-for-apache-airflow/
[ $# -eq 0 ] && echo "Usage: $0 MWAA environment name " && exit
if [[ $2 == "" ]]; then
dag="variables list"
elif [ $2 == "get" ] || [ $2 == "delete" ] || [ $2 == "set" ]; then
@ludflu
ludflu / llamaindex.py
Created January 8, 2024 16:40
Llamaindex chat with docs
from llama_index.llms import Ollama
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index import ServiceContext
from llama_index import (
ServiceContext,
SimpleDirectoryReader,
StorageContext,
VectorStoreIndex,
set_global_service_context,
@ludflu
ludflu / gist:96cad4f277e034e1f384befcdff1cbf7
Last active January 22, 2022 19:07
demo load checklist
  • alerts_by_actions
  • alerts_by_all
  • alerts_by_date
  • alerts_by_departments
  • alerts_by_employees
  • alerts_by_encounter
  • alerts_by_lgl
  • alerts_by_provider_types
  • alerts_by_triggers
  • alerts_with_dis
@ludflu
ludflu / music.hs
Created October 11, 2021 18:46
trying to use constraint programming to write music
{-# OPTIONS_GHC -Wno-missing-methods #-}
{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE RankNTypes #-}
module Main where
@ludflu
ludflu / find_peaks.py
Last active March 11, 2020 19:10
Find peaks in time series data
# I wanted to see if my naive find_peaks code would be faster than: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html
# So far the answer is that its about the same. Will run more benchmarks
def remove_repeats(data):
acc = []
head, *tail = data
acc.append(head)
for (v,i) in tail:
(lastv,lasti) = acc[-1]
if (v != lastv): # only append if element is not the same as last
@ludflu
ludflu / datamgr.txt
Last active May 20, 2019 16:20
interview questions for a data engineering manager
Technical questions:
1. Describe a data pipeline or data warehouse you've built
2. How do you go about gathering requirements for a data pipeline or warehouse?
3. How do you unit test ETL systems?
4. Explain CI/CD for data systems
5. How do you track data provenance?
6. What makes a software architecture good or bad? What makes a code module good or bad? A function?
7. If someone gives you a process that is too slow, how do you improve its performance?
8. Explain normalized vs denormalized data schemas. Why would you pick one over the other?
@ludflu
ludflu / orders.sql
Last active September 10, 2020 15:02
--- for sqlfiddle go to http://sqlfiddle.com/#!17
create table customer(
id serial primary key,
name varchar(256)
);
create table ServiceOrder(
id serial primary key,
description varchar(256),
@ludflu
ludflu / blenderkeys.md
Last active October 24, 2023 00:53
Blender Keyboard Shortcuts

Blender Keyboard Shortcuts

Key Description
n properties
g translate (x,y,z)
r rotate (x,y,z)
s scale (x,y,z)
e extrude
i inset
@ludflu
ludflu / privacy.md
Last active November 1, 2018 18:45
evolution of data privacy techniques

The evolution of privacy technology

  1. Data sanitizing - supressing identifiers
  2. k-Anonymity (Sweeney & Samarai, 1998) - each individual contained in dataset is indistinguisable from k-1 other users

In practice, it works by a combination of supressing identifiers and bucketing values https://en.wikipedia.org/wiki/K-anonymity The algorithm k-Optimize by Bayardo and Agrawal (2005) approximates k-Anonymity . It aims to perform the "lowest cost" anonymization - meaning it supresses and aggregates data a little as possible in order to achieve the required "k" not great for high-dimensional datasets

@ludflu
ludflu / journal.scala
Last active January 2, 2018 16:21
free monad language for a key-value store
package Journal
import cats._
import cats.data.State
import cats.implicits._
import com.github.nscala_time.time.Imports._
import cats.free.Free
import cats.free.Free.liftF
import cats.arrow.FunctionK