Skip to content

Instantly share code, notes, and snippets.


David Volquartz Lebech dlebech

View GitHub Profile
dlebech / binomial_prob.sql
Last active Nov 17, 2020
Binomial probability calculation function in SQL (BigQuery)
View binomial_prob.sql
-- Public Domain CC0 license.
-- Calculate the probability of k successes for n trials with probability of success k,
-- using the binomial distribution.
-- Calculate the binomial coefficient using the "multiplicative formula"
CREATE OR REPLACE FUNCTION functions.binomial_coef(n INT64, k INT64) AS ((
-- k!/(n!*(n-k)!)
-- We're going to have a hard time doing factorials here,
-- but based on the "multiplicative formula" in Wiki, it should be possible:
dlebech /
Created Jul 26, 2020
Python Script for downloading and organizing images from The Painting Dataset:
# Public Domain CC0 license.
# Download images from The Painting Dataset:
# The image urls are outdaed in the Excel sheet but the painting urls are not,
# so this script re-crawls those images and downloads them locally.
# It works as of July 2020.
# Run this first with:
# $ scrapy runspider -o paintings.json
# Images are stored in 'out/raw'
dlebech /
Last active Jun 8, 2019
Extract photos and names of members of Danish parliament
# Public Domain CC0 license.
# Run this file first, e.g.:
# $ scrapy runspider -o members.json
# It will probably stop working if they change their urls for the contact list of course.
# Worked in Spring of 2019
import scrapy
import re
from urllib.parse import urlparse, urlunparse
dlebech / tokenizer.js
Last active Mar 4, 2021
Keras text tokenizer in JavaScript with minimal functionality
View tokenizer.js
// Public Domain CC0 license.
class Tokenizer {
constructor(config = {}) {
this.filters = config.filters || /[\\.,/#!$%^&*;:{}=\-_`~()]/g;
this.lower = typeof config.lower === 'undefined' ? true : config.lower;
// Primary indexing methods. Word to index and index to word.
this.wordIndex = {};
this.indexWord = {};
dlebech /
Last active Mar 13, 2020
Matplotlib useful one liners that I always forget
# Matplotlib
# Creating a list of colors (e.g. for a bar chart)
# "Blues" is the colormap. It can be any colormap
colors = [matplotlib.colors.to_hex(c) for c in, 1, len(some_dataframe.index)))]
# Globally adjusting DPI and figure size
matplotlib.rcParams['figure.dpi'] = 100
matplotlib.rcParams['figure.figsize'] = [6.0, 4.0]
dlebech /
Last active Jun 16, 2018
Minimal Keras examples for various purposes
# Public Domain CC0 license.
# Create a Keras embedding layer with an initial one-hot encoding by using identity initializer
import tensorflow as tf
import numpy as np
# Input sequence consisting of four features (e.g. words)
# Let's pretend this is "hello world hello everyone else"
# Where hello is then mapped to 1, world = 0, everyone = 2, else = 3,
a = np.array([[1, 0, 1, 2, 3]])
dlebech /
Created May 4, 2017
Command-line notes
# Convert a unix timestamp in millisconds in a column of a CSV to a date
cat thefile.csv | perl -MPOSIX -pe 's/(^\d+),/strftime("%F,", localtime($1\/1000))/ge'
dlebech /
Last active Sep 18, 2020
Super-simple MongoDB Apache Beam transform for Python
# Public Domain CC0 license.
"""MongoDB Apache Beam IO utilities.
Tested with google-cloud-dataflow package version 2.0.0
__all__ = ['ReadFromMongo']
import datetime
dlebech / redis.go
Created May 19, 2016
Connecting to Redis in Golang
View redis.go
package services
import (
log ""
dlebech /
Created Mar 20, 2016
Python LRU cache that works with coroutines (asyncio)
"""Global LRU caching utility. For that little bit of extra speed.
The caching utility provides a single wrapper function that can be used to
provide a bit of extra speed for some often used function. The cache is an LRU
cache including a key timeout.
import cache