Skip to content

Instantly share code, notes, and snippets.

tmux cheatsheet

As configured in my dotfiles.

start new:


start new with session name:

ashrithr /
Last active March 14, 2024 21:16
kafka introduction

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.


  • Producers send messages to brokers
  • Consumers read messages from brokers
  • Messages are sent to a topic
timothyandrew /
Last active December 16, 2023 17:05
Set up a seedbox (on DigitalOcean – Ubuntu) really quick


  • This script lets you set up and use a temporary DigitalOcean droplet to download torrent files.
  • Once downloaded, they can be streamed down to your local machine.
  • This uses transmission-cli for the torrent client, and nginx to serve files.

Setup on Local Machine

  • This assumes that you have a DigitalOcean account and tugboat set up, as well as present in the current directory.
fulmicoton / Kaggle Yandex parse code
Created December 11, 2013 12:29
Parse function returns a generator of session object. It takes a generator of tuples as an input.
import itertools
from collections import defaultdict, OrderedDict
from math import log
def dcg(scores):
return sum( (2**score - 1) / log(i+2) for (i, score) in enumerate(scores) )
class Session(object):
debasishg / gist:8172796
Last active May 10, 2024 13:37
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](
lelandbatey /
Last active June 2, 2024 11:23
Whiteboard Picture Cleaner - Shell one-liner/script to clean up and beautify photos of whiteboards!


This simple script will take a picture of a whiteboard and use parts of the ImageMagick library with sane defaults to clean it up tremendously.

The script is here:

convert "$1" -morphology Convolve DoG:15,100,0 -negate -normalize -blur 0x1 -channel RBG -level 60%,91%,0.1 "$2"


bqm / findMeetupWaitlistPosition.js
Last active April 6, 2023 15:40
Find your position in a waitlist
// Open the developer console of your favorite browser
// on the page of your favorite meetup where you're on the wailist (grrr)
// and paste this by putting your name
// This will actually compute the position in the waitlist div which seems to correspond to the waitlist position
var findPosition = function(username) { return $("#rsvp-list-waitlist h5").map(function(i, el) {return {"pos": i, "name": $(el).text()}}).filter(function(i, el) {return el["name"].indexOf(username) >= 0;})}
findPosition("my displayed name") // where my displayed name is the name that is displayed for you on the meetup
bsweger /
Last active April 19, 2024 18:04
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

jmoiron / Makefile
Last active December 23, 2015 19:09
monte carlo pi estimation in different languages
all: monte-c monte-go monte-rs monte-gccgo
go build montepi.go && mv montepi monte-go
rustc -O -o monte-rs
gcc -std=c99 -O2 -o monte-c montepi.c -lm

A Few Useful Things to Know about Machine Learning

The paper presents some key lessons and "folk wisdom" that machine learning researchers and practitioners have learnt from experience and which are hard to find in textbooks.

1. Learning = Representation + Evaluation + Optimization

All machine learning algorithms have three components:

  • Representation for a learner is the set if classifiers/functions that can be possibly learnt. This set is called hypothesis space. If a function is not in hypothesis space, it can not be learnt.
  • Evaluation function tells how good the machine learning model is.
  • Optimisation is the method to search for the most optimal learning model.