Skip to content

Instantly share code, notes, and snippets.

View mjacobs's full-sized avatar

Matthew Jacobs mjacobs

  • San Francisco
  • 13:06 (UTC -07:00)
View GitHub Profile
@miquels
miquels / pam.c
Created December 7, 2018 10:22
PAM authentication boilerplate.
/* cc -DTEST -Wall -o pamtest pam.c -lpam */
#include <security/pam_appl.h>
#include <sys/resource.h>
#include <string.h>
#include <stdlib.h>
struct creds {
char *user;
char *password;
@trusktr
trusktr / DefaultKeyBinding.dict
Last active October 18, 2025 09:55
My DefaultKeyBinding.dict for Mac OS X
/* ~/Library/KeyBindings/DefaultKeyBinding.Dict
This file remaps the key bindings of a single user on Mac OS X 10.5 to more
closely match default behavior on Windows systems. This makes the Command key
behave like Windows Control key. To use Control instead of Command, either swap
Control and Command in Apple->System Preferences->Keyboard->Modifier Keys...
or replace @ with ^ in this file.
Here is a rough cheatsheet for syntax.
Key Modifiers

Recent versions of Cloudera's Impala added NDV, a "number of distinct values" aggregate function that uses the HyperLogLog algorithm to estimate this number, in parallel, in a fixed amount of space.

This can make a really, really big difference: in a large table I tested this on, which had roughly 100M unique values of mycolumn, using NDV(mycolumn) got me an approximate answer in 27 seconds, whereas the exact answer using count(distinct mycolumn) took ... well, I don't know how long, because I got tired of waiting for it after 45 minutes.

It's fun to note, though, that because of another recent addition to Impala's dialect of SQL, the fnv_hash function, you don't actually need to use NDV; instead, you can build HyperLogLog yourself from mathematical primitives.

HyperLogLog hashes each value it sees, and then assigns them to a bucket based on the low order bits of the hash. It's common to use 1024 buckets, so we can get the bucket by using a bitwise & with 1023:

select
#!/usr/bin/env python
import random
from subprocess import Popen, PIPE
COW_MOODS = ('b', 'd', 'g', 'p', 's', 't', 'y')
COW_COMMANDS = ('cowsay', 'cowthink')
COW_FILES = ''.join(Popen('cowsay -l',
shell=True,
@jboner
jboner / latency.txt
Last active October 28, 2025 10:58
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD