Skip to content

Instantly share code, notes, and snippets.

View ast_tagged_index.c
// We have five kinds of nodes: literal, negate, not, add, xor.
// In this case we only need 3 bits for the tag but you can use as many as you need.
enum Tag {
View self_validating_handles.c
struct Object {
Tag tag; // The tag is any piece of data that uniquely identifies the object.
// ...
struct Handle {
Tag tag;
Index index; // This caches a speculative table index for an object with the corresponding tag.
View graph_separators.txt
A grid graph is a directed graph where the vertices are arranged in k rows each
containing k vertices, where k is a positive integer. Let v[i,j] denote the jth
vertex in the ith row. Let s = v[1,1]. A grid graph has the following vertices and edges:
V = {v[i,j] | 1 <= i <= k and 1 <= j <= k}
E = {(v[i,j], v[i,j+1]) | 1 <= i <= k and 1 <= j < k} union
{(v[i,j], v[i,j-1]) | 1 <= i <= k and 1 < j <= k} union
{(v[i,j], v[i+1,j]) | 1 <= i < k and 1 <= j <= k}
View add_tail_call.c
ValueRef add(ValueRef left, ValueRef right) {
// Canonical commutation: x + y => y + x (constants have negative pos and move to the left)
if (left.pos > right.pos) SWAP(left, right);
switch (left.op) {
case NEG:
// Strength reduction: (-x) + y => y - x
return sub(right, getleft(left));
case NUM:
// Strength reduction: 0 + x => x
if (getnum(left) == 0) return right;
View perfold.c
// Heavily based on ideas from
// The most fundamental deviation is that I eschew the big hash table and the lj_opt_fold()
// trampoline for direct tail calls. The biggest problem with a trampoline is that you lose
// the control flow context. Another problem is that there's too much short-term round-tripping
// of data through memory. It's also easier to do ad-hoc sharing between rules with my approach.
// From what I can tell, it also isn't possible to do general reassociation with LJ's fold engine
// since that requires non-tail recursion, so LJ does cases like (x + n1) + n2 => x + (n1 + n2)
// but not (x + n1) + (y + n2) => x + (y + (n1 + n2)) which is common in address generation. The
// code below has some not-so-obvious micro-optimizations for register passing and calling conventions,
// e.g. the unary_cse/binary_cse parameter order, the use of long fields in ValueRef.
# Hash, displace, and compress:
# This is expected linear time for any seeded hash function that acts like a random hash function (universality isn't enough).
# (Actually, the code as written is O(n log n) when targeting 100% load. It's O(n) when targeting any smaller load factor.)
# You can make keys_per_bucket higher than the default of 4 but construction time will start to increase dramatically.
# The paper this is based on compresses the seeds (so the fact that the algorithm tries seeds in increasing order is important)
# which brings the representation size close to the information-theoretical minimum. I don't do any of that here, but it could
# be done as a postprocess.
def make_perfect_hash(keys, load_factor=1.0, keys_per_bucket=4, rhash=murmurhash, max_seed=1000000):
m = int(len(keys) / load_factor)
r = int(len(keys) / keys_per_bucket)
View clang_arm64_repro.c
// The buffer is terminated with "\n\0"
int count(const char *p) {
int n = 0;
while (*p) {
while (*p != '\n') p++;
while (*p == '\n') p++;
return n;
View perfect_hash_search.c
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
const char *keywords[] = {
# Here's a probably-not-new data structure I discovered after implementing weight-balanced trees with
# amortized subtree rebuilds (e.g.
# and realizing it was silly to turn a subtree into a sorted array with an in-order traversal only as an
# aid in constructing a balanced subtree in linear time. Why not replace the subtree by the sorted array and
# use binary search when hitting that leaf array? Then you'd defer any splitting of the array until inserts and
# deletes hit that leaf array. Only in hindsight did I realize this is similar to the rope data structure for strings.
# Unlike ropes, it's a key-value search tree for arbitrary ordered keys.
# The main attraction of this data structure is its simplicity (on par with a standard weight-balanced tree) and that it
# coalesces subtrees into contiguous arrays, which reduces memory overhead and boosts the performance of in-order traversals
View nodeiter.c
typedef struct {
// These fields are internal state and not considered part of the public interface.
Node *parent;
int next_index;
// This field is public and valid to read after a call to next that returns true.
Node *child;
} Iter;
Iter iter_children(Node *node) {