Skip to content

Instantly share code, notes, and snippets.

View benhoyt's full-sized avatar

Ben Hoyt benhoyt

View GitHub Profile
@benhoyt
benhoyt / birthday_probability.py
Created August 5, 2016 18:27
"Birthday problem" calculator in Python
"""Calculate the probability of generating a duplicate random number after
generating "n" random numbers in the range "d".
Usage: python birthday_probability.py n [d=365]
Each value can either be an integer directly, or in the format "2**x", where
x is the number of bits in the value.
For example, to calculate the probability that two people will have the same
birthday in a room with 23 people:
@benhoyt
benhoyt / glob.go
Created June 8, 2022 01:25
Simple glob matcher in Go
package main
import (
"fmt"
"os"
"path/filepath"
)
func main() {
if len(os.Args) != 2 {
@benhoyt
benhoyt / gist:66e33fcce9094cc11ba7e2d10bfd7657
Created May 1, 2022 22:04
Go switch jump tables - before and after performance using GoAWK
# NOTES
"before" is Go version go1.19-dd97871282, before the switch jump tables commits
"after" is Go version go1.19-78bea702cd, after the switch jump tables commits
# GOAWK'S GO MICROBENCHMARKS
$ benchstat -sort=delta -geomean benchmarks_before.txt benchmarks_after.txt
name old time/op new time/op delta
IncrDecr-8 141ns ± 1% 160ns ± 2% +13.59% (p=0.008 n=5+5)
IfStatement-8 147ns ± 1% 157ns ± 2% +6.59% (p=0.008 n=5+5)
@benhoyt
benhoyt / ngrams.py
Created May 12, 2016 15:34
Print most frequent N-grams in given file
"""Print most frequent N-grams in given file.
Usage: python ngrams.py filename
Problem description: Build a tool which receives a corpus of text,
analyses it and reports the top 10 most frequent bigrams, trigrams,
four-grams (i.e. most frequently occurring two, three and four word
consecutive combinations).
NOTES
@benhoyt
benhoyt / repeat-while.diff
Last active December 16, 2021 02:22
See how fast we can make direct-threaded code in C (using computed goto)
static void* prog[] = {
- // loop:
&&i_pushvar0, // pushvar i
&&i_pushnum, (void*)100000000, // pushnum 100000000
&&i_jge, (void*)5, // jge end
+ // loop:
&&i_pushvar0, // push i
&&i_addvar1, // addvar s
&&i_incvar0, // incvar i
- &&i_jmp, (void*)((long long)-10), // jmp loop
@benhoyt
benhoyt / client.go
Created September 23, 2021 00:12
Benchmark of three ways to do optional fields in Go structs
package client
func IntPtr(n int) *int { return &n }
type FooArgsPtr struct {
UserID *int
User string
GroupID *int
Group string
}
@benhoyt
benhoyt / mt.py
Created September 21, 2021 03:33
Quick performance test of Python 3.10's "match" vs "if...elif"
"""Quick performance tests comparing "match" with "if...elif".
See:
https://benhoyt.com/writings/python-pattern-matching/
https://news.ycombinator.com/item?id=28601616
# Enum switch with match:
$ python3.10 -m timeit -s 'import mt' -c 'mt.enum_match(mt.FileType.FILE)'
1000000 loops, best of 5: 356 nsec per loop
$ python3.10 -m timeit -s 'import mt' -c 'mt.enum_match(mt.FileType.SYMLINK)'
@benhoyt
benhoyt / python-stdlib.md
Created October 17, 2019 00:08
Overview of (parts of) the Python standard library

I'm going to demo a bunch of Python builtin and stdlib functions. There's a lot to get through, so I'll be going fast, but please stop me and ask questions as we go. The goal is to give you a taste of Python's power and expressivity if you're not a Python person, or maybe teach you a few new tricks if you are already.

Built-in functions

# enumerate: iterate with index *and* item
>>> strings = ['123', '0', 'x']
>>> for i, s in enumerate(strings):
...     print(f'{i} - {s}')  # f-strings!
@benhoyt
benhoyt / countwords.fs
Created March 12, 2021 07:33
Forth: print frequencies of unique words in stdin, most frequent first
200 constant max-line
create line max-line allot \ Buffer for read-line
wordlist constant counts \ Hash table of words to count
variable num-uniques 0 num-uniques !
\ Allocate space for new string and copy bytes, return new string.
: copy-string ( addr u -- addr' u )
dup >r dup allocate throw
dup >r swap move r> r> ;
@benhoyt
benhoyt / markdown.diff
Created October 7, 2020 19:28
Diff to override goldmark's code block output
diff --git a/internal/markdown/markdown.go b/internal/markdown/markdown.go
index a729b9f..94d29c1 100644
--- a/internal/markdown/markdown.go
+++ b/internal/markdown/markdown.go
@@ -13,8 +13,11 @@ import (
"bytes"
"github.com/yuin/goldmark"
+ "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/parser"