Skip to content

Instantly share code, notes, and snippets.

View ChayimFriedman2's full-sized avatar

Chayim Refael Friedman ChayimFriedman2

View GitHub Profile
@ChayimFriedman2
ChayimFriedman2 / google-benchmark-lex-keywords.md
Last active October 25, 2020 18:47
Benchmark of various ways to lex (scan) keywords

In the comparison: simple linear lookup, trie (prefix tree) and perfect hash (generated using gperf).

I compared the keywords of the Wren programming language, using the code of the core (at https://github.com/wren-lang/wren/blob/ad4e039187bda492709786762d4a36ec0b00d91c/src/vm/wren_core.wren). First I isolated the identifiers (including keywords) to std::array<std::string> (total 855 identifiers), then benchmarked the various methods to identify keywords using Google Benchmark with https://quick-bench.com (I don't know how to share links to benchmarks in https://quick-bench.com, sorry). I did not reviewed the generated assembly code, although I hope to do that in the near future.

The results are quite surprising: of course, linear lookup is the slowest. But the others depend on the compiler: with Clang perfect hash is the fastest, while with GCC trie takes. Overall, Clang is better at linear search (the difference in GCC is so much bigger), but lose to GCC with trie and perfect hash