This is based off of code in my current program (https://gitlab.com/nyanpasu64/exotracker-cpp).
In practice, synth_run_clocks
contains a very tight loop whose interior is called up to 1.79 million times a second (once per NES clock cycle). I thought it might be faster if the innermost method call was statically dispatched. (But if the loop is complex, does it generate instruction cache bloat?)