Skip to content

Instantly share code, notes, and snippets.

@gcolvin
Created December 30, 2016 21:00
Show Gist options
  • Save gcolvin/517e754d474c4bcfd15bd9ed55a21020 to your computer and use it in GitHub Desktop.
Save gcolvin/517e754d474c4bcfd15bd9ed55a21020 to your computer and use it in GitHub Desktop.
Effective, DoS-resistant code caching

I've been thinking on an effective cache of compiled bytecode that is difficult to attack. LRU, FIFO, and related eviction/retention policies are easily attacked with round robin access. Currently there are about 4 GB worth of compiled contract code on tth blockchain. Two policies seem safe. I'd like to discuss whether they are in fact effective and safe.

  1. Random eviction:

The number of transactions per contract is Zipf distributed with alpha=1; I estimate that if you cached 1/4 of the contracts you'd get a hit ratio of about 80-90%. Probably worth it, and it seems expensive to DoS. To degrade the cache you have to pollute it with spam, and the more spam you put in the higher the probability of it getting kicked out. If you managed to get the cache half-full of spam then you would have to keep paying for half of all transactions to keep it that way.

  1. Expensive retention:

The retention formula is F = geo/m, where the factors are moving averages of

  • g = gas use
  • r = runtime
  • e = effectiveness = r(post-compile) / r(pre-compile)
  • o = optimization time
  • m = memory size

Intuitively, the preferences of this formula are that

  • programs that spend a lot of gas are cached (and are paying for the memory to cache them)
  • programs that are effectively compiled are cached (otherwise why compile them?)
  • programs that are expensive to compile are cached (so they don't need to be compiled again)
  • programs that are large are expensive to cache (as memory to cache them in is a cost)

The main attack vector would be small contracts that don't take much gas that are effectively but expensively compiled. The defense is that to some extent expensive compilation (especially of validated code) requires larger programs, and that it still takes some gas to stay cached

To maintain a cache of compiled code using this policy we maintain two heaps, U and V. U is a heap of unoptimized contracts, with just the stats to compute F. It is sorted so that popping the heap yields the contract with the best F. V is a heap that includes optimized contracts, and is sorted so that popping the heap yields the contract with the worst F.

As each contract is processed we

  • push new code if any on heap U
  • update and re-heapify heap U and heap V
  • if top element in U is greater than top element in C then
  • _ discard top of V
  • _ pop V
  • _ let u = top of U
  • _ let v = optimized u
  • _ push v on V
  • _ pop U The result is that we have stats on all known contracts, and the cache now contains contracts with the highest F value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment