holiman/Caching.md

## Caching.md

      
    Raw
  

              Caching.md
            
          
    Part one: Caching in consensus

The magic of caching

It's sometimes said, within software engineering, that there are no 'golden bullets' -- magical stuff that automatically fixes all of your problems. But actually, there is something that is pretty close to a golden bullet in many scenarios. After everything has been stretched to the limit, but the software still falls short of the demands being put on it, there's one thing that engineers will reach for: The Cache.
A cache layer can scale something orders of magnitude, it can and should be used wherever appropriate. When we can put in a caching layer, we suddenly have a shelve where we can put "stuff" that might be useful later. It's great.
Caching, and perhaps more specifically cache invalidation, is generally considered as a difficult problem, and it can of course be tuned and experimented with at length, before it's fully matured. Do we want least-recently-used eviction? Least-frequently-used? Do we want strict eviction guarantees, or is fuzzy fifo:ish eviction ok? Do we want to prioritize eviction based on cost (e.g. size) of elements? Do we want the cache to handle multiple writers concurrently, at the expense of a slight delay between Set and internal commit?
These are all luxury problems that a software engineer can fiddle around with in order to get the maximum boost out of his chosen cache.
Later on, during runtime, we can set the cache size depending on the overall resources on the machine where the program is running -- using larger cache for a big fat AWS m2 instance, and a smaller one when running off a raspberry pi.
Yes, caches are great!
Ruining the magic

For EVM discussions, sometimes the idea pops up, roughly along these lines:

If an account 'X' has already been accessed in this (transaction / block / last chunk of 100 blocks), the node would already have it in the cache. So calling it should be cheaper.
If a storage slot 'Y' has already been accessed in this  (context / transaction / block), the node would already have it cached, so accessing it should be cheaper.

So let's follow this train of thought a bit. Let's say we institute a rule that CALL to any account accessed within the last 100 blocks is subsidized. At that point, our "cache" lost all configurability, because it's now part of the consensus-engine.

If geth has 12341023 items in the cache, and parity has 12341022, that would be an exploitable consensus issue.

This means,

We can't pick and choose eviction policy, the eviction policy becomes a consensus rule,
If a transaction reverts, we can't choose to just "keep" the stuff it accessed in the cache, even if it  might be useful later! We must explicitly remove all those elements.

Which also means that we will need a journalling layer, to be able to revert the "cache" to a previous checkpoint.


We lose any ability to scale the "cache" allowance depending on hardware: all nodes are now forced to cache exactly the same amount of data.

Furthermore, it also opens up for cache-attacks. Let's say we determine that over 100 blocks, we'd normally access 20K accounts. So a consensus-node needs to have a fast-lookup (RAM) of 20K accounts in order to play ball. However, if some attacker fills 100 blocks with transactions that do nothing but access other accounts, what would be the theoretical max limit? I'd guess somewhere between two and three orders of magnitude, meaning that suddenly we'd have 2M to 20M accounts in our fast-lookup. Mandatory!
At this point, it's no longer a cache. It's just another complicated consensus data structure. In order to handle it, we'll probably need to place an actual cache layer beneath it, so we can retain all the good things that the consensus rules otherwise forces us to remove, and to make sure we utilize all that juicy RAM to the fullest extent.
TLDR;

Caches are great, but should not be enshrined into consensus rules, since that just makes them lose the things that made them great in the first place.