As the ethereum trie becomes more and more saturated, the number of disk lookups that a node is required to do in order to access a piece of state increases too. This means that checking the EXTCODEHASH
of an account at block 5
was inherently a cheaper operation that it is at, say 8.5M
.
So far, the solution to these problems has been in raising the gas cost. From a node perspective, a node can (and does) use various caching mechanisms to cope with the problem, but there's an inherent problem with caches: when the yield a 'hit', they're great, but when they 'miss', they're useless.
This is attackable. By forcing a node to lookup non-existant keys, an attacker can maximize the number of disk lookups.
Sidenote: even if the 'non-existence' is cached, it's trivial to use a new non-existent key the next time, and never hit the same non-existent key again. Thus: caching 'non-existence' might be dangerous, since it will evict 'good' entries.
And there we have a problem: should we price these ops according to:
- The happy-path, where items are cached?
- No, that would obviously be extremely dangerous
- The 'normal' usage, based on benchmarks of actual usage?
- This is basically what we do now,
- The 'paranoid' case: price everything as if caching did not exist?
- This would severely harm basically every contract
From an engineering point of view, we're left with few options:
- Implement bloom filters for existence. This is difficult, not least because of the eternal problems of reorgs, and the fact that it's difficult to undo bloom filter modifications.
- Implement flattened account databases. This is also difficult, both because of reorgs and also because it needs to be an additional data structure aside from the
trie
-- we need thetrie
for consensus. So it's an extra data structure of around15G
that needs to be kept in check. Nevertheless, this is something we're actively working on within the geth team, specifically @karalabe.
So, is there some way that we can be nice to those who play nice, but penalize those who do not? Yes, there is.
For a trie-dependent opcode, such as EXTCODEHASH
, let
base
be the base cost (currently700
)penalty
be the penalty cost (e.g.2000
).
Now, if extcodehash(addr)
is performed, and addr
does not exist, then penalty
gas is deducted from the gas
.
The addition of a penalty
, depending on the result of the execution of an opcode, is something completely new. Currently, all ops are either fully static in cost, or their cost depends on the stack arguments, but no current opcode has a post-execution cost. While that would be a totally new thing, I don't believe it would be very difficult to implement.
The upside of this scheme is that we could continue to price these operations based on the 'normal' usage, but gain protection from inexistence-attacks that try to maximize disk lookups.
Code Is The Road on the Board
vitalik
means We Talk or Walkie-talkie.Let's Code the Universal Virtuality Of Everything.
EVM=MVA
Ethereum Virtual Machine= Make Virtual Era.