Medalla non-finality mid October
What is happening with the Medalla eth2 testnet currently, Oct 17th 2020? Here's an ELI5. (Not exactly 5. Maybe 10. You get the idea.)
- A bunch of validators turned off (maybe zinken, maybe bored, who knows)
- We entered non-finality. This means we don't have enough validators online to agree to what the state of the network is. We need 2/3rds.
- Some sync bugs reared their heads, in Prysm and Nimbus. We lost more validators to the bugs, not everyone has updated since
- Non-finality increases memory and CPU requirements, we likely lost more validators because their nodes couldn't handle it
- Either people come back in and we regain finality or
- They don't and offline validators lose eth faster and faster, until we regain finality. Some may be ejected if their balance falls too low.
- You can stare at beaconcha.in to see current participation rate.
- This would not go on this long on main net because people aren't just going to say "meh" while real eth burns
- The network is working as designed for a major disruption scenario in which it needs to self-heal
- I'm not sure we have a solid handle on when the network is back up if 6) comes to pass, but lower bound Oct 25th (ish), upper bound 5 days later. Edit: This was too optimistic, expect us to gain finality by Nov 5th or a few days earlier.
- Anyone who is offline and doesn't want to come back on can help by doing an orderly exit, here's a tool to make it easy: https://github.com/eth2-educators/medalla-exit
= End of ELI10 =
In some more technical detail:
We entered non-finality October 12th in the morning US Eastern TZ. This happened after 4 consecutive epochs without consensus.
At this point, "quadratic leaking" kicks in. Validators that are both active and offline are penalized in increasing amounts as non-finality continues. The formula for this is Penalty = EffectiveBalance * Epochs-Since-Finality / (2^25). For the math geeks: "The penalty per epoch is linear with finality delay, which means the total penalty (integral of it) is quadratic" (thank you torfbolt)
Inactive validators are not punished. A validator might be inactive because it is in activation queue, or its deposit hasn't even been processed yet - these will become active as we regain finality. A validator might also be inactive because it sent a voluntary exit, and it will not be punished.
Active and online validators stay at exactly +-0 if their inclusion distance is a perfect 1. That's impossible, so they are penalized slightly. djrtwo of the Ethereum Foundation has stated that they are looking at ways to safely reduce this penalty for validators "doing their part" because the penalty doesn't, in a nutshell, feel good. Edit: Attestations can also be lost due to no fault of the operator (blocks full, peers have issues), and a validator will be punished as "offline" for that epoch.
A validator is marked offline for an epoch if it does not attest in an epoch. This can happen to otherwise running validators if their beacon node gets out of sync, or if they are unable to attest. Possible causes to look for are client bugs and RAM/CPU resource usage. Now is the time to learn how to build clients from source, and to check the sizing of your node.
Estimating the time when the chain regains finality is difficult because validator participation fluctuates, influenced by client bugs.
Validators are kicked out when their "effective balance" reaches 16 eth, which happens at 16.75 actual eth.
Even before validators are kicked out, their decreasing balance means they lose weight in the consensus. I have seen an estimate that we might regain consensus after ~13 days. This gives us our lower bound: Oct 25th, or thereabouts. Some validators became active in August and never submitted a single attestation, those would accelerate this process.
It takes 18 days for a validator to be at ~60.6% of its balance, and almost 22 days to be at ~50%. Once a validator is in the exit queue, the queue processes exits at 4 validators per epoch (1), or 900 validators a day. As of Oct 30th, roughly 3k to 4k validators are "missing". The first validators should hit the exit queue early Nov 1st, maybe late Oct 31st. At 900 validators moving through the exit queue per day, Nov 5th should see finality. This is highly dependent on how currently online validators behave. If all of them stay online, finality may be earlier; if there's a run for the hills, it'll be a little later. The exit queue will continue growing as it is processed, it may grow to ~25k validators if finality is not regained early. See also this graph of validator balances over time during non-finality, and the effective balances of active validators on Medalla.
After 2 consecutive epochs with consensus, or 3 out of 4, finality will be restored. Offline penalties revert to their regular, less-punishing "during finality" defaults.
This resource is a great intro to Ethereum 2.0 and the beacon chain: https://ethos.dev/beacon-chain/
Lastly, if you are looking for a project that automates the task of compiling a client from source, have a look at eth2-docker
(1) The exit queue processing speed is determined by the "validator churn limit", which is "4" or "active validators divided by 65536", whichever is greater. See https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/beacon-chain.md#get_validator_churn_limit .