2020-08-20 Eth2 Call 46
https://github.com/ethereum/eth2.0-pm/issues/173 | |
Nimbus | |
Medalla: | |
1. Losing peers | |
2. Syncing performance | |
3. (very) High-memory usage for 2 caches: | |
- epoch cache that in particular caches validators public keys | |
was way too aggressive after a couple speed optimizations (gigabytes!) | |
- fork choice/votes cache that is only pruned at finality every 256 epochs | |
and is growing in period of non-finality | |
Reasons: | |
1. Losing peers and syncing: At start, we subscribe to gossip but when syncing we can't verify and so propagate the latest attestations blocks and so we get ejected. | |
2. Similarly, subscribing too early pollutes our quarantine system that stores blocks from the network that can't be attached to the chain yet with blocks thousands epochs in the future. | |
PR: don't subscribe to gossip to early under review. | |
3. Caches memory usage: | |
- Epoch cache: We had duplicated caches created at each epoch. | |
- Fork choice: We wait a long time before pruning the cache to amortize pruning cost. A change in data structure allow for pruning every epoch (protoarray refinement) | |
Multinet scripts maintenance: small lighthouse-nimbus testnet to debug specific issue | |
Fallback: | |
- In case we have a critical nimbus bug, we are preparing to fallback to other client to keep our validators up | |
——— | |
Medalla | |
Testing and release updates: | |
- networking testing in progress | |
- fork choices started to be updated in the spec test repo | |
Fuzzing update | |
- Community fuzzing | |
- bug report | |
- Differential fuzzing | |
- consensus bug in Prysm, not checking empty when aggregating signature for verification. | |
- https://blog.sigmaprime.io/beacon-fuzz-07.html | |
——— | |
Client updates | |
Lighthouse: | |
Fix syncing from a long finalized slot in the past: more stability. | |
When lots of blocks and processing, things in a core executor where blocking or deadlocking lighthouse —> switched to a queuing system. | |
Avoid switching chain head when syncing | |
Import prysm keystores | |
interior improvement | |
Teku | |
State regeneration, by replaying blocks on top of the cache | |
—> no control on when and where to regenerate | |
—> lots of regeneration and even duplicate regeneration on multiple cores in parallel | |
when multi requests | |
—> queueing system with deduplication support | |
Extra logic to pull hot state regularly instead of doing this at startup which delayed startup a lot. | |
Deadlock in block import logic after finalisation as the finalised block was dropped out of cache | |
https://github.com/PegaSysEng/teku/issues/2596 | |
Prysm | |
More improvement in last 3 days than in last 6 months | |
Refactoring caches updates, peer scoring improvement to avoid junk peers. | |
Race condition: saving in cache at finality but finalised root never saved to disk due to timeout —> required everyone to restart | |
The cleanup operation at finality timed out | |
—> 47% validators dropped of the network | |
Using “roughtime” 6 time server but one had a 24h offset which led to chaos as they were using “mean” (instead of median) | |
-> now using system time | |
-> Dankrad: if big difference to local time: don’t adjust clock. | |
Trinity | |
Can restore fork choice context from database | |
Network: getting noise to work | |
Sync perf | |
Move eth2 trinity in a separate repo | |
Lodestar | |
Cannot sync to head during finality incident | |
-> need deeper look, probably large refactor | |
Gossipsub 1.1 in devel, need to be added to js-libp2p | |
BLST: looking for a switch between pure C and WASM code | |
Nethermind | |
1 more senior dev full-time on Eth2 | |
Focus at the moment: Eth1 deposit | |
——— | |
Hsiao-Wei: Must-Have and Nice-to-have for mainnet | |
has been shared privately with client team | |
will likely be on Github. | |
Afri: How to move forward from here and the learning from Medalla. | |
-> clients should work on better release tracks. | |
-> clients need strategy to stabilize the codebase and avoid breaking things that used to work. | |
Thinking on “Mainnet candidate” (if something goes wrong, we reuse the same deposit contract at different genesis time) | |
Dankrad: we don’t want to lead to complacency. raise the stake? | |
Launchpad decentralization? | |
Carl: tracked bug that caused people to have double deposits | |
Danny: note scammer launchpad and deposit contracts | |
Danny: standard proposal for slashing protection DB | |
Danny: how to go from client A to client B is a priority | |
—- | |
Lighthouse subscribes to topics while syncing | |
Prysm subscribes to topics while syncing but ignores them | |
- but don’t subscribe to subnet until fully sync | |
—- | |
Second release of deposit contracts | |
backported to eth2 spec repo | |
only diff is metadata | |
Use https://github.com/ethereum/eth2.0-specs/tree/dev/solidity_deposit_contract | |
for local testnet | |
Probably final version |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment