Created
August 20, 2020 15:00
-
-
Save mratsim/370f96bd0e50f3bc8bce4a7e4010f887 to your computer and use it in GitHub Desktop.
2020-08-20 Eth2 Call 46
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://github.com/ethereum/eth2.0-pm/issues/173 | |
Nimbus | |
Medalla: | |
1. Losing peers | |
2. Syncing performance | |
3. (very) High-memory usage for 2 caches: | |
- epoch cache that in particular caches validators public keys | |
was way too aggressive after a couple speed optimizations (gigabytes!) | |
- fork choice/votes cache that is only pruned at finality every 256 epochs | |
and is growing in period of non-finality | |
Reasons: | |
1. Losing peers and syncing: At start, we subscribe to gossip but when syncing we can't verify and so propagate the latest attestations blocks and so we get ejected. | |
2. Similarly, subscribing too early pollutes our quarantine system that stores blocks from the network that can't be attached to the chain yet with blocks thousands epochs in the future. | |
PR: don't subscribe to gossip to early under review. | |
3. Caches memory usage: | |
- Epoch cache: We had duplicated caches created at each epoch. | |
- Fork choice: We wait a long time before pruning the cache to amortize pruning cost. A change in data structure allow for pruning every epoch (protoarray refinement) | |
Multinet scripts maintenance: small lighthouse-nimbus testnet to debug specific issue | |
Fallback: | |
- In case we have a critical nimbus bug, we are preparing to fallback to other client to keep our validators up | |
——— | |
Medalla | |
Testing and release updates: | |
- networking testing in progress | |
- fork choices started to be updated in the spec test repo | |
Fuzzing update | |
- Community fuzzing | |
- bug report | |
- Differential fuzzing | |
- consensus bug in Prysm, not checking empty when aggregating signature for verification. | |
- https://blog.sigmaprime.io/beacon-fuzz-07.html | |
——— | |
Client updates | |
Lighthouse: | |
Fix syncing from a long finalized slot in the past: more stability. | |
When lots of blocks and processing, things in a core executor where blocking or deadlocking lighthouse —> switched to a queuing system. | |
Avoid switching chain head when syncing | |
Import prysm keystores | |
interior improvement | |
Teku | |
State regeneration, by replaying blocks on top of the cache | |
—> no control on when and where to regenerate | |
—> lots of regeneration and even duplicate regeneration on multiple cores in parallel | |
when multi requests | |
—> queueing system with deduplication support | |
Extra logic to pull hot state regularly instead of doing this at startup which delayed startup a lot. | |
Deadlock in block import logic after finalisation as the finalised block was dropped out of cache | |
https://github.com/PegaSysEng/teku/issues/2596 | |
Prysm | |
More improvement in last 3 days than in last 6 months | |
Refactoring caches updates, peer scoring improvement to avoid junk peers. | |
Race condition: saving in cache at finality but finalised root never saved to disk due to timeout —> required everyone to restart | |
The cleanup operation at finality timed out | |
—> 47% validators dropped of the network | |
Using “roughtime” 6 time server but one had a 24h offset which led to chaos as they were using “mean” (instead of median) | |
-> now using system time | |
-> Dankrad: if big difference to local time: don’t adjust clock. | |
Trinity | |
Can restore fork choice context from database | |
Network: getting noise to work | |
Sync perf | |
Move eth2 trinity in a separate repo | |
Lodestar | |
Cannot sync to head during finality incident | |
-> need deeper look, probably large refactor | |
Gossipsub 1.1 in devel, need to be added to js-libp2p | |
BLST: looking for a switch between pure C and WASM code | |
Nethermind | |
1 more senior dev full-time on Eth2 | |
Focus at the moment: Eth1 deposit | |
——— | |
Hsiao-Wei: Must-Have and Nice-to-have for mainnet | |
has been shared privately with client team | |
will likely be on Github. | |
Afri: How to move forward from here and the learning from Medalla. | |
-> clients should work on better release tracks. | |
-> clients need strategy to stabilize the codebase and avoid breaking things that used to work. | |
Thinking on “Mainnet candidate” (if something goes wrong, we reuse the same deposit contract at different genesis time) | |
Dankrad: we don’t want to lead to complacency. raise the stake? | |
Launchpad decentralization? | |
Carl: tracked bug that caused people to have double deposits | |
Danny: note scammer launchpad and deposit contracts | |
Danny: standard proposal for slashing protection DB | |
Danny: how to go from client A to client B is a priority | |
—- | |
Lighthouse subscribes to topics while syncing | |
Prysm subscribes to topics while syncing but ignores them | |
- but don’t subscribe to subnet until fully sync | |
—- | |
Second release of deposit contracts | |
backported to eth2 spec repo | |
only diff is metadata | |
Use https://github.com/ethereum/eth2.0-specs/tree/dev/solidity_deposit_contract | |
for local testnet | |
Probably final version |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment