Skip to content

Instantly share code, notes, and snippets.

@mratsim
Created March 3, 2021 13:54
Show Gist options
  • Save mratsim/67129d045d37d525e30d994b09f5af4b to your computer and use it in GitHub Desktop.
Save mratsim/67129d045d37d525e30d994b09f5af4b to your computer and use it in GitHub Desktop.

Batching opportunities overviews

Recap

aggregate != batching

An aggregate is spec-defined, for a receiver:

  • the signature is received aggregated from the network
  • the attesters public key must be aggregated (on BLS G1)
  • only attestations are aggregated

Batching is client optimization

  • MillerLoop(PublicKey, Message, Signature)
  • FinalExponentiation(partial Miller loops)

According to Justin Drake there is an incoming optimization that will make adding public keys 50% faster in BLST and that it's the bottleneck with a large number of signatures, but in my benchmarks, it's only 5% of aggregation verification and it's unused for batching. Is he talking about something else?

validation != verification

Validation is according to gossipsub rebroadcasting rules

  • block
  • attestations
  • voluntary exits, attester slashing, block proposer slashing

Validation requires verifying 1 signature per object

Verification is according to consensus rules

  • for blocks:
    • state transition can be done using the block
    • crypto verification, including all nested crypto objects (attestations, exits, slashings, randao)
      • crypto verification for blocks is batched today.
  • for attestations:
    • 1 signature to verify
    • small consistency checks which seem redundant with gossip checks
  • voluntary exits, attester slashing, block proposer slashing:
    • 1 signature to verify
    • small consistency checks which seem redundant with gossip checks

Flow

For attestations, exists, slashings (during steady state):

  • Gossip -> eth2_processor -> validation + rebroadcast -> a "pool"

For blocks during sync:

  • Eth2 RPC -> sync_manager -> SharedBlockQueue -> clearance -> verification -> Candidate ChainDAG

For blocks during steady state:

  • Gossip -> eth2_processor -> validation + rebroadcast -> SharedBlockQueue -> clearance -> verification -> Candidate ChainDAG

During sync or steady state, for missing ancestors:

  • Eth2 RPC -> request_manager -> SharedBlockQueue -> clearance -> verification -> Candidate ChainDAG

Bottlenecks

During sync:

  • block processing speed
  • no mesh or latency expectations from connected peers

During steady state:

  • attestation processing speed (may be unaggregated!)
  • latency expectations from connected peers

Batching opportunities

For blocks:

  • we can batch during steady state, for validation
  • we can batch during steady state or sync, for verification
  • we can batch missing ancestors during steady state or sync, for verification

Analysis

In a well functioning chain we rarely receive more than 1 block per slot at a steady state. Our opportunity to receive many blocks that we can verify is only during sync or if we are missing a fork. So it only makes sense to batch the "SharedBlockQueue -> clearance -> verification". Par exemple by changing the SharedBlockQueue, from AsyncQueue[BlockEntry] to AsyncQueue[seq[BlockEntry]] so that inputs from the same sync request or ancestors request are grouped together.

One issue is that verifying blocks is expensive and we may not have enough time to verify 5 missing blocks at once during periods of non-finality without impacting validator duties or gossipsub connectivity requirements.

For attestations:

  • we can batch 'eth2_processor -> validation + rebroadcast -> a "pool"' during steady state
  • no attestations during sync

Analysis

We kind of already process multiple at once (without returning control to event loop) because validation is somewhat cheap (1 signature only, no state transition), so only thing missing is using crypto batchVerify

Architecture exploration

If we want to batch blocks, we need to avoid blocking the networking and validator duties (so DAG, forkchoice as well).

  • solution 1 requires a threadpool and a way to asyncify waiting for a Flowvar
  • solution 2 would split NBC in a networking thread and a verification thread and a way to asyncify waiting for a Channel or AsyncChannel
  • solution 3 would split NBC in a networking process and a verification process and communication via IPC and 2 event loops
  • solution 4 would split NBC in producer<->consumer services, at least one service will be networking and one will be consensus.

Whatever we use as a solution we need a way to tie in the communication (Channels, Flowvar) with Chronos. AsyncChannel seems stalled due to transporting GC-ed types on thread-local heap. ARC/ORC is out of question for the moment.

I've outlined a way to make both Flowvar (including ones for custom threadpools) or Channels (including Nim channels) work with Chronos, by creating an "async condition variable" (AsyncV): nim-lang/RFCs#304 (comment) So an AsyncChannel or an AsyncFlowvar is just "Channel + AsyncCV" or "Flowvar + AsyncCV".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment