Skip to content

Instantly share code, notes, and snippets.

@JekaMas
Last active April 17, 2018 16:32
Show Gist options
  • Save JekaMas/060542ef9435799230aab4cd61bd96dd to your computer and use it in GitHub Desktop.
Save JekaMas/060542ef9435799230aab4cd61bd96dd to your computer and use it in GitHub Desktop.
Start point
github.com/tendermint/tendermint/consensus/state.go:101
WAL - write ahead log
WAL relates to State
There are two general failure scenarios: failure during consensus, and failure while applying the block.
The former is handled by the WAL, the latter by the proxyApp Handshake on restart, which ultimately hands off the work to the WAL.
State is
ConsensusState handles execution of the consensus algorithm.
It processes votes and proposals, and upon reaching agreement,
commits blocks to the chain and executes them against the application.
The internal state machine receives input from peers, the internal validator, and from a timer.
WAL has 2 reply modes
normal -
light - in light mode we only write new steps, timeouts, and our own votes (no proposals, block parts)
WAS has doWALCatchup bool
determines if we even try to do the catchup
WAL types
nilWAL - is used at the start
baseWAL - the main WAL
Write ahead logger writes msgs to disk before they are processed.
Can be used for crash-recovery and deterministic replay
TODO: currently the wal is overwritten during replay catchup
give it a mode so it's either reading or appending - must read to end to start appending again
byteBufferWAL - is used for positive testing
byteBufferWAL is a WAL which writes all msgs to a byte buffer. Writing stops
when the heightToStop is reached. Client will be notified via
signalWhenStopsTo channel.
crashingWAL - is used for negative testing
crashingWAL is a WAL which crashes or rather simulates a crash during Save
(before and after). It remembers a message for which we last panicked
(lastPanicedForMsgIndex), so we don't panic for it in subsequent iterations.
WAL is used in steps:
1. setup State WAL=nilWAL and doWALCatchup=true
updateToState(state sm.State) <- finalizeCommit(height int64)
2. State.OnStart(), restore WAL messages with SearchForEndHeight(currentHight)
2.1
2.1.1 Receiving and processing new messages should NOT work
2.1.2 Tries to restore a voting state. If it's not a fast sync
// we may have lost some votes if the process crashed
// reload from consensus log to catchup
if cs.doWALCatchup {
if err := cs.catchupReplay(cs.Height); err != nil {
cs.Logger.Error("Error on catchup replay. Proceeding to start ConsensusState anyway", "err", err.Error())
// NOTE: if we ever do return an error here,
// make sure to stop the timeoutTicker
}
}
2.1.3
// Ensure that ENDHEIGHT for this height doesn't exist.
// NOTE: This is just a sanity check. As far as we know things work fine
// without it, and Handshake could reuse ConsensusState if it weren't for
// this check (since we can crash after writing ENDHEIGHT).
// Ignore data corruption errors since this is a sanity check.
gr, found, err := cs.wal.SearchForEndHeight(csHeight, &WALSearchOptions{IgnoreDataCorruptionErrors: true})
2.2
// Search for last height marker
// Ignore data corruption errors in previous heights because we only care about last height
gr, found, err = cs.wal.SearchForEndHeight(csHeight-1, &WALSearchOptions{IgnoreDataCorruptionErrors: true})
2.3 Read messages from reply one by one
2.4 For each message check IsDataCorruptionError(err) and panic if true
2.5
// NOTE: since the priv key is set when the msgs are received
// it will attempt to eg double sign but we can just ignore it
// since the votes will be replayed and we'll get to the next step
err := cs.readReplayMessage(msg, nil)
2.6 consensus/replay.go
2.7
// Functionality to replay blocks and messages on recovery from a crash.
// There are two general failure scenarios: failure during consensus, and failure while applying the block.
// The former is handled by the WAL, the latter by the proxyApp Handshake on restart,
// which ultimately hands off the work to the WAL.
//-----------------------------------------
// recover from failure during consensus
// by replaying messages from the WAL
// Unmarshal and apply a single message to the consensus state
// as if it were received in receiveRoutine
// Lines that start with "#" are ignored.
// NOTE: receiveRoutine should not be running
2.8 Exit on err.EOF
3. setup baseWAL with a given config (normal or light mode), baseWAL.Start()
4. wal.Save(State)
4.1 finalizeCommit(height int64)
cs.Votes.Precommits(cs.CommitRound).TwoThirdsMajority()
...
// Save to blockStore.
if cs.blockStore.Height() < block.Height {
// NOTE: the seenCommit is local justification to commit this block,
// but may differ from the LastCommit included in the next block
precommits := cs.Votes.Precommits(cs.CommitRound)
seenCommit := precommits.MakeCommit()
cs.blockStore.SaveBlock(block, blockParts, seenCommit)
}
// Finish writing to the WAL for this height.
// NOTE: If we fail before writing this, we'll never write it,
// and just recover by running ApplyBlock in the Handshake.
// If we moved it before persisting the block, we'd have to allow
// WAL replay for blocks with an #ENDHEIGHT
// As is, ConsensusState should not be started again
// until we successfully call ApplyBlock (ie. here or in Handshake after restart)
cs.wal.Save(EndHeightMessage{height})
SAVES EndHeightMessage
4.2 receiveRoutine(maxSteps int)
// receiveRoutine handles messages which may cause state transitions.
// it's argument (n) is the number of messages to process before exiting - use 0 to run forever
// It keeps the RoundState and is the only thing that updates it.
// Updates (state transitions) happen on timeouts, complete proposals, and 2/3 majorities.
// ConsensusState must be locked before any internal state is updated.
select {
case height := <-cs.mempool.TxsAvailable():
cs.handleTxsAvailable(height)
case mi = <-cs.peerMsgQueue:
cs.wal.Save(mi)
// handles proposals, block parts, votes
// may generate internal events (votes, complete proposals, 2/3 majorities)
cs.handleMsg(mi)
case mi = <-cs.internalMsgQueue:
cs.wal.Save(mi)
// handles proposals, block parts, votes
cs.handleMsg(mi)
case ti := <-cs.timeoutTicker.Chan(): // tockChan:
cs.wal.Save(ti)
// if the timeout is relevant to the rs
// go to the next step
cs.handleTimeout(ti, rs)
case <-cs.Quit():
// NOTE: the internalMsgQueue may have signed messages from our
// priv_val that haven't hit the WAL, but its ok because
// priv_val tracks LastSig
// close wal now that we're done writing to it
cs.wal.Stop()
close(cs.done)
return
}
SAVES timeoutInfo if timeout
//NewRoundStepMessage is sent for every step taken in the ConsensusState. For every height/round/step transition
NewRoundStepMessage
// CommitStepMessage is sent when a block is committed.
CommitStepMessage
// ProposalMessage is sent when a new block is proposed.
ProposalMessage
// ProposalPOLMessage is sent when a previous proposal is re-proposed.
ProposalPOLMessage
// BlockPartMessage is sent when gossipping a piece of the proposed block.
BlockPartMessage
// VoteMessage is sent when voting for a proposal (or lack thereof).
VoteMessage
// HasVoteMessage is sent to indicate that a particular vote has been received.
HasVoteMessage
// VoteSetMaj23Message is sent to indicate that a given BlockID has seen +2/3 votes.
VoteSetMaj23Message
// VoteSetBitsMessage is sent to communicate the bit-array of votes seen for the BlockID.
VoteSetBitsMessage
// ProposalHeartbeatMessage is sent to signal that a node is alive and waiting for transactions for a proposal.
ProposalHeartbeatMessage
Voting messages could be internal and external (from peers):
VoteMessage
ProposalMessage
BlockPartMessage
4.3 cs.handleMsg(mi) OR
enterCommit(height int64, commitRound int) <- addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) <- tryAddVote(vote *types.Vote, peerID p2p.ID) <- cs.handleMsg(mi)
enterPrecommit(height int64, round int) <- addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) <- tryAddVote(vote *types.Vote, peerID p2p.ID) <- cs.handleMsg(mi)
enterPrevote(height int64, round int) <- addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) <- tryAddVote(vote *types.Vote, peerID p2p.ID) <- cs.handleMsg(mi)
enterPrevoteWait(height int64, round int) <- addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) <- tryAddVote(vote *types.Vote, peerID p2p.ID) <- cs.handleMsg(mi)
enterPropose(height int64, round int)
<- handleTimeout(ti timeoutInfo, rs cstypes.RoundState)
<- handleTxsAvailable(height int64)
<- enterNewRound(height int64, round int) <- handleTimeout(ti timeoutInfo, rs cstypes.RoundState)
<- enterNewRound(height int64, round int) <- addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) ...
updateToState(state sm.State) <- finalizeCommit(height int64)
updateToState(state sm.State) <- SwitchToConsensus(state sm.State, blocksSynced int)
updateToState(state sm.State) - SAVES EventDataRoundState
5. Flush all WAT data to have empty WAT for next cyrcle and block
6. State.OnStop() -> baseWAL.Wait()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment