build akash 0.26.2
with rollback function to recover from the AppHash
/ error on replay: wrong Block.Header.LastResultsHash
errors
This enables the rollback function with --delete-pending-block
(Delete the pending block in tendermint block store if exists
) in akash 0.26.2
(cosmos-sdk v0.45.16
, tendermint v0.34.27
)
Useful for recovering from the AppHash
/ error on replay: wrong Block.Header.LastResultsHash
errors without resorting to restoring the backup again. Especially useful for heavy archival nodes.
Previous version for akash
0.16.4
(cosmos-sdkv0.45.4
, tendermintv0.34.19
) https://gist.github.com/andy108369/44f5a676935286e0115431015ef66e1c
If you don't want to compile it yourself, you can get the pre-built binary for akash v0.26.2 here: https://transfer.sh/lgfqoMrg4o/akash-0.26.2-rollback2.tar.gz
- sha256sum:
d21afcff0bdfed1f333dae599bb52a2d37736263ab6a9041e74ab29fa43101be akash-0.26.2-rollback2
Based on https://github.com/tendermint/tendermint/pull/8574/commits/81344ac9464db02421fa115041f162ab9ace9372
git clone https://github.com/akash-network/cometbft.git
cd cometbft
git checkout tags/v0.34.27-akash -b v0.34.27-akash-w-upstream-pr-8574
git remote add upstream https://github.com/tendermint/tendermint.git
git fetch upstream pull/8574/head:pr8574
git cherry-pick 81344ac9464db02421fa115041f162ab9ace9372
git rm CHANGELOG_PENDING.md
git cherry-pick --continue --no-edit
git tag v0.34.27-akash-w-upstream-pr-8574
COMETBFT_PATH="$(realpath .)"
cd ..
Based on https://github.com/cosmos/cosmos-sdk/pull/11982/commits/614906d69177db9555ea16ebb3e895b2f4617ff3
git clone https://github.com/cosmos/cosmos-sdk.git
cd cosmos-sdk
git checkout tags/v0.45.16 -b v0.45.16-w-upstream-pr-11982
git fetch origin pull/11982/head:pr11982
git cherry-pick --strategy=recursive -X theirs 614906d69177db9555ea16ebb3e895b2f4617ff3
git show HEAD~1:go.sum > go.sum
git show HEAD~1:go.mod > go.mod
go mod edit -replace github.com/tendermint/tendermint="$COMETBFT_PATH"
git add go.mod go.sum
for i in baseapp/baseapp.go server/types/app.go store/iavl/tree.go store/rootmulti/rollback_test.go store/types/store.go; do
git show HEAD~1:"$i" > "$i"
git add "$i"
done
git apply << 'EOF'
diff --git a/server/rollback.go b/server/rollback.go
index db865661a..db9f2fb1f 100644
--- a/server/rollback.go
+++ b/server/rollback.go
@@ -5,11 +5,11 @@ import (
"fmt"
"github.com/spf13/cobra"
- dbm "github.com/tendermint/tm-db"
+ dbm "github.com/cometbft/cometbft-db"
"github.com/cosmos/cosmos-sdk/client/flags"
"github.com/cosmos/cosmos-sdk/server/types"
- tmcmd "github.com/tendermint/tendermint/cmd/tendermint/commands"
+ tmcmd "github.com/tendermint/tendermint/cmd/cometbft/commands"
cfg "github.com/tendermint/tendermint/config"
"github.com/tendermint/tendermint/state"
"github.com/tendermint/tendermint/store"
@@ -86,7 +86,9 @@ func loadStateAndBlockStore(config *cfg.Config) (*store.BlockStore, state.Store,
if err != nil {
return nil, nil, err
}
- stateStore := state.NewStore(stateDB)
+ stateStore := state.NewStore(stateDB, state.StoreOptions{
+ DiscardABCIResponses: config.Storage.DiscardABCIResponses,
+ })
return blockStore, stateStore, nil
}
EOF
git diff
git add -u .
git commit --amend --no-edit
git tag v0.45.16-w-upstream-pr-11982
COSMOS_SDK_PATH="$(realpath .)"
cd ..
Build akash node v0.26.2 with the previously patched cometbft & cosmos-sdk.
git clone https://github.com/akash-network/node.git
cd node
git checkout -b tag-v0.26.2 v0.26.2
go mod edit -replace github.com/tendermint/tendermint="$COMETBFT_PATH"
go mod edit -replace github.com/cosmos/cosmos-sdk="$COSMOS_SDK_PATH"
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash
curl -sfL https://direnv.net/install.sh | bash
source ~/.bashrc
nvm install v20
nvm use v20
eval "$(direnv hook bash)"
direnv allow
go install 'golang.org/dl/go1.21.0@latest'
go1.21.0 download
export GOTOOLCHAIN=go1.21.0
export GOTOOLCHAIN_SEMVER=v1.21.0
$ go version
go version go1.21.0 linux/amd64
GO_LINKMODE="internal" CGO_ENABLED=0 make
.cache/bin/akash version
.cache/bin/akash version --long | grep -E 'go version|cosmos-sdk@|tendermint@'
# Example outputs
$ .cache/bin/akash version
0.26.2
$ .cache/bin/akash version --long | grep -E 'go version|cosmos-sdk@|tendermint@'
go: go version go1.21.0 linux/amd64
- github.com/cosmos/cosmos-sdk@v0.45.16 => /tmp/1/cosmos-sdk@(devel)
- github.com/tendermint/tendermint@v0.34.27 => /tmp/1/cometbft@(devel)
$ .cache/bin/akash rollback --help | grep pending
--delete-pending-block Delete the pending block in tendermint block store if exists
Note: this section has been copied from the akash v0.16.4; however it is valid for v0.26.2 as well.
Note: I hit the
AppHash
error again when rolled back only once. But it worked when I've rolled back twice down to6955212
height.
IMPORTANT Make sure you have set the same pruning
setting in your ~/.akash/config/app.toml
file if you are running the akash
tool in different context (i.e. outside the container). This will corrupt the IAVL DB since changing the pruning strategy is not supported.
A single rollback takes about 30-40 minutes for the full chain (~670 GiB). No extra disk space is required.
# ./akash rollback --delete-pending-block
rollback pending block 6955215
Rolled back state to height 6955213 and hash D7BF7C12B7212B3E9FEBB538A839330711C81189CB72EF460216F6FF12985424
# ./akash rollback --delete-pending-block
rollback pending block 6955214
Rolled back state to height 6955212 and hash 6A815F05C7521E2ABB7DB6F85F11D185EF5DD1311C01D31D19B04596D40C089E
I think we need to make these patches to the upstream cosmos-sdk (already done), cometbft and akash node eventually.
In the future, validators or nodes will not need to rely on restoring from snapshots in the event of AppHash error which may occur during the chain upgrades (mostly with non-gov SW upgrades, i.e. binary swaps) (edited)
PR for cometbft => cometbft/cometbft#1610
Have tested this procedure against Polkachu's snapshot at
13632948
height made with akash0.26.2
.