Skip to content

Instantly share code, notes, and snippets.

@andy108369
Last active January 23, 2024 01:37
Show Gist options
  • Save andy108369/9da91c585500cecc4e4521f49044de15 to your computer and use it in GitHub Desktop.
Save andy108369/9da91c585500cecc4e4521f49044de15 to your computer and use it in GitHub Desktop.
How to sync Akash Node from height=0 in akashnet-2 network

How-to sync Akash Node from height=0 in akashnet-2 network

Important notes before you start

Make sure you are running your archival node with pruning = nothing since height=0 to keep all historic states (i.e. archiving node).

With akash 0.18.0 (aka mainnet4) you HAVE TO start the chain with AKASH_PRUNING=nothing set. (This is fixed in akash 0.20.0)

Do NOT change pruning in between the restarts since this can corrupt the chain data (IAVL) cosmos/cosmos-sdk#6370 (comment)

Sync from height=0

You can synchronize your akash node (RPC/validator) from height=0 on akashnet-2 following these steps:

Legend:

  • auto - the chain will automatically halt at Govt SW upgrade proposal height;
  • manual - you have to leverage AKASH_HALT_HEIGHT for the chain to halt at specific height; This is because the upgrade was done by binary swap, i.e. without the Govt SW upgrade proposal;

Process:

  1. akash 0.10.1 (akashnet-2 is born, aka mainnet2):
    Start at height=0 until height=455200 (auto).
    Govt SW upgrade proposal #5 will halt the akash at that height, expecting 0.14.1.
    Blockchain size: 63 GiB.

  2. akash 0.14.1:
    Continue from height=455200 until height=5629650 (auto).
    Govt SW upgrade proposal #19 will halt akash at that height, expecting 0.16.1.
    Blockchain size: 1222 GiB.
    =
    I recommend to backup the chain somewhere at this step if space allows (simple rsync it a separate directory will do)
    You can remove this backup at step 4, after about 10 minutes it is running, to ensure there was no mistake during the step 3.
    As well as you might want to preserve this chain with akash 0.14.1 as this is the only way to query the account balances before the height=5629650 due to https://github.com/ovrclk/akash/issues/1666

  3. akash 0.16.2 (aka mainnet3):
    Continue from height=5629650 until height=5629801 (manual).
    You HAVE TO use AKASH_HALT_HEIGHT=5629801, otherwise the AppHash error will corrupt the chain when calculating it at 5629803 height.

  4. akash 0.16.3:
    Continue from height=5629801 until height=8526250 (auto).
    Govt SW upgrade proposal #27 will halt akash at that height, expecting 0.18.0.

  5. akash 0.18.0 (aka mainnet4):
    Continue from height=8526250 until height=8998907 (auto).
    Govt SW upgrade proposal #29 will halt akash at that height, expecting v0.20.0.
    IMPORTANT: Make sure you have exported the following environment variables before starting akash 0.18.0:

export AKASH_PRUNING=nothing
export AKASH_IAVL_DISABLE_FASTNODE=false
export AKASH_STATESYNC_SNAPSHOT_INTERVAL=0

Notes: state-sync and pruning are broken in 0.18.0.

  1. akash 0.20.0 (aka mainnet5; also compatible with 0.22.0 as there was no code change - https://github.com/akash-network/node/compare/v0.20.0..v0.22.0 ):
    Continue from height=8998907 until height=12606074 (auto).
    Govt SW upgrade proposal #224 will halt akash at that height, expecting v0.24.0.
    Notes: state-sync and pruning are fixed in 0.20.0
    You can set the AKASH_STATESYNC_SNAPSHOT_INTERVAL, AKASH_PRUNING, AKASH_IAVL_DISABLE_FASTNODE back to either the default values or the values you had before you upgraded to 0.18.0. It is safe to change these 3 parameters after they were set with akash 0.18.0.

  2. akash 0.24.0 (aka mainnet6)
    Continue from height=12606074 until height=12992204 (auto).
    Govt SW upgrade proposal #231 will halt akash at that height, expecting v0.26.0.
    NOTE: one cannot query the balances for pre-12606074 height (SW upgrade proposal #224 v0.24.0) akash-network/support#134 after this or one of the following (0.26.1, 0.26.2, 0.28.0, 0.30.0) network upgrades.

  3. akash 0.26.1 (aka mainnet7)
    Continue from height=12992204 until height=13602142 (manual).
    =
    You HAVE TO use AKASH_HALT_HEIGHT=13602142, otherwise you will get the AppHash / error on replay: wrong Block.Header.LastResultsHash error at 13602144 height. This is because height 13602143 is supposed to be processed by akash 0.26.2.

  4. akash 0.26.2 Continue from height=13602142 until height=13759618 (auto).
    Govt SW upgrade proposal #237 will halt akash at that height, expecting v0.28.0.
    =
    Historical info
    The chain was halted for 7h59m4s between 13602144 and 13602145 blocks.
    This happened because not all validators swapped the binaries at the same time and it appears there were two 13602142 blocks committed by different validators (one 0.26.1 and another one 0.26.2) leading to the wrong block hash.
    The validators mostly restored from Polkachu's snapshot in order to reach the consensus https://snapshots.polkachu.com/snapshots/akash/akash_13601931.tar.lz4 at 13602145 height (ERR prevote step: ProposalBlock is invalid err="wrong Block.Header.LastResultsHash. Expected D6923F1734C42C9261F736192281D71A0C8DEE534508CAD73AE5B6BD64CEBA9A, got 4CFF83ED877897DA71CB36AB85A0277F2801F26D242663C2A27FE1A1CA99F5C6" height=13602144 module=consensus )
    Consensus was reached when +2/3 (66.66%+) of the voting power went online with the right binary version.
    =
    To explain this situation better, during non-governance network upgrades (via binary swap), there's a notable risk where a "bad" pending block might be proposed by validators (with +2/3 voting power, VP) still operating on the older binary. This situation occurs during the brief transition period of the upgrade. Consequently, this leads to AppHash / Block.Header.LastResultsHash errors. These errors arise because, post-upgrade, validators who have completed the switch to the new binary might still have a "pending" block they proposed under the old binary. However, this block is not recognized by the new binary, creating a hash mismatch and resulting in the aforementioned errors.
    To mitigate this issue, validators often resort to using the same snapshot, ensuring they start the new network binary from the same block height.
    The validators could have instead recover without resorting to the snapshot by deleting the wrong pending block.

  5. akash 0.28.0 (aka mainnet8)
    Continue from height=13759618


Additional information

Recovering from AppHash / Block.Header.LastResultsHash errors

Here is a doc on how-to build akash 0.26.2 with rollback function to recover from the AppHash / error on replay: wrong Block.Header.LastResultsHash errors.
You can try leveraging it for different versions too.


cosmos-omnibus procedure

NOTE: I haven't updated it since the 0.20.0 / 0.22.0 (aka mainnet5), but it shouldn't be difficult to update the steps based on the above process.

If you are using cosmos-omnibus project, you can use the following environment variables.

Start from the bottom step 1 through step 5 each time the container stops.

    image: ghcr.io/ovrclk/cosmos-omnibus:v0.3.1-generic
...
    environment:
      ## 6) akash 0.20.0 start height=8998907 (also compatible with akash 0.22.0)
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.20.0/akash_0.20.0_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.20.0_linux_amd64/akash
      #
      ## 5) akash 0.18.0 start height=8526250 until height=8526250 (SW upgrade to v0.20.0)
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.18.0/akash_0.18.0_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.18.0_linux_amd64/akash
      # prevents "panic: cannot delete latest saved version" error
      #- AKASH_PRUNING=nothing
      # prevents "panic: runtime error: invalid memory address or nil pointer dereference" error
      #- AKASH_IAVL_DISABLE_FASTNODE=false
      # prevents "panic: runtime error: invalid memory address or nil pointer dereference" error on cosmos-sdk's `createSnapshot ... incrVersionReaders`
      #- AKASH_STATESYNC_SNAPSHOT_INTERVAL=0
      #
      ## 4) akash 0.16.3 start height=5629801 until height=8526250 (SW upgrade to 0.18.0);
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.16.3/akash_0.16.3_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.16.3_linux_amd64/akash
      #
      ## 3) akash 0.16.2 start height=5629650 until height=5629801
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.16.2/akash_0.16.2_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.16.2_linux_amd64/akash
      #- AKASH_HALT_HEIGHT=5629801
      #
      ## 2) akash 0.14.1 start height=455200 until height=5629650 (SW upgrade to 0.16.1); blockchain size 1222 GB
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.14.1/akash_0.14.1_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.14.1_linux_amd64/akash
      #
      ## 1) akash 0.10.1 start height=0 until height=455200 (SW upgrade to 0.14.1); blockchain size 63 GB
      - BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.10.1/akash_0.10.1_linux_amd64.zip
      - BINARY_ZIP_PATH=akash_0.10.1_linux_amd64/akash
      ## pre-fork, akashnet-1, akash 0.8.3 << DO NOT USE THIS!
      #- BINARY_URL=https://github.com/ovrclk/akash/releases/download/v0.8.3/akash_0.8.3_linux_amd64.zip
      #- BINARY_ZIP_PATH=akash_0.8.3_linux_amd64/akashd
      #- PROJECT_BIN=akashd
      #- PROJECT_DIR=akashd
      ## archive mode
      # do not prune anything on the archiving node.
      - PRUNING=nothing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment