Skip to content

Instantly share code, notes, and snippets.

@jleskovar
Last active January 12, 2019 19:15
Show Gist options
  • Save jleskovar/dfc545148398d81715da02f61bf39b91 to your computer and use it in GitHub Desktop.
Save jleskovar/dfc545148398d81715da02f61bf39b91 to your computer and use it in GitHub Desktop.
btcd watchdog
#!/bin/bash
POST_INIT_SYNC_DELAY=60
POLL_DELAY=60
STALL_THRESHOLD=5
if [ -z `pidof btcd` ]; then
echo "Starting btcd"
nohup btcd &
sleep $POST_INIT_SYNC_DELAY
fi
stalls=0
while true; do
start=`btcctl --notls getinfo | jq -r .blocks`
sleep $POLL_DELAY
end=`btcctl --notls getinfo | jq -r .blocks`
echo "Processed $((end - start)) blocks in the last $POLL_DELAY seconds"
if [[ "$start" == "$end" ]]; then
if (( stalls > STALL_THRESHOLD )); then
echo "Too many stalls detected. Restarting btcd..."
kill `pidof btcd`
sleep 10
nohup btcd &
stalls=0
else
syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .addr' | cut -f1 -d:`
if [ -z "$syncnode" ]; then
echo "Stall detected, but no syncnode found. Restarting btcd..."
kill `pidof btcd`
sleep 10
nohup btcd &
stalls=0
else
echo "Stall detected! Evicting potentially bad node $syncnode"
btcctl --notls node disconnect $syncnode
stalls=$(( stalls + 1 ))
fi
fi
fi
done
@adiack
Copy link

adiack commented Apr 20, 2018

Works like a charm, thank you. In my case I only had to remove --notls .
./watchdog_btcd.sh

+ POST_INIT_SYNC_DELAY=60
+ POLL_DELAY=60
+ STALL_THRESHOLD=5
++ pidof btcd
+ '[' -z 5465 ']'
+ stalls=0
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
++ btcctl getinfo
++ jq -r .blocks
+ end=384672
+ echo 'Processed 0 blocks in the last 60 seconds'
Processed 0 blocks in the last 60 seconds
+ [[ 384672 == \3\8\4\6\7\2 ]]
+ ((  stalls > STALL_THRESHOLD  ))
++ btcctl getpeerinfo
++ jq -r '.[] | select(.syncnode == true) | .addr'
++ cut -f1 -d:
+ syncnode=217.23.8.80
+ '[' -z 217.23.8.80 ']'
+ echo 'Stall detected! Evicting potentially bad node 217.23.8.80'
Stall detected! Evicting potentially bad node 217.23.8.80
+ btcctl node disconnect 217.23.8.80
2018-04-20 09:28:00.697 [INF] SYNC: Lost peer 217.23.8.80:8333 (outbound)
2018-04-20 09:28:00.697 [INF] SYNC: Syncing to block height 519094 from peer 83.248.113.248:8333
+ stalls=1
+ true
++ jq -r .blocks
++ btcctl getinfo
+ start=384672
+ sleep 60
2018-04-20 09:28:00.977 [INF] SYNC: New valid peer 5.15.98.67:8333 (outbound) (/Satoshi:0.16.0/)
2018-04-20 09:28:01.391 [INF] SYNC: Processed 1 block in the last 7m29.19s (2 transactions, height 384673, 2015-11-21 19:38:21 +0000 UTC)
2018-04-20 09:28:11.851 [INF] SYNC: Processed 3 blocks in the last 10.46s (1207 transactions, height 384676, 2015-11-21 19:47:05 +0000 UTC)
2018-04-20 09:28:24.364 [INF] SYNC: Processed 6 blocks in the last 12.51s (3072 transactions, height 384682, 2015-11-21 20:19:26 +0000 UTC)
2018-04-20 09:28:36.536 [INF] SYNC: Processed 2 blocks in the last 12.17s (3743 transactions, height 384684, 2015-11-21 20:55:52 +0000 UTC)
2018-04-20 09:28:52.387 [INF] SYNC: Processed 4 blocks in the last 15.85s (2171 transactions, height 384688, 2015-11-21 21:24:00 +0000 UTC)

@githorray
Copy link

I was having issues with the script being able to ban stalled ipv6 hosts. It is easier to ban by node id than ip.

syncnode=`btcctl --notls getpeerinfo | jq -r '.[] | select(.syncnode == true) | .id'

@neogeno
Copy link

neogeno commented Aug 9, 2018

This helped a lot

@ccdle12
Copy link

ccdle12 commented Oct 3, 2018

Thank you, very helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment