Skip to content

Instantly share code, notes, and snippets.

@michaeljfazio
Last active February 23, 2020 20:52
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save michaeljfazio/35ed67578df85d6d19b877e0fe8574c9 to your computer and use it in GitHub Desktop.
Save michaeljfazio/35ed67578df85d6d19b877e0fe8574c9 to your computer and use it in GitHub Desktop.
Jormungandr Node Monitor
#!/bin/bash
#
# Author: Michael Fazio (sandstone.io)
#
# This script monitors a Jormungandr node for "liveness" and executes a shutdown if the node is determined
# to be "stuck". A node is "stuck" if the time elapsed since last block exceeds the sync tolerance
# threshold. The script does NOT perform a restart on the Jormungandr node. Instead we rely on process
# managers such as systemd to perform restarts.
POLLING_INTERVAL_SECONDS=30
SYNC_TOLERANCE_SECONDS=240
REST_API="http://127.0.0.1:8443/api"
while true; do
LAST_BLOCK=$(jcli rest v0 node stats get --output-format json --host $REST_API 2> /dev/null)
LAST_BLOCK_HEIGHT=$(echo $LAST_BLOCK | jq -r .lastBlockHeight)
LAST_BLOCK_DATE=$(echo $LAST_BLOCK | jq -r .lastBlockTime)
LAST_BLOCK_TIME=$(date -d$LAST_BLOCK_DATE +%s 2> /dev/null)
CURRENT_TIME=$(date +%s)
DIFF_SECONDS=$((CURRENT_TIME - LAST_BLOCK_TIME))
if ((LAST_BLOCK_TIME > 0)); then
if ((DIFF_SECONDS > SYNC_TOLERANCE_SECONDS)); then
echo "Jormungandr out-of-sync. Time difference of $DIFF_SECONDS seconds. Shutting down node..."
jcli rest v0 shutdown get --host $REST_API
else
echo "Jormungandr synchronized. Time difference of $DIFF_SECONDS seconds. Last block height $LAST_BLOCK_HEIGHT."
fi
else
echo "Jormungandr node is offline or bootstrapping..."
fi
sleep $POLLING_INTERVAL_SECONDS
done
@aenomis
Copy link

aenomis commented Jan 5, 2020

Hi Michael,
Great job for the script. I tested on my node and it's working fine when i run the script.
Because I'm using ssh connection to my server I can't keep all the time my ssh connection active, so I created a service which will start the script, but when i check the status for my service i see all the time this message : "Jormungandr node is offline or bootstrapping..." So what should i change in order to run your script via an linux service ?
Thank you.

@michaeljfazio
Copy link
Author

Hey. There really isn’t anything special required to run this as a service. If you can run it manually then it should run just fine as a service also.

@aenomis
Copy link

aenomis commented Jan 6, 2020

Hi, Now I can run the script as well with linux service and I solved it by adding in the script the path to the jcli (my changes are yellow) :

example

@gacallea
Copy link

gacallea commented Jan 22, 2020

Hi, Now I can run the script as well with linux service and I solved it by adding in the script the path to the jcli (my changes are yellow) :

example

you'd be better off with this:

JCLI="$(which jcli)"
[ -z "${JCLI}" ] && [ -f jcli ] && JCLI="./jcli"

it checks the binaries in your $PATH and if it fails, it looks in the current directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment