Skip to content

Instantly share code, notes, and snippets.

@jshufro

jshufro/how.md Secret

Last active Nov 30, 2022
Embed
What would you like to do?
Preventing node reboots during duties

Preventing Node Reboots During Duties

Disclaimer

It's generally a Bad Idea to run random scripts from the internet on your node unless you know what they're doing. To that end, please read and try to understand this before installing it, and only install it if you feel comfortable.

Installation

When you configured your Rocket Pool node, you probably set up Automatic Updates persuant to the guides. This enabled automatic reboots, which is great for the security of your node, however, it is not ideal to reboot during an active validator duty such as a sync committee, or when you have upcoming duties like proposals or sync committees.

Note that this script only works if the node has advanced notice of the committees/proposals. Proposals in particular only come with a 1 epoch advanced notice, so if you restart and have Doppelgänger Protection enabled, you can still miss a proposal during the 2-3 epochs after rebooting.

This guide will help you prevent the system from rebooting (and prevent you from accidentally rebooting) during those times.

This guide only works if you are running the Rocket Pool monitoring software (Grafana).

Install dependencies

sudo apt install jq molly-guard

Turn off the SSH guard

By default, molly-guard prevents reboots when there's an active SSH session. I find this really annoying. To disable it, run

sudo mv /etc/molly-guard/run.d/30-query-hostname /etc/molly-guard/30-query-hostname

Create the new molly-guard script

Finally, we simply need to provide molly-guard with a script to check validator duties.

Run

sudo nano /etc/molly-guard/run.d/99-rocketpool

and paste in

#!/bin/bash
JQ=/usr/bin/jq
TIMEOUT_SEC=3
DOCKER_RUN="docker run --network rocketpool_monitor-net --rm curlimages/curl -m $TIMEOUT_SEC -s"
PROM="http://rocketpool_prometheus:9091/api/v1/query?query="
RVAL=0

# Queries prometheus for the upcoming_proposals metric
get_proposals () {
        echo $($DOCKER_RUN ${PROM}rocketpool_beacon_upcoming_proposals)
}

# Queries prometheus for the active_sync_committee metric
get_current_sync () {
        echo $($DOCKER_RUN ${PROM}rocketpool_beacon_active_sync_committee)
}

# Queries prometheus for the upcoming_sync_committee metric
get_upcoming_sync () {
        echo $($DOCKER_RUN ${PROM}rocketpool_beacon_upcoming_sync_committee)
}

# Check if curl is already pulled. If it isn't, pull it.
if [ "$(docker images -q curlimages/curl)" == "" ]; then
        echo "Pulling curl..."
        docker pull -q curlimages/curl
fi

# In parallel, query for all 3 metrics
exec 7< <(get_proposals)
exec 8< <(get_current_sync)
exec 9< <(get_upcoming_sync)

# Store the results in variables
read <&7 PROPOSALS
read <&8 CURRENT_SYNC
read <&9 UPCOMING_SYNC

# If jq isn't installed, we can't do anything, so notify the user but allow the shutdown.
if [ ! $(command -v $JQ) ]; then
        echo "$JQ not found. Allowing shutdown."
        exit 0
fi

# Stall reboot if there are upcoming proposals
if [ ! -z "$PROPOSALS" ] && [ $($JQ .status <<< $PROPOSALS) == "\"success\"" ]; then
        if [ $($JQ .data.result[0].value[1] <<< $PROPOSALS) == "\"0\"" ]; then
                echo "0 upcoming proposals, allowing shutdown to continue."
        else
                echo "Upcoming proposals found, blocking reboot."
                ((RVAL|=1)) # Set the first bit of the exit code
        fi
fi

# Stall reboot if there is an active sync committee
if [ ! -z "$CURRENT_SYNC" ] && [ $($JQ .status <<< $CURRENT_SYNC) == "\"success\"" ]; then
        if [ $($JQ .data.result[0].value[1] <<< $CURRENT_SYNC) == "\"0\"" ]; then
                echo "0 active sync committees, allowing shutdown to continue."
        else
                echo "Active sync committees found, blocking reboot."
                ((RVAL|=2)) # Set the second bit of the exit code
        fi
fi

# Stall reboot if there is an upcoming sync comittee
if [ ! -z "$UPCOMING_SYNC" ] && [ $($JQ .status <<< $UPCOMING_SYNC) == "\"success\"" ]; then
        if [ $($JQ .data.result[0].value[1] <<< $UPCOMING_SYNC) == "\"0\"" ]; then
                echo "0 upcoming sync committees, allowing shutdown to continue."
        else
                echo "Upcoming sync committees found, blocking reboot."
                ((RVAL|=4)) # Set the 3rd bit of the exit code
        fi
fi

exit $RVAL # Return the exit code. If any of the bits are set, the shutdown will be canceled.

Save and exit with Ctrl-O Enter Ctrl-X

Then mark the file as executable:

sudo chmod u+x /etc/molly-guard/run.d/99-rocketpool

Check that the script is working

$ sudo reboot --molly-guard-do-nothing
I: demo mode; molly-guard will not do anything due to --molly-guard-do-nothing.
0 upcoming proposals, allowing shutdown to continue.
0 active sync committees, allowing shutdown to continue.
0 upcoming sync committees, allowing shutdown to continue.
molly-guard: would run: /lib/molly-guard/reboot

Note that this solution will prevent shutdowns from being scheduled, it will not abort an already pending shutdown. Do that by calling sudo shutdown -c.

Overriding Mollyguard

If you ever want to reboot or shutdown despite mollyguard, you can bypass it by calling sudo systemctl reboot or sudo systemctl poweroff respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment