Skip to content

Instantly share code, notes, and snippets.

@slackpad
Last active March 23, 2016 17:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save slackpad/383473addd8b73cfc5f4 to your computer and use it in GitHub Desktop.
Save slackpad/383473addd8b73cfc5f4 to your computer and use it in GitHub Desktop.
A bare-bones dead node notification system for Consul

Here's a basic watch script that will look through the output for serfHealth checks in the critical state. The serfHealth check is a built-in check added by Consul that keeps track of the health of a node. When this watch handler fires, it will get the JSON body of the health endpoint passed to it over stdin.

#!/bin/sh
for node in $(jq '.[] | select(.CheckID=="serfHealth" and .Status=="critical") | .Node' -); do
    echo "$node is dead"
done

You can then run this on the command line via consul watch. We add filtering so this only gets called with there are checks in the critical state, but the script can handle not doing that:

consul watch -type=checks -state=critical ./dead_node.sh

This could also get registered with an agent if you don't want to run a command line thing on the side.

The only limit to this script is that it will see all failed nodes every time any node fails. If you are sending a summary of failed nodes as a notification then this will work fine, otherwise you'll need to keep a little state some place to not re-trigger notifications (maybe some rate limit).

Here's a sample run:

workpad:consul-demo-tf james$ consul watch -type=checks -state=critical ./dead_node.sh
"consul-client-nyc3-2" is dead
# watch blocks waiting for more updates...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment