Instantly share code, notes, and snippets.

Embed
What would you like to do?
Bash Script for Nagios to Check Status of Docker Container
#!/bin/bash
# Author: Erik Kristensen
# Email: erik@erikkristensen.com
# License: MIT
# Nagios Usage: check_nrpe!check_docker_container!_container_id_
# Usage: ./check_docker_container.sh _container_id_
#
# Depending on your docker configuration, root might be required. If your nrpe user has rights
# to talk to the docker daemon, then root is not required. This is why root privileges are not
# checked.
#
# The script checks if a container is running.
# OK - running
# WARNING - restarting
# CRITICAL - stopped
# UNKNOWN - does not exist
#
# CHANGELOG - March 20, 2017
# - Removes Ghost State Check, Checks for Restarting State, Properly finds the Networking IP addresses
# - Returns unknown (exit code 3) if docker binary is missing, unable to talk to the daemon, or if container id is missing
CONTAINER=$1
if [ "x${CONTAINER}" == "x" ]; then
echo "UNKNOWN - Container ID or Friendly Name Required"
exit 3
fi
if [ "x$(which docker)" == "x" ]; then
echo "UNKNOWN - Missing docker binary"
exit 3
fi
docker info > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "UNKNOWN - Unable to talk to the docker daemon"
exit 3
fi
RUNNING=$(docker inspect --format="{{.State.Running}}" $CONTAINER 2> /dev/null)
if [ $? -eq 1 ]; then
echo "UNKNOWN - $CONTAINER does not exist."
exit 3
fi
if [ "$RUNNING" == "false" ]; then
echo "CRITICAL - $CONTAINER is not running."
exit 2
fi
RESTARTING=$(docker inspect --format="{{.State.Restarting}}" $CONTAINER)
if [ "$RESTARTING" == "true" ]; then
echo "WARNING - $CONTAINER state is restarting."
exit 1
fi
STARTED=$(docker inspect --format="{{.State.StartedAt}}" $CONTAINER)
NETWORK=$(docker inspect --format="{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" $CONTAINER)
echo "OK - $CONTAINER is running. IP: $NETWORK, StartedAt: $STARTED"
@UnquietCode

This comment has been minimized.

Show comment
Hide comment
@UnquietCode

UnquietCode Jun 1, 2015

Pro, this is exactly what I was looking for. Thanks!

UnquietCode commented Jun 1, 2015

Pro, this is exactly what I was looking for. Thanks!

@B-Galati

This comment has been minimized.

Show comment
Hide comment
@B-Galati

B-Galati Jun 11, 2015

Thanks a lot ;)

B-Galati commented Jun 11, 2015

Thanks a lot ;)

@wrenzi

This comment has been minimized.

Show comment
Hide comment
@wrenzi

wrenzi Jun 16, 2015

Nice gist, thanks.

wrenzi commented Jun 16, 2015

Nice gist, thanks.

@mohammadhamzehloui

This comment has been minimized.

Show comment
Hide comment
@mohammadhamzehloui

mohammadhamzehloui Jul 3, 2015

Thanks a lot, great job.

mohammadhamzehloui commented Jul 3, 2015

Thanks a lot, great job.

@Alino

This comment has been minimized.

Show comment
Hide comment
@Alino

Alino Jul 18, 2015

cool, thank you

Alino commented Jul 18, 2015

cool, thank you

@l8nite

This comment has been minimized.

Show comment
Hide comment
@l8nite

l8nite Jul 19, 2015

Are ghosted containers still an issue? This thread seems to indicate they aren't

l8nite commented Jul 19, 2015

Are ghosted containers still an issue? This thread seems to indicate they aren't

@benishor

This comment has been minimized.

Show comment
Hide comment
@benishor

benishor Jul 31, 2015

Thanks, just what I needed!

benishor commented Jul 31, 2015

Thanks, just what I needed!

@SpheMakh

This comment has been minimized.

Show comment
Hide comment
@SpheMakh

SpheMakh commented Nov 8, 2015

💯

@lightxu

This comment has been minimized.

Show comment
Hide comment
@lightxu

lightxu Nov 19, 2015

Awesome script, thank you :) docker inspect is a powerful command

lightxu commented Nov 19, 2015

Awesome script, thank you :) docker inspect is a powerful command

@hermanjunge

This comment has been minimized.

Show comment
Hide comment
@hermanjunge

hermanjunge commented Dec 31, 2015

💯 💯

@ktanriverdi

This comment has been minimized.

Show comment
Hide comment
@ktanriverdi

ktanriverdi Jan 5, 2016

How can i run this script without sudo?Because when i try to run this script via nagios user,it says does not exist and with sudo ./check_docker_container container_id command it works fine.Any suggestions?Thanks

ktanriverdi commented Jan 5, 2016

How can i run this script without sudo?Because when i try to run this script via nagios user,it says does not exist and with sudo ./check_docker_container container_id command it works fine.Any suggestions?Thanks

@michabbb

This comment has been minimized.

Show comment
Hide comment
@michabbb

michabbb Feb 2, 2016

when doing the ghost command, i get:

Template parsing error: template: :1:9: executing "" at <.State.Ghost>: Ghost is not a field of struct type *types.ContainerState

is this an error or a normal msg ? thanks !

michabbb commented Feb 2, 2016

when doing the ghost command, i get:

Template parsing error: template: :1:9: executing "" at <.State.Ghost>: Ghost is not a field of struct type *types.ContainerState

is this an error or a normal msg ? thanks !

@BartJol

This comment has been minimized.

Show comment
Hide comment
@BartJol

BartJol Feb 23, 2016

I get the same error as michabb, (Docker version 1.9.1, build a34a1d5, on Ubuntu, Trusty Tahr 14.04.4 LTS), but when I issue the docker inspect command for the container I am using, the State.Ghost field does not exist. So I expect this is:
a. a State property that is just not in the container
b. a State property that is not included in the docker version
Neither give me any worries. But it would be nice to know which it is :)

I thought off option c: the property is only added in specific circumstances, that would be strange though, imho.

Ow the properties I do have are:
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 5667,
"ExitCode": 0,
"Error": "",
"StartedAt": "2016-02-23T11:44:17.338327578Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},

A bit of a newbie in docker, but maybe Dead is a replacement for ghost, have to read more changelogs and manual pages.

BartJol commented Feb 23, 2016

I get the same error as michabb, (Docker version 1.9.1, build a34a1d5, on Ubuntu, Trusty Tahr 14.04.4 LTS), but when I issue the docker inspect command for the container I am using, the State.Ghost field does not exist. So I expect this is:
a. a State property that is just not in the container
b. a State property that is not included in the docker version
Neither give me any worries. But it would be nice to know which it is :)

I thought off option c: the property is only added in specific circumstances, that would be strange though, imho.

Ow the properties I do have are:
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 5667,
"ExitCode": 0,
"Error": "",
"StartedAt": "2016-02-23T11:44:17.338327578Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},

A bit of a newbie in docker, but maybe Dead is a replacement for ghost, have to read more changelogs and manual pages.

@jedrekdomanski

This comment has been minimized.

Show comment
Hide comment
@jedrekdomanski

jedrekdomanski Apr 27, 2016

Awesome, it helped me to write my own script. Good job.

jedrekdomanski commented Apr 27, 2016

Awesome, it helped me to write my own script. Good job.

@JasonRitchie

This comment has been minimized.

Show comment
Hide comment
@JasonRitchie

JasonRitchie Aug 5, 2016

since this comment, I've adopted this script with two modifications:

  1. It returns status UNKNOWN if the docker command is missing (instead of OK)
  2. I removed all the GHOST check code

hash docker 2>/dev/null || { echo "UNKNOWN - docker command not found"; exit 3; }

JasonRitchie commented Aug 5, 2016

since this comment, I've adopted this script with two modifications:

  1. It returns status UNKNOWN if the docker command is missing (instead of OK)
  2. I removed all the GHOST check code

hash docker 2>/dev/null || { echo "UNKNOWN - docker command not found"; exit 3; }

@andrusstrockiy

This comment has been minimized.

Show comment
Hide comment
@andrusstrockiy

andrusstrockiy Aug 19, 2016

first of thanks for sharing.
Small comment:
Guys please be sure that when you run above script from nrpe i.e as nagios unprivileged user you have sudo rights for

docker inspect

command.
Lost like an hour to discover why the scriprt didn't work through nrpe.

Added something like to top
# permissions
if [ "$(whoami)" != "root" ]; then
    echo "Root privileges are required to run this, try running with sudo..."
    exit 2
fi

andrusstrockiy commented Aug 19, 2016

first of thanks for sharing.
Small comment:
Guys please be sure that when you run above script from nrpe i.e as nagios unprivileged user you have sudo rights for

docker inspect

command.
Lost like an hour to discover why the scriprt didn't work through nrpe.

Added something like to top
# permissions
if [ "$(whoami)" != "root" ]; then
    echo "Root privileges are required to run this, try running with sudo..."
    exit 2
fi
@rumeshbandara

This comment has been minimized.

Show comment
Hide comment
@rumeshbandara

rumeshbandara Nov 21, 2016

Thanks man, after 3 years it's still not outdated! 👍

rumeshbandara commented Nov 21, 2016

Thanks man, after 3 years it's still not outdated! 👍

@ventz

This comment has been minimized.

Show comment
Hide comment
@ventz

ventz Nov 26, 2016

If you are using Docker 1.12+ and using the IP driver (direct,macvlan, etc) -- the way to get the IP address is:

NETWORK=$(docker inspect --format="{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" $CONTAINER)

That also works on the default networked containers, so it's probably a better way.

ventz commented Nov 26, 2016

If you are using Docker 1.12+ and using the IP driver (direct,macvlan, etc) -- the way to get the IP address is:

NETWORK=$(docker inspect --format="{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" $CONTAINER)

That also works on the default networked containers, so it's probably a better way.

@noxxer

This comment has been minimized.

Show comment
Hide comment
@noxxer

noxxer Dec 15, 2016

A slight improvement:

docker ps -q --filter "name=nginx" | awk '{print $1}' | xargs docker inspect --format="{{ .State.Status }}"

other ps filter options:
(https://docs.docker.com/v1.11/engine/reference/commandline/ps/)
and inspect additional features :
(https://docs.docker.com/v1.11/engine/reference/commandline/inspect/)

noxxer commented Dec 15, 2016

A slight improvement:

docker ps -q --filter "name=nginx" | awk '{print $1}' | xargs docker inspect --format="{{ .State.Status }}"

other ps filter options:
(https://docs.docker.com/v1.11/engine/reference/commandline/ps/)
and inspect additional features :
(https://docs.docker.com/v1.11/engine/reference/commandline/inspect/)

@lalitprasanth12

This comment has been minimized.

Show comment
Hide comment
@lalitprasanth12

lalitprasanth12 Feb 8, 2017

Thanks a lot!!

lalitprasanth12 commented Feb 8, 2017

Thanks a lot!!

@esteinborn

This comment has been minimized.

Show comment
Hide comment
@esteinborn

esteinborn Mar 17, 2017

2> /dev/null is what I needed after hours of searching. 🙇
Thank you!

esteinborn commented Mar 17, 2017

2> /dev/null is what I needed after hours of searching. 🙇
Thank you!

@ekristen

This comment has been minimized.

Show comment
Hide comment
@ekristen

ekristen Mar 20, 2017

I wasn't getting notifications on this! My apologies.

I've updated the script with most of the suggestions in the comments.

Please note, I'm not using this script anymore, but if needed I'll move this to a git repo so pull requests can be accepted.

Owner

ekristen commented Mar 20, 2017

I wasn't getting notifications on this! My apologies.

I've updated the script with most of the suggestions in the comments.

Please note, I'm not using this script anymore, but if needed I'll move this to a git repo so pull requests can be accepted.

@wirtoo

This comment has been minimized.

Show comment
Hide comment
@wirtoo

wirtoo Mar 26, 2018

Hello guys.
Added nagios user to docker group, so it has permissions to speak to docker daemon.
When I execute it as nagios:

nagios@nrpe-client-host:~$ sh /usr/lib/nagios/plugins/check_docker_container.sh redis
/usr/lib/nagios/plugins/check_docker_container.sh: 25: [: xredis: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 30: [: x/usr/bin/docker: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 48: [: true: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 55: [: false: unexpected operator
OK - redis is running. IP: 172.17.0.2, StartedAt: 2018-03-01T08:07:42.857992735Z

So, it works.
But when I'm trying to execute it from nagios host:

user@nagios-host:~$ /usr/local/nagios/libexec/check_nrpe -H 123.45.67.89 -c check_docker_container redis
NRPE: Unable to read output

Nagios displays the same "UNKNOWN - NRPE: Unable to read output"

Here is nrpe.cfg
command[check_docker_container]=/usr/lib/nagios/plugins/check_docker_container.sh
and service definition

define service {
        use                             generic-service
        host_name                       host-example
        service_description             Redis Docker Container
        check_command                   check_nrpe!check_docker_container!redis
}

Am I missing something? What's wrong?
Thank you!

wirtoo commented Mar 26, 2018

Hello guys.
Added nagios user to docker group, so it has permissions to speak to docker daemon.
When I execute it as nagios:

nagios@nrpe-client-host:~$ sh /usr/lib/nagios/plugins/check_docker_container.sh redis
/usr/lib/nagios/plugins/check_docker_container.sh: 25: [: xredis: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 30: [: x/usr/bin/docker: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 48: [: true: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 55: [: false: unexpected operator
OK - redis is running. IP: 172.17.0.2, StartedAt: 2018-03-01T08:07:42.857992735Z

So, it works.
But when I'm trying to execute it from nagios host:

user@nagios-host:~$ /usr/local/nagios/libexec/check_nrpe -H 123.45.67.89 -c check_docker_container redis
NRPE: Unable to read output

Nagios displays the same "UNKNOWN - NRPE: Unable to read output"

Here is nrpe.cfg
command[check_docker_container]=/usr/lib/nagios/plugins/check_docker_container.sh
and service definition

define service {
        use                             generic-service
        host_name                       host-example
        service_description             Redis Docker Container
        check_command                   check_nrpe!check_docker_container!redis
}

Am I missing something? What's wrong?
Thank you!

@wirtoo

This comment has been minimized.

Show comment
Hide comment
@wirtoo

wirtoo Mar 27, 2018

Just solved my issue.
It was about permissions, sorry.
Great script, btw. Thanks.

wirtoo commented Mar 27, 2018

Just solved my issue.
It was about permissions, sorry.
Great script, btw. Thanks.

@jjbursik

This comment has been minimized.

Show comment
Hide comment
@jjbursik

jjbursik Jul 25, 2018

@wirtoo can you share the permission issue fix you used?

jjbursik commented Jul 25, 2018

@wirtoo can you share the permission issue fix you used?

@r3lik

This comment has been minimized.

Show comment
Hide comment
@r3lik

r3lik Aug 6, 2018

I'm running NRPE in a container. Do I need to add the nagios user to /etc/sudoers in the container itself?
From my Nagios host:

./check_nrpe -H 10.99.125.131 -c check_docker_container1
NRPE: Unable to read output

r3lik commented Aug 6, 2018

I'm running NRPE in a container. Do I need to add the nagios user to /etc/sudoers in the container itself?
From my Nagios host:

./check_nrpe -H 10.99.125.131 -c check_docker_container1
NRPE: Unable to read output
@vanokg

This comment has been minimized.

Show comment
Hide comment
@vanokg

vanokg Sep 18, 2018

What I do wrong?

Remote
/usr/local/nagios/libexec/check_nrpe -H hostip -c check_docker -a asterisk
UNKNOWN - Missing docker binary

Local
/usr/lib64/nagios/plugins/check_docker asterisk
OK - asterisk is running. IP: 172.19.0.2, StartedAt: 2018-09-14T06:44:09.174409454Z

vanokg commented Sep 18, 2018

What I do wrong?

Remote
/usr/local/nagios/libexec/check_nrpe -H hostip -c check_docker -a asterisk
UNKNOWN - Missing docker binary

Local
/usr/lib64/nagios/plugins/check_docker asterisk
OK - asterisk is running. IP: 172.19.0.2, StartedAt: 2018-09-14T06:44:09.174409454Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment