Skip to content

Instantly share code, notes, and snippets.

@GoodMirek
Last active January 28, 2020 09:03
Show Gist options
  • Save GoodMirek/2dc39100d18c72eed3f0f3569e221f4f to your computer and use it in GitHub Desktop.
Save GoodMirek/2dc39100d18c72eed3f0f3569e221f4f to your computer and use it in GitHub Desktop.
Bash script sending RabbitMQ metrics to AWS CloudWatch
#! /bin/bash
# This script reads rabbitmq statistics and report them as CloudWatch metrics.
# Author: Mirek Svoboda
# Version: 0.3
# TABSTOP=4
#
# Changelog:
# 0.3
# -- increasing number of AWS CLI calls from 3 to 5. We have added RabbitMQ queues and then hit a limit of 20 datapoints.
#
# 0.2
# -- minimizing number of AWS CLI calls. These calls are CPU expensive. 42 calls/minute depleted all CPU credit (average CPU utilization 20% for t2.small)
# -- introduced changes require higher version of AWS CLI than available in Ubuntu repository
#
# This script is known to work with AWS CLI v1.8.6+. It does not work with AWS CLI v1.2.9.
# This script is known to work with rabbitmq-server 3.2.4 and rabbitmq-server 3.5.1
# Script reads value(s) of those queue-related statistic(s) listed in variable STATS2READ
# Statistic(s) is read for each of the queues on local rabbitmq broker.
# Statistics are then pushed into AWS CloudWatch as user-specific metrics
# Metric namespace is RabbitMQ
# Dimensions are next two combinations: Queue, Queue + InstanceId
# For each queue and statistic a datapoint is submitted twice:
# first datapoint for cluster-wide metric, the dimension is only Queue
# second datapoint for instance-specific metric, the dimensions are Queue and InstanceId
# Metric name is RabbitMQ queue name for queue statistics.
#
# Another CloudWatch user-specific metric is "partitioned". Dimensions: None, InstanceId
# This metric looks for partitions (split-brain). If partitioning occures, then datapoint with value 1 is submitted, otherwise value 0 is recorded.
#
STATS2READ=( messages consumers )
#
# Examples:
# CW Namespace | CW Dimensions | CW MetricName | RabbitMQ statistic
# ------------ | ----------------- | ------------- | ------------------
# RabbitMQ | Queue=Orders | messages | Number of messages in Orders queue, datapoints agregated from all instances
# RabbitMQ | Queue=Orders | consumers | Number of consumers of Orders queue, datapoints agregated from all instances
# RabbitMQ | Queue=Orders,InstanceId=i-a5b6c7 | messages | Number of messages in Orders queue, datapoints from instace i-a5b6c7
# RabbitMQ | - | partitioned | Partitioned status, datapoints agregated from all instances
# RabbitMQ | InstanceId=i-a5b6c7 | partitioned | Partitioned status, datapoints from instace i-a5b6c7
# Namespace
NS=RabbitMQ
# Instance ID
EC2ID=$(ec2metadata --instance-id)
# Endpoint (using HTTP instead of HTTPS to save CPU credits)
ENDPOINT='http://monitoring.us-east-1.amazonaws.com'
# Enable debug
DEBUG=0
# Debug files
DBGFILE=/tmp/rabbitmq2cloudwatch.debug
if [[ $DEBUG == 0 ]]; then DBGFILE=/dev/null; fi
echo $(date -uIns) >> $DBGFILE
# Collecting statistics for RabbitMQ queues
UNIT=Count
for STATISTIC in ${STATS2READ[@]}; do
# FL is a flag indicating whether the first line has already been processed (1 = has not been processed yet, 0 = has been processed)
# First line must be 'Listing queues ...', otherwise an error occured during execution of rabbitmqctl
FL=1
# Metric data is stored in $MDATA[12]. It is populated while parsing output of rabbitmqctl
MDATA1=''
MDATA2=''
# Parsing output of rabbitmqctl
while read -r line ; do
echo "Processing line: $line" >> $DBGFILE
if [[ $FL == 1 ]]; then
if [[ $line == 'Listing queues ...' ]]; then
FL=0
continue
else
break
fi
fi
if [[ $line == '...done.' ]]; then break; fi
read VALUE QUEUE <<< $line
# Adding datapoint to aggregated metric
# MetricName=$STATISTIC,Value=$VALUE,Unit=Count,Dimensions=[{Name=Queue,Value=$QUEUE}]
MDATA1+="MetricName=$STATISTIC,Value=$VALUE,Unit=Count,Dimensions=[{Name=Queue,Value=$QUEUE}] "
# Adding datapoint to instance-specific metric
# MetricName=$STATISTIC,Value=$VALUE,Unit=Count,Dimensions=[{Name=Queue,Value=$QUEUE},{Name=InstanceId,Value=$EC2ID}]
MDATA2+="MetricName=$STATISTIC,Value=$VALUE,Unit=Count,Dimensions=[{Name=Queue,Value=$QUEUE},{Name=InstanceId,Value=$EC2ID}] "
done < <(rabbitmqctl list_queues $STATISTIC name)
# Submitting metric data 1
# It is not possible to submit more than 20 datapoints at time while using shorthand syntax
echo "Submitting metric data: $MDATA1" >> $DBGFILE
aws cloudwatch put-metric-data --endpoint-url $ENDPOINT --namespace $NS --region us-east-1 --metric-data $MDATA1 >>$DBGFILE 2>&1
# Submitting metric data 2
# It is not possible to submit more than 20 datapoints at time while using shorthand syntax
echo "Submitting metric data: $MDATA2" >> $DBGFILE
aws cloudwatch put-metric-data --endpoint-url $ENDPOINT --namespace $NS --region us-east-1 --metric-data $MDATA2 >>$DBGFILE 2>&1
done
# Looking for partitioning of rabbitmq cluster (split brain)
STATISTIC=partitioned
clusterOK=$(rabbitmqctl cluster_status | grep "{partitions,\[\]}" | wc -l)
if [[ $clusterOK != "1" ]]; then
echo "RabbitMQ cluster is partitioned (split brain)!" >> $DBGFILE
MDATA="MetricName=$STATISTIC,Value=1 "
MDATA+="MetricName=$STATISTIC,Value=1,Dimensions=[{Name=InstanceId,Value=$EC2ID}] "
else
echo "RabbitMQ cluster is OK (not partitioned)" >> $DBGFILE
MDATA="MetricName=$STATISTIC,Value=0 "
MDATA+="MetricName=$STATISTIC,Value=0,Dimensions=[{Name=InstanceId,Value=$EC2ID}] "
fi
# Submitting metric data
echo "Submitting metric data: $MDATA" >> $DBGFILE
aws cloudwatch put-metric-data --endpoint-url $ENDPOINT --namespace $NS --region us-east-1 --metric-data $MDATA 2>&1 >>$DBGFILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment