Skip to content

Instantly share code, notes, and snippets.

@jbfavre
Last active September 12, 2015 12:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jbfavre/453879cf39f24dd18d0f to your computer and use it in GitHub Desktop.
Save jbfavre/453879cf39f24dd18d0f to your computer and use it in GitHub Desktop.
Problem: item get triggered, and alert sent, when threshold is reached. But, it get recovered within seconds despite hysteresis.
Since data are collected every minutes using trappers, we got an alert+recovery every minute.
Zabbix version: 2.4.6 for both server & agent
Item is discovered through LLD. Here's the actual LLD paylod:
{
"{#RMQQUEUENAME}": "queue_name",
"{#RMQRATIOTHRES}": 5,
"{#RMQVHOSTNAME}": "queue_vhost",
"{#RMQMSGTHRESH}": 3500
}
Trigger prototype is as fallow:
(
{TRIGGER.VALUE}=0 and {BBC_TPL_RABBITMQ:rabbitmq.queue[{#RMQVHOSTNAME},{#RMQQUEUENAME},master].last()}=1 and {BBC_TPL_RABBITMQ:rabbitmq.queue[{#RMQVHOSTNAME},{#RMQQUEUENAME},count,message].min(5m)}>({#RMQMSGTHRESH}*10)
) or (
{TRIGGER.VALUE}=1 and {BBC_TPL_RABBITMQ:rabbitmq.queue[{#RMQVHOSTNAME},{#RMQQUEUENAME},count,message].max(10m)}<{#RMQMSGTHRESH}
)
Actual trigger on hostname is:
(
{TRIGGER.VALUE}=0 and {real_hostname:rabbitmq.queue[queue_vhost,queue_name,master].last()}=1 and {real_hostname:rabbitmq.queue[queue_vhost,queue_name,count,message].min(5m)}>(3500*10)
) or (
{TRIGGER.VALUE}=1 and {real_hostname:rabbitmq.queue[queue_vhost,queue_name,count,message].max(10m)}<3500
)
Raw items values are as follow. As you can see, we should never have the flapping effect :-/
|-----------|------------------|----------|----|
| itemid | clock | value | ns |
|-----------|------------------|----------|----|
| 127424 | 1441992304 | 10138 | 0 |
| 127424 | 1441992365 | 12576 | 0 |
| 127424 | 1441992426 | 17052 | 0 |
| 127424 | 1441992487 | 21800 | 0 |
| 127424 | 1441992548 | 26441 | 0 |
| 127424 | 1441992609 | 30759 | 0 |
| 127424 | 1441992670 | 35482 | 0 |
| 127424 | 1441992731 | 39735 | 0 |
| 127424 | 1441992792 | 44115 | 0 |
| 127424 | 1441992853 | 48654 | 0 |
| 127424 | 1441992914 | 52902 | 0 |
| 127424 | 1441992975 | 57206 | 0 |
| 127424 | 1441993036 | 61052 | 0 |
| 127424 | 1441993097 | 65225 | 0 |
| 127424 | 1441993158 | 69401 | 0 |
| 127424 | 1441993219 | 73779 | 0 |
| 127424 | 1441993280 | 78420 | 0 |
| 127424 | 1441993341 | 82797 | 0 |
| 127424 | 1441993402 | 86617 | 0 |
| 127424 | 1441993463 | 91222 | 0 |
| 127424 | 1441993524 | 95373 | 0 |
| 127424 | 1441993585 | 99768 | 0 |
| 127424 | 1441993646 | 104122 | 0 |
| 127424 | 1441993707 | 108785 | 0 |
| 127424 | 1441993768 | 112611 | 0 |
| 127424 | 1441993829 | 116940 | 0 |
| 127424 | 1441993890 | 121295 | 0 |
| 127424 | 1441993951 | 125623 | 0 |
| 127424 | 1441994012 | 130242 | 0 |
| 127424 | 1441994073 | 134281 | 0 |
| 127424 | 1441994135 | 138360 | 0 |
| 127424 | 1441994196 | 142716 | 0 |
| 127424 | 1441994257 | 119110 | 0 |
| 127424 | 1441994319 | 88993 | 0 |
| 127424 | 1441994380 | 58073 | 0 |
| 127424 | 1441994441 | 25217 | 0 |
@jbfavre
Copy link
Author

jbfavre commented Sep 12, 2015

Here are some example of events:

2015-09-11 22:55:50 real_hostname   We have too many messages in queue_vhost:queue_name OK  Average 15h 44m 50s No  Ok
2015-09-11 22:55:50 real_hostname   We have too many messages in queue_vhost:queue_name PROBLEM Average 0   No  12  5
2015-09-11 22:54:48 real_hostname   We have too many messages in queue_vhost:queue_name OK  Average 1m 2s   No  Ok
2015-09-11 22:54:48 real_hostname   We have too many messages in queue_vhost:queue_name PROBLEM Average 0   No  12  5
2015-09-11 22:53:47 real_hostname   We have too many messages in queue_vhost:queue_name OK  Average 1m 1s   No  Ok
2015-09-11 22:53:47 real_hostname   We have too many messages in queue_vhost:queue_name PROBLEM Average 0   No  12  5
2015-09-11 22:52:46 real_hostname   We have too many messages in queue_vhost:queue_name OK  Average 1m 1s   No  Ok
2015-09-11 22:52:46 real_hostname   We have too many messages in queue_vhost:queue_name PROBLEM Average 0   No  12  5
2015-09-11 19:59:40 real_hostname   We have too many messages in queue_vhost:queue_name OK  Average 2h 53m 6s   No  Ok
2015-09-11 19:59:40 real_hostname   We have too many messages in queue_vhost:queue_name PROBLEM Average 0   No  12  5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment