Skip to content

Instantly share code, notes, and snippets.

@psachin
Last active March 7, 2019 19:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psachin/30cfb5c4f4c848b6810952485d885b04 to your computer and use it in GitHub Desktop.
Save psachin/30cfb5c4f4c848b6810952485d885b04 to your computer and use it in GitHub Desktop.
ceilo-gnocchi-aodh-notes
Gnocchi Archive policies
=========================
1. Capture data every 1 minute for 1 day
[heat-admin@overcloud-controller-2 ~]$ gnocchi archive-policy create 1min1day -d granularity:1m,timespan:1d
+---------------------+----------------------------------------------------------------+
| Field | Value |
+---------------------+----------------------------------------------------------------+
| aggregation_methods | std, count, 95pct, min, max, sum, median, mean |
| back_window | 0 |
| definition | - points: 1440, granularity: 0:01:00, timespan: 1 day, 0:00:00 |
| name | 1min1day |
+---------------------+----------------------------------------------------------------+
2. Capture data for every 1 minute for 30 days
[heat-admin@overcloud-controller-2 ~]$ gnocchi archive-policy create 1min30day -d granularity:1m,timespan:30d
+---------------------+-------------------------------------------------------------------+
| Field | Value |
+---------------------+-------------------------------------------------------------------+
| aggregation_methods | std, count, 95pct, min, max, sum, median, mean |
| back_window | 0 |
| definition | - points: 43200, granularity: 0:01:00, timespan: 30 days, 0:00:00 |
| name | 1min30day |
+---------------------+-------------------------------------------------------------------+
3. Same are above but using 'points'. 30 days is (30d x 24hrs. x 60min. = ) 43200 mins
[heat-admin@overcloud-controller-2 ~]$ gnocchi archive-policy create 1min43200points -d granularity:1m,points:43200
+---------------------+-------------------------------------------------------------------+
| Field | Value |
+---------------------+-------------------------------------------------------------------+
| aggregation_methods | std, count, 95pct, min, max, sum, median, mean |
| back_window | 0 |
| definition | - points: 43200, granularity: 0:01:00, timespan: 30 days, 0:00:00 |
| name | 1min43200points |
+---------------------+-------------------------------------------------------------------+
4. Every 30 days for 2 years. Note that 24 points = 2yrs x 12 mnths
[heat-admin@overcloud-controller-2 ~]$ gnocchi archive-policy create 1mnth2yrs -d granularity:30d,points:24
+---------------------+--------------------------------------------------------------------------+
| Field | Value |
+---------------------+--------------------------------------------------------------------------+
| aggregation_methods | std, count, 95pct, min, max, sum, median, mean |
| back_window | 0 |
| definition | - points: 24, granularity: 30 days, 0:00:00, timespan: 720 days, 0:00:00 |
| name | 1mnth2yrs |
+---------------------+--------------------------------------------------------------------------+
OR
(overcloud) [root@overcloud-controller-0 ~]# gnocchi archive-policy create verylow -d granularity:30d,timespan:720d
+---------------------+--------------------------------------------------------------------------+
| Field | Value |
+---------------------+--------------------------------------------------------------------------+
| aggregation_methods | std, count, min, max, sum, mean |
| back_window | 0 |
| definition | - points: 24, granularity: 30 days, 0:00:00, timespan: 720 days, 0:00:00 |
| name | verylow |
+---------------------+--------------------------------------------------------------------------+
5. Include multiple granularity
[heat-admin@overcloud-controller-2 ~]$ gnocchi archive-policy create verylow -d granularity:30d,points:24 -d granularity:1d,timespan:30d
+---------------------+--------------------------------------------------------------------------+
| Field | Value |
+---------------------+--------------------------------------------------------------------------+
| aggregation_methods | std, count, 95pct, min, max, sum, median, mean |
| back_window | 0 |
| definition | - points: 24, granularity: 30 days, 0:00:00, timespan: 720 days, 0:00:00 |
| | - points: 30, granularity: 1 day, 0:00:00, timespan: 30 days, 0:00:00 |
| name | verylow |
+---------------------+--------------------------------------------------------------------------+
6. Create archive-policy rule
$ gnocchi archive-policy create 1min1day -d granularity:1m,timespan:1d
+---------------------+----------------------------------------------------------------+
| Field | Value |
+---------------------+----------------------------------------------------------------+
| aggregation_methods | std, count, min, max, sum, mean |
| back_window | 0 |
| definition | - points: 1440, granularity: 0:01:00, timespan: 1 day, 0:00:00 |
| name | 1min1day |
+---------------------+----------------------------------------------------------------+
$ (overcloud) [root@overcloud-controller-0 heat-admin]# gnocchi archive-policy-rule create cpu -a 1min1day -m "cpu*"
+---------------------+----------+
| Field | Value |
+---------------------+----------+
| archive_policy_name | 1min1day |
| metric_pattern | cpu* |
| name | cpu |
+---------------------+----------+
$ (overcloud) [root@overcloud-controller-0 heat-admin]# gnocchi archive-policy-rule list
+---------+---------------------+----------------+
| name | archive_policy_name | metric_pattern |
+---------+---------------------+----------------+
| cpu | 1min1day | cpu* |
| default | low | * |
+---------+---------------------+----------------+
(Note for archive-pilicy-rule): Please also make changes in /etc/gnocchi/gnocchi.conf and /etc/ceilometer/ceilomter.conf if you want to apply archive policy rule.
~~~/etc/gnocchi/gnocchi.conf
[statsd]
archive_policy_name=1min1day
[oslo_policy]
policy_default_rule = cpu
~~~
(TDB)
~~~/etc/ceilometer/ceilometer.conf
[dispatcher_gnocchi]
archive_policy = 1min1day
[oslo_policy]
policy_default_rule = cpu
~~~
Handy commands:
----------------
1) Service
~~~
systemctl stop/start/status openstack-gnocchi-statsd openstack-gnocchi-metricd openstack-gnocchi-api
~~~
2) Delete all metrics
~~~
for m in `gnocchi metric list -c id -f value`; do gnocchi metric delete $m --debug; done
~~~
Results
-------
(overcloud) [root@overcloud-controller-0 heat-admin]# gnocchi metric list
+--------------------------------------+---------------------+-------------------------------+------+--------------------------------------+
| id | archive_policy/name | name | unit | resource_id |
+--------------------------------------+---------------------+-------------------------------+------+--------------------------------------+
| 0985d386-b9ed-408b-bcfb-499472f6bcf2 | low | disk.write.bytes | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| 0cc2f096-47a7-47ed-afd0-0e4726bf1d61 | low | network.outgoing.packets.rate | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 144e7a8a-c564-4a44-8de1-29495288adb8 | low | network.outgoing.packets | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 1caf1c25-9726-42dd-ba8d-abf14ebf8374 | low | network.outgoing.bytes | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 2d51fbd6-d4ce-4286-a805-6fd2400df5e2 | low | network.incoming.packets.rate | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 4a860629-6960-4abb-a727-0bbbc1927881 | low | disk.read.requests | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| 558fb6e8-3654-47aa-a2b4-b35234992d7c | low | network.incoming.packets | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 73125e76-7e19-4bf4-aa6c-ebc9bf0345ec | low | disk.write.requests.rate | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| 762f6317-7633-4698-9ff8-150693ed121f | low | network.incoming.bytes.rate | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| 92b27854-89dd-49b2-a49b-d8bbf1cf7c27 | low | disk.read.bytes.rate | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| 99cf5b2a-a9d6-4970-9bb9-87aa9006a151 | 1min1day | cpu | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| 9f448953-3370-4690-a52a-67c3275631c2 | low | disk.write.bytes.rate | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| a69609c1-5a60-442c-be4a-25be7059f3a6 | low | disk.write.requests | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| ab825519-6590-4b7d-88e0-f6278001832d | low | disk.read.bytes | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| bd278541-ad9e-4ea2-bf0a-cc6d67acf2ab | low | network.outgoing.bytes.rate | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| c4ca91c3-e494-4eea-93c4-039bda36a4f2 | low | network.incoming.bytes | None | 01658c1f-b768-5670-85b1-0f129ce11e33 |
| df903ff5-95c5-40af-9600-c42b2887bd51 | 1min1day | cpu_util | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| f43ed98d-9f6f-4aa3-98cb-ca7ddf10eaa1 | low | disk.read.requests.rate | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
| fd891bef-5313-4e8f-9c06-d9ded66ac706 | 1min1day | cpu.delta | None | 3a0cfe8a-4428-4482-bd2b-26c6401cf34f |
+--------------------------------------+---------------------+-------------------------------+------+--------------------------------------+
(Don't know what gnocchi archives metric with low event if default policy is set to '1min1day'. But rule for 'cpu' is applied for metric pattern "cpu*")
(overcloud) [root@overcloud-controller-0 heat-admin]# openstack server list -c ID -f value
3a0cfe8a-4428-4482-bd2b-26c6401cf34f
(overcloud) [root@overcloud-controller-0 heat-admin]# gnocchi measures show cpu_util -r 3a0cfe8a-4428-4482-bd2b-26c6401cf34f
+---------------------------+-------------+--------------+
| timestamp | granularity | value |
+---------------------------+-------------+--------------+
| 2017-11-27T19:30:00+00:00 | 60.0 | 3.1599404951 |
+---------------------------+-------------+--------------+
Aodh
====
1. Create Aodh alarm when instance is powered off
~~~
# aodh --debug alarm create \
--type event \
--name instance_off \
--description 'event_instance_power_off' \
--event-type "compute.instance.power_off.*" \
--enable True \
--query "traits.instance_id=string::bb912729-fa51-443b-bac6-bf4c795f081d" \
--alarm-action 'log://' \
--ok-action 'log://' \
--insufficient-data-action 'log://' \
--resource-type instance
POST call to alarming for http://10.74.129.11:8042/v2/alarms used request id req-abaa0152-e67c-428b-a86b-150d9d42867d
+---------------------------+-----------------------------------------------------------+
| Field | Value |
+---------------------------+-----------------------------------------------------------+
| alarm_actions | [u'log://'] |
| alarm_id | 85a2942f-a2ec-4310-baea-d58f9db98654 |
| description | event_instance_power_off |
| enabled | True |
| event_type | compute.instance.power_off.* |
| insufficient_data_actions | [u'log://'] |
| name | instance_off |
| ok_actions | [u'log://'] |
| project_id | 9ee200732f4c4d10a6530bac746f1b6e |
| query | traits.instance_id = bb912729-fa51-443b-bac6-bf4c795f081d |
| repeat_actions | False |
| severity | low |
| state | insufficient data |
| state_timestamp | 2017-07-15T01:31:14.516188 |
| time_constraints | [] |
| timestamp | 2017-07-15T01:31:14.516188 |
| type | event |
| user_id | 89b4e48bcbdb4816add7800502bd5122 |
+---------------------------+-----------------------------------------------------------+
# openstack server list
+--------------------------------------+----------------+---------+-----------------------+------------+
| ID | Name | Status | Networks | Image Name |
+--------------------------------------+----------------+---------+-----------------------+------------+
| 0004b07f-028a-45b7-9753-b22795b00808 | tachoi-testvm2 | SHUTOFF | External=10.74.154.63 | cirros |
| bb912729-fa51-443b-bac6-bf4c795f081d | tachoi-testvm1 | ACTIVE | External=10.74.154.65 | cirros |
+--------------------------------------+----------------+---------+-----------------------+------------+
# openstack alarm-history show 85a2942f-a2ec-4310-baea-d58f9db98654
+----------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp | type | detail | event_id |
+----------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-07-15T01:31:14.516188 | creation | {"alarm_actions": ["log://"], "user_id": "89b4e48bcbdb4816add7800502bd5122", "name": "instance_off", "state": "insufficient | fb31f4c2-e357-44c3-9b6a-bd2aaaa4ae68 |
| | | data", "timestamp": "2017-07-15T01:31:14.516188", "description": "event_instance_power_off", "enabled": true, "state_timestamp": | |
| | | "2017-07-15T01:31:14.516188", "rule": {"query": [{"field": "traits.instance_id", "type": "string", "value": "bb912729-fa51-443b- | |
| | | bac6-bf4c795f081d", "op": "eq"}], "event_type": "compute.instance.power_off.*"}, "alarm_id": "85a2942f-a2ec-4310-baea- | |
| | | d58f9db98654", "time_constraints": [], "insufficient_data_actions": ["log://"], "repeat_actions": false, "ok_actions": | |
| | | ["log://"], "project_id": "9ee200732f4c4d10a6530bac746f1b6e", "type": "event", "severity": "low"} | |
+----------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
# openstack server stop tachoi-testvm1
# openstack alarm-history show 85a2942f-a2ec-4310-baea-d58f9db98654
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp | type | detail | event_id |
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-07-15T01:33:20.390623 | state transition | {"transition_reason": "Event <id=abe437a3-b75b-40b4-a3cb-26022a919f5e,event_type=compute.instance.power_off.start> hits | c5ca92ae-584b-4da6-a12c-b7a00dd39fef |
| | | the query <query=[{\"field\": \"traits.instance_id\", \"op\": \"eq\", \"type\": \"string\", \"value\": \"bb912729-fa51 | |
| | | -443b-bac6-bf4c795f081d\"}]>.", "state": "alarm"} | |
| 2017-07-15T01:31:14.516188 | creation | {"alarm_actions": ["log://"], "user_id": "89b4e48bcbdb4816add7800502bd5122", "name": "instance_off", "state": | fb31f4c2-e357-44c3-9b6a-bd2aaaa4ae68 |
| | | "insufficient data", "timestamp": "2017-07-15T01:31:14.516188", "description": "event_instance_power_off", "enabled": | |
| | | true, "state_timestamp": "2017-07-15T01:31:14.516188", "rule": {"query": [{"field": "traits.instance_id", "type": | |
| | | "string", "value": "bb912729-fa51-443b-bac6-bf4c795f081d", "op": "eq"}], "event_type": "compute.instance.power_off.*"}, | |
| | | "alarm_id": "85a2942f-a2ec-4310-baea-d58f9db98654", "time_constraints": [], "insufficient_data_actions": ["log://"], | |
| | | "repeat_actions": false, "ok_actions": ["log://"], "project_id": "9ee200732f4c4d10a6530bac746f1b6e", "type": "event", | |
| | | "severity": "low"} | |
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
~~~
Note:
1. `bb912729-fa51-443b-bac6-bf4c795f081d` is instance/resource-id
2. Event type is actually nova compute events. Full list here: https://wiki.openstack.org/wiki/SystemUsageData
3. When event is triggered..in this case instance is powered on/off, relevant event is seen in /var/log/aodh/listener.log
~~~
# tail /var/log/aodh/listener.log
2017-07-15 07:02:22.023 2866 WARNING aodh.evaluator.event [-] psachin-event: [{u'event_type': u'objectstore.http.request', u'traits': [[u'typeURI', 1, u'http://schemas.dmtf.org/cloud/audit/1.0/event'], [u'eventTime', 1, u'2017-07-15T01:32:21.886359'], [u'outcome', 1, u'success'], [u'user_id', 1, u'341ad30b35a44cb0b7707128252d4056'], [u'initiator_typeURI', 1, u'service/security/account/user'], [u'service', 1, u'ceilometermiddleware'], [u'target_id', 1, u'9ee200732f4c4d10a6530bac746f1b6e'], [u'observer_id', 1, u'target'], [u'initiator_id', 1, u'341ad30b35a44cb0b7707128252d4056'], [u'eventType', 1, u'activity'], [u'target_typeURI', 1, u'service/storage/object'], [u'action', 1, u'read'], [u'project_id', 1, u'b46a6e45359845fa8124e16b3b560e11'], [u'id', 1, u'5852de0a-45fd-5d8f-81c1-6ded3e7197d1']], u'message_signature': u'd74bc06118c98a7c100aa605b47944ea1760dcc36e754c241e06b4ef81e22ed6', u'raw': {}, u'generated': u'2017-07-15T01:32:21.887249', u'message_id': u'80a2e0e4-0eeb-416d-90fc-c67abd8749e7'}]
2017-07-15 07:02:22.023 2866 WARNING aodh.evaluator.event [-] psachin-event: 1
2017-07-15 07:02:22.266 2866 WARNING aodh.evaluator.event [-] psachin-event: [{u'event_type': u'objectstore.http.request', u'traits': [[u'typeURI', 1, u'http://schemas.dmtf.org/cloud/audit/1.0/event'], [u'eventTime', 1, u'2017-07-15T01:32:21.993104'], [u'outcome', 1, u'success'], [u'user_id', 1, u'341ad30b35a44cb0b7707128252d4056'], [u'initiator_typeURI', 1, u'service/security/account/user'], [u'service', 1, u'ceilometermiddleware'], [u'target_id', 1, u'b46a6e45359845fa8124e16b3b560e11'], [u'observer_id', 1, u'target'], [u'initiator_id', 1, u'341ad30b35a44cb0b7707128252d4056'], [u'eventType', 1, u'activity'], [u'target_typeURI', 1, u'service/storage/object'], [u'action', 1, u'read'], [u'project_id', 1, u'b46a6e45359845fa8124e16b3b560e11'], [u'id', 1, u'19b572d7-3533-599b-adb8-8fd346d43857']], u'message_signature': u'f9a874824a4322c549b5bf6ecc9b9760aefd29c8ccdc59b70257c4caac80a1d4', u'raw': {}, u'generated': u'2017-07-15T01:32:21.994005', u'message_id': u'ca64f811-137d-46ee-aa71-720a557804ff'}]
2017-07-15 07:02:22.266 2866 WARNING aodh.evaluator.event [-] psachin-event: 1
2017-07-15 07:03:19.613 2866 WARNING aodh.evaluator.event [-] psachin-event: [{u'event_type': u'compute.instance.update', u'traits': [[u'resource_id', 1, u'bb912729-fa51-443b-bac6-bf4c795f081d'], [u'ephemeral_gb', 2, 0], [u'instance_type_id', 2, 1], [u'user_id', 1, u'89b4e48bcbdb4816add7800502bd5122'], [u'service', 1, u'compute'], [u'state', 1, u'active'], [u'old_state', 1, u'active'], [u'project_id', 1, u'9ee200732f4c4d10a6530bac746f1b6e'], [u'launched_at', 4, u'2017-06-20T00:57:41'], [u'disk_gb', 2, 1], [u'instance_id', 1, u'bb912729-fa51-443b-bac6-bf4c795f081d'], [u'host', 1, u'osp11-2.gsslab.pnq2.redhat.com'], [u'root_gb', 2, 1], [u'tenant_id', 1, u'9ee200732f4c4d10a6530bac746f1b6e'], [u'memory_mb', 2, 512], [u'instance_type', 1, u'm1.tiny'], [u'vcpus', 2, 1], [u'request_id', 1, u'req-c58971fe-5c61-4a2b-b3c8-f277c3f589b0']], u'message_signature': u'018e39c59ae08ee496a896d3e22ea4141bd8419f9e60e6493de60ba6d9334188', u'raw': {}, u'generated': u'2017-07-15T01:33:19.508133', u'message_id': u'e381ba7f-4cc9-499a-a7b5-7ee0a3930a4c'}]
2017-07-15 07:03:19.613 2866 WARNING aodh.evaluator.event [-] psachin-event: 1
2017-07-15 07:03:19.614 2866 WARNING oslo_db.sqlalchemy.utils [-] Unique keys not in sort_keys. The sorting order may be unstable.
2017-07-15 07:03:20.148 2866 WARNING aodh.evaluator.event [-] psachin-event: [{u'event_type': u'compute.instance.power_off.start', u'traits': [[u'user_id', 1, u'89b4e48bcbdb4816add7800502bd5122'], [u'service', 1, u'compute'], [u'disk_gb', 2, 1], [u'resource_id', 1, u'bb912729-fa51-443b-bac6-bf4c795f081d'], [u'tenant_id', 1, u'9ee200732f4c4d10a6530bac746f1b6e'], [u'root_gb', 2, 1], [u'ephemeral_gb', 2, 0], [u'instance_type_id', 2, 1], [u'state', 1, u'active'], [u'memory_mb', 2, 512], [u'launched_at', 4, u'2017-06-20T00:57:41'], [u'instance_id', 1, u'bb912729-fa51-443b-bac6-bf4c795f081d'], [u'host', 1, u'osp11-2.gsslab.pnq2.redhat.com'], [u'request_id', 1, u'req-c58971fe-5c61-4a2b-b3c8-f277c3f589b0'], [u'instance_type', 1, u'm1.tiny'], [u'project_id', 1, u'9ee200732f4c4d10a6530bac746f1b6e'], [u'vcpus', 2, 1]], u'message_signature': u'96513b247cb3273f3802df4b860e5b4335da82aa5de644966348a9919b5b50ec', u'raw': {}, u'generated': u'2017-07-15T01:33:20.000005', u'message_id': u'abe437a3-b75b-40b4-a3cb-26022a919f5e'}]
2017-07-15 07:03:20.148 2866 WARNING aodh.evaluator.event [-] psachin-event: 1
2017-07-15 07:03:20.149 2866 INFO aodh.evaluator [-] alarm 85a2942f-a2ec-4310-baea-d58f9db98654 transitioning to alarm because Event <id=abe437a3-b75b-40b4-a3cb-26022a919f5e,event_type=compute.instance.power_off.start> hits the query <query=[{"field": "traits.instance_id", "op": "eq", "type": "string", "value": "bb912729-fa51-443b-bac6-bf4c795f081d"}]>.
~~~
4. For alarm of type event, following is need from ceilometer side. /etc/ceilometer/event_pipeline.yaml[1]
~~~
---
sources:
- name: event_source
events:
- "*"
sinks:
- event_sink
sinks:
- name: event_sink
transformers:
publishers:
- notifier://
- notifier://?topic=alarm.all
~~~
[1] https://github.com/openstack/aodh/blob/master/doc/source/event-alarm.rst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment