Skip to content

Instantly share code, notes, and snippets.

@h0hmj
Forked from timfeirg/ceph_alerts.yml
Last active March 18, 2024 05:29
Show Gist options
  • Save h0hmj/8b8c3ddf49b75504557b32985e4e3604 to your computer and use it in GitHub Desktop.
Save h0hmj/8b8c3ddf49b75504557b32985e4e3604 to your computer and use it in GitHub Desktop.
ceph monitoring
apiVersion: 1
groups:
- folder: ceph-alerts
interval: 10s
name: cluster health
orgId: 1
rules:
- annotations:
description: The cluster state has been HEALTH_ERROR for more than 5 minutes.
Please check 'ceph health detail' for more information.
summary: Ceph is in the ERROR state
condition: A
data:
- datasourceUid: ceph
model:
disableTextWrap: false
editorMode: code
expr: ceph_health_status == 2
fullMetaSearch: false
includeNullMetadata: true
instant: true
intervalMs: 10000
legendFormat: __auto
maxDataPoints: 43200
range: false
refId: A
useBackend: false
refId: A
relativeTimeRange:
from: 300
to: 0
execErrState: Error
for: 5m
isPaused: false
labels:
oid: 1.3.6.1.4.1.50495.1.2.1.2.1
severity: critical
type: ceph_default
noDataState: OK
title: CephHealthError
uid: ceph-ceph_health_error
- annotations:
description: The cluster state has been HEALTH_WARN for more than 15 minutes.
Please check 'ceph health detail' for more information.
summary: Ceph is in the WARNING state
condition: A
data:
- datasourceUid: ceph
model:
disableTextWrap: false
editorMode: code
expr: ceph_health_status == 1
fullMetaSearch: false
includeNullMetadata: true
instant: true
intervalMs: 10000
legendFormat: __auto
maxDataPoints: 43200
range: false
refId: A
useBackend: false
refId: A
relativeTimeRange:
from: 300
to: 0
execErrState: Error
for: 15m
isPaused: false
labels:
severity: warning
type: ceph_default
noDataState: OK
title: CephHealthWarning
uid: ceph-ceph_health_warning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment