Skip to content

Instantly share code, notes, and snippets.

@ties
Last active February 22, 2021 14:22
Show Gist options
  • Save ties/65ce3cf0323bcb7ed5b76acfa05ce717 to your computer and use it in GitHub Desktop.
Save ties/65ce3cf0323bcb7ed5b76acfa05ce717 to your computer and use it in GitHub Desktop.
Prometheus rules for basic machine health of my NAS
groups:
- name: host.health
rules:
- alert: SmartFailure
expr: smartmon_device_smart_healthy != 1
labels:
severity: "critical"
annotations:
summary: "Disk {{ $labels.disk }} on {{ $labels.instance }} has a SMART failure."
- alert: DiskFullPercentage
expr: (node_filesystem_avail_bytes{type!="tmpfs"} / node_filesystem_size_bytes{type!="tmpfs"}) and (node_filesystem_size_bytes{type!="tmpfs"} < 256*1024*1024*1024) < 0.1
labels:
severity: "critical"
annotations:
summary: "Disk {{ $labels.device }} ({{ $labels.fstype }} on {{ $labels.instance }} has {{ humanizePercentage $value }} space available."
- alert: LargeDiskFullSpace
expr: (node_filesystem_avail_bytes{mountpoint!~"(\\/run|\\/boot\\/efi).*"} < 50*1024*1024*1024) and (node_filesystem_size_bytes{type!="tmpfs"} > 256*1024*1024*1024)
labels:
severity: "critical"
annotations:
summary: "Disk {{ $labels.device }} ({{ $labels.fstype }} on {{ $labels.instance }} has {{ humanize1024 $value }} of space available."
- alert: DiskFullSpace
expr: (node_filesystem_avail_bytes{mountpoint!~"(\\/run|\\/boot\\/efi).*"} < 1024*1024*1024)
labels:
severity: "critical"
annotations:
summary: "Disk {{ $labels.device }} ({{ $labels.fstype }} on {{ $labels.instance }} has {{ humanize1024 $value }} of space available."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment