Skip to content

Instantly share code, notes, and snippets.

@bzon
Created May 12, 2020 18:24
Show Gist options
  • Save bzon/97beeb29fe94e2697b07fe81c48cc1a7 to your computer and use it in GitHub Desktop.
Save bzon/97beeb29fe94e2697b07fe81c48cc1a7 to your computer and use it in GitHub Desktop.
Prometheus Flux alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
app: prometheus-operator
release: prometheus
name: alertmanager-flux
namespace: monitoring
spec:
groups:
- name: alerts.flux
rules:
- alert: FluxMissing
annotations:
message: Flux pod is missing for the last 5 minutes. Deployments may not work!
for: 5m
expr: absent(up{job="flux"})
labels:
severity: warning
- alert: FluxHelmOperatorErrors
annotations:
message: >
There is an issue deploying `{{ $labels.release_name }}` release helm chart.
Errors count `{{ $value }}`.
for: 5m
expr: sum(increase(flux_helm_operator_release_duration_seconds_count{success="false"}[5m])) by (release_name) > 0
labels:
severity: warning
- alert: FluxSyncErrors
annotations:
message: Errors count `{{ $value }}`. Please check the flux logs.
for: 5m
expr: sum(increase(flux_daemon_sync_duration_seconds_count{success="false"}[5m])) > 0
labels:
severity: warning
- alert: FluxRegistryFetchErrors
annotations:
message: Errors count `{{ $value }}`. New images may not be fetched from remote registries.
for: 10m
expr: sum(increase(flux_registry_fetch_duration_seconds_count{success="false"}[5m])) > 0
labels:
severity: warning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment