Skip to content

Instantly share code, notes, and snippets.

@auxesis
Last active August 29, 2015 14:04
Show Gist options
  • Save auxesis/2fa0c734aa1341aff65d to your computer and use it in GitHub Desktop.
Save auxesis/2fa0c734aa1341aff65d to your computer and use it in GitHub Desktop.

Leveling up your Flapjack stack

Too many alerts. Too many dashboards. Too much noise - and the alert fatigue isn't receding.

If you're frequently on the end of a pager (or pager-like device) and working with systems running in the cloud, you've probably noticed an increase in the volume of alerts over last few years.

This is a problem that's not going away - in fact, with the proliferation of monitoring tools going on at the moment due to a renaissance in Open Source monitoring, coupled with the ever expanding sprawl of systems that make up modern businesses on the web, the problem is only getting worse.

Flapjack is an alert umbrella for people on-call that intelligently routes and rolls up alerts, integrates with check execution engines like Sensu & Nagios, and ships a well documented API for restart-less configuration.

You may have heard of Flapjack a bit in the last year. Maybe you've even played with it a bit. In this talk, we're going to build your Flapjack stack to the next level by learning:

  • How to make Flapjack alert via PagerDuty, and take advantage of bi-directional alert acknowledgement and recovery synchronisation.
  • How to use Flapjack's HTTP Broker to get free heartbeating (with TTLs!) from within your app or infrastructure.
  • How to pull in alerts from CloudWatch and other webhook-based alerting platforms.
  • How to connect Flapjack with Sensu and run it in parallel to your existing Nagios infrastructure.

…as well as a bucketload of other tips and tricks along the way.

Let's start making your on-call experience human.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment