Created
September 23, 2014 21:11
-
-
Save alzabo/c562ef2e729654e5ad27 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sensu | |
----- | |
monitoring not only an ops problem. devs need to be on board, looped in | |
poor monitoring coverage | |
rather than making a large change, iteratively integrate sensu, gradually replacing nagios | |
exploring what success looks like; observations on whether or not it was better than the status quo | |
standalone checks? | |
checks have runbook baked in | |
sensu::check puppet type / wrapped with in-house code | |
handlers for irc/issue tracker; arbitrary api/tools (awsprune) | |
sensu plugins worth looking at for use with nagios | |
upon login alerts indicated / machine stats, error conditions | |
dns is canonical source of truth in yelp deploy | |
stale cron, cron spam | |
- solved by writing "staleness" file in cron, monitoring for age | |
- stale crons fail a check, open a ticket | |
python sensu client; open sourced. pumps test results into sensu client socket | |
services contain a yaml file which explains how to monitor they can deploy with their app | |
parsed yaml configures sensu checks. no operators required | |
cases | |
- alert misbehaving devs that they are killing a machine. example troubleshooting commands | |
slideshare/bobtfish |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment