Skip to content

Instantly share code, notes, and snippets.

@cwjohnston
Last active March 30, 2018 13:09
Show Gist options
  • Save cwjohnston/93289f6a2d1ed62d7868493458743b69 to your computer and use it in GitHub Desktop.
Save cwjohnston/93289f6a2d1ed62d7868493458743b69 to your computer and use it in GitHub Desktop.
Sensu Scheduled Downtime POC

Sensu Scheduled Downtime Event Annotation

As part of helping a customer develop their proof of concept monitoring system with Sensu Enterprise, I worked up a mutator which uses stash data to determine if an event occurred within a pre-defined maintenance window.

The idea here is that event data needs to be annotated to indicate whether an event occurred during a scheduled maintenance window for SLA reporting purposes. With this added downtime context, events logged to an external source (e.g. greylog, elasticsearch) via Sensu Enterprise event bridge should provide enough information to determine whether or not a client's check result matches a scheduled downtime window.

Please note that I have done very little in the way of testing so this plugin is not likely to be very robust. Since this mutator probably needs to be applied to every event, it should probably be implemented as an extension before being put into a production system.

Assumptions

The mutator assumes the following:

  1. Relative to Sensu event processor, Sensu API is running on 127.0.0.1:4567 . This will be true of any Sensu Enterprise server.

  2. Sensu Clients are configured with a custom attribute, services, whose value is an array containing zero or more strings defining service names which will be compared to named stashes under the downtime path.

  3. Stashes will be created via the Sensu API under the downtime path, with a name matching a service defined on clients with start and end attributes whose values are unix epoch timestamps.

Example client definition, Note "arbitrary_service_id" as a value in the services array.:

{
  "client":{
    "name":"datboi",
    "address":"192.168.2.227",
    "subscriptions":[
      "client:datboi"
    ],
    "environment":"staging",
    "tags":[],
    "services":[
      "arbitrary_service_id"
    ]
  }

Example curl command to create a "scheduled downtime" stash under the downtime path, matching the arbitrary_service_id service defined on the client above:

curl -X POST -H 'Content-Type: application/json' -d '{"path":"downtime/arbitrary_service_id","content":{"start":1493158003,"end":1493168003,"creator":"Your Name Here","description":"this is a test"}}' http://127.0.01:4567/stashes

With a client configured and a stash created, the mutator can be defined in configuration and applied to a handler. Here's the combined handler and mutator configuration I used in my testing:

{
  "handlers": {
    "downtime_test": {
      "type": "pipe",
      "command": "tee /tmp/downtime_test",
      "mutator": "scheduled_downtime"
    }
  },
  "mutators": {
    "scheduled_downtime": {
      "command": "/usr/local/bin/scheduled-downtime.rb"
    }
  }
}

After restarting Sensu services to apply configuration, I tested the mutator using nc (netcat) to send a check result to the local client socket:

echo '{"name":"test","status":2,"output":"test output","handler":"downtime_test"}' | nc 127.0.0.1 3030

And I see the data written to disk by tee, with a copy of the downtime stash incorporated in the event data under the downtime array, as I expect:

$ cat /tmp/downtime_test | jq .
{
  "client": {
    "name": "datboi",
    "address": "192.168.2.227",
    "subscriptions": [
      "client:datboi"
    ],
    "environment": "staging",
    "tags": [],
    "services": [
      "arbitrary_service_id"
    ],
    "version": "0.29.0",
    "timestamp": 1493160058
  },
  "check": {
    "name": "test",
    "status": 2,
    "output": "test output",
    "handler": "downtime_test",
    "executed": 1493160075,
    "issued": 1493160075,
    "type": "standard",
    "history": [
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2",
      "2"
    ],
    "total_state_change": 0
  },
  "occurrences": 21,
  "occurrences_watermark": 21,
  "action": "create",
  "timestamp": 1493160075,
  "id": "fc081db1-961a-4f64-8412-d5a56a152ed4",
  "last_state_change": 1491797301,
  "last_ok": 1491797301,
  "silenced": false,
  "silenced_by": [],
  "downtime": [
    {
      "start": 1493159338,
      "end": 1493169338,
      "creator": "Your Name Here",
      "description": "this is a test"
    }
  ]
}
#!/opt/sensu/embedded/bin/ruby
# Copyright (c) 2016 Sensu Inc.
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
# the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
require 'sensu-mutator'
require 'net/http'
class ScheduledDowntime < Sensu::Mutator
# Make an HTTP GET request to the Sensu API, using the URI
# path provided. Assumes API is running locally on port 4567.
#
# @param path [String]
def sensu_api_get_request(path)
request = Net::HTTP::Get.new(path)
Net::HTTP.new("127.0.0.1", 4567).start do |http|
http.request(request)
end
end
# Determine whether a check result was executed in between
# start and end times specified in a downtime stash.
#
# @param event [Hash] Sensu Event Data
# @param stash [Hash] Scheduled Downtime Stash
def downtime_matches?(event, stash)
event["check"]["executed"].between?(stash["start"], stash["end"])
end
# Annotate event data to indicate whether any services defined under
# client.services custom attribute match scheduled downtime windows
# as defined by stashes under '/stashes/downtime' at the time the
# event is processed.
#
# This method is invoked automatically by Sensu::Mutator @@autorun
def mutate
services = @event["client"].fetch("services", [])
if services.empty?
puts JSON.dump(event)
exit 0
else
begin
services.each do |service|
path = ["/stashes/downtime", service].join("/")
response = sensu_api_get_request(path)
next if response.code.to_i != 200
stash = JSON.load(response.body)
@event["downtime"] ||= []
@event["downtime"] << stash if downtime_matches?(@event, stash)
end
puts JSON.dump(@event)
exit 0
rescue => error
puts "scheduled downtime mutator error: #{error.to_s}"
exit 2
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment