Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Executive summary
- brief paragraph that anyone can understand & apology for those affected.
The rest of the report should be purely factual with no emotion & or blame.
Issue Summary
- short summary (5 sentences)
- list the duration along with start and end times (include timezone)
- state the impact (most user requests resulted in 500 errors, at peak 100%)
- close with root cause
- list the timezone
- covers the outage duration
- when outage began
- when staff was notified
- actions, events,etc
- when service was restored
Root Cause
- give a detailed explanation of event
- do not sugarcoat
Resolution and recovery
- give detailed explanation of actions taken (includes times)
Corrective and Preventative Measures
- itemised list of ways to prevent it from happening again
what can we do better next time?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.