Skip to content

Instantly share code, notes, and snippets.

@hanokhaloni
Created December 23, 2018 07:08
Show Gist options
  • Save hanokhaloni/0add5853d98517a06f4a242851f755ec to your computer and use it in GitHub Desktop.
Save hanokhaloni/0add5853d98517a06f4a242851f755ec to your computer and use it in GitHub Desktop.
Resiliency checklist
1. Define your availability requirements, based on business needs.
2. Design the application for resiliency. Start with an architecture that follows proven practices, and then identify the possible failure points in that architecture.
3. Implement strategies to detect and recover from failures.
4. Test the implementation by simulating faults and triggering forced failovers.
5. Deploy the application into production using a reliable, repeatable process.
6. Monitor the application to detect failures. By monitoring the system, you can gauge the health of the application and respond to incidents if necessary.
7. Respond if there are failure that require manual interventions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment