Skip to content

Instantly share code, notes, and snippets.

@justinhennessy
Last active August 29, 2015 14:06
Show Gist options
  • Save justinhennessy/3c4d8fa956935beecd49 to your computer and use it in GitHub Desktop.
Save justinhennessy/3c4d8fa956935beecd49 to your computer and use it in GitHub Desktop.
The importantce of the inspect and adaption cycle

The importance of the inspect and adaption cycle

Here at Everyday Hero we pride ourselves on our ability as an engineering team to continually improve not only our engineering practices but also our planning, testing, and communication.

Recently we performed an operation on one of our primary databases. This involved creating a new slave database for our backups and other read only services, promoting the current slave to be the primary as it had more resources available to it, then decommission the old master.

This process was given a two hour window to be completed. The team involved performed this work smoothly in just over an hour.

Post-work, we met as a team to perform a short retrospective; a common "meeting" in the Agile/Scrum process which gives the team the chance to review what was done, what worked, and what could be improved next time.

Here was the outcome from our retrospective:

Things we did well

  • Planning and documentation processes have improved immensely over the last couple of maintenance windows

  • Having the team onsite yet again proved to be superior to remote - communications and responsiveness to issues was great

Things we can improve

  • Obtaining a maintenance window was problematic. Given Everyday Hero is a global business, finding a convenient and appropriate time for a maintenance window has always proved challenging

  • Generate and document success test cases that are going to be used to verify the system is ok in advance. This will help prevent us tripping over application bugs on the day, which could hamper the successful completion of maintenance

Actions

  • Develop and maintain a maintenance window calendar that the business has access to so that if a window is needed a call can be made without involving engineering

  • All maintenance windows should have a dress rehearsal done before a proposed window is requested. This is to ensure the team are across the process that needs to be done and more accurate timings are defined for the outage window

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment