Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
DevOps Transformation Notes

DevOps Transformation

A successful DevOps transformation should not only institute good development/deployment/operational best-practices but should instill a spirit of 'continuous transformation' among the team.

What it isnt:

  • Easy
  • Cure-all
  • A solution for technical debt
  • Free
  • Fast

What it is:

  • Cost-efficient
  • Cheaper than the alternative
  • Iterative (small victories, repeated)

Must-Haves to achieve fearless push-button releases

  • Clear Documentation
    • Easy access to (Up-To-Date) architectural diagrams and infrastructure maps to assist during high-stress incidents.
  • Version Control
    • All aspects of application/infrastructure/DevOps bust be expressed as code and under version control.
      • Source Code
      • Infrastructure Code
      • Pipeline/Build Code
      • Property Files / Configuration Files
      • Test Scripts
      • Install Scripts
      • Etc.
    • Electronic signing of all changes to validate source.
    • Disallow direct-commit to release branch.
  • Branching Strategy
    • Trunk-Based development (release branch is ALWAYS in a known-good state) with short-lived feature/bugfix/POC branches.
  • Code Coverage
    • Establish a percentage of application code that must be covered by unit tests.
    • Typically above 80% begins to have diminishing returns
    • Failure to meet this threshold should fail the build.
  • Code Analysis (all analysis and scanning should be performed as early in the process as is reasonable to find and fix problems when they are cheap to address)
    • Static Analysis
      • useful for finding obscure bugs or unforseen nested loops. Can seem tedious but can save a lot of heartburn down the road.
    • Security Analysis
      • Self explanatory
    • Open Source Scan
      • To protect the business(legally and financially), it is critical to ensure that no open source software with licensing restrictions that conflict with our business model are allowed to make it to production.
  • Artifact Management
    • Must use an artifact management system that keeps track of versions and artifact origin.
  • Automated Provisioning (service &/or infrastructure)
    • using scripts/templates/cookbooks/playbooks (pick your poison)
  • Immutable Servers
    • non-prod an dprod servers are never directly modified but are replaced using the above automated provisioning (this excludes emergencies).
  • Automated Build/Test/Deploy (zero manual intervention in the below phases, they should all exist as part of a deployment pipeline)
    • Automated Unit-Testing
    • Automated Integration Testing
    • Automated Performance Testing
    • Automated Critical Business Transaction testing (identify most critical buisness use-cases for product. Design and implement robust regression tests around this functionality as the final quality check before any deployment)
    • Builds:
      • All build jobs must be reviewed any time code is changed.
      • Build server should only have accesses to approved libraries to mitigate untrusted binaries being inadvertantly imported into production, introducing possible vulnerabilities.
      • No person should have direct, unaudited, access to the binary artifacts or be able to alter a build product post-build.
      • Robust application versioning utilizing semantic versioning (or have a good reson for using an alternate versioning method). Consider seperate internal and external versioning to support marketing efforts that might center around "new version" announcements.
      • Any Failed integration tests or unit tests should automatically fail the build.
    • Deploys:
      • Non-Prod env must have zero connectivity to prod, must be as close as possible to real production environment(including close-to-prod data sources).
      • Prod env must have zero connectivity to non-prod. All artifacts are pulled from artifact repository and are not pulled from non-prod through a DMZ. All prod Deploys must be performed by an appropriate pipelining tool.
  • Automated Rollbacks (single button 'undo' for those rare times your checks and balances dont catch a problem prior to release)
    • Chaos Engineering (simian army, chaos monkey, etc -- see Defensive development below)
      • This assumes a robust active-active or active-passive fail-over configuration or something analagous, place zero trust in any one region or availability zone.
    • Valid health checks/monitoring
    • Modern Deployment Strategy (Blue/Green or Rolling, + Canary)
      • This also implies zero-downtime releases
  • Development Methodology
    • Agile/scrum/kanban - Something that isnt waterfall
    • Feature toggles
    • Defensive development (all engineers should anticipate the outage of any service at any time and should program accordingly)
    • Separation of duties
      • 'Trust' for any release decisions should not lie in the hands of one person (You should be able to trust your engineers but your customers should not have to, all code must be reviewed by, at minimum, a second pair of eyes before being deployed).
      • Standard Production console access should be read-only.
    • Everything-As-Code
      • At no time, save for extreme emergencies, should production changes be made by hand in the console. All changes should be expressed as code and reviewed. This is to prevent fat-fingering mistakes and improve autidability of changes to production.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment