This is designed to be a quick primer on Continuous Integration and Continuous Delivery. We will also use the term "Continuous Deployment" to mean a subset of Continuous Delivery in which you are delivering with every commit.
For the sake of this discussion, we should have a term for "The opposite of CI/CD" and we'll call that "Intermittent Integration" and "Intermittent Delivery" so that we have a shared vocabulary.
You can read the textbook definition on wikipedia. For the intents of this discussion, we will assume it is the practice of integrating your code into a shared master, preferably with as many of the "best practices" as possible. I will outline an example below.
You can read the textbook definition on wikipedia.
For the intents of this discussion, let's broadly say it means You can release at any time (even if you do not).
For the intents of this discussion, let's broadly say it means You release with every commit
Faster feedback loops. You want everything to "shift left" (happen earlier). That means code review, basic QA testing, security, infrastructure, performance testing, etc. should all happen early and often in the Software Development Life Cycle (SDLC).
It is useful to think of CI vs. non-CI as a codified version of agile vs. waterfall. Let's imagine a "Golden Architecture" where everything is agile and CI/CD vs. a "Not-So-Golden Architecture" where everything is waterfall and II/ID.
You have a basic web site that shows pictures of dogs (like instagram).
- You get a requirement to show cat pictures in addition to dogs
- You implement it in a feature branch
- You make a PR to merge the code into master and it runs your CI suite (linting, tests, etc.)
- Everything passes and a teammate approves it
- The codes goes into master and gets deployed into prod
- Repeat this for every feature
- You get a requirement to show cat pictures in addition to dogs
- You implement it in a branch shared with 10 other developers
- You try to merge the code into master but you have a lot of merge conflicts from everyone else's code
- You think you solved the conflicts and merge it into master
- Your QA team tells you everything is broken
- A teammate tells you they had some unstable code in that branch, so you have to figure out how to revert their change
- You end up branching off an earlier release because the feature is urgent and you copy/paste your code there
- You realize your code is dependent on some recent changes not in the release, so you need to rewrite your code
- Repeat this for every feature
Going from 0 to 100 takes a while, so there are a number of steps you can take (in no particular order) to help improve your iterative speed.
- Use feature branches (git is preferable, but not mandatory)
- Keep the scope of features small
- Review code and run CI suite before it goes into the main branch
- If you don't have a CI suite, make one
- Ensure each developer's work environment is the same (e.g. use Docker Compose) to avoid framework mismatch blunders - this env should also be as similar to prod as possible
Let's go through an example of some "good things". Each of these could warrant their own entire 1 hour talks, but I'll touch on them so you have some general guidance.
- A CI server or setup of some sort. There are a number of frameworks (or combinations) you could use:
- Jenkins
- AWS CodePipeline
- Atlassian Bamboo/Bitbucket & Octopus Deploy
- Unit tests in your application, preferably "Integration" tests for web apps:
- End-to-End tests
- Webdriver
- Cypress
- Configuration Management / Infrastructure-as-code. Ideally, you are also using some form of CM for your Jenkins, too!
- Docker
- Ansible / Chef / Puppet
- Cloudformation
- You should also test this when possible (e.g. chefspec, inspec, etc.)
- Database Management (and any other "stateful" pieces of your architecture)
- Frameworks/libraries:
- Ruby: Rails activerecord
- Java: Liquibase / Flyway
- You should also typically treat your cache (Redis/Memcached) much like a DB
- Other long-lived items like an S3 bucket or EFS may also require careful management to ensure the correct convergence
- Frameworks/libraries:
- Performance tests and Application Performance Monitoring
- Performance needs to be benchmarked, or you don't really know how well your app is performing. Tools like NewRelic, AppDynamics, Dynatrace, Raygun are useful for monitoring your apps.
- Tools like JMeter (Blazemeter) / Loadrunner / Locust for running tests
- Your site's performance is probably worse than you think (neat blog: https://bravenewgeek.com/everything-you-know-about-latency-is-wrong/)
- Scaling / Availability
- This is a minor aside to CI/CD, but if done well, scaling and availability should be something considered "early" in the life cycle so that deployments don't incur outages
- Also see blue/green deploys, canary deploys, and failovers
- Neat Route 53 article on failovers: https://medium.com/dazn-tech/how-to-implement-the-perfect-failover-strategy-using-amazon-route53-1cc4b19fa9c7
There are a few must-have pre-reqs you should have in place before you go all-in on continuous deployment. Everything else is "nice to have" or can be built incrementally, but without these in place, you're probably going to fail.
- Test suites - You need to have an automated test suite so you can trust your deploys. I recommend starting with a decent e2e suite since it will be the most "realistic" simulation of production.
- Feature flags / "Toggles" - You need a way to push features into master without "accidentally" showing them to users before the code is ready.
- Rollback / Roll forward - You need to have a strategy for quickly rolling back or rolling forward.
Here are a few other good practices for moving towards safe, predictable, quick iterations into production:
- Application Monitoring: AWS X-Ray, NewRelic, synthetics/canaries, etc.
- Blue/Green (immutable) deployments: Deployments roll out into a new group to ensure no downtime (and can also go through an additional test suite)
- Immutable infrastructure: You can make a docker image (or AMI) and this ensures you don't have drift on a particular instance, as well as enabling easy rollbacks if a newer deploy goes wrong.
- Cattle vs. pets: Generally you want cattle (Clusters/Autoscaling groups, destroyed at any time) instead of a pet (long-lived Jenkins, cannot destroy without problems) so you can easily terminate any particular instance without worry. It also ensures you don't end up with drift.
- Everything-as-code If it can be code, it should be code (or something code-like). This makes development and testing much easier. Examples:
- Docker / docker-compose - This makes your image "code" and easy to replicate on any OS
- Config. Management (chef) - This makes any modifications to the image (if not done via a Dockerfile) easy to replicate (and test)
- Pipeline - No one wants to manually configure Jenkins and individual pipelines, especially after drift occurs
- Infrastructure - With Cloudformation (or Terraform), creation and teardown are fairly simple and predictable. (OPINION: Prefer declarative over imperative, e.g. CloudFormation vs. Boto)
- Database - Even if you want to have native SQL stored procedures, functions, etc., you should at least put them in version control and use some tool (ORM or otherwise) to correctly apply changes to them
- Book on CI/CD concepts: https://www.amazon.com/Accelerate-Software-Performing-Technology-Organizations/dp/1942788339