Nobody (Almost) Upgrades Servers Weekly in 2019
No one wants to run old software. We all appreciate using the latest stable version of a given piece of code, be it the OS, a service like Postgres, or an app on your phone. However, it's still difficult to accomplish that. Let's explore why and how to actually fix it.
Outdated Software is Easy to Find
When a company decides to "do security," they often do it to enable the sales team to "sell security." Let us not fool ourselves. No organization does security to make the world a better place. They do it because they realize they can sell to financial services, government, and medical industry customers, or the company has suffered a security breach. After 40+ years of security product marketing, the common wisdom of doing security is to buy: anti-virus, firewall, vulnerability manager, and a static analyzer. The first two have their own limitations, so I'll be focusing on what happens after a vulnerability managers finds something, and how to actually fix it.
Nobody owns it
The hardest part of IT and security is people. Very few people want to invest the time and money into doing IT security. Most organizations wants to create new features in their flagship products and sell it, which is great. Security is seen as the "department of no," the "people you have check a box just before a release," and the "people who freak out over low probability risks." However, now that DevSecOps is on the rise, security can actually help create new features, and enable an organization to sell their product as genuinely secure to more customers and industries, by enabling other IT teams to fully automate upgrades. The first step is to establish: "Who owns this, and will maintain it?" I have found that it is completely reasonable for a security team / department to own and maintain the base OS layer for an organziation, such as the AWS AMI VM's and Docker images. Infosec teams can be the example to follow, by fully automating upgrades and builds on a weekly and nightly basis.
Should be a solved problem by now
IT is always changing. None of us used cloud provider services before Amazon AWS in 2006. None of us had touchscreen smartphones before the iPhone in 2007. None of us used Linux Containers and Docker before 2013. Things change, and hopefully over time the problems of the past will be automated away. In my experience, some problems get automated away, as long as you keep your use cases generic and standardized, and other problems get moved around. As an example, Docker does not have built-in configuration management. The concept of infrastructure as code has been embraced. Almost everyone agrees we should be using a configuration management tool such as Ansible, Puppet, or Chef. The next assumption is that "Surely Ansible / Puppet / Chef has full support for XYZ service?" Not always. Very popular services such as Apache, PHP, and MySQL do have full OS support, full configuration parameter support, with robust tests suites. However, other popular services such as Redis, Elasticsearch, and Kafka do not have full os support, nor full configuration parameter support, nor robust test suites. You can see my analysis of the support provided by Ansible, Puppet, and Chef, here. I would also suggest using the DevSec Hardening Framework, a collection of Ansible / Puppet / Chef playbooks to help with hardening for the CIS Benchmark.
Let's build it!
I like your enthusiasm! If you've read my analysis, you'll notice that overall Ansible has the most robust support for multiple OSes, parameters, and testing suites. Within multiple organizations and teams, I've seen the anti-pattern of "Let's copy someone else's opensource Ansible / Puppet / Chef code into our privte repos." This kind of works, but then the IT organization loses the benefits of collaborating with a larger development team. These internal private configuration repos tend to go stale very quickly, while everyone else within the organization relies on it. Please contribute to the open-source configuration management projects.
How do we build?
Yes, the next problem to tackle is how to actually use Ansible / Puppet / Chef to build your AWS AMI's and Docker images. I highly suggest using Hashicorp's Packer. It is open-source, supports AWS, Azure, GCP, Docker, and many other providers. I maintain a repo of multiple Docker images. We should put this into a Build Service. I suggest Jenkins, and have found Travis CI works well for smaller open-source projects. I would also suggest not using Bash, and instead having Jenkins call Gradle, and then call Packer. This type of build pipeline uses task-oriented tools, and keeps the build pipelines simple, easy to understand, and maintain.
Does it actually work?
All of the automation I've mentioned so far only builds, but doesn't test. There are 4 types of image build pipeline testing: dependency testing, configuration testing, e2e testing, and security testing. For test runners: Ansible Molecule, Chef Kitchen. For dependency testing: OWASP Dependency Check. For configuration testing: serverspec, testinfra, goss. For e2e testing: infrataster. For security testing: gauntlt, continuum security / bdd-security, F-Secure mittn, Chef Inspec. I would also suggest finishing off the build pipeline with an image security scanner, and integrating a final checck from Nessus / Qualys.
I help my consulting clients build, develop, and mature their build pipelines using the above tools. However, it doesn't prevent ShadowOps and untracked changes made in production. For those situations, I suggest: augeas, Facebook's osquery, osquery augeas plugin, osquery FIM, fed into your logging and monitoring / SIEM of choice. I hope that over time the popular open-source projects will develop Ansible / Puppet / Chef roles for their services to help us all install and configure in a reliable fashion. It is debatable if that should be their responsibility, or the if it should fall to Ansible / Puppet / Chef. Until that day comes, let's all do our part to help contribute.