Skip to content

Instantly share code, notes, and snippets.

@cloudnull
Last active August 29, 2015 14:16
Show Gist options
  • Save cloudnull/9e7f008bf4b17b51aa21 to your computer and use it in GitHub Desktop.
Save cloudnull/9e7f008bf4b17b51aa21 to your computer and use it in GitHub Desktop.
OSAD manifesto

OpenStack Ansible Manifesto

date

2015-06-23

tags

lxc, openstack, cloud, ansible, deployment, manifesto

Problems

  • There needs to be a single reference way to deploy Openstack that just works and can be put into production.
  • Developers should have the ability to develop OpenStack in a way that matches what deployers are using in production.
  • Current deployment tools lack scale and stability. OpenStack deployment frameworks generally lack the ability to maintaining an OpenStack cloud throughout its life cycle which includes the expansion of cloud infra.

Proposal

The basis for all deployed OpenStack software will be source. This means that OpenStack services and their python dependencies will be built and installed from upstream source code as found within the OpenStack Git repositories. The project will provide a system which is able to be patched and updated as needed from OpenStack upstream code. This system will allow deployers the ability to deploy OpenStack in various releases and forms without having the change much in the way of the deployment framework. The services will be installed within LXC containers, whenever possible, with the constraint that the services running in containers must make sense from the standpoint of scalability and stability. The system will be micro-service like and strive to deliver micro-services where they makes sense. The deployment framework will be built using Ansible and adhere to the standards of Ansible best practices as much as possible. This project will be a Batteries included project. Which means deployer can expect that deploying from any of the named feature branches or tags should provide an OpenStack cloud built for production which will be available at the successful completion of the deployment.

Why LXC

This project introduces containers as a means to abstract services from one another. The container technology chosen for this project is LXC. This will provide the project the ability to deploy OpenStack ONE way using resource abstraction to make things more scalable. The use of LXC containers will also allow for additional abstractions of entire application stacks to be run all within the same physical host machines. The "containerized" applications may be grouped within a single LXC container, where it makes sense, or distributed in multiple containers based on application and or architectural needs. The default container architecture has been built in such a way to allow for scalability and highly available deployments. Because of the flexibility and stability LXC provides running a system that is providing a micro-service like infrastructure makes the most sense.

Example of running multiple services in one container: An example of grouped services can be seen within neutron agents containers where running DHCP, LinuxBridge and L3 agents all running within a single container makes the most sense. This reduces resource consumption, even with containers that are thin as defined by micro services, and allows neutrons multiple services to communicate over the same system sockets in the same root namespace which ensures the deployment system is not creating workarounds just to be able to use containers.

Highlights:
  • Active development community
  • Built in bridges
  • Easy networking setup and or integration with existing networks
  • Simple container configuration
  • Diverse storage backends: LVM, localdisk, BTRFS, ZFS, etc...

The simple nature of LXC allows the deployer to treat containers as physical machines. Programmatically there is NO difference between an LXC container and a physical machine as it pertains to the Openstack Ansible Deployment project. This will allow deployers to use existing operational tool sets to troubleshoot issues within a deployment and the ability to revert an application or service within inventory to a known working state without having to re-kick a physical host.

Not all services can be containerized and others don't make sense to run within a container. Logic needs to be applied to how services are containerized and if their requirements can't be met due to system limitations, (kernel, application maturity, etc...), then the service should not be running within a container under normal circumstances.

Example of un-containerized services:
  • Nova compute is an example of a service that can't run within a container due to current Kernel limitations when attaching an ISCI volumes to a running instance.
  • Swift storage nodes are example of a service where it doesn't make sense to run within a container due to direct drive access requirements.

Why Source Based OpenStack

A source based deployment, for Python built parts of OpenStack, makes sense when dealing with scale and wanting consistency over long periods of time. A deployer should have the ability to deploy the same OpenStack release on every node throughout the life cycle of the cloud. A deployer should have the ability to upgrade at desired times without fear of having nodes in different states and using different python package versions. The consistency in Python deliverable packages will be provided by the OpenStack Ansible Deployment project and will be an integral part of the release.

When installing the service application and its corresponding bits the operating system package manager will only be used with regard to infrastructure and or un-resolvable OpenStack dependencies. This means that there will never be a time where OpenStack specific packages, as provided by the distributions, are being used for OpenStack services. Third party repositories like CloudArchive and or RDO may still be required within a given deployment but only as a means to meet application dependencies.

Why Ansible

Ansible is a simple yet powerful orchestration tool that is ideally equipped for deploying OpenStack powered clouds. The declarative nature of Ansible allows the deployer to turn an entire deployment into a rather simple set of instructions.

Roles within the Openstack Ansible Deployment repository will be built using Ansible best practices and contain namespaced variables that are human understandable. All roles will be built as Galaxy compatible roles even when the given role is not intended for stand alone use. While the project will offer a lot of built in roles the deployer will be able to pull down or override roles with external ones using the built in Ansible Galaxy capabilities.

Design

Overview

The OpenStack Ansible Deployment project has been built as a single monolithic repository with the intent to provide simplicity in structure and operation. While the design goals have set out to be simple, the system is by no means un-configurable. The system is capable of being as complex as the deployer wants it. While the monolithic setup is simple it operate and update it does create a learning curve for the users that may want to participate in the development of the project as there is more code in a single repository which may be confusing to some as they begin working with the system. That said, with everything is namespaced and the learning curve should be generally low and really a mater of individuals learning the structure.

Roles

The roles used by this project have been built to allow for running them as individual components outside of the main repository. While the roles will exist within the single repository the separation of concerns within the role design should promote individual contributions as people adopt the project as a means to deploy OpenStack.

Points from within the repository:
  • All roles, and variables are namespaced to avoid confusion.
  • No one role interacts with any other role, except where listed as a dependency from within the meta/main.yml file.
  • All roles are galaxy compatible
  • All playbooks exist at the top of the playbooks/ directory.
  • Use of groupvars should be very limited in favor of hastvars and user defined variable files.

Notes

  • This project does not PXE boot hosts. Host setup is left to the devices of the deployer. However this project does require that bridges be setup within the host to allow the containers to attach to a local bridge for system network access. This assumes that you have at least two bridges which your containers/hosts are going to use to talk to the rest of the infrastructure. The LXC bridge is created by this project upon host LXC setup and is only used by LXC containers.
  • LXC is not a means of securing a system and it should be explicitly stated that LXC was not chosen for its security safe guards. LXC was chosen because of it's practicality with regard to providing a more uniformed OpenStack deployment. Even if the abstractions that LXC provides do improve overall deployment security these potential benefits are not the intention of the containerization of services.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment