- no upfront installation/agents on remote/slave machines - ssh should be enough
- application components should use third-party software, e.g. HDFS, Spark's cluster, deployed separately
- configuration templating
- environment requires/asserts, i.e. we need a JVM in a given version before doing deployment
- deployment process run from Jenkins
Ansible (perhaps with Docker + Vagrant)
-
no upfront installation/agents on remote/slave machines - ssh should be enough
- Ansible uses OpenSSH and requires just Python 2.6 or 2.7 to work on remote machines that most sensible operating systems meet out of the box.
-
application components should use third-party software, e.g. HDFS, Spark's cluster, deployed separately
- Inventory file(s) use variables, groups, and all sorts of substitution
- See http://docs.ansible.com/intro_inventory.html
-
configuration templating
- See above - inventory file(s)
- Playbooks = Ansible’s configuration, deployment, and orchestration language.
- describe a policy you want your remote systems to enforce, or a set of steps in a general IT process.
- deployment scripts are written in Python
- Jinja2
-
environment requires/asserts, i.e. we need a JVM in a given version before doing deployment
- playbooks language = If Ansible modules are the tools in your workshop, playbooks are your design plans.
- http://docs.ansible.com/playbooks.html
- See "configuration templating" requirement
-
deployment process run from Jenkins
- Using ansible requires a central machine with ansible command to manage remotes
- A central machine could be Jenkins's one
Ansible, Chef, Puppet, SaltStack, CFEngine
- config management tool
- describe infrastructure as code
- main difference between them: the language
- Both Puppet and Chef are both languages that allow you to write scripts to quickly provision servers (including instances of Vagrant and/or Docker). You don't need to use Puppet or Chef to setup these services, but sometimes they can be a quick way to do so.
Docker and Vagrant are mentioned, too, for similar requirements.
- approach - Containers vs Hypervisors (VMs)
- V for Virtualization
- C for Containerization
- With Vagrant you might spin up a few CentOS boxes and install nginx on them, with Docker you might just spin up a few nginx instances without the overhead of the entire VM.
- Vagrant gives a new VM to work within - all components inside are the same, regardless of developer's local machine setup
- Docker is a two part shell/management layer for building and running virtual linux containers, based on lxc.
- I always say that in Development/Test and Staging, you always have to test your application against the current production environment (including configuration) and whatever potential alternate production environment. Are you planning a security update? Do you want to update or switch your Java stack?
- This is where Vagrant and Docker shine. I would expect Docker to help you speed up testing against multiple operating environments.
- Vagrant abstracts the machine, while Docker abstracts the application.
- When you need a throwaway machine, use Vagrant. If you want a throwaway application, use Docker.
- Why would you need a throwaway machine for an application? When you want to do anything sophisticated with networking or hardware.
- If you're not sure, use Vagrant.
- In addition to the factors mentioned above, one significant factor is the host operating system. If you are on a Windows hosts, then it rules out docker at a native level.
- Vagrant on Docker = Docker provider in Vagrant
- https://docs.vagrantup.com/v2/docker/basics.html
- Beyond the 'use both' solutions already given, which I have done on projects before and can agree is viable; there's an argument for docker in that 'Everyone's doing it' So you'd have a lot of support from a really great community. I've found vagrant community sparse at times.
- Vagrant with a VM based backend like vSphere or VirtualBox vs. Fig with Docker.
Vocabulary
- the tools = ansible, chef, puppet and salt
- provisioning
- workstation = the laptop or desktop machine you are sitting at.
- Guest = the virtual machine running on your workstation (and Vagrant or Docker uses). It is a guest on your workstation and can be asked to leave at any time.
- hypervisor = Oracle’s Virtual Box provides the hypervisor in which to run one or more guests on your workstation.
What is difference between docker, puppet, chef and vagrant?:
- the tools turn the configuration of an environment into source code
- when the environment configuration becomes code it can then be managed from within a VCS such as git or SVN so that changes are attempted, shared, rolled forward and rolled back in a much more frictionless way than the traditional written specification documents or word-of-mouth configuration sharing (e.g. do this .... now try that ... no, OK then try this)
- Gene Kim's book The Phoenix Project recommended
- CFEngine mentioned - review, too?
- configuration management tools
- Puppet: This solution seems to appeal mostly to operations teams with little to no development background
- a Puppet Master = a state server to track your infrastructure
- While Puppet can be extended using the Ruby language, it is not terribly easy to do so
- Puppet is difficult to pick up.
- Chef: This solution resonates best with teams that, while not developers, are familiar with unit and integration testing, use of source control and other developer tooling.
- highly mature and works at massive scale due to its adoption by Facebook which also has contributed
- extensible using the Ruby language
- Chef is very difficult to learn, though the exceptionally verbose output of a convergence run eases the identification and rectification of problems.
- Ansible: This solution is by far the simplest of systems and appeals greatly to front line developers who often moonlight as their companies operations folks.
- written in Python, so has a certain attraction to the Python community
- If you are considering configuration management for the first time ever and need an easy win, Ansible is good place to start.
- not familiar with Salt Stack and CF Engine.
- Docker as a way to package code into consistent units of work.
- These units of work can then be deployed to testing, QA and production environments with far greater ease.
- Docker needs to only to express the configuration for a single process, the problem becomes far easier. The Dockerfile is, thus, a bit more than a simple bash script for configuring a process with its dependencies.
- Docker also brings artifact management along with it via the public Docker Hub. This can be thought of as a correlate to public Github repositories. When a developer writes code and packages it with a Docker the can push this to the Docker hub to be shared as a binary artifact. The ramifications of this are quite extensive. This artifact can be tested for function, performance and security as a "black box" without needing to know what the contents of the "box" (container really) are.
- Docker also brings artifact management along with it via the public Docker Hub. This can be thought of as a correlate to public Github repositories. When a developer writes code and packages it with a Docker the can push this to the Docker hub to be shared as a binary artifact. The ramifications of this are quite extensive. This artifact can be tested for function, performance and security as a "black box" without needing to know what the contents of the "box" (container really) are.
- Docker is revolutionary in its scope because of the combined container, packaging and artifact management it seeks to employ.
- still very weak in production due to difficulties in networking, identity management and data persistence = consider Docker, but be sure to use it in the real world with your eyes wide open.
- The Configuration Management solutions all share a complexity that reflects the difficulty of configuring bare metal and virtual machines.
- not only launching them, but also modifying the state of their configuration -- often with the release of new software.
- Vagrant is a way to use Oracle's Virtual Box or VMWare Fusion on a developer workstation for the purpose of creating disposable and shareable development environments.
- Vagrant does not compare to Puppet, Chef or Docker, Vagrant is meant to be used with them.
- Vagrant takes the entire description of your development environment and couches it in a Ruby file. This means your development environment configuration is code and can be shared, rolled back and rolled forward with ease.
- You, the developer, are free to try new and innovative things such as that new, awesome Java package or the latest version of PHP without worrying that a failure might take you days to set up or unravel.
- Vagrant is a killer development application, and should be considered by nearly every development team.
- Vagrant + Chef == Docker
- First consider your situation in terms of the paradigm of production deployment (Containers vs. VMs). If you chose containers, then Docker/Boot2Docker is a good choice as it has a great deal of momentum right now and its ecosystem is beginning to grow.
- Puppet and Chef are configuration management tools = keep all of your server configuration in a central place.
Current released version: 1.9.2
- Ansible is an open-source IT configuration and automation platform
- configure systems (configuration management), deploy software (application deployment), cloud provisioning (ec2, openstack, digital ocean, rackspace), ad-hoc task execution (one-off playbooks), and finally orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates (multi-tier orchestration).
- Uses Python 2.6 or 2.7
- Python 3 is a slightly different language than Python 2 and most Python programs (including Ansible) are not switching over yet.
- No support for Windows as the control machine
- No root permissions required
- Ansible communicates with remote machines over SSH
- use native OpenSSH for remote communication when possible
- enables ControlPersist (a performance feature), Kerberos, and options in ~/.ssh/config such as Jump Host setup
- OpenSSH for transport (with an accelerated socket mode and pull modes as alternatives)
- In releases up to and including Ansible 1.2, the default was strictly paramiko. Native SSH had to be explicitly selected with the -c ssh option or set in the configuration file.
- Ansible 1.3 tries to use native OpenSSH by default
- when the version of OpenSSH is too old to support ControlPersist, Ansible falls back into using a high-quality Python implementation of OpenSSH called ‘paramiko’
- In 1.2, native SSH had to be explicitly selected with the -c ssh option or set in the configuration file.
- features like Kerberized SSH and more? Consider using Fedora, OS X, or Ubuntu as your control machine
- By default SSH keys used. Password authentication can also be used.
- use native OpenSSH for remote communication when possible
- Ansible manages machines in an agentless manner.
- No daemons or database setup to use Ansible
- Ansible is decentralized–it relies on your existing OS credentials to control access to remote machines.
- support for Kerberos, LDAP, and other centralized authentication management systems.
- The main parts: playbooks, configuration management, deployment, and orchestration
- install it on one machine, say laptop, and manage an entire fleet of remote machines from that central point.
- software installed or running on remote/managed machines => no worry about future upgrades of Ansible on the them
- Running From Source
- runs so easily from source + does not require any installation of software on remote machines => many usethe development version of Ansible all of the time straight from github
- taking advantage of new features (and also easily contribute to the project)
- Installation:
git clone
+cd
+source
- see http://docs.ansible.com/intro_installation.html#running-from-source
- Ansible’s release cycles are usually about two months long
- minor bugs will generally be fixed in the next release
- Major bugs will still have maintenance releases when needed
- For installation of the latest released version of Ansible, use the OS package manager
- Commercial Ansible = Tower
- Enterprise users may also be interested in Ansible Tower. Tower provides a very robust database logging feature where it is possible to drill down and see history based on hosts, projects, and particular inventories over time – explorable both graphically and through a REST API.
- powerful configuration/deployment/orchestration features of Ansible = handled by playbooks
- A mode called ‘ansible-pull’ can also invert the system and have systems ‘phone home’ via scheduled git checkouts to pull configuration directives from a central repository.
- Ansible is not just about running commands, it also has powerful configuration management and deployment features.
- Ansible’s inventory file
- defaults to
/etc/ansible/hosts
. - an INI-like format
- group names in brackets to classify systems
- variable precedence
- variable files are in YAML format
- patterns
- defaults to
- Modules FIXME
- Plugins FIXME
- Chef uses a subset of Ruby, and server configurations are written as a series of steps. e.g. install apache, enable authn_ldap module, etc.
- Puppet uses a DSL (Ruby-based) that tries to be declarative, as in, ensure a machine is in a certain state, but not done in any particular order.
- Vagrant is a tool for quickly spinning up virtual machines.
- There's a configuration file (again in Ruby) that specifies what vm image to start with, what additional tasks to run, how the network should be configured, and so on. Vagrant has pretty good support for running configuration management software as well.
- One place Vagrant really shines is setting up test environments. You're using Ubuntu Trusty with Apache 2.4 in production? Just spin up a few equivalent boxes locally with the same configuration. Using something like Beaker (another Puppetlabs project), you can even test your Puppet code itself!
- Another use case for Vagrant is as a provisioning tool, since you're not technically limited to the default Virtualbox provider. You can use it with XEN, KVM, EC2, basically whatever. However, there are probably better tools for this use case. Razor and the Foreman come to mind.
- Vagrant is a Virtual Machine which uses another service (such as VirtualBox or AWS) as its provider. You can launch many different types of Virtual Environments with Vagrant, but the most common is a Linux server.
- Vagrant on the other hand is a wonderful tool for automatically provisioning multiple virtual machines each with their own configurations managed with puppet and/or chef. For its virtualisation it can use different providers. Originally the default provider was virtualbox, but it now supports many more, including vmware fusion and even amazon-ec2.
- Interestingly, Vagrant has a Docker provider now, so you can use vagrant to manage your Docker builds and deployments.
- Vagrant has similar challenges, as boxes can become out of date, and sometimes boxes can be hard to find and/or update. There are tools like packer and the older veewee to help you build so called 'base' boxes.
- Vagrant uses virtualbox to spin up a virtual machine for you where you can set up your own environment and install everything you need on that machine mostly through some provisioning scripts. Its most common use case is to test your application on different environments and spin up test servers/machines where your teammates can ssh to and test their work on where everyone else can see and use.
- Vagrant talks with virtualization systems (VMware, virtualbox, aws, even docker) to mainly create full virtual machines with their own IPs, running any OS and of course, all the applications than implies booting that VM.
- We've tried Vagrant + Chef (solo and server) and it gets really complicated fast; the addition of berkshelf created huge problems for local installs on Windows (sometimes for Macs as well).
- I recommend using the Vagrant solution if you are not planning to use Docker in production.
- the disposable nature of a Vagrant development environment
:shell
provider- "Towards a better software development lifecycle with Vagrant"
vagrant --version
vagrant [init|up|ssh|halt|destroy]
vagrant up
= starts VM off Vagrantfile- Vagrant reads the Vagrantfile and builds a virtual machine to that specification.
Vagrantfile
andconfig.vm.box
- Every Vagrant development environment requires a box. Search for boxes at https://atlas.hashicorp.com/search.
- https://cloud-images.ubuntu.com/vagrant/
- Provisioning with a shell script, but additional provisioners such as Puppet, Chef, Ansible, Salt, and Docker are also available.
- The name “Vagrant” refers to the transient nature of these guests.
- vagrant = a person who is poor and does not have a home or job http://dictionary.cambridge.org/dictionary/british/vagrant
- A fully functional development environment by cloning a github repo
- All teammates can use the same image
- Tweak the image to match your needs, alter
Vagrantfile
and send pull request with the change to git repo. - Your teammates can now
git pull
and follow along - Upgrade the dev env,
vagrant destroy
,git pull
,vagrant up
, and do your work - All we need is
Vagrantfile
- it's a text file, remember?
- DevOps thinking is to experiment and fail fast.
➜ ~ docker version
Client version: 1.7.0
Client API version: 1.19
Go version (client): go1.4.2
Git commit (client): 0baf609
OS/Arch (client): darwin/amd64
- Docker is a tool for managing Linux containers. At first glance it seems similar to Vagrant, but the use case is actually quite different. With containers, you generally don't want an entire distribution installed, but rather just a single service.
- Think of Docker as something a little closer to package management, rather than a virtual machine.
- One really nice thing about Docker (and containers in general), is that you can get away with running more bleeding-edge software in production, since when you deploy a Docker container, you're deploying the entire stack, not just a few files. So you do an update on everything in a container? It goes through QA just like the rest of the code.
- Docker is not a full-fledged Virtual Machine, but rather a container. Docker enables you to run instances of services/servers in a specific virtual environment. A good example of this would be running a Docker container with Ruby on Rails on Ubuntu Linux.
- The great thing about Docker is that it is light-weight (because it relies on shared-kernel linux containers) and it is distribution agnostic. While the kernel between all instances is shared (but isolated from the host and each other), the user space for different instances can be based on different linux distributions.
- Docker is yet limited in its flexibility - 'everything is an image', and you can create variant images, and whole stacks of images, where each one adds features to a previous one. Managing that can become a challenge.
- Docker on the other hand uses images and containers to build your application as an image. An image is basically an instance of your application with all of its setup environment and requirements installed, however its not a machine.. A container is basically a process/service that runs on the background, it acts as a virtual machine that contains your images, but its not one, its just a service that runs on top of a machine. You can run many images in one container and you can run many containers on one machine.
- If you have your docker image, you can run your application on any machine, all you need is to have docker installed. Docker uses dockerhub as a CDN where users can pull/push images from repositories.
- Docker talks with the kernel of linux (and in a near future, windows) to create containers (think in a chroot that have its own network interfaces, users and process, isolated from the rest of the system ) to launch applications, in an opinionated way (the emphasis is in running single apps, declaring exposed network ports, and image based filesystems).
- Docker will be more complicated to setup, because in the case of a multi-tier app, you'll want to have more than one container; this means images for each container type, the fig template, some setup/docs and a build system/procedure to tie them together. However, once you get the ball rolling, people will get updates fast, and will generally have an easier time with it. Caveat: win and mac folks will need to do some tinkering with boot2docker and such.
- Docker did not come in the market to play a "versus" role for Vagrant. Docker does a good job, but the basic fact still lies that it needs an OS to run on. A VM would serve the purpose for getting docker run over it.
- Docker up and destroy is faster than Vagrant, so just use one Vagrant VM and multiple Docker images over it.
- a Docker development environment
- Docker is a provisioning tool.
- On Mac OS, docker is creating and spawning containers on Virtual BOX VM
- Docker is a daemon that runs on a Linux machine and allows you to run one or more light weight virtualization containers. Loosely, the Docker daemon can be thought of as a hypervisor.
docker
command to interact with the Docker daemon and obtain information or perform actions on containers and images.docker info
docker ps
to list all running containers.- Just a single line with the headers is fine as it proves that from our workstation we are connecting to the Docker daemon on the guest.
docker run -d --name redis -p 6379:6379 redis
--name
gives the name of a container- the last option, i.e.
redis
above, is for container's name to use -d
flag to run a container in detached mode which gives you back control of your command line.-p container-port:host-port
mapscontainer-port
tohost-port
- The command names the running container “redis” and opens port 6379 on the container. It maps the same port on the host.
-v host-path:container-path
to mount a directory on a workstationhost-path
tocontainer-path
in the container--link redis:redis
to link toredis
container
- the full hash of the container = the primary name by which the container is known
- can also be referenced by its human name “redis” which was assigned with the
--name
switch. - looks for an image on the guest and when it can't find it, it downloads the public Redis image from the Docker Hub.
- a number of hashes representing “layers.”
- each command in a Dockerfile creates a layer
- can also be referenced by its human name “redis” which was assigned with the
- The Docker daemon uses Linux-specific kernel features so it only runs on Linux. You can’t run Docker natively in OS X. You will need a linux virtual machine running on your workstation. The Boot2Docker application for the other OSes.
- The application includes a VirtualBox Virtual Machine (VM), Docker itself, and the Boot2Docker management tool.
- The Boot2Docker management tool is a lightweight Linux virtual machine made specifically to run the Docker daemon on Mac OS X.
- The VirtualBox VM runs completely from RAM, is a small ~24MB download, and boots in approximately 5s.
boot2docker help
boot2docker version
boot2docker up
for VM and Docker daemon to start- boot2docker has created a guest Linux machine that is running the Docker daemon.
docker run hello-world
boot2docker halt
boot2docker upgrade
boot2docker shellinit
docker stop
stops an image- use
docker ps
to list running images - After stopping, docker doesn’t let you run two containers with the same name, so either pick a new name, or remove the existing containers with
docker rm
- use
- Boot2Docker is also known as a remote Docker daemon
Dockerfile
= a text file used to build a Docker image and run a container based on that image.- Images are like classes in OOP while containers are instantiations (objects) of that image (class).
- Just like you can instantiate any number of objects from a class, you can start any number of containers from an image provide certain runtime configuration doesn’t conflict.
docker build -t myapp:latest .
builds a docker image namedmyapp
with taglatest
on the guest host’s disk- takes a long time because the python base image is nearly a gigabyte
- a disposable environment
- All application dependencies are described in the Dockerfile.
- By building the new image, we have standardized the runtime environment for the application.
- any change to the environment is expressed as code (in Dockerfile)
- Once you are sure the change is good, you share it the same way you share all code.
- If the change is bad, then you are back to work in minutes.
- you are now free to experiment
- Anyone new to the project gets all this information when they clone the project
- Pick an image and give it a go - don't like it? pick another one? All's in Dockerfile and versioned.
- Adopting Docker means a cultural shift in QA and Operations that must be viewed as a process.
- Don’t automate without first considering what problems you want to solve and what philosophy governs your implementation.
- Docker Hub https://hub.docker.com
- https://docs.docker.com/docker-hub/userguide/
- An account on the public Docker Hub provides you with your own namespace to push and pull images from.
- This correlates directly to GitHub’s "social coding" paradigm except you share Docker images rather than source code.
- a repo is a fully built image
docker login
to login to the Docker hub~/.docker/config.json
with login credentials
- You can register images as repositories in your namespace by pushing and tagging it.
- Kitematic = set up Docker and run containers using a graphical user interface (GUI).
- Read Docker for the Impatient where I'm describing the other goodies of Docker.
- http://www.quora.com/What-is-difference-between-docker-puppet-chef-and-vagrant
- http://www.quora.com/What-is-the-difference-between-Docker-and-Vagrant-When-should-you-use-each-one
- http://www.odin.com/fileadmin/media/hcap/pcs/documents/ParCloudStorage_Mini_WP_EN_042014.pdf
- http://www.quora.com/What-is-the-difference-between-Docker-and-Vagrant-1
- https://www.facebook.com/laskowski.jacek/posts/10154000971874027?comment_id=10154001172484027
- http://docs.ansible.com/
- https://youtu.be/Qi0AhK7PMCI
- http://stackengine.com/vagrant-development-environments-how-and-why/
- http://stackengine.com/docker-101-01-docker-development-environments/
- http://odewahn.github.io/docker-jumpstart/
- Quick Hadoop Startup in a Virtual Environment
Thanks @matixo! Even though it's a 2-day research project, I'm going to spend more time with Ansible with docker and vagrant to see how well it fits current/team's and future/mostly-my needs.