MrNice/blogpost.md

## blogpost.md

      
    Raw
  

              blogpost.md
            
          
    Ansible

Understanding Ansible

Ansible is a powerful, simple, and easy to use tool for managing computers. It is most often used to update programs and configuration on dozens of servers at once, but the abstractions are the same whether you're managing one computer or a hundred.
Ansible can even do "fun" things like change the desktop photo or backup personal files to the cloud. It can take a while to learn how to use Ansible because it has an extensive terminology, but once you understand the why and the how of Ansible, its power is readily apparent.
Ansible's power comes from its simplicity. Under the hood, Ansible is just a domain specific language (DSL) for a task runner for a secure shell (ssh). You write ansible yaml (.yml) files which describe the tasks which must run to turn plain old / virtualized / cloud computers into production ready server-beasts. These tasks, in turn, have easy to understand names like "copy", "file", "command", "ping", or "lineinfile". Each of these turns into shell commands which are run on the client server. For example, "copy" is essentially secure copy, also known as scp, and is used to move files from the ansible runner onto the client.
The output from these commands is collected and sent back to the ansible runner. This output can then effect the task execution flow. For example, if a "copy" command fails, it will by default stop task execution, but ansible can be instructed to instead ignore the failure, retry the operation, or even select a new source to copy from. In this way, ansible is very much like an imperative domain specific language. Tasks are run sequentially. If the first task copies a file to the computer, and the last removes it, it will still exist on the computer if the task pipeline fails somewhere in the middle.
However, a single task is generally declarative. It describes the state of the computer we want, and ansible ensures that the computer ends up in that state. Because ansible "gathers facts" about a system during setup and as it runs, it knows whether files it cares about exist on the computer. Instead of creating a directory with mkdir, you tell the "file" module to ensure that a certain path is set to directory mode, as everything is a file in Unix. The file module is smart enough to not do anything of the path is already a directory.
The glaring exception to this rule are the core modules "shell" and "command", though they can be treated declaratively with the creates option or the when task parameter, both of which return the no-change "OK" if a boolean flag is active.
Ansible attempts to be idempotent: when a playbook of tasks is run twice successively, or on two congruent computers, little should be different. There are many ways to subvert this in the imperative DSL, but for most ansible use cases, the same playbook should effect the computer in the same way every time. This presumption allows ansible to skip running tasks. For example, if the server already has the right node.js installed, or maybe just any version of node.js installed, the task will be "OK"'d and skipped. Note that "skipped" is a task end state for when a conditional isn't met, while "ok" is a task end state for when the computer was already in the end state.
This allows the ansible runner computer to not matter, as long as the runner has the correct files. This seemingly difficult task is fairly easy to ensure, as ansible encourages you to keep important configuration files along with ansible yaml files in source control, either as a configurable "template" or as a whole.
If every task were just a stateful function call, or a call to an object's method, then task includes statements are how you create your own function calls. A task list can include tasks which simply pass arguments to other task lists. In this way, you can compose functions of task lists, effectively giving us meta-tasks.
Tasks and meta tasks can be included in either playbooks or roles. A "role" is a description of what a computer is: "mysql", "programmer", "youtube-streamer", etc. This is what makes ansible an idempotent task runner. Remember, ansible runs tasks in order to get a computer's software into some end state. A role describes the configuration needed to take a standard computer and transform it into a home media server. But what if you want your home media server to also be, perhaps, a SteamBox? You could use a new role, but this is a case for a playbook.
Playbooks are selections of roles which are applied to specific user logins and computer ip addresses. Your media serving home computer can also be a steam box, or a "bitcoin_miner", or whatever else you may want it to be. Of course, you can create conflicting roles, but that's what virtualization and containers help manage.
The inventory file provides a mapping between a group of computers, and the login information for each computer. That's all that ansible needs into order to ssh into your "tumblr-scrapers" and get them ready for action, without touching your ever-ready "airBnB for iguanas" service server. One day the world will catch up.
So, to recap, the inventory file provides logins for computers. A playbook maps groups of logins to specific computer roles: "wordpress" or "dev2" or "abc" for the cruel hearted. A role contains everything necessary to turn a computer into a server-beast, including task lists, configuration files, and templates, as well as meta data such as "this role needs this other role in order to work". Tasks describe specific pieces of state which must be true. And modules turn tasks into ssh commands!
Roles also have special "handler" tasks, which are "globally unique" and can be notified by any other task. They are best used to restart services such as apache servers or for triggering computer reboots.
The last key piece of ansible is the humble variable system. Ansible yaml files can contain variables which control their behavior. Often these variables instruct the computer to download a new or otherwise specific program version, such as OpenSSL version 1.0.1f. They are also often used for machine specific configuration, such as naming the machine specially on DNS so everyone knows not to touch "production-load-balancer-plz-no-fail".
Variable rules are pretty simple: you define default variable values, then later you can overwrite them. There's a straightforward (if confusing) precedence order that interested parties can find in the docs. It is similar to: command line variables always win, then shell environmental ansible variables, then multiple levels of ansible yaml file rules, then finally a role's defaults/main.yml.
Because variables can be set anywhere are everywhere, this can lead to confusing and hard to debug situations with variable name clashes, until precedence rules are internalized.
A workflow for making a role

Let's walk through installing the bare essentials for any Mac OS X box: Google Chrome, Transmission torrent client, and VLC. You pay for HBO, but you want Game of Thrones anywhere, anytime, on any device.
It often makes sense to think at the role level of abstraction when writing ansible scripts. "This computer is a dev box configured with my settings, stored in environmental variables." You can use the ansible role manager (arm) application to scaffold new playbooks and roles with arm init -r {{ role_name }}. This will create the new role directory structure in the current working directory.
Once you've scaffolded the "media_mac" role, open the tasks/main.yml file (it may have the .arm suffix as well). Let's think about what needs to happen in order for the computer to be ready for use

Install Google Chrome
Install VLC
Install Transmission

Seems straight forward. Let's list these out:
---
# media_mac/tasks/main.yml
- name: Install Google Chrome
- name: Install VLC
- name: Install Transmission
How should we install these three apps? Why, the homebrew_cask module is perfect for this.
---
# media_mac/tasks/main.yml
- name: Install Google Chrome
  homebrew_cask: name=google-chrome state=present
Remember that we are declaring a state we want, in this case, please have google-chrome installed through homebrew_cask. We can also make the yaml more git line diff friendly by taking advantage of yaml syntax.
---
# media_mac/tasks/main.yml
- name: Install Google Chrome
  homebrew_cask: >
   name=google-chrome state=present
Now, we must test this role. Don't bother writing out the other two installations, there's no point if the google chrome one doesn't work. In order to imprint a role onto a computer, you need a playbook and a hosts file. Ansible can configure the computer it's run on, so configure your ansible_hosts file will look like this:
[self]
# IP         special host variable settings
127.0.0.1    ansible_connection=local
Now let's make a playbook, in playbooks/test.yml. Don't scaffold with arm yet, because we need to type this path often. This playbook is tiny:
---
- hosts: self
  roles:
  - role: media_mac
And now run ansible-playbook playbooks/test.yml... and the debugging starts. If you've installed homebrew, then used homebrew to install the cask command, then run the cask command, you set up ansible and its dependencies, and ansible hasn't changed yet, and this tutorial has all the required steps, and you're lucky, the command will work.
Let's update the role yaml to prevent you in the future from running into the homebrew problem. We're going to check to see if homebrew exists on the media_mac already. If homebrew was more programmer friendly or I was smarter, we would simply ensure homebrew's existence or install it, but right now we're going to push the problem onto future you, using the ansible stat module
The stat module lets you do light system fact checking at run time. You register the end result of the stat command, and then you can reference that result later. Here, we check to see if brew is installed, and choose to fail if it isn't.
---
# media_mac/tasks/main.yml
- name: check if homebrew is already installed
  stat: "path=/usr/local/bin/brew"
  register: brew_exists

- fail: msg="Please install homebrew with the ruby installer script, then cask, then run cask once for permissions reasons"
  when: brew_exists.stat.exists == False

- name: Install Google Chrome
...
Now that we've already started debugging, before we ever even get "hello world" working. Welcome to devops. Let's move on and hope nothing else bad happens and forces us to adjust our engineering estimate again.
Use Caskroom.io/search to discover that VLC and transmission can also be installed with homebrew_cask. Other installations might require unzipping a tar archive somewhere, or running an installation script with the shell command. Luckily for us, these things all exist already.
Now that you can install everything you need, let's do some configuration. Media Macs should be friendly to everyone, even the family dog. Let's add these apps to the dock. Normally, on a mac, that's an issue of messing around with an XML file called a preference list. Preference lists (plists) are similar to Yaml, but look like HTML with all those <words> tags.
Instead let's use dockutil, a python program which can manage the dock more easily than we can. Let's use brew for this.
- name: install /usr/local/bin/dockutil to manage the dock
  homebrew: >
    name=dockutil
    state=present
Note the /usr/local/bin/dockutil. This is used by the shell module to run dockutil. Prefer absolute paths if possible. Let's use dockutil to add the Google Chrome to the Dock.
- name: "add google chrome to the dock"
  shell: /usr/local/bin/dockutil --add "/opt/homebrew-cask/Caskroom/google-chrome/latest/Google Chrome.app"
Note that this task must run after the dockutil install command, otherwise it won't work on untouched computers. If you run this command again, there will be two Chromes. Oops. Let's fix that. First, let's collect the output of dockutil --list and then if "Google Chrome" is in that output, don't add another dock item.
- name: read defaults to know what to add to the dock
  shell: /usr/local/bin/dockutil --list
  register: dock_list

- name: "add google chrome to the dock"
  shell: /usr/local/bin/dockutil --add "/opt/homebrew-cask/Caskroom/google-chrome/latest/Google Chrome.app"
  when: dock_list.stdout.find("Google Chrome") == -1
Do that for the other two apps, and you're good to go. If you want to do more, check out the list of ansible modules and how to use them. Also check out the tips section below, as it illustrates how I develop with ansible.
Tips: to insure promptness

It takes a day or two to get used to ansible. This section should help past most of the ansible humps.
Debugging


Use the debug and assert modules to assist in debugging
Use the --step CLI flag to enable interactive mode
Use the --start-at-task CLI directive to skip to the step you're currently debugging
Run ps aux | grep ansible on the remote host to track the ansible process.
Run ps aux | grep {{ task_underlying_command }} to track the amount of CPU time a long running task has taken.
Understand ssh, privilege escalation, and ssh remote agents.

Getting better at ansible

Also check out ansible galaxy, and read through some other roles to see what's possible. Favor an iterative approach when building playbooks, knocking out installation problems as you go along. Combine tasks into meta tasks, and use variables to and loops to write less and do more. Favor actions which can be "OK"'d over "CHANGED", though not always necessary or possible.
Try starting specific, then becoming more abstract as the role grows. Knock one problem down at a time, and refactor and add variables once you know your patterns.
If you have the data you need to know whether or not to run a task, and just need to get that data into ansible, there's usually a way. Aside from computer fact gathering, you can offer a prompt to a user to ask for input. You can also share encrypted data (such as ssh keys?) with ansible vault. You can control ansible with anything, as lookups allow you to communicate with external API's. If you need certain programs to be installed on the same server rack, use ansible tags to control deployment to inventories.

  
## outline-cheatsheet.md

      
    Raw
  

              outline-cheatsheet.md
            
          
    How to write a great Ansible role / playbook / task

I am by no means an ansible expert, but I'm working on getting there


Ansible is a great tool

Fast to script / update
Easy to use and understand
Good abstractions


Since it's a DSL, there's a learning curve

Have to understand quite a bit before it "clicks"
Making new roles can be daunting, but it shouldn't be


High Level Overview

Everything boils down to idempotent module calls
Tasks call modules
Meta-tasks call tasks using includes (meta is my word)
Roles combine tasks with metadata to raise abstraction

tasks for the role
default variables if nothing else is set
files which must exist for the role to work
templates for files which must be created and configured
handlers are globally unique tasks which can be notified

"Handlers are best used to restart services and trigger reboots"


An inventory creates a mapping between SSH and human readable names
Playbooks combine hosts and roles
Variables can be set anywhere, and are everywhere

Because it's declarative, this isn't so bad
Can still cause debugging issues


Tools to aid development

ps aux | grep ansible
ps aux | grep {{ task_name }}
debug module
arm command line tool
A strong understanding of ssh, privilege escalation
Command line flag --step lets you interactively run a playbook


A workflow for making a new role

`arm init -r {{ role_name }}
open {{ role_name }}/tasks/main.yml.arm, remove .arm from name
list every step you know you need with - name:
write out the first task, don't use variables
make shell command to run the playbook

if necessary, add your sudo pass to it w/ --ask-become-pass


run the playbook, ensure the task works, check with ssh session

This is "running the test suite" in TDD


write out the next test, repeat
add #TODO's for edge cases like "homebrew cask can't install vagrant"
when the script works perfectly, or all edge cases are discovered, you're done
find patterns, turn them into their own task file, use includes
find constants, add them as a default variable
adding a shell command? add a when:  clause, maybe utilizing the stat module
need to do something conditionally? Inspect the stdout or stderr from a previously run task, using register


Tips: to insure promptness

Start Specific, Become Abstract

Avoid loops until you need them


When you use jinja templating, always add double quotes "{{ some_variable }}"
Use the greater than sign (>) for line-diffs in git
Use task includes to make meta-tasks
Always set a default value for a variable, because someone might use it in a conditional check and hell will break loose
Favor "OK" and "SKIPPED" over "CHANGED"

Use fact gathering and stat + register checking to your advantage


Write descriptive fail states to support the user self debugging
When developing or debugging, don't disable expensive or time consuming tasks, front load them and use --start-at-task to skip ahead of setup tasks
Follow the best practices around organization: http://docs.ansible.com/ansible/playbooks_best_practices.html#what-this-organization-enables-examples
If you have data somewhere, but don't know how to give it to ansible, check Special Topics: http://docs.ansible.com/ansible/playbooks_special_topics.html

Data in the user who is running the playbook: use prompt
Data must be encrypted: use vault
Data in external service: use lookups
Data describes machine instances: use tags


Examples:

Using > for git line diffs

- name: tap php cask
  homebrew_tap: >
    name=homebrew/php/
    state=present
Using creates to control shell command running

- name: install composer through php
  shell: curl -sS https://getcomposer.org/installer | php && mv composer.phar /usr/local/bin/composer
  args:
    creates: /usr/local/bin/composer
Using Check and Fail together

- name: check if vagrant is already installed
  stat: "path=/usr/local/bin/vagrant"
  register: vagrant_exists

- fail: msg="Please install vagrant with brew cask install vagrant"
  when: vagrant_exists.stat.exists == False
descriptive failure states

- name: check if vagrant is already installed
  stat: "path=/usr/local/bin/vagrant"
  register: vagrant_exists

- fail: msg="Please install vagrant with brew cask install vagrant"
  when: vagrant_exists.stat.exists == False
Also works if the user needs to make a file, can be used as a koan tool
# Prerequisite tasks to fail if there's nothing there
- name: ensure {{ ssh_key }} exists
  stat: "path=~/.ssh/{{ ssh_key }}"
  register: homestead_key

- fail: msg="Please create your {{ ssh_key }} or change your playbook variable ssh_key"
  when: homestead_key.stat.exists == False