JM1/Ansible_Roles_with_OS-specific_Defaults.md

## Ansible_Roles_with_OS-specific_Defaults.md

      
    Raw
  

              Ansible_Roles_with_OS-specific_Defaults.md
            
          
    Ansible Roles with OS-specific Defaults

This Ansible guide discusses several approaches on how to set different
role default variables
based / depending on the host operating system aka ansible_distribution /
ansible_facts.distribution or other variables. For example, a role
variable image_uri should point to the latest cloud image for the host.
For CentOS 8 or Red Hat Enterprise Linux (RHEL) 8 the default value
should be:
image_uri: 'https://cloud.centos.org/centos/8/x86_64/images/CentOS-8-GenericCloud-8.2.2004-20200611.2.x86_64.qcow2'
For Ubuntu 20.04 LTS (Focal Fossa) it is supposed to be:
image_uri: 'https://cloud-images.ubuntu.com/focal/20200616/focal-server-cloudimg-amd64.img'
This guide applies to Ansible 2.9 and later, up to the latest (21.06.2020)
revision in Ansible's devel branch on GitHub.com.
First off, Ansible loads default variables from defaults/main.yml file in
the role directory. Role default variables have a very low precedence /
priority in comparison to variables defined in other places:

Anything that goes into “role defaults” (the defaults folder
inside the role) is the most malleable and easily overridden

For details see Variable precedence: Where should I put a variable?.
Approach using include_vars

Put the supposed-to-be-default variables into distinct files in the role's
vars/ folder and load these files with include_vars:

vars/Ubuntu.yml, vars/CentOS.yml etc.:
image_uri: https://...

tasks/main.yml:
- name: Load OS-specific variables
  include_vars: '{{ ansible_facts.distribution }}.yml'

- name: Do something with variable
  debug:
    var: image_uri


Downsides:
The big issue is, that variables which are loaded with include_vars
have more precedence than variables from most other places, e.g. they
override variables in group_vars and host_vars. Most often, this
behaviour is not wanted for role defaults.
Approach using prefixed variables, include_vars and conditional set_fact

The intention here is, to give variables from e.g. host_vars and group_vars
a higher priority over default role variables. To achieve this, include_vars
is combined with a conditional set_fact:
Add a prefix such as double underscores __ to the default variables defined
in the vars/ folder:

vars/Ubuntu.yml, vars/CentOS.yml etc.:
__image_uri: https://...


Load the variables using include_vars in tasks/main.yml but assign the
non-prefixed variable only if it has not been defined yet:

tasks/main.yml:
- name: Load OS-specific default variables
  include_vars: '{{ ansible_facts.distribution }}.yml'

- name: Set image_uri variable to default value
  set_fact:
    image_uri: "{{ __image_uri }}"
  when: image_uri|default(None) == None

- name: Do something with variable
  debug:
    var: image_uri


The conditional set_fact is a workaround for the high variable precedence
of include_vars which caused unwanted side effects in the previous approach.
This approach is used by Jeff Geerling (@geerlingguy).
Downsides:
To allow multiple role executions, e.g. using import_role or include_role,
non-prefixed variables may have to be "undefined" i.e. reset to !!null / none:

tasks/main.yml:
- name: Load OS-specific default variables
  include_vars: '{{ ansible_facts.distribution }}.yml'

- name: Set image_uri variable to default value
  set_fact:
    image_uri: "{{ __image_uri }}"
  when: image_uri|default(None) == None

- name: Do something with variable
  debug:
    var: image_uri

- name: Cleanup role variables
  set_fact:
    image_uri: !!null


Else subsequent role executions might be affected by previous role executions.
Setting variables to !!null has side effects though. Suppose one default
(prefixed) variable is a Jinja2 Template
that uses a non-prefixed variable, such as:
__image: '{{image_uri|urlsplit("path")|basename}}'
Later one dumps all variables with e.g.:
- name: List all known variables and facts
  debug:
    var: hostvars
Ansible will try to evaluate __image but fails because image_uri has been
set to !!null / none during the variable cleanup at the end of the role.
Hence any Jinja2 template in default (prefixed) variables must handle invalid
and !!null values properly to avoid those 'NoneType' object errors.
One might be tempted to workaround this by cleaning default variables as well:
- name: Load OS-specific default variables
  include_vars: '{{ ansible_facts.distribution }}.yml'

- name: Set image_uri variable to default value
  set_fact:
    image_uri: "{{ __image_uri }}"
  when: image_uri|default(None) == None

- name: Set image variable to default value
  set_fact:
    image: "{{ __image }}"
  when: image|default(None) == None

- name: Cleanup role default variables
  set_fact:
    __image_uri: !!null
    __image: !!null

- name: Do something with variables
  debug:
    msg: '{{ image_uri }} / {{ image }}'

- name: Cleanup role variables
  set_fact:
    image_uri: !!null
    image: !!null
Unfortunately this will prohibit subsequent role executions, because set_fact
has precedence over include_vars. Hence once e.g. __image has been set to
!!null using set_fact a subsequent call to include_vars won't change that
nullified value back to the value defined in vars/*.yml files.
Another drawback is that set_fact causes Ansible to immediately evaluate
and template variables:

Because of the nature of tasks, set_fact will produce ‘static’ values for
a variable. Unlike normal ‘lazy’ variables, the value gets evaluated and
templated on assignment.

Approach using include_vars with os_vars dictionary and conditional set_fact

This approach is similar to the previous one. First, use include_vars to fetch
default variables from vars/ folder. But instead of making them top level variables,
assign them into a variable named os_vars. Then loop through all variables in
os_vars and set them as top level variables if no variable with the same name
already exist, i.e. they have not been defined by the user:


vars/Ubuntu.yml, vars/CentOS.yml etc.:
image_uri: https://...


tasks/main.yml:
- name: Fetch OS dependent variables
  include_vars:
    file: '{{ item }}'
    name: 'os_vars'
  with_first_found:
    - files:
        - '{{ ansible_facts.distribution }}_{{ ansible_facts.distribution_major_version }}.yml'
        - '{{ ansible_facts.distribution }}.yml'
        - '{{ ansible_facts.os_family }}_{{ ansible_facts.distribution_major_version }}.yml'
        - '{{ ansible_facts.os_family }}.yml'
      skip: true

# we only override variables with our default, if they have not been specified already
# by default the lookup functions finds all varnames containing the string, therefore
# we add ^ and $ to denote start and end of string, so this returns only exact matches
- name: Set OS dependent variables, if not already defined by user  # noqa var-naming
  set_fact:
    '{{ item.key }}': '{{ item.value }}'
  when: "not lookup('varnames', '^' + item.key + '$')"
  loop: '{{ os_vars|dict2items }}'

- name: Do something with variables
  debug:
    var: image_uri


This approach is used in Ansible collection devsec.hardening,
e.g. refer to roles/ssh_hardening/tasks/hardening.yml.
Downsides:
Using set_fact causes Ansible to immediately evaluate and template variables:

Because of the nature of tasks, set_fact will produce ‘static’ values for
a variable. Unlike normal ‘lazy’ variables, the value gets evaluated and
templated on assignment.

Hence variables defined in vars/, loaded with include_vars and set with
set_fact cannot include references to variables from the same file, because
Ansible does not lazy evaluate those variables. For example:

vars/Ubuntu.yml:
conf_dir: /etc/foo
conf_file: "{{ conf_dir }}/foo.conf"


This will fail with an 'conf_dir' is undefined error if conf_dir has
not been defined outside of vars/Ubuntu.yml before calling set_fact.
Approach using custom include_defaults plugin

Ansible Plugin include_defaults
has been developed by Daniele Varrazzo (@dvarrazzo). But:

Warning! unfortunately this implementation of include_defaults has an issue:
because it changes some data structures in-place it doesn't work when ansible
runs in parallel on many hosts, because the process forks and the modified
variables get lost.

The author does suggest some workarounds though.
Details
include_defaults has been proposed for inclusion
in Ansible but the pull request has been rejected for these reasons:


we've discussed this topic on ansible-devel and cannot pin down a use case
where this can't be modelled more idiomatically through other ansible-means
we believe introducing extra syntax for this feature would add to complexity
in learning the application that we would like to solve through more
idiomatic means


Approach using lookup('file', ...)

This approach uses Lookup Plugins
and indirections in defaults/main.yml to load the OS-specific default
variables. Hence, role default variables have the intended precedence.
Put default variables into distinct files in the role's defaults/ folder:

defaults/Ubuntu.yml, defaults/CentOS.yml etc.:
image_uri: https://...


Use the file lookup
to load the OS-specific variables from disk and then convert this string to a dict
using the from_yaml filter:

tasks/main.yml:
- name: Load OS-specific default variables
  set_fact:
    role_default_vars: |
        {{ lookup('file', '../defaults/' + ansible_facts.distribution + '.yml')|from_yaml }}

- name: Do something with variable
  debug:
    var: image_uri


The role's defaults/main.yml then uses indirections to initialize
default variables from the role_default_vars dictionary:

defaults/main.yml:
image_uri: "{{ role_default_vars['image_uri'] }}"


Downsides:
One assumption, that must be satisfied, is that the set of variables must be
the same across all OS's.
The lookup('file', ...) call does not render any Jinja2 Template,
hence e.g. image: '{{image_uri|urlsplit("path")|basename}}' will not evaluate
to a filename, instead it will contain the raw string
{{image_uri|urlsplit("path")|basename}}. The template lookup plugin
would render templates inside the defaults/*.yml files immediately during load.
But template evaluation is done before the from_yaml filter has been executed,
hence if a template inside defaults/*.yml uses any default variable that is
defined inside the same file, then Ansible may raise errors because this
variable has not yet been defined.
One has to force Ansible to render those templated default variables after the
indirection inside defaults/main.yml or later on their first use.
Unfortunately, Ansible does not provide any filters that render templates
and a custom filter plugin does not work either: The template rendering
is done in class Templar
but no (?) instance of this class is available inside the FilterModule classes.
An instance of Templar is available to the LookupModule
though. Hence a custom LookupModule class allows to force Ansible into rendering
the templates, e.g. inside defaults/main.yml. An example lookup plugin might look
like this:


NAMESPACE/COLLECTION/plugins/lookup/template.py (irrelevant code stripped for the sake of brevity):
class LookupModule(LookupBase):

    def run(self, terms, variables=None, **kwargs):
        if variables is not None:
            self._templar.available_variables = variables

        ret = []
        for term in terms:

            if isinstance(term, AnsibleUnsafeBytes):
                term = super(AnsibleUnsafeBytes, term).decode().encode()
            elif isinstance(term, AnsibleUnsafeText):
                term = super(AnsibleUnsafeText, term).encode().decode()

            if not isinstance(term, string_types):
                raise AnsibleError('Invalid setting identifier, "%s" is not a string, its a %s' % (term, type(term)))

            ret.append(self._templar.template(term, fail_on_undefined=True))
        return ret


The eagle-eyed reader might wonder about super(AnsibleUnsafeText, term).encode().decode():
Ansible marks text (i.e. bytes and strings), that is assigned using set_fact,
as unsafe.
In practice, Ansible wraps unsafe texts in AnsibleUnsafe
objects. For example, all variables inside the role_default_vars dictionary
are marked unsafe. Unsafe variables are skipped during template rendering.
To remove the outer AnsibleUnsafe wrapper, strings are encoded to bytes
and decoded back to strings.
Side note:
Lookup plugins do provide an allow_unsafe=True argument, which skips this
unsafe wrapper, but this only applies to the current evaluation context:
Once task set_fact: { role_default_vars: "{{ lookup('file', ..., allow_unsafe=True)|from_yaml }}" }
has been completed, all entries inside the role_default_vars dictionary
are unsafe (AnsibleUnsafe) texts ultimately. One cannot simply call the
custom lookup plugin inside the same evaluation context for the same reason
it is not possible to use the template lookup plugin here.
Let's get back to how to use the custom template.py lookup plugin:

defaults/main.yml:
image_uri: "{{ lookup('NAMESPACE.COLLECTION.template', role_default_vars['image_uri']) }}"


First, variable image_uri is extracted from the role_default_vars dict,
then plugins/lookup/template.py removes the AnsibleUnsafe wrapper and
uses class Templar
to render the Jinja2 template. This works because Ansible delays these steps
until image_uri is used actually.
NOTE: It still has to be determined whether this approach causes side effects.
Approach using modified group variable precedence merge order

Change Ansible's group variable precedence rules with
configuration setting VARIABLE_PRECEDENCE
as explained by George Shuklin.
Downsides:
Ansible only allows to change merge order of group variables.
It is not possible to completely override Ansible's variable
precedence rules.
Changing the group variable precedence rules might cause
conflicts with external Ansible content, i.e. third party
roles from Ansible Galaxy which most likely assume default
precedence rules.
Messing with variable precedence rules might cause confusion
for external developers and might be counterintuitive even for
developers working on the project.
Approach using OS-agnostic dictionaries in defaults/main.yml

Create OS-agnostic dictionaries in defaults/main.yml and assign suitable
values from those dictionaries to default variables using as keys e.g.
ansible_facts.distribution:

defaults/main.yml:
image_uri: |-
    {{
        {
            'CentOS': 'https://...',
            'Ubuntu': 'https://...'
        }[ansible_facts.distribution]
    }}


This approach is used in Ansible collection jm1.cloudy,
e.g. refer to roles/tftpd/defaults/main.yml.
Downsides:
As before, one assumption that must be satisfied is that the set of
variables must be the same across all OS's.
With an increasing number of variables and operating systems the
syntax might get hard to opaque.
Author

Jakob Meng
@jm1 (github, galaxy, web)