Skip to content

Instantly share code, notes, and snippets.

@abadger
Last active October 24, 2023 11:39
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save abadger/d3d1917cb9cb7b1cf31454f0c726e41b to your computer and use it in GitHub Desktop.
Save abadger/d3d1917cb9cb7b1cf31454f0c726e41b to your computer and use it in GitHub Desktop.
Documentation on how Ansible modules are made ready to be executed on a remote system.
=======
Modules
=======
This is an in-depth dive into understanding how Ansible makes use of modules.
It will be of use to people working on the portions of the Core Ansible Engine
that execute a module, it may be of interest to people writing Ansible Modules,
and people simply wanting to use Ansible Modules will likely want to read
a different paper.
Types of Modules
================
Ansible supports several different types of modules in its code base. Some of
these are for backwards compatibility and others to enable flexibility.
Action Plugins
--------------
Action Plugins look like modules to end users who are writing playbooks but
they're distinct entities for the purposes of this paper. Action Plugins
always execute on the controller. They sometimes are able to do all work there
(for instance, the debug action plugin which prints some text for the user to
see or the assert action plugin which can test whether several values in
a playbook satisfy certain criteria.)
More often, Action Plugins set up some values and environment on the controller
and then invoke an actual module on the managed node that does something with
these values. An easy to understand version of this is the template action
plugin. The template action plugin takes values from the user to construct
a file in a temporary location on the controller using variables from the
playbook environment. It then transfers the temporary file to a temporary file
on the remote system. After that, it invokes the copy module which operates on
the remote system to move the file into its final location, set file
permissions, and so on.
New-style Modules
-----------------
All of the modules that ship with Ansible fall into this category.
New-style modules have the arguments to the module embedded inside of them in
some manner. Non-new-style modules have to copy a separate file over to the
managed node which is less efficient as it requires two over-the-wire
connections instead of only one.
Python
^^^^^^
New-style python modules use the ziploader framework for constructing modules.
All official modules (shipped with Ansible) use either this or the Powershell
module framework.
These modules use imports from :code:`ansible.module_utils` in order to pull in
a lot of boilerplate module code such as argument parsing, formatting of return
values as json, and various file operations.
.. note:: In Ansible up to 2.0 the official python modules used the module
replacer framework. For module authors, ziploader is largely a superset of
module replacer functionality so you usually do not need to know about one
versus the other.
Powershell
^^^^^^^^^^
New-style powershell modules use the module replacer framework for constructing modules.
These modules get a library of powershell code embedded in them before being
sent to the managed node.
JSONARGS
^^^^^^^^
Scripts can arrange for an argument string to be placed within them by placing
the string ``<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>`` somewhere inside of the
file. The module will typically set a variable to that value like this::
json_arguments = """<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>"""
Which is expanded as::
json_arguments = """{"param1": "test's quotes", "param2": "\"To be or not to be\" - Hamlet"}"""
.. note:: Ansible outputs a json string with bare quotes. Double quotes are
used to quote string values, double quotes inside of string values are
backslash escaped, and single quotes may appear unescaped inside of
a string value. To use JSONARGS your scripting language must have a way
to handle this type of string. The example uses python's triple quoted
strings to do this. Other scripting languages may have a similar quote
character that won't be confused by any quotes in the json or it may
allow you to define your own start-of-quote and end-of-quote characters.
If the language doesn't give you any of these then you'll need to write
a ``Non-native wants json`` or old-style module instead.
The module will typically parse the contents of json_arguments using a json
library and then use them as native variables throughout hte rest of its code.
Non-native wants json modules
-----------------------------
If a module has the string "WANT_JSON" in it anywhere then Ansible will treat
it as a non-native module that accepts a filename as its only command line
parameter. The filename is for a temporary file containing a json string
containing th module's parameters. The module needs open the file, read and
parse the parameters, operate on the data, and print its return data as a json
encoded dictionary to stdout before exiting.
These types of modules are self-contained entities. As of Ansible 2.1, Ansible
only modifies them to change a shebang line if present.
.. seealso:: Examples of Non-native modules written in ruby are in the `Ansible
for Rubyists <https://github.com/ansible/ansible-for-rubyists>`_ repository.
Old-style Modules
-----------------
Old-style modules are similar to non-native wants json modules except that the
file that they take contains key=value pairs for their parameters instead of
json.
Ansible tells that a module is old style because it doesn't have any of the
markers that would show that it is one of the other types.
.. note:: this isn't BabyJSON is it?
How modules are executed
========================
When a user uses ansible or ansible-playbook, they specify a task to execute.
The task is usually the name of a module along with several parameters to be
passed to the module. Ansible takes these values and processes them in various
ways before they are finally executed on the remote machine.
executor/task_executor
----------------------
The TaskExecutor receives the module name and parameters that were parsed from
the playbook (or command line in the case of /usr/bin/ansible). It uses the
name to decide whether it's looking at a module or an action plugin. If it's
a module, it loads the ``normal`` action plugin and passes the name, variables,
and other information about the task and play to that Action Plugin for further
processing.
Normal Action Plugin
--------------------
The normal Action Plugin executes the module on the remote host. It is the
primary coordinator of much of the work to actually execute the module on the
managed machine.
* It takes care of creating a connection to the managed machine by
instantiating a Connection class according to the inventory configuration for
that host.
* It adds any internal ansible variables to the module's parameters (for
instance, the ones that pass along no_log to the module).
* It takes care of creating any temporary files on the remote machine and
cleans up afterwards.
* It does the actual work of pushing the module and module parameters to the
remote host although the module_common code described next does the work of
deciding which format those will take.
* Handles any special cases regarding modules (for instance, various
complications around Windows modules that need to have the same names as
python modules so that internal calling of modules from other Action Plugins
work.)
Much of this functionality comes from the BaseAction class which lives in
plugins/action/__init__.py. It makes use of Connection and Shell objets to do
its work.
executor/module_common.py
-------------------------
Code in executor/module_common.py takes care of assembling the module to be
shipped to the managed node. The module is first read in, then examined to
determine its type. Powershell and JSON-args modules are passed through module
replacer. New-style python modules are are assembled by ziploader.
Non-native-want_json and old style modules aren't touched by either of these
and pass through unchanged. After the assembling step, one final modification is
made to all modules that have a shebang line. We check whether the interpreter
in the shebang line has a specific path configured via an
ansible_$X_interpreter inventory variable. If it does we substitute that path
for the interpreter path given in the module. After this we return the
complete module data and the module type to the Normal Action which continues
execution of the module.
Next we'll go into some details of the two assembler frameworks.
module replacer
^^^^^^^^^^^^^^^
Module replacer is essentially a preprocessor (like the C Preprocessor for
those familiar with that language). It does straight substitutions of specific
substring patterns in the module file. There are two types of substitutions:
* Replacements that only happen in the module file. These are public
replacement strings that modules can utilize to get helpful boilerplate or
access to arguments.
- :code:`from ansible.module_utils.MOD_LIB_NAME import *` is replaced with the
contents of the :file:`ansible/module_utils/MOD_LIB_NAME.py` These should
only be used with new-style python modules.
- :code:`#<<INCLUDE_ANSIBLE_MODULE_COMMON>>` is equivalent to
:code:`from ansible.module_utils.basic import *` and should also only apply
to new-style python modules.
- :code:`# POWERSHELL_COMMON` substitutes the contents of
:file:`ansible/module_utils/powershell.ps1`. It should only be used with
new-style Powershell modules.
* Replacements that are used by ansible module_utils code. These are internal
replacement patterns. They may be used internally in the above public
replacements but shouldn't be used directly by modules.
- :code:`"<<ANSIBLE_VERSION>>"` is substituted with the ansible version. In
a new-style python module, it's better to use ``from ansible import
__version__`` and then use ``__version__`` instead.
- :code:`"<<INCLUDE_ANSIBLE_MODULE_COMPLEX_ARGS>>"` is substituted with
a string which is the python repr of the json encoded module parameters.
Using repr on the json string makes it safe to embed in a python file.
In new-style python modules, this will be passed in via an environment
variable instead.
- :code:`<<SELINUX_SPECIAL_FILESYSTEMS>>` substitutes a string which is
a comma separated list of filesystems which have a file system dependent
security context in selinux. In new-style python modules, this will be
found by looking up ``SELINUX_SPECIAL_FS`` from the
``ANSIBLE_MODULE_CONSTANTS`` environment variable. See the ziploader
documentation for details.
- :code:`<<INCLUDE_ANSIBLE_MODULE_JSON_ARGS>>` substitutes the module
parameters as a json string. Care must be taken to properly quote the
string as JSON data may contain quotes. JSON_ARGS is not substituted in
new-style python modules as they can get the module parameters via the
environment variable.
- the string :code:`syslog.LOG_USER` is replaced wherever it occurs with the
value of ``syslog_facility`` from the ansible.cfg or any
ansible_syslog_facility inventory variable that applies to this host. In
new-style python modules you can get the value of the ``syslog_facility``
by looking up ``SYSLOG_FACILITY`` in the ``ANSIBLE_MODULE_CONSTANTS``
environment variable. See the ziploader documentation for details.
.. attention:: Confirm that all of the things in the second category can be
considered internal and not to be used in module code. We want to stop
supporting them where a replacement exists (ie: everything except JSON_ARGS)
If we cannot stop supporting them for backwards compatibility confirm if we
can at least limit them to the module itself (module_utils code shipped
with ansible will be ported to the ziploader constructs. Custom
module_utils code is not supported).
ziploader
^^^^^^^^^
Ziploader differs from module_replacer in that it uses real python imports of
things in module_utils instead of merely preprocessing the module. It does
this by constructing a zipfile of the module, files in ansible/module_utils
that are imported by the module, and some boilerplate to pass in the constants.
The zipfile is then base64 encoded and wrapped in a small python script which
unzips the file on the managed node and then invokes python on the file. (We
have to wrap the zipfile in the python script so that pipelining will work.)
In ziploader, any imports of python modules from the ``ansible.module_utils``
package trigger inclusion of that python file into the zipfile. Instances of
:code:`#<<INCLUDE_ANSIBLE_MODULE_COMMON>>` in the module are turned into
:code:`from ansible.module_utils.basic import *` and
:file:`ansible/module-utils/basic.py` is then included in the zipfile. Files
that are included from module_utils are themselves scanned for imports of other
python modules from module_utils to be included in the zipfile as well.
Passing args
~~~~~~~~~~~~
In module replacer, module arguments are turned into a json-ified string and
substituted into the combined module file. In ziploader, the json-ified string
is placed in the the ANSIBLE_MODULE_ARGS environment variable. When
ansible.module_utils.basic is imported it places this string in the global
variable ``ansible.module_utils.basic.MODULE_COMPLEX_ARGS`` and removes it from
the environment. Modules probably should not access this variable directly.
Instead, they should instantiate an AnsibleModule() and use
AnsibleModule.params to access the parsed version of the arguments.
Passing constants
~~~~~~~~~~~~~~~~~
Currently there are three constants passed from the controller to the modules:
ANSIBLE_VERSION, SELINUX_SPECIAL_FS and SYSLOG_FACILITY. In module replacer,
ANSIBLE_VERSION and SELINUX_SPECIAL_FS were substituted into the global variables
``ansible.module_utils.basic.ANSIBLE_VERSION`` and
``ansible.module_utils.basic.SELINUX_SPECIAL_FS``. ``SYSLOG_FACILITY`` didn't
get placed into a variable. Instead, any occurrences of the string
``syslog.LOG_USER`` were replaced with ``syslog.`` followed by the string
contained in SYSLOG_FACILITY.
The ansible verison can now be used by a module by importing __version__ from ansible::
from ansible import __version__
module.exit_json({'msg': 'module invoked by ansible %s' % __version__})
For now ANSIBLE_VERSION is also available at its old location inside of
``ansible.module_utils.basic`` but that will eventually be removed.
SELINUX_SPECIAL_FS and SYSLOG_FACILITY have changed much more. ziploader
passes these as another json-ified string inside of the
``ANSIBLE_MODULE_CONSTANTS`` environment variable. When
``ansible.module_utils.basic`` is imported it places this string in the global
variable ``ansible.module_utils.basic.MODULE_CONSTANTS`` and removes it from
the environment. The constants are parsed when an AnsibleModule is
instantiated. Modules shouldn't access any of those directly. Instead they
should instantiate an AnsibleModule() and use AnsibleModule.constants to access
the parsed version of these values.
Unlike the ANSIBLE_ARGS and ANSIBLE_VERSION where some efforts were made to
keep the old backwards compatible globals available, these two constants are
not available at their old names. This is a combination of the degree to which
these are internal to the needs of module_utils.basic and (in the case of
SYSLOG_FACILITY) how hacky and unsafe the previous implementation was.
Porting code from the module replacer method of getting SYSLOG_FACILITY to the
new one is a little more tricky than the other constants and args due to just
how hacky the old way was. Here's an example of using it in the new way::
import syslog
facility_name = module.constants.get('SYSLOG_FACILITY')
facility = getattr(syslog, facility_name)
syslog.openlog(str(module), 0, facility)
Special Considerations
----------------------
Pipelining
^^^^^^^^^^
Ansible can transfer a module to a remote machine in two ways. Either it can
write out the module to a temporary file on the remote host and then use
a second connection to the remote host to execute it with the interpreter that
the module needs or it can use what's known as pipelining to execute the module
by piping it into the remote interpreter's stdin. Pipelining only works with
modules written in python at this time becaues Ansible only knows that python
supports this mode of operation. Supporting pipelining means that whatever
format the module payload takes before being sent over the wire must be
executable by python via stdin.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment