Skip to content

Instantly share code, notes, and snippets.

@mhitza
Last active September 5, 2023 10:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mhitza/b6eb99314b45444e531ead74aa9ec406 to your computer and use it in GitHub Desktop.
Save mhitza/b6eb99314b45444e531ead74aa9ec406 to your computer and use it in GitHub Desktop.
Two lines of shell script in 4 hours of custom Ansible module development

In the last couple of days I've typed away at some Ansible automation, and one specific group of tasks I dealt with was the creation of a target symlink and parent directories. I wasn't able to do this in one go, and thus arrived at the following result:

- name: "absolute home {{ config_path }}"
  set_fact:
    absolute_home_config_path: "{{ user_home_dir + '/.' + config_path }}"

- name: "{{ absolute_home_config_path|dirname }} present"
  file:
    dest: "{{ absolute_home_config_path|dirname }}"
    state: directory
  become_user: "{{ username }}"

- name: "{{ config_name }}"
  file:
    src: "{{ playbook_dir }}/files/home/{{ config_path }}"
    dest: "{{ absolute_home_config_path }}"
    state: link
    force: true
    mode: "{{ config_mode if config_mode is defined else '0600' }}"

The shell script translation for the task above is:

mkdir --parents ~/.config/parent
ln --force --symbolic files/home/config/parent/file ~/.config/parent/file

While I don't mind the fact that Ansible is verbose - a fine trade-off for the extra properties it offers (such as idempotency) - when I run the playbook, the output feels rather cluttered for this specific project. It would be nice if the ansible.builtin.file module would be able to do this out of the box, but as things stand I started looking for alternatives.


First I combed through the documentation looking at all the various callback plugin available. As unintuitive as it may seem, callback plugins are used as output filtering tools. The one that came closest to what I've been trying to do was the selective plugin. To take advantage of this plugin, I needed to go through all my tasks and tag the ones I would like to get printed with 'print_action' For this automation, this was the single circumstance in which I would have liked to ignore output, and it didn't make sense to go through my playbooks and tag every other task.

Writing a module that combines those two tasks into a single unit is what I was going to attempt next.

While I've been writing Ansible playbooks for some years at this point, going as far back as 2013, I never needed to step outside the confines of the built-in primitives. I know some Python, Ansible is well established, plentiful of examples out there, how hard could it be? Right? Right?

I was expecting to write something quick based on available code out there.

Developing a custom Ansible module

The official documentation has an introductory page on "Developing modules".

After copying the boilerplate example Python code, which I've named deepsymlink.py, in a new ./library/ folder within my project, I was able to invoke this new module via:

ANSIBLE_LIBRARY=./library ansible -m deepsymlink -a "name=hello new=true" localhost

Arguments passed in via the -a flag are those found in the example code. At this point the single difference from the example code was the filename.

"Great, now I just have to find the file module code", thinking that I can just copy/paste the code paths I'm interested in, effectively merging the symlinking procedure with the one that recursively creates directories.

The state parameter of the file module switches the module behavior. With link as value it will create a symlink, and with directory it will create the desired directory and all its parents along the way.

Incidentally on the same day - after seeing it praised by random users on HN - I've signed up for a Kagi search trial account. While I tend to use a mix of Google and DuckDuckGo, I wanted to see how it fares in comparison for development sessions. This sidequest will be a good opportunity for review.

The builtin file module is a 987 line Python script, with ~220 lines of documentation strings (used in the generation of official docs). It became clear from the start that the patterns used within diverge visibly from the boilerplate template code.

After about half an hour of going through some cycles of code deletion, imports juggling, and reruns I didn't feel like I was making progress. In part because there were other operations at play, with some level of indirection, that I wasn't wrapping my head around.

"Can I write a module, that calls other modules?"

The answer to that questions is no!

Based on Konstantin Suvorov's StackOverflow answer, modules must be self-contained, as they run in isolation from one another on remote hosts. Couple things I didn't check at the time:

  • are these module restrictions mentioned anywhere in the official documentation?
  • if I'm running the tasks locally (connection: local), can I "cheat" and access the other modules anyway?

Developing a custom Ansible module plugin

Starting from the example posted in the StackOverflow answer, I had the following code.

from ansible.plugins.action import ActionBase

class ActionModule(ActionBase):
    def run(self, tmp=None, task_vars=None):
        result = super(ActionModule, self).run(tmp, task_vars)

        print(self._task.args.get('src'))
        print(self._task.args.get('dest'))
        pass

"This seems less boilerplate-y. I like it!". Although, when it came to testing, things didn't seem as straightforward. While skimming the documentation page, I didn't notice the fact that action plugins need to bear the name of a module which they "augment". This became clear when I looked over the bundled action plugins (I recognized module names in there).

Renamed my plugin to file.py and tried to run the file module hoping to see the extra outputs:

ANSIBLE_LIBRARY=./library ansible -m file -a "src=main.yaml dest=/tmp/main.yaml state=link" localhost
localhost | FAILED! => {
    "msg": "module (file) is missing interpreter line"
}

This error, based on reports I've seen on GitHub show up in a variety of circumstances when there's some name overlap between various files, playbooks, etc. For me, like another commenter, and which seems reasonable given the error message, was to add a Python shebang line to my script (#!/usr/bin/env python3). I'm not sure why this was necessary, as plugins are written in Python and none of the bundled plugins seem to have a shebang line.

While the action now progressed further, it wasn't outputting anything:

localhost | FAILED! => {
    "changed": false,
    "module_stderr": "",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 0
}

Spent some time with extra verbosity flags, a Display class, etc. But in the end progress stalled here. Without any output or error message, I did not know how to proceed.

Took a break, and came back ready to reconsider my approach. Since Ansible modules can be written in any language, and bash is generally available on the systems I'm targeting with this automation…

When in doubt, "bash" it out

First search result links to a well written guide on Ansible bash modules. Had to take a refresher on how to use jq to create objects without excessive string interpolation, and wrote the following deepsymlink script:

#!/usr/bin/env bash

# $1 - file that contains module arguments key=value pairs on separate lines
source "$1" 
changed=0

alias json="jq --null-input '\$ARGS.named'"

if [ ! -f "$src" ]; then
  json --arg msg "src=$src not found" --argjson failed 'true'
  exit 1
fi

realsrc=$(realpath "$src")

parent=$(dirname "$dest")

if [ ! -d "$parent" ]; then
  mkdir -p "$parent"
  changed=1
fi

if [ ! -L "$dest" ]; then
  ln -f -s "$realsrc" "$dest"
  changed=1
else 
  target=$(readlink -f "$dest")
  if [ "$realsrc" != "$target" ]; then
    ln -f -s "$realsrc" "$dest"
    changed=1
  fi
fi

json --argjson changed "$changed"

Encountered a bash: json: command not found error. Forgot that aliases aren't expanded by default when shell scripts run in non-interactive environments. StackOverflow answer, I need to set shopt -s expand_aliases somewhere within my bash script before I use the json alias.

ANSIBLE_LIBRARY=./library ansible -m deepsymlink -a 'src=main.yaml dest=/tmp/a/b/cx/y/z/.main.yaml' localhost
localhost | CHANGED => {
    "changed": 1
}

ANSIBLE_LIBRARY=./library ansible -m deepsymlink -a 'src=main.yaml dest=/tmp/a/b/cx/y/z/.main.yaml' localhost
localhost | SUCCESS => {
    "changed": 0
}

Good enough, and as a compromise I was fine with having jq as a dependency on my system (or target systems) for the shell script to reliably work. I'll find a better solution next time

A couple of days after the bash solution, I realized that I could have patched the file module more easily, without having to write a module from scratch. Even if I would have thought of this before, I would still have started with the same approach, but would have had a quick fallback plan. The clean, from scratch, approach is desirable after all.

Copied the source of the file module (exact revision link), and patched it out to always attempt to create parent directory hierarchy + symlink.

680,684d679
<     changed = module.set_fs_attributes_if_different(file_args, changed, diff, expand=False)
<     changed |= update_timestamp_for_file(file_args['path'], mtime, atime, diff)
<     if recurse:
<         changed |= recursive_set_attributes(b_path, follow, file_args, mtime, atime)
<
941a937,940
>     module.params['state'] = 'absent' if module.params['state'] == 'absent' else 'link'
>     module.params['force'] = True
>     if module.params['state'] == 'absent':
>         module.params['src'] = None
970,980c969
<     if state == 'file':
<         result = ensure_file_attributes(path, follow, timestamps)
<     elif state == 'directory':
<         result = ensure_directory(path, follow, recurse, timestamps)
<     elif state == 'link':
<         result = ensure_symlink(path, src, follow, force, timestamps)
<     elif state == 'hard':
<         result = ensure_hardlink(path, src, follow, force, timestamps)
<     elif state == 'touch':
<         result = execute_touch(path, follow, timestamps)
<     elif state == 'absent':
---
>     if state == 'absent':
981a971,973
>     else:
>         result = ensure_directory(os.path.dirname(path), follow, recurse, timestamps)
>         result = ensure_symlink(path, src, follow, force, timestamps)

If you have file.py and this diff saved in a file named deepsymlink.diff, you can apply the patch using: patch file.py deepsymlink.diff

The patched version:

  • makes more operational sense than an action plugin, and I'm able to actually test.
  • is dependency free whereas the bash version depends on jq.

Just have ChatGPT do it for you

I've experimented with both Copilot and ChatGPT around their release, to get a feel on how well they work. ChatGPT was fun on release, when it had fewer guardrails. They are not tools that I use actively use, thus I often forget about them as an option.

While writing this article, I thought that I should give it a try, since this task is: small, self-contained, and I don't doubt that Ansible code from somewhere™ got mixed into it's training data.

Quoted my prompt, and within the section separators ChatGPT's verbatim response.

create a custom Ansible module, that given a src= and a dest= argument, creates a symlink to the absolute path of src, at the destination path of dest. If the parent directory of the destination doesn't exist, recursively create those directories (a la mkdir -p)


Creating a custom Ansible module involves writing Python code that interacts with Ansible's API. Below is an example of a custom Ansible module that fulfills the requirements you've described. This module creates a symbolic link at the specified destination path to the absolute path of the source.

Save the following Python code in a file named create_symlink.py:

#!/usr/bin/python

from ansible.module_utils.basic import AnsibleModule
import os

def create_symlink(src, dest):
    try:
        os.makedirs(os.path.dirname(dest), exist_ok=True)  # Recursively create parent directories if needed
        os.symlink(os.path.abspath(src), dest)
        return True, None
    except Exception as e:
        return False, str(e)

def main():
    module = AnsibleModule(
        argument_spec=dict(
            src=dict(type='str', required=True),
            dest=dict(type='str', required=True)
        )
    )

    src = module.params['src']
    dest = module.params['dest']

    success, error_msg = create_symlink(src, dest)

    if success:
        module.exit_json(changed=True, msg=f"Symlink created from {src} to {dest}")
    else:
        module.fail_json(msg=f"Failed to create symlink: {error_msg}")

if __name__ == '__main__':
    main()

Make the script executable:

chmod +x create_symlink.py

You can now use this script as a custom Ansible module. Here's an example playbook that demonstrates how to use the module:

---
- name: Create Symbolic Link
  hosts: localhost
  tasks:
    - name: Create symlink
      create_symlink:
        src: /path/to/source/file
        dest: /path/to/destination/symlink
      become: yes

Replace /path/to/source/file with the actual source file path and /path/to/destination/symlink with the desired destination path for the symlink.

Remember to put the create_symlink.py script in the same directory as your playbook or specify the full path to the script in the create_symlink module invocation.


Things to note for the response generated. While I'm sure it doesn't hurt, the executable flag is not necessary, and neither is the shebang. If the module is going to run as a commandline tool (an example later), I still think in practice the non Python3 explicit shebang can still cause some runtime errors. Some older, yet still active, distribution still have the bin path python binary pointed at Python2 for compatibility with older software.

Don't quote me on this, as a non-Python developer I haven't kept up to date with what distros do nowadays, but I have encountered similar issues when sharing with people the instruction to run pip commands, when I should have told them instead to use pip3 due to system defaults.

Another thing to note is that if II create the sample playbook as shown in the example, and I place create_symlink.py withn the same directory it won't get just picked up during execution.

ls create_symlink.py 
create_symlink.py

ansible-playbook play.yaml
ERROR! couldn't resolve module/action 'create_symlink'. This often indicates a misspelling, missing collection, or incorrect module path.

The error appears to be in '/home/dm/Workspace/personal/workstation/library/play.yaml': line 4, column 7, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  tasks:
    - name: Create symlink
      ^ here

And as ar as the suggestion to "specify the full path to the script in the create_symlink module invocation" I don't know enough to know if something like that is possible or its just commingling different data it has about tasks (for example when using command/shell modules; when other script paths might be involved).

One of those ChatGPT chat lines made me curious. Since the modules themselves have a main section, they must be callable programs, so I've ran the module a couple of times until I figured out what the expected input looks like. Here's how to invoke an Ansible module on the commandline as a program:

echo '{ "ANSIBLE_MODULE_ARGS": { "src": "main.yaml", "dest": "/tmp/a/b/c/main.yaml" }}' | ./create_symlink.py

Excluded in this post, but I've played around more with ChatGPT, prompting it to refine the code to account for possible errors, and restarted the prompt to have it present alternative implementations. Each of those had some minor gotcha, but overall I could have used them for a baseline "from scratch" implementation and tack on the features the patched code supports (eg: file attributes, selinux context, ownership).

I got something that works well enough for an automation that I'm going to use just locally.

I did expect to spend 1-2 hours off course, yet by the time I finished the bash script 4 hours have passed. I didn't keep track of the time it took to rewrite all my notes into something I could publish, or the extra time it took to act on plan B and the ChatGPT solution exploration.

I have PyCharm installed locally for some OSS software. Maybe If I would have started with it I could have benefited on my first attempt with intellisense for the Ansible library, or help me slice through the file module faster. It could have helped me debug the runtime issues I've had with the action plugin approach. Having never used a step debugger with Python, I wasn't inclined to detour once again.

In the end an interesting experience diving into the topic, better than other debugging sessions I've had in the past. It also was the first time I actively set out to take notes with the intent on publishing an article at the end of the development/debugging session. I think I might (no promise) make a habit out of this going forward, even if most of the time I probably won't be publishing my results/notes.

As for my Kagi search review, my post got larger than I'd like, and the review itself seems to continue to expand as well. Will leave it out for another time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment