terminalmage/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Salt Debugging Best Practices

These notes should prove useful for those looking for tips on how to find and
fix bugs, as well as those who are developing Salt and would like to
improve/streamline the process.
NOTE: These notes are written from the perspective of a developer working
in a Linux environment. Those on MacOS may need to make some adjustments. Those
on Windows may need to make several adjustments.
Additionally, you will see several uses of $PWD in the CLI examples below.
It is expected that when you run these commands, you are doing so from the root
of a git checkout of Salt. This will mount the repo into the container so that
it can be used to run Salt.
What you will need


Docker
Git
Your text editor of choice

salt-docker

salt-docker is a tool which
builds docker images that have all the prerequisites to run Salt and its tests,
and launches containers with a clone of Salt mounted into the container. Within
the container, PATH and PYTHONPATH are set such that when you run Salt
commands, you are running them against the mounted-in copy of the Salt
codebase.
Review the
README
for salt-docker for help getting set up.
The images built by salt-docker can be used to set up reproducible test cases
to share with others (or include in a bug report). After launching into a
container and setting up some files under /srv/salt and/or /srv/pillar, install
needed packages, etc., you can then (from outside the container) run docker commit container_id user/image:tag to save the contents of the running
container with a new image name. If you are unsure of the container_id, it is
the hexidecimal string you see when you are in a salt-docker container:
(saltdev) root@45515ec019d1:/#

For this container, you can use 45515ec019d1 as the container_id when
committing a new image.
I like to use an image name of issues and then a tag number which references
an associated Github issue number, where applicable. An example of this would
be terminalmage/issues:12345. By naming it that way (with my Docker Hub
username included), I can do a docker push and share the image with others.
With this container pushed up to the Docker Hub, you can instruct someone to
run the container like so:
docker run --rm -it -v $PWD:/testing user/image:tag salt-call state.apply test
This would download and launch the container, and run the states in
/srv/salt/test.sls that were saved in the container when you ran docker commit.
Git Worktrees

If you have git 2.5 or newer, you should be using git worktrees. Normally, if
you're working on code in one branch, and you need to stop and work on
something else, you would need to stash your changes and then create a new
branch, then come back to the branch you were working on before and apply the
stash to continue working. Using worktrees, you have separate entire copies of
the repo in their own directories, but they all use the "main" checkout's
.git directory to store their metadata.
In my workflow, I have the Salt repo cloned to ~/git/salt/main. I never
write code in this directory. Whenever I have something to work on, I navigate
to ~/git/salt/main, switch to the branch from which I wish to make changes,
and create a worktree:
% git worktree add ../issue12345
This command does two things:

Creates a new worktree at the specified path
Creates a new branch issue12345 and checks it out in that directory

You can use -b branchname to specify the branch, and you can also specify a
revision to use when checking out the branch (the default is HEAD). If you do
not specify a branch, git will create one matching the basename of the path you
specify for the worktree.
To remove the worktree, just delete the directory and remove the branch. You
can also then run git worktree prune -v to clean up the worktree metadata.
rm -rf /path/to/worktrees/issue12345
git worktree prune -v
git branch -d issue12345
If you have pushed this branch to GitHub (for example, to open a PR), and no
longer need it, don't forget to clean up the old branch!
git push origin :issue12345
Git Bisect

Using a git bisect is a helpful way of discovering which commit caused a given
bug. Given two commits (one from before the bug appeared and one from after), a
binary search
algorithm is performed
(based on feedback from the user) to find the commit where the bug first
occurred.
Before you start a bisect, you need to find a commit where the bug does not
exist (i.e. a "good" commit), as well as one where the bug does exist (i.e. a
"bad" commit). It is important that the "good" commit is a direct ancestor of
the "bad" commit.
The easiest case for finding a good commit is when you know the bug does not
exist in one Salt release, but it does in another. In those cases you can
simply use the tags for those release as your good and bad commits. However,
when you don't have specific information about when the bug occurred, sometimes
the easiest way to find this "good" commit is to just do a hard-reset to
several commits before HEAD (e.g. git reset --hard HEAD~20) and keep trying
until the bug is no longer present.
Once you have the good and bad commits, it's time to start the bisect. To do
so, run git bisect start. You can then specify the good and bad commits:
git bisect good abcdef1
git bisect bad 012345a

Tags and other refs can also be used:
git bisect good v2018.3.1
git bisect bad HEAD

Once both a good and bad commit has been specified, git will point the repo at
the commit which is at the midpoint of the good and bad commits. From here, you
can run the code to see if the bug exists. If it does, run git bisect bad,
and if it does not, run git bisect good. Either way, this will point the repo
at another commit, and you can repeat running the code and run either git bisect bad or git bisect good. After at most about a dozen steps, the bisect
will be complete and git will tell you which commit was the first to contain
the bug.
Once you are done, or at any point during the bisect, you can run git bisect reset and git will point HEAD at the location it was at before you ran git bisect start.
Using salt-docker is great for
git bisects, as you can test the code from a fresh copy of the image for each
step of the bisect. As described above, you can set up a container with
everything in place to reproduce a bug, and then use docker commit to save
that setup to a new image. You can then use that image to run the code for each
step of the bisect:
docker run --rm -it -v $PWD:/testing user/image:tag salt-call state.apply foo
You could also just stay launched into a salt-docker container and run
salt-call state.apply foo for each step of the bisect.
Automated Git Bisects using salt-docker Docker Images

While git bisects can be run manually, they can also be automated using git bisect run <command>. The command will be repeated for each step of the
bisect, and the exit status of the command will be used to mark the commit
being tested as good/bad.
This requires a little extra setup at the beginning, but it it allows for the
entire bisect to run without any interaction.
You can write a shell script which runs salt, then does some sort of check to
see if the bug is present. For example, in the below script, imagine a bug
where the
sl
package fails to install, and the state fails. The below script will attempt
run a single state, and then check the output for a True result:
#!/bin/bash

# Ensure that the state output goes to the CLI so we can see the results of
# each step as it runs.
salt-call state.single pkg.installed name=sl | tee /tmp/out

# Look for a True result in the state's output
fgrep -q "Result: True" /tmp/out && exit 0 || exit 1
It's important here that your script returns 0 when the bug is not present, and
nonzero when it is. This is because an automated git bisect will use the return
code of the command you give it to determine whether the commit is "good" or
"bad".
Save your script as /test.sh, and don't forget to give it execute privileges
(chmod +x /test.sh), then docker commit your container to save it as an
image. You can then use this image to run an automated git bisect:
git bisect run docker run --rm -it -v $PWD:/testing user/image:tag /test.sh
For each step of the bisect, git will check out a commit, then run the docker run command it was given. If the shell script you wrote returns 0, it marks
the commit as "good", otherwise it marks it as "bad".
Don't forget, you still need to start the bisect and tell git your known "good"
and "bad" commits, before you use git bisect run to start automatically
bisecting. Otherwise, git won't know the correct range of commits to search.
git bisect start
git bisect good abcdef1
git bisect bad 012345a
git bisect run docker run --rm -it -v $PWD:/testing user/image:tag /test.sh

Consider the case where what you're testing takes a minute or two to run.
Waiting for each step to complete, and then manually marking the step as good
or bad, could take a while and keep you from getting other things done. But
with a little bit of extra setup, you can let git do the rest of the work for
you.
Troubleshooting States / Execution Modules (i.e. Stuff That Runs on a Minion)

Use Masterless

When testing something that runs on the minion, testing in masterless mode
offers a couple benefits:


No need to run a master or exchange keys, so it's much easier to set up your
test case


Runs in the foreground, making debuggers like pdb/pudb easy to use


To run in masterless mode, you would use salt-call instead of salt. In
addition, you must do one of two things:

Add --local to the salt-call command
Add file_client:  local to /etc/salt/minion

Any additional configuration (pillar, fileserver, etc.) must also be done in
/etc/salt/minion (or within /etc/salt/minion.d/somefile.conf) when running
masterless.
The salt-docker project
pre-configures images using file_client: local, so salt-call commands are
masterless by default in those images.
$ salt-docker centos7 salt-call pkg.version bash
local:
    4.2.46-29.el7_4
However, often it can be better to first launch into a shell in the container,
so that you can run multiple commands before the container exits.
$ salt-docker centos7
[root@60948f923223 /]# salt-call pkg.version zsh
local:
[root@60948f923223 /]# salt-call pkg.install zsh
local:
    ----------
    zsh:
        ----------
        new:
            5.0.2-28.el7
        old:
[root@60948f923223 /]# salt-call pkg.version zsh
local:
    5.0.2-28.el7
pudb

pudb is a console-based debugger that is a user-friendly alternative to the
pdb debugger in the Python stdlib.
To launch it, simply add the following line where you want to launch the
debugger:
import pudb; pu.db
When you run the function being tested, the debugger will start once execution
reaches that line of code, and you can use it to step through line-by-line.
pudb is easiest to use when you are running salt-call, but it has a remote
debugging component which can be used to test the master and other processes
which do not run in the foreground. More on this later.
pdb

Personally, I am a much bigger fan of pudb, but pdb has the benefit of being
part of the Python standard library. Launching it is similar to pudb:
import pdb; pdb.set_trace()
From here, you can do pretty much all of what pudb can do, the difference being
that you don't get a persistent view of the code as you step through. The last
command that you enter at the (pdb) prompt will be repeated if you hit
Enter without typing another command, so this can be used to repeat stepping
forward. If you use the l or list command it will show you the last few
lines before and after your current position, and subsequent repeats of this
command (if the current position hasn't been advanced by stepping forward) will
show the next several lines. This allows you to run l and then hit Enter a
few times and get a picture of the next 20-30 lines of code.
Troubleshooting the Master Using Remote PUDB

Launching a remote pudb session is slightly different than opening pudb in the
foreground. Since you will be using telnet to connect to the session, you must
tell it what the screen dimensions are so that pudb knows how large of a window
to draw:
from pudb.remote import set_trace
set_trace(term_size=(80, 24))
For best results, you should use a fullscreen terminal, and get the number of
columns and lines to pass to set_trace():
% tput cols; tput lines
174
40
By default, remote pudb will listen only on localhost:6899. To connect to
remote pudb on a Docker container, you should also pass the host parameter to
set_trace(). The port can also be specified using the port parameter. For
example:
from pudb.remote import set_trace
set_trace(term_size=(174, 40), host='0.0.0.0', port=9999)
When execution reaches the call to set_trace(), if possible pudb will write
to the console and tell you the port on which to connect. You can then telnet
to the container's IP on that port, and you will connect to the pudb session.
Some caveats to keep in mind when using remote pudb:


If multiple processes/threads hit the code path which launches pudb, then
pudb will start up separate debuggers for each, and will count up from the
initial port to select a listening port


The telnet interface is a little finicky. If your goal is to write a script
or something to check for an open port and then connect to it, the act of
probing for the open port (using nmap, netcat, etc.) will start the pudb
session and immediately end it, and by the time you try to connect the port
will already be closed and the session over. Best to just loop trying to
connect to telnet every N seconds and break from the loop if successful. I
wrote a few shell functions to work with debugging using Docker containers,
which I've shared alongside this document.


Using salt-docker to Assist in Developing Modules

If you're doing development on existing Salt code, or code that you plan to
submit upstream, then you can just edit files inside the git checkout you've
mounted into the salt-docker
container (i.e. within salt/modules/, salt/states/, etc.).
However, if you want to develop custom modules that you only plan to use
internally, you can separately mount the directory where these custom modules
reside as another volume. For example:
$ salt-docker --mount /path/to/custom/mods /var/cache/salt/minion/extmods centos7
This would mount /path/to/custom/mods into the location where custom modules
would normally be synced to (using one of the saltutil.sync_* functions).
Note however that in this case, the module would need to be in a subdirectory
of /path/to/custom/mods (i.e. /path/to/custom/mods/states for states,
/path/to/custom/mods/modules for execution modules, etc.). If you know that
you are only developing an execution module, you could instead mount
/path/to/custom/mods to /var/cache/salt/minion/extmods/modules.
If developing custom types that run on the master (e.g. runners), then you
would of course be mounting /path/to/custom/mods to the extmods dir in the
master cachedir (i.e. /var/cache/salt/master/extmods).
Running Tests

Using salt-docker

Whether troubleshooting a failing test, or attempting to run a test you are
writing, these images are good ways of easily running the test suite against
the code in the repository you've mounted into the container.
Note that the upstream documentation recommends running tests using nox.
However, nox attempts to set up a virtualenv and installs the test deps into
it, i.e. things that salt-docker already does. For that reason, you should
simply be able to run pytest directly.
First, launch into a container:
salt-docker centos7
This will get you a shell in that image. From here you can run pytest on a test
file directly. Note that salt-docker mounts the salt codebase at /testing, so
the path to the test file will be /testing/ followed by the path to the test
file, relative to the root of the git repo:
py.test -vvv /testing/tests/pytests/unit/test_fileclient.py
You can also run on entire directories full of test modules.
To run a smaller subset of tests, you can identify the tests you wish to run
using filename::funcname or filename::classname::funcname, for example:
py.test -vvv /testing/tests/pytests/unit/test_fileclient.py::test_fsclient_master_no_fs_update
Running a Debugger Within the Test Suite

Debuggers can be used in the test suite.
salt-docker has pudb
pre-installed, making it a great option.
For unit tests, just add import pudb; pu.db wherever you want to launch the
debugger, and make sure that you add --capture=no to your command when
running pytest (otherwise pudb won't work).
For integration tests, you will need to use remote pudb procedure launch the
debugger. However, when running integration tests, the helper functions to run
states/functions often invoke salt itself, so your set_trace may need to be
placed in the code being tested rather than the test module in order to get the
debugger to step through the code being tested.
There are some issues with running pudb (and to a smaller extent, pdb) within
unit tests where functions such as os.path.exists(), os.path.islink(),
os.path.isfile(), or os.path.isdir() are mocked. This is because the
mocking affects the debuggers as well, so any references to these functions
within pudb or pdb's source code could result in an error due to the outcome of
those functions being mocked. Thus, when writing tests which mock these
functions, the best approach is to use a MagicMock with a side_effect
rather than a return_value. For example:
import os
from tests.support.mock import MagicMock, patch, DEFAULT

isfile_mock = MagicMock(side_effect=lambda x: False if x == name else DEFAULT)
with patch(os.path, 'isfile', isfile_mock):
    assert somemod.somefunc(name)
The mock defined above will cause os.path.isfile() to return False if the
path matches whatever path is defined by the name variable, and will return
the actual result of running os.path.isfile() if the path is anything but
that. How you define your mocks will depend on the code being tested, and it
may not always be possible to know precisely which path(s) will need to have
their results mocked. But taking care when crafting mocks involving the
functions described above from os.path will make pudb/pdb run smoother in the
event that it becomes necessary to use a debugger to step through the code
being tested.
Miscellaneous Tips


When using salt-docker, most
of the time I find myself just working in a bash shell. In these cases, to
start the master/minion daemons you can use -d (e.g. salt-master -d or
salt-minion -d). If you want to stop the daemons, use pkill -f salt-master (or pkill -f salt-minion, or just pkill -f salt). The -f
flag to pkill will tell it to kill any process which has the associated
string in the process title.


It can also be helpful to run the master in the foreground with debug output
(e.g. salt-master -l debug. But this means that you lose your shell because
it will be taken up by salt running in the foreground. However, this is
easily worked around. Simply get the container_id before you start the
daemon (remember, it's in the prompt):
(saltdev) root@45515ec019d1:/#

You can then run docker exec -it 45515ec019d1 bash, and you will have
a new shell in that same container.


## docker_helpers.sh
# Feel free to add these to your shell RC file

function drun {
    docker run --rm -it -v "$PWD":/testing "$@"
}

function drun-systemd {
    local image=$1
    test -n "$2" && local container_name=$2 || local container_name="$image-systemd"
    if test -z "$image"; then
        echo "Missing image name!" 1>&2
        return 1
    fi
    docker run --detach --rm --name $container_name --hostname $container_name --cap-add SYS_ADMIN -v $PWD:/testing -v /sys/fs/cgroup:/sys/fs/cgroup:ro $image /usr/lib/systemd/systemd
}

function dgetip {
    local container=$1
    local network=$2
    if test -z "$container"; then
        echo "Missing container name!" 1>&2
        return 1
    fi
    local cfgpath
    test -n "$network" && cfgpath=".NetworkSettings.Networks.${network}.IPAddress" || cfgpath=".NetworkSettings.IPAddress"
    echo $(docker inspect --format "{{ $cfgpath }}" $container 2>/dev/null)
}

function dssh {
    local container=$1
    if test -z "$container"; then
        echo "Missing container name!" 1>&2
        return 1
    fi
    local network=$2
    ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null "root@$(dgetip $container $network)"
    test -n "$TMUX" && tmux set-window-option automatic-rename
}

function dport_open {
    local host=$1
    local port=$2
    test -z "$port" && return 1
    nmap -p "$port" "$host" 2>/dev/null | egrep -q "$port/tcp +open"
    return $?
}

function dtelnet {
    local container=$1
    local port=$2
    if test -z "$container"; then
        echo "Missing container name!" 1>&2
        return 1
    fi
    test -z "$port" && port=9999

    local ip=$(dgetip $container)
    if test -z "$ip"; then
        echo "Failed to get IP for container '$container'" 1>&2
        return 1
    fi

    while [ 1 ]; do
        telnet $ip $port 2>/dev/null && break
        echo "Waiting for port $port to open up on $container ($ip)..."
        sleep 1
    done

}

function dchildren () {
    local image_id=$1
    if test -z "$image_id"; then
        echo "Missing image ID!" 1>&2
        return 1
    fi
    local image
    local ret
    for image in $(docker images -q); do
        docker history -q $image | fgrep -q $image_id || continue
        for tag in $(docker inspect --format="{{.RepoTags}}" $image | cut -f2 -d'[' | cut -f1 -d']'); do
            ret="$ret\n$tag"
        done
    done
    echo "$ret" | egrep -v '^$' | sort -u
}
	# Feel free to add these to your shell RC file

	function drun {
	docker run --rm -it -v "$PWD":/testing "$@"
	}

	function drun-systemd {
	local image=$1
	test -n "$2" && local container_name=$2 \|\| local container_name="$image-systemd"
	if test -z "$image"; then
	echo "Missing image name!" 1>&2
	return 1
	fi
	docker run --detach --rm --name $container_name --hostname $container_name --cap-add SYS_ADMIN -v $PWD:/testing -v /sys/fs/cgroup:/sys/fs/cgroup:ro $image /usr/lib/systemd/systemd
	}

	function dgetip {
	local container=$1
	local network=$2
	if test -z "$container"; then
	echo "Missing container name!" 1>&2
	return 1
	fi
	local cfgpath
	test -n "$network" && cfgpath=".NetworkSettings.Networks.${network}.IPAddress" \|\| cfgpath=".NetworkSettings.IPAddress"
	echo $(docker inspect --format "{{ $cfgpath }}" $container 2>/dev/null)
	}

	function dssh {
	local container=$1
	if test -z "$container"; then
	echo "Missing container name!" 1>&2
	return 1
	fi
	local network=$2
	ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null "root@$(dgetip $container $network)"
	test -n "$TMUX" && tmux set-window-option automatic-rename
	}

	function dport_open {
	local host=$1
	local port=$2
	test -z "$port" && return 1
	nmap -p "$port" "$host" 2>/dev/null \| egrep -q "$port/tcp +open"
	return $?
	}

	function dtelnet {
	local container=$1
	local port=$2
	if test -z "$container"; then
	echo "Missing container name!" 1>&2
	return 1
	fi
	test -z "$port" && port=9999

	local ip=$(dgetip $container)
	if test -z "$ip"; then
	echo "Failed to get IP for container '$container'" 1>&2
	return 1
	fi

	while [ 1 ]; do
	telnet $ip $port 2>/dev/null && break
	echo "Waiting for port $port to open up on $container ($ip)..."
	sleep 1
	done

	}

	function dchildren () {
	local image_id=$1
	if test -z "$image_id"; then
	echo "Missing image ID!" 1>&2
	return 1
	fi
	local image
	local ret
	for image in $(docker images -q); do
	docker history -q $image \| fgrep -q $image_id \|\| continue
	for tag in $(docker inspect --format="{{.RepoTags}}" $image \| cut -f2 -d'[' \| cut -f1 -d']'); do
	ret="$ret\n$tag"
	done
	done
	echo "$ret" \| egrep -v '^$' \| sort -u
	}