Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save CMCDragonkai/e2cde09b688170fb84268cafe7a2b509 to your computer and use it in GitHub Desktop.
Save CMCDragonkai/e2cde09b688170fb84268cafe7a2b509 to your computer and use it in GitHub Desktop.
Shell Process Groups and the behaviour of CTRL+C and `kill` #cli

Shell Process Groups and the behaviour of CTRL+C and kill

Shells create process groups when running commands. This is true regardless of synchronous or asynchronous commmands.

Unix processes are not automatically supervisory processes (unlike Erlang). This extends Unix shells as well.

However things like pipe groups and terminal TTY shortcuts (CTRL+C) obscure this fact.

Let's clear this up.

Imagine we have a Python program ./script.py which will run forever, but also listen to signals, print them out and exit.

#!/usr/bin/env python3
import time
import signal

def sighandler(signum, frame):
    print('SIGNAL HANDLER called with SIGNAL:', signum)
    exit()

signal.signal(signal.SIGINT, sighandler)
signal.signal(signal.SIGTERM, sighandler)
signal.signal(signal.SIGQUIT, sighandler)

print('READY')

while True:
    print('LIVE')
    time.sleep(1)

Then we have a shell script ./orchestrate.sh that "orchestrates" the Python program:

#!/usr/bin/env sh

echo $$

# it does not matter if this was asynchronous using `&` and wait
python3 ./script.py

If you run this script, you'll get something like:

> ./orchestrate.sh
16933
READY
LIVE
LIVE
^CSIGNAL HANDLER called with SIGNAL: 2

What has happened is that you have used a terminal shortcut CTRL+C which is first handled by your terminal emulator. The terminal emulator will send SIGINT to the foreground process group 16933. This will interrupt both bash and the python program. Which results in the entire process hierarchy stopping.

However if you instead use kill -SIGINT 16933 from a different terminal, this will send SIGINT to just the bash process, but bash will not propagate the SIGINT to its python child process. The ./orchestrator exits, but ./script.py continues running.

To ensure that you are killing the entire process hierarchy you need to instead use kill -SIGINT -16933. The - prefix sends the signal to the process group instead.

Obviously our ./orchestrator.sh is not very robust. To make it robust, we must explicitly turn it into a supervisor.

To do this, we can use traps:

#!/usr/bin/env sh

echo $$

trap 'exit' INT QUIT TERM
trap 'kill -TERM 0' EXIT

python3 ./script.py

The above will trap SIGINT, SIGQUIT, SIGTERM and raise an EXIT condition.

The EXIT condition will then be handled by kill 0, which sends SIGTERM to the current process group. Note that SIGTERM is the default kill signal, however it is good to be explicit. Make sure to use -TERM for posix compatibility.

The above is just an example, your orchestrator traps should be customised to your situation.

Remember it is possible to receive signals multiple times. So the python signal handler can be called multiple times. You must make sure that the handler is idempotent.

Finally:

# sends SIGINT to just the PID
kill -INT <pid>
# sends SIGINT to the process group using the pid
kill -INT -<pid>

The only good way of creating shell script orchestrators is like this:

cleanup () {
  kill -TERM $(jobs -p) 2>/dev/null || true
}

trap 'exit' INT QUIT TERM
trap cleanup EXIT

comm &

{
  comm &
  comm &
  wait
} &

wait

If you use set -m inside a script, each individual command group will get its own process group ID.

If you don't then they inherit the shell's process group ID.

Session ID is meant to be associated to a "tty", or to "sessions" that is separate from a tty. So daemons get their own SID.

But also each terminal (pty or vty or tty) gets their own SID. An SID is for process grouping at the "session" level related to a terminal or non-terminal.

Process groups are intended for allowing one to signal a group together. Thus process trees don't really exist in Unix, there are process groups instead. Process trees still exist, but only through PID to PPID relationships.

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Jun 27, 2018

Oh it turns out that kill 0 is bad idea when you're a script, cause you inherit the same process group. (This means you could in fact kill the parent process as well, so you're not really simulating a process tree semantics).

Instead we have to use this:

cleanup () {
  kill -TERM $(jobs -p) 2>/dev/null || true
}

trap 'exit' INT QUIT TERM
trap cleanup EXIT

Unfortunately relying on jobs -p is a Bash specific behaviour. There's nothing in ZSH that I can find that allows you to get a simple list of the PIDs of the current job table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment