Sleep is an executable that we can use to simulate a blocking process for the purposes of demonstrating how to handle child processes.
Here's our base case.
The execution pattern is going to occur in this way: zsh (interactive) -> bash (./parent.sh) -> bash (./child.sh) -> sleep
. Because both Bash processes performed an exec
without fork
, the bash (./parent.sh)
process will be replaced by Bash (./child.sh)
, which itself will be replaced by sleep
. This will mean that there are no child processes to manage except for the immediate child of sleep
. Upon sending SIGINT, the sleep
process will be terminated, and there will be no resource leak. No orphaned nor zombie processes.
./parent.sh
:
#!/usr/bin/env bash
exec ./child.sh
./child.sh
:
#!/usr/bin/env bash
exec sleep 1000
> alias pss='ps -A --format comm,pid,ppid,pgid,sid' # just aliasing the command to use for later
> ./parent.sh &
[1] 4254
> pss | grep 4254 # acquire the sleep process
sleep 4254 4014 4254 4014
> pss | grep 4014 # 4014 is the parent id and session id, which would the ZSH shell that launched the program
zsh 4014 4008 4014 4014
sleep 4254 4014 4254 4014
ps 4378 4014 4378 4014
grep 4379 4014 4378 4014
2 important things to consider here. Firstly, the final process hierarchy is:
zsh (4014) ----------+
| / \
| / \
| / \
| / \
sleep (4254) ps (4378) grep (4379)
Secondly, the PGID of sleep
is 4254
. This is the same ID as the PID of sleep
. We also notice that the PGID of ps
and grep
are also equal to the PID of ps
. It turns out that the process group with respect to shell job control, is just any group of processes launched together while joined with pipes or no pipes at all. In the case of sleep
, it stands alone. But ps | grep
are bundled together. So process groups don't tell us what shell they are part of, instead we need the session ID, which is in fact pointing to the zsh's session ID. Note that SID doesn't necessarily have to equal PID. But it does seem to occur often.
Note that ZSH's job control table will not show sleep
, instead it will only show ./parent.sh
. On the other hand ps
table will only show sleep
.
Now let's try a situation there is fork + exec. This time the parent will fork and exec to launch the ./child.sh
as a child process, and ./child.sh
will be replaced with sleep
. We should expect a process hierarchy of zsh
parenting bash
which parents sleep
.
./parent2.sh
:
#!/usr/bin/env bash
./child.sh
> ./parent2.sh &
[1] 4421
> pss | grep 4421
bash 4421 4014 4421 4014
sleep 4422 4421 4421 4014
> kill -TERM %1
[1] + terminated ./parent2.sh`
> pss | grep 4421
> pss | grep 4422
Here we see that that indeed, we do get 2 processes bash
and sleep
, both of which are part of the same PGID and SID, but different PPID. The process hierarchy is like this.
zsh (4014)
|
bash (4421)
|
sleep (4422)
Running kill
through ZSH's job control feature ends up killing the entire command, which kills the entire tree. There are no orphans, nor are there any zombies. Is it that ZSH is somehow propagating the SIGTERM signal to all processes in the group? Or is the ./parent
bash
process that is propagating SIGTERM to its child?
However, what if instead of going through ZSH's job control feature, we directly killed the ./parent
bash
process? What would happen then? Would we get an orphaned process?
> ./parent2.sh &
[1] 4516
> pss | grep 4516
bash 4516 4014 4516 4014
sleep 4517 4516 4516 4014
> kill -TERM 4516
[1] + terminated ./parent2.sh
> pss | grep 4516
sleep 4517 1 4516 4014
Now instead of killing entire process tree, we're left with an orphaned process. The ./parent
bash
process is gone. However the sleep
process has now become an orphaned process, and its PPID is now set to 1
which is PID 1 init. The PID 1 has inherited the orphan, and will wait for it to finish execution. What hasn't changed is its PGID and SID. Its PGID is still 4516, and at the moment, it is the only process in 4516 PGID. At the same because the SID hasn't changed, even if a process is an orphan, we can still know from which shell or "session" did the process launch from. I usually consider this a resource leak. But this is still a technique people use in order to launch daemons from foreground. Now when you exit from the shell, the shell will not send SIGHUP to the daemon, thus leaving it to run potentially forever.
Without a proper service wrapper like systemd or others, this fork, exec and disown is the main way to run daemons. Of course there are easier ways such as using the disown
and nohup
and &|
commands, than running a parent + child process and directly killing the parent.
Note that by default ZSH will send SIGHUP to all child processes, foregrounded or backgrounded. Only processes launched with nohup
or disowned will not be closed when ZSH closes. This however doesn't seem to happen in Cygwin ZSH. So that seems like a bug. Also for Bash, you need to enable the huponexit
option.
So it turns out that it cannot be the ./parent
bash
process that is propagating the SIGTERM before. It must be ZSH. It might be in fact using the entire PGID as it appears that PGID not only encapsulates commands that are joined via pipes, but parent child process trees too.
What if we kill the child directly instead of killing the parent?
> ./parent2.sh &
[1] 4589
> pss | grep 4589
bash 4589 4014 4589 4014
sleep 4590 4589 4589 4014
> kill -TERM 4590
./parent2.sh: line 3: 4590 Terminated ./child.sh~
[1] + exit 143 ./parent2.sh
> pss | grep 4589
Because the parent was running the child process synchronously, as the child was terminated, the exit status bubbled up to the parent, the parent process continues to work until it finishes and exits. In this case, the parent had nothing else to do, so it returned the child's exit status. This why the job status at the end is exit 143
rather than done
. However if we had changed our ./parent2.sh
to also run another task after ./child.sh
, then the job control would not report with exit 143
but instead with done
. This behaviour makes sense, if our parent is simply launching a child process without replacing, than we should expect that the child's exit status is bubbled up to the parent's exit status.
Let's investigate some more asynchronous behaviour.
./parent3.sh
:
#!/usr/bin/env bash
./child.sh &
> ./parent3.sh &
[1] 4647
[1] + done ./parent3.sh`
> jobs
> pss | grep 4647
sleep 4648 1 4647 4014
In this situation, this is equivalent to running disown
or in our above case, directly killing the parent. However, there the parent exited successfully, so since the parent finished, the child automatically becomes an orphan of PID 1. This is really important to understand, processes don't get inherited by their parent's parent, but simply by PID 1. It's kind of unintuitive.
To prevent the above from happening, we can make the parent process wait on backgrounded child processes.
./parent4.sh
:
#!/usr/bin/env bash
./child.sh &
wait
> ./parent4.sh &
[1] 4680
> pss | grep 4680
bash 4680 4014 4680 4014
sleep 4681 4680 4680 4014
> kill -TERM 4681
./parent4.sh: line 5: 4681 Terminated ./child.sh~
[1] + done ./parent4.sh
Notice how the difference is that wait
finishes successfully with exit 0
even if the child was terminated. After all, wait
did succeed in its directive.
Now what about adding in a grandparent?
./grandparent.sh
:
#!/usr/bin/env bash
./parent2.sh
> ./grandparent.sh &
[1] 4693
> jobs
[1] + running ./grandparent.sh
> pss | grep 4693
bash 4693 4014 4693 4014
bash 4694 4693 4693 4014
sleep 4695 4694 4693 4014
> kill -TERM 4693
[1] + terminated ./grandparent.sh
> pss | grep 4693
bash 4694 1 4693 4014
sleep 4695 4694 4693 4014
> kill -TERM 4694
> pss | grep 4693
sleep 4695 1 4693 4014
> kill -TERM 4695
> pss | grep 4693
As we can see, extending it to grandparents results in the same behaviour as a single parent and child. The entire process tree always has the same PGID starting from the initial top parent process. Now what do you think would happen if we killed the parent, but left the grandparent and child? I suspect this would cause the grandparent the terminate, but leave the child as an orphan.
> ./grandparent.sh &
[1] 4709
> pss | grep 4709
bash 4709 4014 4709 4014
bash 4710 4709 4709 4014
sleep 4711 4710 4709 4014
> kill -TERM 4710
./grandparent.sh: line 3: 4710 Terminated ./parent2.sh
[1] + exit 143 ./grandparent.sh
> pss | grep 4709
sleep 4711 1 4709 4014
We can see that terminating the middle process, makes the grandparent receive the exit status, while the child is just left to be an orphan. This is true even if the grandparent survives to do other things. The grandparent does not necessarily inherit the grandchild.
This gives us the surprising fact, that killing any process in an arbitrary process tree in Unix, does not directly imply that any other process will be killed. With regards to the process, its parents' simply receive the status code either from synchronous calls, or from a later wait
call. Its children are just inherited by the PID 1. Unix processes are simply by default not strict process trees. And they don't maintain tight relationships with each other.
So how do we get nicely behaved process trees, where killing children, allows parents to continue, but killing parents automatically kills all transitive children? This particular type of process is called supervisor processes. In order to create supervisor processes, we need to add some extra code to handle all sorts of signals. We need to handle termination signals, and propagate that to all children. We need to handle child termination signals, and reap their resources to avoid zombie processes. We need to consider whether PGID or SID come into play here? And most importantly we also need to consider what happens if our supervisor launches process trees that don't behave like itself, namely a supervisor? How does a supervisor deal with non-supervisory parents? This is all quite complicated, and now we can see why Unix/POSIX didn't make all processes by default supervisors. It's not trivial.
Note that SIGKILL and SIGSTOP cannot be caught or handled. This means using SIGKILL or SIGSTOP will always leave orphans if you are running process trees, supervisor or not. There's an exercise to figure out how to kill an entire process tree without relying on shell job control, and this can come up if for some reason, the parent supervisor was killed and left a process tree orphan.
For now, I shall leave it here. And later we can explore some Unix C based supervisors and compare them to Erlang supervisors.
- http://stackoverflow.com/questions/392022/best-way-to-kill-all-child-processes
- http://riccomini.name/posts/linux/2012-09-25-kill-subprocesses-linux-bash/
- https://en.wikipedia.org/wiki/Orphan_process
- http://www.linusakesson.net/programming/tty/
Updates regarding the above.
Running process1 | process2
, results in this dataflow architecture (assuming /dev/pts/0
is your attached terminal device for the shell):
/dev/pts/0 -> STDIN - process1 - STDOUT -> pipe0 -> STDIN - process2 - STDOUT -> /dev/pts/0
| |
STDERR STDERR
| |
v v
/dev/pts/0 /dev/pts/0
Use this instead:
alias pss='ps -A --format comm,pid,ppid,pgid,sid,stat,wchan,tty'
The stat
column shows process status, and can illuminate more about what's going on:
PROCESS STATE CODES
Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process:
D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped, either by a job control signal or because it is being traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by its parent.
For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group.
Pay attention to the +
and s
codes.
It isn't always true that an orphaned process will be inherited by PID 1. From
Linux 3.4 onwards, processes can call prctl
with the PR_SET_CHILD_SUBREAPER
option which means they will acquire parenthood of any grandorphans. This is
now implemented in both Upstart user instances and systemd user instances. This
gets us one step closer to Erlang style supervisor trees.
See: