Set up init process for your container in Kubernetes

Roaming Roadster
3 min readOct 21, 2023

--

If you create your a in Kubernetes, you may see a process tree like this in your container:

$ kubectl exec <YOUR_POD> -n <YOUR_NAMESPACE> -c <YOUR_CONTAINER> -it -- bash
...
...
$ ps -aef --forest
UID PID PPID C STIME TTY TIME CMD
app 59 0 0 20:59 pts/1 00:00:00 bash
app 78 59 0 20:59 pts/1 00:00:00 \_ ps -aef --forest
app 1 0 0 20:57 ? 00:00:00 /bin/bash
app 7 1 7 20:57 ? 00:02:29 /java -Dcom........

You can see that bash (not the pts one) is PID 1 and is the parent of your Java program. If you don’t use bash, you may see your program as PID 1.

In linux world, PID 1 is the init process. When you run a docker container, PID 1is whatever you set as ENTRYPOINT in your docker file.

What does PID 1 do?

As an init process, PID 1 is supposed to do 2 things:

  • signal forwarding: PID 1 should catch signals such as SIGTERM and forward to its child processes.
  • reap zombies: when a zombie process is created, it is re-parented to PID 1. PID 1 should reap the zombie processes.

Signal forward

Let’s say you are running a bash script as the init process in your container. By default bash script does not catch signals and therefore not forward them to child processes. So if you run docker stop, your container will be stopped after the grace period.

When you run docker stop, it does 2 things:

  • send SIGTERM to your init process
  • wait for the process to terminate, or wait until the grace period(default 10 seconds) and then send SIGKILL

Ideally your init program should handle SIGTERM and do some cleanup work. For example, your init program should forward them to the child processes to make sure they are terminated properly. You can use trap command to catch and handle signals, for example:

#!/bin/bash 

handler() {
kill -TERM "$child" 2>/dev/null
}

trap handler SIGTERM

/my_command &

child=$!
wait "$child"

Reap zombies

A zombie process is created when

  • it finishes
  • its parent is gone or its parent does not reap it

You can easily produce a zombie process. For example,

docker run -d --rm --name my_app centos bash -c "sleep 10 & exec sleep 100"

exec will replace the current shell with the command. So you can see bash is supposed to be PID 1 but is replaced by sleep 100 (after 10 seconds):

UID        PID  PPID  C STIME TTY          TIME CMD
app 1 0 0 12:14 ? 00:00:00 sleep 100
app 7 1 0 12:14 ? 00:00:00 [sleep] <defunct>

Because bash(which was the parent of sleep 10) is gone, so after sleep 10 finishes, it becomes defunct and re-parented to PID 1. Now it is a zombie process and because sleep 100 (PID 1) does not reap zombies, sleep 10 will stay there forever until your container dies.

Use tini as the init process

If you run your program as PID 1, your program should handle signals properly and reap zombies, which could be a cumbersome task if you need to do this for every program. Fortunately, there is a small program called tini which does these for you. There are two ways to use it:

  • Run docker run --init. It is included in docker now.
$  ps -ef
UID PID PPID C STIME TTY TIME CMD
app 1 0 0 07:48 pts/0 00:00:00 /sbin/docker-init -- /bin/bash
app 7 1 0 07:48 pts/0 00:00:00 /bin/bash
app 16 7 0 07:50 pts/0 00:00:00 ps -ef

You can see that docker-init is PID 1 now.

  • If you are using other container runtime such as containerd, you can add tini in your pod spec. Simply use it to start your program:
command:
- /bin/tini
- --
- /bin/bash
- .....

After running tini, you can see your processes in your container:

$ ps -aef --forest
UID PID PPID C STIME TTY TIME CMD
app 253 0 0 21:10 pts/0 00:00:00 bash
app 334 253 0 21:10 pts/0 00:00:00 \_ ps -aef --forest
app 1 0 0 21:09 ? 00:00:00 tini
app 6 1 0 21:09 ? 00:00:00 /bin/bash
app 8 6 0 21:09 ? 00:00:00 \_ sleep 365d
app 100 1 99 21:09 ? 00:01:22 /bin/java -Dcom....

Now tini is PID 1 and other processes are parented to it.

Tini has other use cases too. If you are interested, you can check the official website(detail explanation here).

Conclusions

  • A common pitfall of containerization is not handling the signals and zombie processes properly.
  • Use tini can help you to set up init process properly in your containers. If you don’t know if you should use it, simply use it and it will not do any harm.

--

--