Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGTERM sent to docker run command doesn't send a SIGTERM to the main process running inside the container anymore. #5241

Closed
shambhushrestha-bpi opened this issue Jul 8, 2024 · 4 comments · Fixed by #5247
Assignees
Milestone

Comments

@shambhushrestha-bpi
Copy link

Description

In Docker version 26.1.4, sending a SIGTERM to the docker run process would also send a SIGTERM to the main process inside the container. However, in version 27.0.3, this behavior has changed. Now, the process exits with the message context canceled without the main process receiving a SIGTERM.

Reproduce

  1. Save the following as $HOME/test/handle_sigterm.sh and make it executable.
#!/bin/sh

# Function to handle SIGTERM
handle_sigterm() {
    echo "Received SIGTERM, exiting..."
    exit 0
}

trap 'handle_sigterm' TERM

while true; do
    echo "Waiting for sigterm"
    sleep 10
done
  1. Run it as follows in a container
docker run -i \
    -v $HOME/test:$HOME/test:ro \
    alpine:latest \
    $HOME/test/handle_sigterm.sh
  1. Now find the pid of this process as follows (in a separate terminal) and send a sigterm
ps -ef | grep 'handle_sigterm.sh' | grep 'docker run'
kill -15 <pid from above>
  1. You should see the following output
Waiting for sigterm
context canceled

Expected behavior

Up until version 26.x , the SIGTERM would be received by the main process and it would output the following:

Waiting for sigterm
Received SIGTERM, exiting...

docker version

Client: Docker Engine - Community
 Version:           27.0.3
 API version:       1.46
 Go version:        go1.21.11
 Git commit:        7d4bcd8
 Built:             Sat Jun 29 00:02:33 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.0.3
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       662f78c
  Built:            Sat Jun 29 00:02:33 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.18
  GitCommit:        ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc:
  Version:          1.7.18
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    27.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.15.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.28.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 22
  Running: 8
  Paused: 0
  Stopped: 14
 Images: 9
 Server Version: 27.0.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.5.0-1022-aws
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.718GiB
 Name: <removed>
 ID: 2b79cfde-c13d-466d-bea6-96ef75d4d0d7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Tried in both Ubuntu 20 and Ubuntu 22 with Docker 26.1.4 and Docker 27.0.3.

@Benehiko
Copy link
Member

Benehiko commented Jul 8, 2024

It seems like we aren't forwarding signals to the container. Or maybe exiting the CLI before they are forwarded.

I'll take a look in a bit

@Benehiko
Copy link
Member

Benehiko commented Jul 8, 2024

I looked into this and it seems that the container is being terminated correctly. Although it could happen to have a race condition which could end up terminating the CLI before an event is forwarded to the daemon.

From the output you received it looks confusing since the CLI is being terminated before the script can output its own output (which gets printed by the CLI).

There are two processes, the CLI and the container.
The CLI sends API requests to the daemon to create and attach to the container.
The container streams its output back to the CLI (your handle_sigterm.sh script).

cli (main process) -> create container (daemon) -> attach container -> (stream input/output)
container -> run script

When we send a termination signal to the CLI we do the following:

1. Handle termination signal by catching it
2. Cancel the main go context
3. Everything attached to this context starts returning
4. Background `ForwardAllSignals` also catches the termination signal
5. `ForwardAllSignals` sends an API request with the termination signal to the daemon
6. Daemon tries to kill the container (script running inside the container)
7. Main CLI exits closing the channel accepting the signals inside `ForwardAllSignals` background task. Causing it to exit.
8. You see `context canceled`

You can see in this log console output I've added a log inside ForwardAllSignals to print once it sends the signal to the daemon.

time="2024-07-08T13:10:30+02:00" level=info msg="Forwarding signal: TERM"
context canceled

I'll take a look at how this can be improved so we can wait for the container to be killed before detaching and exiting the main CLI process.

@shambhushrestha-bpi
Copy link
Author

Thanks for looking into this. The container is indeed being terminated properly - The main problem is that, in production, the process running inside the container often have cleanup routines to run when receiving specific signals like SIGINT. Since the container exits before the signal handler is even called, the cleanup routines can't run.

@shambhushrestha-bpi
Copy link
Author

Thank you @Benehiko @vvoland for addressing this quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants