Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward signals to child process with micromamba run #3152

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 79 additions & 18 deletions libmamba/src/core/run.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ extern "C"
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
}
#else
Expand All @@ -35,6 +36,7 @@ extern "C"
#include "mamba/core/execution.hpp"
#include "mamba/core/output.hpp"
#include "mamba/core/run.hpp"
#include "mamba/core/thread_utils.hpp"
#include "mamba/core/util_os.hpp"
#include "mamba/util/environment.hpp"
#include "mamba/util/random.hpp"
Expand Down Expand Up @@ -281,6 +283,40 @@ namespace mamba
}
}
}

// Wait for a process to finish
// Similar to reproc::process.wait, but also handles suspension
int wait_for_process(pid_t pid)
{
int status;
pid_t wpid;
while (1)
{
wpid = waitpid(pid, &status, WUNTRACED);
if (wpid == pid && WIFSTOPPED(status))
{
// If the child was suspended (e.g. CTRL-Z),
// suspend entire process group as well
kill(0, SIGSTOP);
}
else
{
break;
}
}

// Compute the return status mimicing the private logic in reproc++
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L455
Comment on lines +308 to +309
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Compute the return status mimicing the private logic in reproc++
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L455
// Compute the return status mimicing the private logic in reproc++:
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L471
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L455

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or

Suggested change
// Compute the return status mimicing the private logic in reproc++
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L455
// Compute the return status mimicing the private logic in reproc++
// https://github.com/DaanDeMeyer/reproc/blob/1c07bdbec3f2ecba7125b9499b9a8a77bf9aa8c7/reproc/src/process.posix.c#L465-L478

if (wpid < 0)
{
status = -errno;
}
else
{
status = WIFEXITED(status) ? WEXITSTATUS(status) : WTERMSIG(status) + 128;
}
return status;
}
#endif

int run_in_environment(
Expand Down Expand Up @@ -427,7 +463,7 @@ namespace mamba
);
}
#endif
PID pid;
static PID pid;
std::error_code lec;
static reproc::process proc;

Expand All @@ -442,24 +478,38 @@ namespace mamba
}

#ifndef _WIN32
MainExecutor::instance().schedule(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I forgot this change from one of the experiments I did. I'll revert it.

Do you know why the signal handler was being set with the MainExecutor instead of directly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No 🤷

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I traced this back to 60d4b3a, but still don't know why the signal was being set in a separate thread.

With the new changes, it seems more difficult to do the modifications correctly in a separate thread. The MainExecutor doesn't seem to have functionality to check if a scheduled task has finished, meaning that it wouldn't be possible to check that modifying and restoring the signal masks happen before the end of the run_in_environment call without using some sort of synchronization primitives. At the same time, the signal mask is set on a per-thread basis, so the calls to pthread_sigmask would have to anyhow be done on the main thread. What do you think about leaving this as is?

Copy link
Collaborator

@jonashaag jonashaag Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wolfv do you recall why?

[]()
sigset_t empty_sigmask, prev_sigmask;
sigemptyset(&empty_sigmask);

// Stop the normal signal handler so that SIGINT is forwarded as well,
// and unblock all signals, while saving the previous state
// TODO: Is this the right place to change the mask and is this the right order?
stop_receiver_thread();
pthread_sigmask(SIG_SETMASK, &empty_sigmask, &prev_sigmask);

std::map<int, struct sigaction> prev_actions;

// Signal handler that forwards all signals except CHLD to the child
auto forward_signal = [](int signum)
{
if (signum != SIGCHLD)
{
signal(
SIGTERM,
[](int /*signum*/)
{
LOG_INFO << "Received SIGTERM on micromamba run - terminating process";
reproc::stop_actions sa;
sa.first = reproc::stop_action{ reproc::stop::terminate,
std::chrono::milliseconds(3000) };
sa.second = reproc::stop_action{ reproc::stop::kill,
std::chrono::milliseconds(3000) };
proc.stop(sa);
}
);
kill(pid, signum);
}
);
};

// Set handlers for all signals and save the previous state.
for (int signum = 1; signum < NSIG; signum++)
{
struct sigaction sa, prev_sa;
sa.sa_flags = SA_RESTART;
sa.sa_handler = forward_signal;
int err = sigaction(signum, &sa, &prev_sa);
if (!err)
{
prev_actions[signum] = prev_sa;
}
}
#endif

// check if we need this
Expand All @@ -470,12 +520,23 @@ namespace mamba

ec = reproc::drain(proc, reproc::sink::null, reproc::sink::null);

std::tie(status, ec) = proc.stop(opt.stop);
#ifndef _WIN32
status = wait_for_process(pid);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Between the place where we installed the signal handlers up until after this line, is it possible to get any exceptions? If so, should we put the cleanup/restoring code into a finally?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check, but that sounds like a good idea. Unfortunately I'm not sure how to do that in C++


// Restore the signal handlers and the signal mask
set_default_signal_handler();
pthread_sigmask(SIG_SETMASK, &prev_sigmask, nullptr);
for (const auto& [signum, sa] : prev_actions)
{
sigaction(signum, &sa, nullptr);
}
#else
std::tie(status, ec) = proc.stop(opt.stop);
if (ec)
{
std::cerr << ec.message() << " ; error code " << ec.value() << '\n';
}
#endif
}
// exit with status code from reproc
return status;
Expand Down
1 change: 1 addition & 0 deletions libmamba/src/core/thread_utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ namespace mamba
void set_default_signal_handler()
{
set_signal_handler(default_signal_handler);
// std::signal(SIGINT, [](int /*signum*/) { set_sig_interrupted(); });
}
#else
void set_default_signal_handler()
Expand Down
57 changes: 57 additions & 0 deletions micromamba/tests/signal_forwarding.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/usr/bin/env python
import signal
import os
from time import sleep

# Define the sleep time
sleep_time = 0.005

# Get all valid signals and remove the unhandled ones.
# SIGKILL and SIGSTOP cannot be caught, while SIGCHLD
# is not forwarded by umamba because it makes no sense to do so
all_signals = signal.valid_signals() - {signal.SIGKILL, signal.SIGCHLD, signal.SIGSTOP}

# Initialize a dictionary to keep track of received signals.
# Keep it global to have it accessible from the signal handler.
received_signals = {int(s): False for s in all_signals}


# Define the signal handler
def signal_handler(sig, _):
received_signals[sig] = True


def set_signal_handlers():
"""Set signal handlers for all signals."""
for s in all_signals:
signal.signal(s, signal_handler)


def send_signals():
"""Send all signals to the parent process."""
ppid = os.getppid()
for s in all_signals:
os.kill(ppid, s)
# wait so that the handlers are executed (and in order)
sleep(sleep_time)


def main():
set_signal_handlers()
sleep(sleep_time)

# Send signals and give time for the handlers to be called
send_signals()

ok = all(received_signals.values())
if not ok:
# Print the signals that were not handled
unhandled_signals = [k for k, v in received_signals.items() if not v]
print(f"No signal handled for signals {unhandled_signals}")
exit(1)

print("Signal forwarding ok")


if __name__ == "__main__":
main()
25 changes: 24 additions & 1 deletion micromamba/tests/test_run.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
import os
import random
import shutil
import signal
import string
import subprocess
from sys import platform
import time

import pytest

from .helpers import create, random_string, subprocess_run, umamba_run
from .helpers import create, random_string, subprocess_run, umamba_run, get_umamba

common_simple_flags = ["", "-d", "--detach", "--clean-env"]
possible_characters_for_process_names = (
Expand Down Expand Up @@ -91,6 +93,27 @@ def test_run_non_existing_env(self):
except subprocess.CalledProcessError as e:
assert "critical libmamba The given prefix does not exist:" in e.stderr.decode()

@pytest.mark.skipif(platform == "win32", reason="not supported for windows yet")
def test_signal_forwarding(self):
test_script_file_name = "signal_forwarding.py"
test_script_path = os.path.join(os.path.dirname(__file__), test_script_file_name)
res = umamba_run("python", test_script_path)
assert res.strip() == "Signal forwarding ok"

@pytest.mark.skipif(platform == "win32", reason="not supported for windows yet")
def test_suspension(self):
umamba = get_umamba()
proc = subprocess.Popen([umamba, "run", "sleep", "60"])
pid = proc.pid
os.kill(pid, signal.SIGTSTP)
time.sleep(0.1)
_, status = os.waitpid(pid, os.WNOHANG | os.WUNTRACED)
assert os.WIFSTOPPED(status), "umamba should be stopped"
os.kill(pid, signal.SIGCONT)
time.sleep(0.1)
proc.terminate()
proc.wait()


@pytest.fixture()
def temp_env_prefix():
Expand Down