Waiting for things to shut down - 2 #6669

GilShoshan94 · 2024-07-01T14:02:40Z

Is your feature request related to a problem? Please describe.
This is the comes naturally after #5585

Hi, I have a big async codebase and am using Tokio.
In my binary, I spawn a lot of async tasks (tokio::task::spawn()) and manage sockets and connections.

I like the pattern where my structs spawn backround tokio async tasks in their new method a keep a handle to a channel, in their Drop implementation they signal those task to shutdown with their channel.

When to program shuts down, I need to wait in main to let those background tasks gracefully shutdown (send disconnection messages and clean up...), because by default:

Shutting down a Tokio runtime (e.g. by returning from #[tokio::main]) immediately cancels all tasks on it, excepts for the blocking sync task that are put on a dedicated thread pool for blocking tasks, and the runtime will wait indefinitely for all blocking operations to finish, unless shutdown_timeout is called)

I have read the tutorial Waiting for things to finish shutting down in the Graceful Shutdown topic.
While TaskTracker is very useful and looks great. It seems more like a fine tuned tool for specific scenario and the user needs to keep a handle on this tracker and manually register the tasks to keep track of (api super well done, but I mean that the user have to do an extra call than just the regular tokio::task()) and overall, for the regular general use case I think the ergonomy can be improved.

Describe the solution you'd like

To add a method named wait_active_tasks to Handle so we can do something like that:

use tokio::runtime::Handle;

#[tokio::main]
async fn main() {
    ... code .... entry point ....
    
    let handle = Handle::current();
    handle.wait_active_tasks().await;
}

It seems possible since the ("unstable") method active_tasks_count is already aware on the number of tasks actively running. I looked into the code and reach the scheduler and context code and had hard time to follow to be honest but to my understanding, there is a global (thread local) context that keep the state on the runtime, so the info exist already.

It is nice also because in the main async we could use this in a select to add a timeout or other conditions.

One thing bad with this approach is that if the user await this not in the main async function but in one of the spawned task, it will never be ready... For me, if we can know internally in the tokio context that we are not in the entry point, the first block_on in the main thread (from the perspective of the runtime), then we could simply panic, exactly like we panic if user call Handle::current() outside of a runtime context.

If it's not possible, then to add the method on the runtime itself, would guaranteed that we are not nested in other tasks. But the function would be sync, so it would also required a "_timeout" version:

use tokio::runtime::Runtime;

fn main() {
    let rt = Runtime::new().unwrap();
    ... code .... entry point ....
    
    let handle = Handle::current();
    rt.block_on(async {
        ... code .... entry point .... 
    });
    rt.wait_active_tasks(); // wait indefinitely
    // OR
    // rt.wait_active_tasks_timeout(std::time::Duration::from_millis(1500));
}

(If we go that route, it would be nice to also add and argument to the attribute macro #[tokio::main(wait_tasks)] and #[tokio::main(wait_tasks_timeout_millis = 1500)])

Describe alternatives you've considered

To use a TaskTracker in main and to pass it around by cloning it and spawning on it (+ close and wait on it in main):

Pros:
- In case of multiple runtimes (not my case), we can correctly have one tracker per runtime.
Cons:
- Need to add a TaskTracker field in each new method (and others relevant methods) on each struct types and we even need to add the tracker in their fields sometimes.
- Need to clone the tracker and pass it around.
- Each tokio::spawn() needs to be rewrite as tracker.spawn().

Use a TaskTracker but as a static in the global scope:

use std::sync::OnceLock;
use tokio_util::task::TaskTracker;

pub fn tracker() -> &'static TaskTracker {
    static TRACKER: OnceLock<TaskTracker > = OnceLock::new();
    TRACKER.get_or_init(|| { TaskTracker::new() })
}

then in other place in the codebase we can access this tracker without the need to clone and pass around a handle.

Pros:
- tracker accessible from anywhere in the codebase.
Cons:
- Each tokio::spawn() needs to be rewrite as tracker().spawn().
- In case of multiple runtimes, we don't have one tracker per runtime and this is bad as we could block a runtime to drop eventhough all its tasks are done already.

Use a TaskTracker but as a static item from a trait defined Handle:
Is the same than 2) but instead the tracker is obtain from Handle::current().tracker(). It's not better in anyway...

Describe the solution you'd like PART 2

While writting the alternatives I have considered, I though of another solution:

Add a TaskTracker in the runtime context (I am not exactly sure about the internals) so the user could get the current tracker to put the tasks the user wants to wait for at runtime shutdown.
It resolves the main issue with multiple runtimes from 2) and 3)
Add in impl Handle a method pub fn current_tracker() -> &TaskTracker.
The user would call it like this: Handle::current_tracker()

Pros:
- tracker accessible from anywhere in the codebase.
- In case of multiple runtimes (not my case), we correctly have one tracker per runtime.
Cons:
- Each tokio::spawn() needs to be rewrite as Handle::current_tracker().spawn()
- Also add a dependency to tokio_util (but the code could also be added directly to tokio too)
I think it's a nice solution also, it would be less automatic than the first solution and wouldn't prevent users to close and wait for the tracker in nested tracked task and block (so does TaskTracker too...) but it would give more flexibility to the user to chose which task to track or not while beeing coupled to the correct (current) runtime.

The text was updated successfully, but these errors were encountered:

GilShoshan94 added A-tokio Area: The main tokio crate C-feature-request Category: A feature request. labels Jul 1, 2024

Darksonn added the M-task Module: tokio/task label Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Waiting for things to shut down - 2 #6669

Waiting for things to shut down - 2 #6669

GilShoshan94 commented Jul 1, 2024 •

edited

Loading

Waiting for things to shut down - 2 #6669

Waiting for things to shut down - 2 #6669

Comments

GilShoshan94 commented Jul 1, 2024 • edited Loading

Describe the solution you'd like

Describe alternatives you've considered

Describe the solution you'd like PART 2

GilShoshan94 commented Jul 1, 2024 •

edited

Loading