Add non-blocking wait #766

luraess · 2023-09-06T20:27:18Z

A first implementation of non-blocking wait as discussed in #762

vchuravy · 2023-09-06T20:32:21Z

Should we also add a MPI.wait(t::Task)?

function wait(task::Task)
    while !Base.istaskdone(task)
        MPI.Iprobe(MPI.MPI_ANY_SOURCE, MPI.MPI_ANY_TAG, MPI.COMM_WORLD)
        yield()
    end
    wait(task)
end

luraess · 2023-09-07T07:28:30Z

Should we also add a MPI.wait(t::Task)?

If so, should one add it in similar location as the Base.wait?

simonbyrne · 2023-09-07T17:30:07Z

I think leave that for a separate PR.

simonbyrne · 2023-09-07T17:33:06Z

Can you add it to the docs, and add a test?

luraess · 2023-09-08T09:05:24Z

Can you add it to the docs, and add a test?

Done. Feel free to update the docstring @simonbyrne

src/nonblocking.jl

simonbyrne · 2023-09-11T19:14:42Z

test/test_cooperative_wait.jl

+    Threads.@spawn begin
+        # send to self
+        recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)
+        send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)
+
+        wait(recv_req)
+        @test MPI.isnull(recv_req)
+        recv_check[i] += 1
+
+        wait(send_req)
+        @test MPI.isnull(send_req)
+        send_check[i] += 1
+    end


I'm not sure that this is correct, in that it seems to assume that you can wait on the receive before the send is complete (this is typically true for short messages, but not guaranteed by the spec).

How about something like:

Suggested change

Threads.@spawn begin

# send to self

recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)

send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)

wait(recv_req)

@test MPI.isnull(recv_req)

recv_check[i] += 1

wait(send_req)

@test MPI.isnull(send_req)

send_check[i] += 1

end

Threads.@spawn begin

recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i)

wait(recv_req)

@test MPI.isnull(recv_req)

recv_check[i] += 1

end

Threads.@spawn begin

send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i)

wait(send_req)

@test MPI.isnull(send_req)

send_check[i] += 1

end

I am not sure I follow your point. Also, how would having 2 threads guarantee the following to be true.

you can wait on the receive before the send is complete

Why would the semantic change from usual non-blocking MPI point to point comm which has the initially proposed workflow (as in here). Initially suggested, we have the non-blocking workflow done per MPI process in a sequential way, and have one thread per process.

In your original proposal, the Isend and Irecv! are in one Julia task, and you wait on the receive first: this means that the receive has to complete before it can wait on the send. This is typically fine, but there is no guarantee that it actually will complete. From https://stackoverflow.com/a/67099566/392585

MPI_Isend does not necessarily progress in the background but only when the MPI implementation is given the chance to progress it. MPI_Wait progresses the operation and guarantees its completion. Some MPI implementations can progress operations in the background using progression threads. Some cannot. It is implementation-dependent and one should never rely on one or another specific behaviour.

My proposal splits the send and receive into two Julia tasks: if the task is waiting on the receive, it can then switch to the send task and complete that.

Ok. Thanks for the details.

@simonbyrne I don't quite understand the need for threads. Page 75 of the MPI 4.0 standard (section 3.7 Nonblocking Communication) states:

Progress A call to MPI_WAIT that completes a receive will eventually terminate and return
if a matching send has been started , unless the send is satisfied by another receive. In
particular, if the matching send is nonblocking, then the receive should complete even if no
call is executed by the sender to complete the send. Similarly, a call to MPI_WAIT that
completes a send will eventually return if a matching receive has been started, unless the
receive is satisfied by another send, and even if no call is executed to complete the receive.

Is there something that I am missing? The above indicates to me that the wait on the receive should return even if the send has not been waited on.

p.s. This statement is also on page 45 of the MPI 1.3 standard.

Ah, TIL! Thanks @lcw.

Your welcome. I was worried I had been using an invalid assumption for 20 years. So, I had to look it up.

I don't quite understand the need for threads

The usage of threads here emulates a use case in which one would use a TLS approach to perform some async work on GPUs which requires to launch kernels on different streams from different Julia tasks/threads. This thus requires the "cooperative" wait. But indeed, I was also confused by the fact that non-blocking comm would stall in the single thread spawn implementation.

docs/src/reference/pointtopoint.md

Co-authored-by: Valentin Churavy <[email protected]>

luraess · 2023-09-13T10:15:58Z

Could this be reviewed and finalised 🙏 ?

luraess · 2023-09-15T06:27:29Z

Shall one thus revert the test and spawn a single thread to do the non-blocking send and receive within as originally proposed, given #766 (comment)?

Add non-blocking wait

e3cbe27

luraess requested a review from simonbyrne September 6, 2023 20:27

luraess added 3 commits September 8, 2023 11:03

Add doc

c25f69c

Update docstring

1bbb91b

Add test

8c25a83

simonbyrne reviewed Sep 11, 2023

View reviewed changes

src/nonblocking.jl Outdated Show resolved Hide resolved

simonbyrne reviewed Sep 11, 2023

View reviewed changes

simonbyrne added 2 commits September 11, 2023 13:19

Update docs

076decf

Update test/test_cooperative_wait.jl

520a1fb

vchuravy reviewed Sep 11, 2023

View reviewed changes

docs/src/reference/pointtopoint.md Outdated Show resolved Hide resolved

Update docs/src/reference/pointtopoint.md

fbe9db0

Co-authored-by: Valentin Churavy <[email protected]>

simonbyrne closed this Sep 13, 2023

simonbyrne reopened this Sep 13, 2023

simonbyrne merged commit 846c79d into JuliaParallel:master Sep 13, 2023
39 of 47 checks passed

simonbyrne mentioned this pull request Jan 5, 2024

Make graceful exit feature MPI-compatible CliMA/ClimaAtmos.jl#2475

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add non-blocking wait #766

Add non-blocking wait #766

luraess commented Sep 6, 2023

vchuravy commented Sep 6, 2023

luraess commented Sep 7, 2023

simonbyrne commented Sep 7, 2023

simonbyrne commented Sep 7, 2023

luraess commented Sep 8, 2023

simonbyrne Sep 11, 2023

luraess Sep 12, 2023

simonbyrne Sep 13, 2023

luraess Sep 13, 2023

lcw Sep 14, 2023

simonbyrne Sep 14, 2023

lcw Sep 14, 2023

luraess Sep 14, 2023 •

edited

Loading

luraess commented Sep 13, 2023

luraess commented Sep 15, 2023 •

edited

Loading

Add non-blocking wait #766

Add non-blocking wait #766

Conversation

luraess commented Sep 6, 2023

vchuravy commented Sep 6, 2023

luraess commented Sep 7, 2023

simonbyrne commented Sep 7, 2023

simonbyrne commented Sep 7, 2023

luraess commented Sep 8, 2023

simonbyrne Sep 11, 2023

Choose a reason for hiding this comment

luraess Sep 12, 2023

Choose a reason for hiding this comment

simonbyrne Sep 13, 2023

Choose a reason for hiding this comment

luraess Sep 13, 2023

Choose a reason for hiding this comment

lcw Sep 14, 2023

Choose a reason for hiding this comment

simonbyrne Sep 14, 2023

Choose a reason for hiding this comment

lcw Sep 14, 2023

Choose a reason for hiding this comment

luraess Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

luraess commented Sep 13, 2023

luraess commented Sep 15, 2023 • edited Loading

luraess Sep 14, 2023 •

edited

Loading

luraess commented Sep 15, 2023 •

edited

Loading