-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add non-blocking wait #766
Conversation
Should we also add a
|
If so, should one add it in similar location as the |
I think leave that for a separate PR. |
Can you add it to the docs, and add a test? |
Done. Feel free to update the docstring @simonbyrne |
test/test_cooperative_wait.jl
Outdated
Threads.@spawn begin | ||
# send to self | ||
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i) | ||
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i) | ||
|
||
wait(recv_req) | ||
@test MPI.isnull(recv_req) | ||
recv_check[i] += 1 | ||
|
||
wait(send_req) | ||
@test MPI.isnull(send_req) | ||
send_check[i] += 1 | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this is correct, in that it seems to assume that you can wait on the receive before the send is complete (this is typically true for short messages, but not guaranteed by the spec).
How about something like:
Threads.@spawn begin | |
# send to self | |
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i) | |
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i) | |
wait(recv_req) | |
@test MPI.isnull(recv_req) | |
recv_check[i] += 1 | |
wait(send_req) | |
@test MPI.isnull(send_req) | |
send_check[i] += 1 | |
end | |
Threads.@spawn begin | |
recv_req = MPI.Irecv!(recv_arr[i], MPI.COMM_WORLD; source=myrank, tag=i) | |
wait(recv_req) | |
@test MPI.isnull(recv_req) | |
recv_check[i] += 1 | |
end | |
Threads.@spawn begin | |
send_req = MPI.Isend(send_arr[i], MPI.COMM_WORLD; dest=myrank, tag=i) | |
wait(send_req) | |
@test MPI.isnull(send_req) | |
send_check[i] += 1 | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow your point. Also, how would having 2 threads guarantee the following to be true.
you can wait on the receive before the send is complete
Why would the semantic change from usual non-blocking MPI point to point comm which has the initially proposed workflow (as in here). Initially suggested, we have the non-blocking workflow done per MPI process in a sequential way, and have one thread per process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your original proposal, the Isend
and Irecv!
are in one Julia task, and you wait
on the receive first: this means that the receive has to complete before it can wait on the send. This is typically fine, but there is no guarantee that it actually will complete. From https://stackoverflow.com/a/67099566/392585
MPI_Isend does not necessarily progress in the background but only when the MPI implementation is given the chance to progress it. MPI_Wait progresses the operation and guarantees its completion. Some MPI implementations can progress operations in the background using progression threads. Some cannot. It is implementation-dependent and one should never rely on one or another specific behaviour.
My proposal splits the send and receive into two Julia tasks: if the task is waiting on the receive, it can then switch to the send task and complete that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Thanks for the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonbyrne I don't quite understand the need for threads. Page 75 of the MPI 4.0 standard (section 3.7 Nonblocking Communication) states:
Progress A call to MPI_WAIT that completes a receive will eventually terminate and return
if a matching send has been started , unless the send is satisfied by another receive. In
particular, if the matching send is nonblocking, then the receive should complete even if no
call is executed by the sender to complete the send. Similarly, a call to MPI_WAIT that
completes a send will eventually return if a matching receive has been started, unless the
receive is satisfied by another send, and even if no call is executed to complete the receive.
Is there something that I am missing? The above indicates to me that the wait on the receive should return even if the send has not been waited on.
p.s. This statement is also on page 45 of the MPI 1.3 standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, TIL! Thanks @lcw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your welcome. I was worried I had been using an invalid assumption for 20 years. So, I had to look it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the need for threads
The usage of threads here emulates a use case in which one would use a TLS approach to perform some async work on GPUs which requires to launch kernels on different streams from different Julia tasks/threads. This thus requires the "cooperative" wait. But indeed, I was also confused by the fact that non-blocking comm would stall in the single thread spawn implementation.
Co-authored-by: Valentin Churavy <[email protected]>
Could this be reviewed and finalised 🙏 ? |
Shall one thus revert the test and spawn a single thread to do the non-blocking send and receive within as originally proposed, given #766 (comment)? |
A first implementation of non-blocking wait as discussed in #762