Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use FunctionWrappers for custom reduction operators #637

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

giordano
Copy link
Member

My attempt at addressing #404. This is using FunctionWrappers.jl instead of @cfunction, as suggested in a recent JuliaHPC call.

Note: it doesn't work for me even on x86_64 when using more than 1 rank. Simple reproducer:

using MPI
MPI.Init()
let
    T = Int
    dims = 1
    op = (x,y) -> 2x+y-x
    send_arr = Array(zeros(T, Tuple(3 for i in 1:dims)))
    send_arr[:] .= 1:length(send_arr)
    recv_arr = Array{T}(undef, size(send_arr))
    MPI.Reduce!(send_arr, recv_arr, op, MPI.COMM_WORLD; root=0)
end

Run the script with

mpiexecjl -np 2 julia /path/to/script

The program should die with a segmentation fault (but the generated pointer should be non-null, as far as I can tell)

@giordano
Copy link
Member Author

giordano commented Sep 21, 2022

I thought that there was a segfault in the call to MPI_Reduce at

MPI.jl/src/collective.jl

Lines 598 to 600 in 4cd7118

@mpichk ccall((:MPI_Reduce, libmpi), Cint,
(MPIPtr, MPIPtr, Cint, MPI_Datatype, MPI_Op, Cint, MPI_Comm),
rbuf.senddata, rbuf.recvdata, rbuf.count, rbuf.datatype, op, root, comm)
but putting some debug printing shows that the ccall should work fine. I'm wondering if something is something should be GC.@preserve'd.

Edit: "should work fine" was probably a bit optimistic, as the result of the reduction is garbage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant