Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try out PoCL-R #258

Open
maleadt opened this issue Oct 16, 2024 · 1 comment
Open

Try out PoCL-R #258

maleadt opened this issue Oct 16, 2024 · 1 comment

Comments

@maleadt
Copy link
Member

maleadt commented Oct 16, 2024

This feature of PoCL makes it possible to offload to multiple servers, see pocl/pocl#1621 (comment). That could be an interesting approach to distributing code, although I fear that our current SPMD-style kernels are written with local GPUs in mind and thus won't scale well to a distributed execution environment.

(I know OpenCL.jl isn't the best place to keep track of this, but it's close enough, and issues on pocl_jll.jl have no visibility.)

@pjaaskel
Copy link

There's also a sub-buffer migration manager in PoCL-R. That is, data decomposition could be done using sub-buffers to the main buffer and the runtime manages the sub-buffer transfers directly between nodes as well (this is a rather fresh feature for which I'd like to see more testing). Thus, distributing the workload using OpenCL only would minimally require generating multiple SPMD kernel commands instead of a single SPMD one (for each local/remote GPU/CPU) and splitting the data with sub-buffer offsets. And of course launching a bit more of the kernel commands just to hide the data transfer latencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants