Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GPU tests #764

Merged
merged 5 commits into from
Sep 6, 2023
Merged

Fix GPU tests #764

merged 5 commits into from
Sep 6, 2023

Conversation

luraess
Copy link
Contributor

@luraess luraess commented Sep 6, 2023

This PR tries to fix the issue we are having when several JuliaGPU packages (here CUDA.jl and AMDGPU.jl) need to be accessible for testing purpose in the GPU CI using Buildkite. These packages may not depend on the same versions of e.g. GPUCompiler.jl which may result in non-resolvable combat bounds as it is the case without this PR.

One way to address this is to introduce weakdeps and add packages specifically for the target backend platform. This is here done parsing an extra flag passed when calling the test:

import Pkg
Pkg.test("MPI"; test_args=["--backend=CUDA"]

Issue: this works fine for Julia 1.9 and above, it currently fails for the MPI + CUDA tests run on Julia 1.6 as weakdeps may not be supported by 1.6. Shall we drop the 1.6 tests and keep current (+ nightly)?

@luraess
Copy link
Contributor Author

luraess commented Sep 6, 2023

Besides the 1.6 issue, test now fail again because of MPI + GPU and not because of test env setup.

test/Project.toml Outdated Show resolved Hide resolved
@simonbyrne
Copy link
Member

Our AMD CI tests are still failing...

@simonbyrne
Copy link
Member

@luraess did you want to look into this? Or just merge as-is?

@luraess
Copy link
Contributor Author

luraess commented Sep 6, 2023

Our AMD CI tests are still failing...

CUDA is also failing

@luraess
Copy link
Contributor Author

luraess commented Sep 6, 2023

@luraess did you want to look into this? Or just merge as-is?

I'd say we could merge the infrastructure update and look into fixing respective ROCm and CUDA tests in a separate PR @simonbyrne ?

@simonbyrne simonbyrne merged commit 06d52e8 into JuliaParallel:master Sep 6, 2023
38 of 44 checks passed
@luraess luraess deleted the lr/fix-test branch September 6, 2023 19:57
@luraess luraess mentioned this pull request Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants