Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version incompatibilities with the JuliaGPU stack #694

Closed
anicusan opened this issue Nov 11, 2024 · 4 comments
Closed

Version incompatibilities with the JuliaGPU stack #694

anicusan opened this issue Nov 11, 2024 · 4 comments

Comments

@anicusan
Copy link
Member

anicusan commented Nov 11, 2024

We recently set up a GPU CI for all the GPU backends of AcceleratedKernels.jl - it seems AMDGPU is the only platform where the call to strides(::ROCArray) fails inside a KernelAbstractions.jl kernel:

Image

A few notes: I know @pxl-th ran the N-dimensional reduction and it worked on his machine, but it may have been with GPUArraysCore=0.2. The ecosystem hasn't yet fully updated to that version, so I had to add GPUArraysCore="0.1, 0.2" to [compat] - it may have been due to some combination of Adapt, Metal, and GPUArrays resulting in version conflicts.

I can do a hacky fix and pass the strides to the kernel as an explicit argument, but I wanted to know if this is a known issue and where / how it could be solved.

@anicusan anicusan changed the title Function strides not found Function strides not found inside KernelAbstractions kernel Nov 11, 2024
@pxl-th
Copy link
Collaborator

pxl-th commented Nov 11, 2024

That was fixed here #692.
Make sure CI pulls [email protected].

@anicusan
Copy link
Member Author

anicusan commented Nov 11, 2024

You're right, AMDGPU is downgraded in the GPU CI, along with GPUArrays, GPUArraysCore and GPUCompiler.

I can reproduce the error locally by setting [compat] to GPUArraysCore="0.2" only - trying to add Metal to the dev-ed AcceleratedKernels then gives me:

ERROR: Unsatisfiable requirements detected for package GPUArrays [0c68f7d7]:
 GPUArrays [0c68f7d7] log:
 ├─possible versions are: 0.3.0 - 11.1.0 or uninstalled
 ├─restricted by compatibility requirements with GPUArraysCore [46192b85] to versions: [0.3.0 - 8.3.2, 11.0.0 - 11.1.0] or uninstalled
 │ └─GPUArraysCore [46192b85] log:
 │   ├─possible versions are: 0.1.0 - 0.2.0 or uninstalled
 │   └─restricted to versions 0.2 by AcceleratedKernels [6a4ca0a5], leaving only versions: 0.2.0
 │     └─AcceleratedKernels [6a4ca0a5] log:
 │       ├─possible versions are: 0.3.0 or uninstalled
 │       ├─restricted to versions * by project [bcbbefdc], leaving only versions: 0.3.0
 │       │ └─project [bcbbefdc] log:
 │       │   ├─possible versions are: 0.0.0 or uninstalled
 │       │   └─project [bcbbefdc] is fixed to version 0.0.0
 │       └─AcceleratedKernels [6a4ca0a5] is fixed to version 0.3.0-DEV
 └─restricted by compatibility requirements with Metal [dde4c033] to versions: 8.4.0 - 10.3.1 — no versions left
   └─Metal [dde4c033] log:
     ├─possible versions are: 0.0.1 - 1.4.2 or uninstalled
     └─restricted to versions * by an explicit requirement, leaving only versions: 0.0.1 - 1.4.2

Same with trying to add AMDGPU instead - a clusterduck of version incompatibilities:

   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package AMDGPU [21141c5a]:
 AMDGPU [21141c5a] log:
 ├─possible versions are: 0.1.0 - 1.1.0 or uninstalled
 ├─restricted to versions * by an explicit requirement, leaving only versions: 0.1.0 - 1.1.0
 ├─restricted by compatibility requirements with AcceleratedKernels [6a4ca0a5] to versions: 0.1.0 - 1.0.4 or uninstalled, leaving only versions: 0.1.0 - 1.0.4
 │ └─AcceleratedKernels [6a4ca0a5] log:
 │   ├─possible versions are: 0.3.0 or uninstalled
 │   ├─restricted to versions * by project [bcbbefdc], leaving only versions: 0.3.0
 │   │ └─project [bcbbefdc] log:
 │   │   ├─possible versions are: 0.0.0 or uninstalled
 │   │   └─project [bcbbefdc] is fixed to version 0.0.0
 │   └─AcceleratedKernels [6a4ca0a5] is fixed to version 0.3.0-DEV
 ├─restricted by compatibility requirements with LLVM [929cbde3] to versions: 0.2.9 - 1.1.0 or uninstalled, leaving only versions: 0.2.9 - 1.0.4
 │ └─LLVM [929cbde3] log:
 │   ├─possible versions are: 0.9.0 - 9.1.3 or uninstalled
 │   ├─restricted by julia compatibility requirements to versions: 4.0.0 - 9.1.3 or uninstalled
 │   ├─restricted by compatibility requirements with AMDGPU [21141c5a] to versions: [2.0.0 - 7.2.1, 8.1.0 - 9.1.3], leaving only versions: [4.0.0 - 7.2.1, 8.1.0 - 9.1.3]
 │   │ └─AMDGPU [21141c5a] log: see above
 │   ├─restricted by compatibility requirements with UnsafeAtomicsLLVM [d80eeb9a] to versions: 4.12.0 - 9.1.3, leaving only versions: [4.12.0 - 7.2.1, 8.1.0 - 9.1.3]
 │   │ └─UnsafeAtomicsLLVM [d80eeb9a] log:
 │   │   ├─possible versions are: 0.1.0 - 0.2.1 or uninstalled
 │   └─restricted by compatibility requirements with GPUCompiler [61eb1bfa] to versions: [6.0.0 - 6.6.3, 7.1.0 - 9.1.3], leaving only versions: [6.0.0 - 6.6.3, 7.1.0 - 7.2.1, 8.1.0 - 9.1.3]
 │     └─GPUCompiler [61eb1bfa] log:
 │       ├─possible versions are: 0.1.0 - 1.0.1 or uninstalled
 │       ├─restricted by julia compatibility requirements to versions: 0.23.0 - 1.0.1 or uninstalled
 │       └─restricted by compatibility requirements with AMDGPU [21141c5a] to versions: [0.4.0 - 0.5.5, 0.7.0 - 0.17.3, 0.19.0 - 0.19.4, 0.21.0 - 1.0.1], leaving only versions: 0.23.0 - 1.0.1
 │         └─AMDGPU [21141c5a] log: see above
 ├─restricted by compatibility requirements with Adapt [79e6a3ab] to versions: 0.8.4 - 1.1.0 or uninstalled, leaving only versions: 0.8.4 - 1.0.4
 │ └─Adapt [79e6a3ab] log:
 │   ├─possible versions are: 0.3.0 - 4.1.1 or uninstalled
 │   ├─restricted by compatibility requirements with AMDGPU [21141c5a] to versions: 0.4.0 - 4.1.1
 │   │ └─AMDGPU [21141c5a] log: see above
 │   └─restricted by compatibility requirements with GPUArraysCore [46192b85] to versions: 4.0.0 - 4.1.1
 │     └─GPUArraysCore [46192b85] log:
 │       ├─possible versions are: 0.1.0 - 0.2.0 or uninstalled
 │       └─restricted to versions 0.2 by AcceleratedKernels [6a4ca0a5], leaving only versions: 0.2.0
 │         └─AcceleratedKernels [6a4ca0a5] log: see above
 └─restricted by compatibility requirements with GPUArrays [0c68f7d7] to versions: 1.1.0 or uninstalled — no versions left
   └─GPUArrays [0c68f7d7] log:
     ├─possible versions are: 0.3.0 - 11.1.0 or uninstalled
     ├─restricted by compatibility requirements with AMDGPU [21141c5a] to versions: [2.0.0 - 10.3.1, 11.1.0]
     │ └─AMDGPU [21141c5a] log: see above
     ├─restricted by compatibility requirements with GPUArraysCore [46192b85] to versions: [0.3.0 - 8.3.2, 11.0.0 - 11.1.0] or uninstalled, leaving only versions: [2.0.0 - 8.3.2, 11.1.0]
     │ └─GPUArraysCore [46192b85] log: see above
     ├─restricted by compatibility requirements with AbstractFFTs [621f4979] to versions: [0.3.0 - 1.0.4, 2.0.1 - 11.1.0] or uninstalled, leaving only versions: [2.0.1 - 8.3.2, 11.1.0]
     │ └─AbstractFFTs [621f4979] log:
     │   ├─possible versions are: 0.3.0 - 1.5.0 or uninstalled
     │   └─restricted by compatibility requirements with AMDGPU [21141c5a] to versions: 0.5.0 - 1.5.0
     │     └─AMDGPU [21141c5a] log: see above
     └─restricted by compatibility requirements with Adapt [79e6a3ab] to versions: [0.3.0 - 0.5.0, 10.0.0 - 11.1.0] or uninstalled, leaving only versions: 11.1.0
       └─Adapt [79e6a3ab] log: see above

This is now not the same issue anymore, but would you know how to read these errors and what a fix could be?

@anicusan anicusan changed the title Function strides not found inside KernelAbstractions kernel Version incompatibilities with the JuliaGPU stack Nov 11, 2024
@anicusan
Copy link
Member Author

This was because of the circular dependency between AK and AMDGPU, even though it does not really show in the above. I'll keep AK at 0.2 and defer updating the most significant version figure to when we coordinate in updating both sides.

@pxl-th
Copy link
Collaborator

pxl-th commented Nov 12, 2024

Ah, yes. I usually defer updating the version until the very moment I want to tag a release, this way it is easier for CI things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants