-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MI300X (gfx942) support for broadcast operations #621
Comments
Worth noting that this works on an MI50, and an integrated GPU on 7950x. MI50
7950X
|
Do we miss something to support gfx942 @pxl-th ? |
Note: gfx942 is new and not widely available, so I didn't expect everything to work. I'm happy to work on this with you though. |
Probably because of Julia's 1.10 LLVM version, which is 15, but gfx942 officially was added in LLVM 17 IIUC: You can try Julia 1.11 early release (which has LLVM 16), but I haven't tested it at all with AMD GPUs yet. |
Julia 1.11:
|
AMDGPU 0.9 now supports Julia 1.11 and maybe MI300X. |
Just got a similar issue as the original post with Jullia 1.11.0-beta2, ROCm 6.1.2, and AMDGPU 0.9.5. With and without setting
I have been testing on Runpod and built a Julia-1.11-rc AMD ROCm template you can use to deploy a MI300X. I am happy to help with any debugging as well. |
We then need Julia 1.12, which has LLVM 17 (1.11 has LLVM 16). |
I just built Julia from source (also added version 17 to compatible version of LLD_jll and LLVM_jll for AMDGPU), and got the same issue: # ./julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.12.0-DEV.706 (2024-06-11)
_/ |\__'_|_|_|\__'_| | Commit e7893a1fa4 (0 days old master)
|__/ |
julia> versioninfo()
Julia Version 1.12.0-DEV.706
Commit e7893a1fa4 (2024-06-11 09:53 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 192 × AMD EPYC 9474F 48-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-17.0.6 (ORCJIT, znver4)
Threads: 1 default, 0 interactive, 1 GC (on 192 virtual cores)
Environment:
JULIA_DEPOT_PATH = /root/
julia> using AMDGPU
julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬─────────────────────────────────────────────────────────────────────────┐
│ Available │ Name │ Version │ Path │
├───────────┼──────────────────┼───────────┼─────────────────────────────────────────────────────────────────────────┤
│ + │ LLD │ - │ /opt/rocm/llvm/bin/ld.lld │
│ + │ Device Libraries │ - │ /root/artifacts/5ad5ecb46e3c334821f54c1feecc6c152b7b6a45/amdgcn/bitcode │
│ + │ HIP │ 6.1.40093 │ /opt/rocm/lib/libamdhip64.so │
│ + │ rocBLAS │ 4.1.2 │ /opt/rocm/lib/librocblas.so.4 │
│ + │ rocSOLVER │ 3.25.0 │ /opt/rocm/lib/librocsolver.so.0 │
│ + │ rocALUTION │ - │ /opt/rocm/lib/librocalution.so.1 │
│ + │ rocSPARSE │ - │ /opt/rocm/lib/librocsparse.so.1 │
│ + │ rocRAND │ 2.10.5 │ /opt/rocm/lib/librocrand.so.1 │
│ + │ rocFFT │ 1.0.27 │ /opt/rocm/lib/librocfft.so.0 │
│ + │ MIOpen │ 3.1.0 │ /opt/rocm/lib/libMIOpen.so.1 │
└───────────┴──────────────────┴───────────┴─────────────────────────────────────────────────────────────────────────┘
[ Info: AMDGPU devices
┌────┬─────────────────────┬────────────────────────┬───────────┬─────────────┐
│ Id │ Name │ GCN arch │ Wavefront │ Memory │
├────┼─────────────────────┼────────────────────────┼───────────┼─────────────┤
│ 1 │ AMD Instinct MI300X │ gfx942:sramecc+:xnack- │ 64 │ 191.984 GiB │
└────┴─────────────────────┴────────────────────────┴───────────┴─────────────┘
julia> a_d = ROCMatrix(rand(Float16,5,5))
5×5 ROCArray{Float16, 2, AMDGPU.Runtime.Mem.HIPBuffer}:
0.5596 0.292 0.8354 0.3677 0.641
0.1567 0.978 0.4614 0.2144 0.717
0.4023 0.8706 0.9004 0.9033 0.2319
0.3042 0.3652 0.48 0.02197 0.1309
0.7817 0.1909 0.4595 0.3193 0.846
julia> z_d = a_d .- Float16(0.5)
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#35#37")(::AMDGPU.ROCKernelContext, ::AMDGPU.Device.ROCDeviceMatrix{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to pointerset(ptr::Core.LLVMPtr{T, A}, x::T, i::I, ::Val{align}) where {T, A, I, align} @ LLVM.Interop none:0)
Stacktrace:
[1] unsafe_store! (repeats 3 times)
@ ~/packages/LLVM/6cDbl/src/interop/pointer.jl:88
... Notably, the |
AMDGPU.jl needs to account for changes in Julia 1.12, I haven't done that yet |
Can you give an indication of what needs to be done? I can't promise anything, but I may or may not have a chance to look into this (if it doesn't take too long 🥲) |
Simple reproducer, not sure if this specific use case is supported or not. CPU and GPU versions for comparison. MI300X GPU, Ubuntu 22.04. ROCm 6.1 pre-release.
The
a_h
andz_h
are as expected.The
a_d
andb_d
are properly set, though the subtraction yields thisThe text was updated successfully, but these errors were encountered: