-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rocWMMA support #560
Comments
One could, but it would require probably quite a bit of work by someone. CUDA.jl WMMA support was the work of a full-time master student. |
wekk it has to be done at some point. why only nvidia to get all the goodies? :P |
For matrix multiplication we are using rocBLAS, adding wmma support won't affect its performance. |
And at this moment matrix multiplication is not a bottleneck in DL applications for AMDGPU. |
Are you volunteering? |
i would like to try |
Could you implement rocWMMA to use with Navi3 GPU? From what i understood it uses the AI accelerotors present in them for faster matrix multilplication. I guess this could make it faster for DL tasks.
https://github.com/ROCmSoftwarePlatform/rocWMMA
The text was updated successfully, but these errors were encountered: