Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocWMMA support #560

Open
radudiaconu0 opened this issue Dec 3, 2023 · 6 comments
Open

rocWMMA support #560

radudiaconu0 opened this issue Dec 3, 2023 · 6 comments

Comments

@radudiaconu0
Copy link

radudiaconu0 commented Dec 3, 2023

Could you implement rocWMMA to use with Navi3 GPU? From what i understood it uses the AI accelerotors present in them for faster matrix multilplication. I guess this could make it faster for DL tasks.

https://github.com/ROCmSoftwarePlatform/rocWMMA

@radudiaconu0 radudiaconu0 changed the title rocWMMA.jl support rocWMMA support Dec 3, 2023
@vchuravy
Copy link
Member

One could, but it would require probably quite a bit of work by someone.

CUDA.jl WMMA support was the work of a full-time master student.

@radudiaconu0
Copy link
Author

One could, but it would require probably quite a bit of work by someone.

CUDA.jl WMMA support was the work of a full-time master student.

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

@pxl-th
Copy link
Member

pxl-th commented Dec 18, 2023

For matrix multiplication we are using rocBLAS, adding wmma support won't affect its performance.

@pxl-th
Copy link
Member

pxl-th commented Dec 18, 2023

And at this moment matrix multiplication is not a bottleneck in DL applications for AMDGPU.
Timely memory freeing is.

@vchuravy
Copy link
Member

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

Are you volunteering?

@radudiaconu0
Copy link
Author

wekk it has to be done at some point. why only nvidia to get all the goodies? :P

Are you volunteering?

i would like to try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants