Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

what about ait #15

Open
Boom-Hacker opened this issue Oct 9, 2023 · 1 comment
Open

what about ait #15

Boom-Hacker opened this issue Oct 9, 2023 · 1 comment

Comments

@Boom-Hacker
Copy link

i only ran aitemplate in navi3_rel_ver_1.0,it is so old

@evshiron
Copy link
Owner

evshiron commented Oct 9, 2023

Yes, that branch is very old. I made random fixes while debugging and only managed to bring it to a point where it can achieve a score of 25it/s. According to reports, using this commit of ROCm LLVM can reach 30it/s.

The submodule in this branch is linked to the specified branch of Composable Kernel, which has a Fused Attention implementation for Navi 3x.

I spent a lot of time trying to integrate this Fused Attention into PyTorch before. And you can find my efforts here:

If you're interested, you can check out the repos in the org to conduct further research.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants