what about ait #15

Boom-Hacker · 2023-10-09T14:02:03Z

i only ran aitemplate in navi3_rel_ver_1.0,it is so old

evshiron · 2023-10-09T15:19:27Z

Yes, that branch is very old. I made random fixes while debugging and only managed to bring it to a point where it can achieve a score of 25it/s. According to reports, using this commit of ROCm LLVM can reach 30it/s.

The submodule in this branch is linked to the specified branch of Composable Kernel, which has a Fused Attention implementation for Navi 3x.

I spent a lot of time trying to integrate this Fused Attention into PyTorch before. And you can find my efforts here:

https://github.com/orgs/are-we-gfx1100-yet/repositories

If you're interested, you can check out the repos in the org to conduct further research.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what about ait #15

what about ait #15

Boom-Hacker commented Oct 9, 2023

evshiron commented Oct 9, 2023

what about ait #15

what about ait #15

Comments

Boom-Hacker commented Oct 9, 2023

evshiron commented Oct 9, 2023