Add support for smem_epilogue when mma output is not cast to half #3620

protonu · 2024-12-19T14:48:24Z

Support non-stmatrix stores from regs to shared memory and then TMA when the output of mma op is not cast back to half precision - stmatrix works with half precision only.

protonu · 2024-12-19T15:27:43Z

!test

csrc/scheduler/hopper_multi_matmul.cpp

tests/cpp/test_matmul_scheduler.cpp

protonu · 2024-12-19T15:58:20Z

!test

protonu · 2024-12-19T16:01:32Z

!test

csrc/scheduler/hopper_multi_matmul.cpp

tests/cpp/test_matmul_scheduler.cpp

…into pbasu_smem_epi_no_stmatrix

csrc/scheduler/hopper_multi_matmul.cpp

rdspring1

Refactor Follow-Up Proposal:
It looks like scheduleEpilogue could be broken easily into two functions for readability instead of a single monolithic function.

void HopperMultipleMatmulScheduler::scheduleEpilogueWithVectorization() {}
void HopperMultipleMatmulScheduler::scheduleSmemEpilogue() {}

void HopperMultipleMatmulScheduler::scheduleEpilogue() {
  if (!params_->use_smem_epilogue) {
    scheduleEpilogueWithVectorization();
  } else {
    // Use stmatrix (optional) and tma store
    scheduleSmemEpilogue();
  }
}

jacobhinkle

LGTM

protonu · 2024-12-20T15:14:32Z

!test

protonu · 2024-12-20T19:00:26Z

!test

protonu · 2024-12-21T04:27:44Z

!test

protonu added 2 commits December 19, 2024 06:45

support for smem_epilogue when mma output is not cast to half

3bcb1b5

support propagation of dc when it's not a mma output

c5100cc

protonu requested review from jacobhinkle and rdspring1 December 19, 2024 15:27

removing the change in test precision check

2d53e61

protonu marked this pull request as ready for review December 19, 2024 15:32

jacobhinkle requested changes Dec 19, 2024

View reviewed changes

comments and reviewer points

103fd17

protonu requested a review from jacobhinkle December 19, 2024 15:58

format

99c2f49

protonu added 2 commits December 19, 2024 08:10

minor error

c979210

Merge branch 'main' into pbasu_smem_epi_no_stmatrix

cf27203

jacobhinkle reviewed Dec 19, 2024

View reviewed changes

csrc/scheduler/hopper_multi_matmul.cpp Show resolved Hide resolved

csrc/scheduler/hopper_multi_matmul.cpp Outdated Show resolved Hide resolved

tests/cpp/test_matmul_scheduler.cpp Show resolved Hide resolved

protonu added 2 commits December 19, 2024 09:27

comments

158655c

Merge branch 'pbasu_smem_epi_no_stmatrix' of github.com:nvidia/fuser …

39c20ab

…into pbasu_smem_epi_no_stmatrix

protonu requested a review from jacobhinkle December 19, 2024 17:40

rdspring1 reviewed Dec 19, 2024

View reviewed changes

csrc/scheduler/hopper_multi_matmul.cpp Outdated Show resolved Hide resolved

rdspring1 reviewed Dec 19, 2024

View reviewed changes

csrc/scheduler/hopper_multi_matmul.cpp Outdated Show resolved Hide resolved

rdspring1 reviewed Dec 19, 2024

View reviewed changes

jacobhinkle approved these changes Dec 19, 2024

View reviewed changes

protonu added 3 commits December 19, 2024 14:27

Merge branch 'main' into pbasu_smem_epi_no_stmatrix

e3a5241

reviewer comments

dee2d78

Merge branch 'main' into pbasu_smem_epi_no_stmatrix

82b89de

updatint stmatrix memory test

390c650

updating a mma test with stmatrix

1056263

Merge branch 'main' into pbasu_smem_epi_no_stmatrix

13598d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for smem_epilogue when mma output is not cast to half #3620

Add support for smem_epilogue when mma output is not cast to half #3620

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

rdspring1 left a comment

jacobhinkle left a comment

protonu commented Dec 20, 2024

protonu commented Dec 20, 2024

protonu commented Dec 21, 2024

Add support for smem_epilogue when mma output is not cast to half #3620

Are you sure you want to change the base?

Add support for smem_epilogue when mma output is not cast to half #3620

Conversation

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

protonu commented Dec 19, 2024

rdspring1 left a comment

Choose a reason for hiding this comment

jacobhinkle left a comment

Choose a reason for hiding this comment

protonu commented Dec 20, 2024

protonu commented Dec 20, 2024

protonu commented Dec 21, 2024