Make some fbgemm fp8 triton ops pt2 friendly #3188

Summary: X-link: facebookresearch/FBGEMM#283 Pull Request resolved: pytorch#3188 Make some fbgemm fp8 triton ops pt2 friendly.. # What this diff tries to do * stop using TensorWrapper and tl.reinterpret * Remove the use of triton_heuristics for _kernel_matmul_fp8_row # What this diff won't help: * triton_herustics use cases of EVEN_K. One option is to just merge that into the autotuning configs # need to do in the future: * Update other ops, like quantize_fp8_row. * Update documentation. Feels pretty outdated, and some still reference to TensorWrapper. Reviewed By: henrylhtsang Differential Revision: D63560103

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make some fbgemm fp8 triton ops pt2 friendly #3188

Make some fbgemm fp8 triton ops pt2 friendly #3188

Commits on Sep 30, 2024