Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuned fp8 gemm for LDM cases #3142

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Sep 17, 2024

  1. Tuned fp8 gemm for LDM cases

    Summary:
    ```
    buck2 run @//mode/opt-amd-gpu -c fbcode.rocm_arch=mi300 --modifier ovr_config//third-party/rocm/constraints:6.0.1 //deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/bench:quantize_bench -- --enable_amd_env_vars --kernels=ck_rowwise  --N 3584 --M 8192 --K 9728 --use_rotating_buffer_bench
    
    ck_rowwise sim: 13.812.
    ck_rowwise ms: 0.558.
    ck_rowwise TFLOPS: 1022.833.
    ck_rowwise GB/s: 310.266.
    ```
    
    Differential Revision: D62776861
    zjing14 authored and facebook-github-bot committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    69d7dac View commit details
    Browse the repository at this point in the history