[GPU] Fp8 compute backports #2266

kealan-barbieri · 2024-12-13T17:46:50Z

Description

Backport of mixed fp8 support, additional scale support for compute primitivies.

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

dzarukin · 2024-12-13T19:03:57Z

src/common/convolution.cpp

-                            && utils::one_of(mask_wei, 0, with_groups ? 3 : 1),
+            VCHECK_CONV_UNIMPL(utils::one_of(mask_wei, 0, with_groups ? 3 : 1)
+                            && utils::one_of(mask_dst, 0, 2)
+                            && utils::one_of(mask_src, 0, 3),


Could you, please, clarify why mask=3 for src but mask=2 for dst?

dzarukin · 2024-12-13T19:05:16Z

src/common/convolution.cpp

+                    | smask_t::zero_points_runtime_data_type
+                    | smask_t::scales_runtime_groups
+                    | smask_t::scales_runtime_data_type;
+
        if (engine->kind() == engine_kind::gpu)


The logic here and below seems duplicated. Could you, please, consolidate it?

src/common/matmul.cpp

src/common/matmul_pd.hpp

dzarukin · 2024-12-13T19:13:03Z

src/common/matmul_pd.hpp

@@ -195,7 +207,13 @@ struct matmul_pd_t : public primitive_desc_t {
                                sc.group_dims_[0] == 1
                                        && K() % sc.group_dims_[1] == 0);
            } else {
-                ok = ok && (mask == 0);
+                ok = ok


Here must be a check for fp8 versus classic quantization.

tests/benchdnn/inputs/matmul/option_set_fp8_mixed

dzarukin · 2024-12-13T19:28:16Z

tests/benchdnn/utils/cfg.hpp

@@ -185,7 +185,7 @@ struct base_cfg_t {
        }
        const int64_t safe_digits = get_safe_digits();
        const int64_t safe_n_acc = (1LL << safe_digits) / max_value;
-        return safe_n_acc;
+        return std::max((int64_t)1L, safe_n_acc);


Returning safe_n_acc = 0 is intentional here, it says that input values are not reasonable. When it happened it returned zero?

xe: jit: backport mixed fp8 compute

18b58c3

kealan-barbieri added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel backport labels Dec 13, 2024

kealan-barbieri requested review from a team as code owners December 13, 2024 17:46

atkassen approved these changes Dec 13, 2024

View reviewed changes

github-actions bot removed the backport label Dec 13, 2024

hidefromkgb approved these changes Dec 13, 2024

View reviewed changes

dzarukin reviewed Dec 13, 2024

View reviewed changes

kealan-barbieri added 7 commits December 13, 2024 15:05

xe: jit: backport src, dst compute scales

8b99790

xe: jit: gemm: adjust strategies for fp8 weights decomp

94581ea

tests: benchdnn: matmul: reduce int4 weights range

5c8c716

tests: benchdnn: add mixed fp8 conv, matmul inputs

a673bc3

tests: gtests: remove dst scale checks

22672f8

xe: jit: gemm: enable mixed bf16->fp8

dec5a07

tests: benchdnn: restrict dst scales to common for cpu

8aa5c64

kealan-barbieri force-pushed the kealanba/compute_backports branch from ded32f2 to 8aa5c64 Compare December 13, 2024 23:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Fp8 compute backports #2266

[GPU] Fp8 compute backports #2266

kealan-barbieri commented Dec 13, 2024 •

edited

Loading

dzarukin Dec 13, 2024

dzarukin Dec 13, 2024

dzarukin Dec 13, 2024

dzarukin Dec 13, 2024

[GPU] Fp8 compute backports #2266

Are you sure you want to change the base?

[GPU] Fp8 compute backports #2266

Conversation

kealan-barbieri commented Dec 13, 2024 • edited Loading

Description

Checklist

General

dzarukin Dec 13, 2024

Choose a reason for hiding this comment

dzarukin Dec 13, 2024

Choose a reason for hiding this comment

dzarukin Dec 13, 2024

Choose a reason for hiding this comment

dzarukin Dec 13, 2024

Choose a reason for hiding this comment

kealan-barbieri commented Dec 13, 2024 •

edited

Loading