src: attr: quantization refactor (part 1) #2270

dzarukin · 2024-12-13T21:37:19Z

Every year around Christmas time something happens to quantization:

2022 - a move to runtime happened, lots of obsolete code left behind
2023 - advanced quantization with groups appeared, even more code that could use some love left behind.
2023.5 - extension of zero-points for SRC argument happened, zero-points has become a warehouse of variable to access directly.
2024 - time for refactor now!

The whole point of this refactor is to move quantization attributes to C++ way of doing things - provide clear and simple interfaces to operate with objects and close members.
This part 1 covers scales which were not that bad in terms of interfaces but could be better with argument members which this part covers.

Interface for both scales and its new underlying object, quant_entry_t, provide getters for a mask, data_type and groups (no need to worry about ndims any longer!), as well as default values checks.
Initialization is still done through set, reset was replaced with set,

Any operations with masks should happen only after verifying that specific arg scale is not default! It's forced now through an invalid mask value which can't be used as is.

Among legacy use-cases here are the main changes across sources:
common:

some primitive attribute checkers got more checks added.
cpu:
many places updated mask != 0 which was the default value for common and non-initialized scales to mask > 0 as now the default mask is negative.
- A check for equality is still valid while it's highly recommended to avoid unequal comparison (unless you really know what you are doing).
  gpu:
found some bugs in gemm_with_post_ops implementations.
changed logic in several generic/cudnn kernels to match the new behavior.
tests:
updated gtests to comply with updated primitive checks.

Part 2 will cover zero-points.

Disclaimer: the change is somewhat fundamental, the bug leaks are highly possible even if all tests are passing. Feel free to report any if I missed something.

dzarukin · 2024-12-13T21:37:31Z

make test

dzarukin · 2024-12-13T23:34:01Z

make test

dzarukin · 2024-12-16T18:10:21Z

make test

dzarukin requested review from a team as code owners December 13, 2024 21:37

dzarukin force-pushed the dzarukin/refactor_quant branch from ad9b2d3 to 9b4efc6 Compare December 13, 2024 23:33

dzarukin added 6 commits December 16, 2024 10:09

cpu: matmul: limit mask over K dim to specific use cases

601a6bb

gpu: intel: ocl: fix scales application

2634b7c

gtests: update allow_attr struct with scale arg

08a5101

gtests: update attr_quant with relevant arguments and conditions

0b8567e

common: move default_attr back

96e5888

src: introduce quant_entry_t and refactor arg_scales_t to rely on it

58b979e

dzarukin force-pushed the dzarukin/refactor_quant branch from 9b4efc6 to 58b979e Compare December 16, 2024 18:10

github-actions bot added the component:tests label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: attr: quantization refactor (part 1) #2270

src: attr: quantization refactor (part 1) #2270

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 16, 2024

src: attr: quantization refactor (part 1) #2270

Are you sure you want to change the base?

src: attr: quantization refactor (part 1) #2270

Conversation

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 13, 2024

dzarukin commented Dec 16, 2024