Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: attr: quantization refactor (part 1) #2270

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dzarukin
Copy link
Contributor

Every year around Christmas time something happens to quantization:

  • 2022 - a move to runtime happened, lots of obsolete code left behind
  • 2023 - advanced quantization with groups appeared, even more code that could use some love left behind.
  • 2023.5 - extension of zero-points for SRC argument happened, zero-points has become a warehouse of variable to access directly.
  • 2024 - time for refactor now!

The whole point of this refactor is to move quantization attributes to C++ way of doing things - provide clear and simple interfaces to operate with objects and close members.
This part 1 covers scales which were not that bad in terms of interfaces but could be better with argument members which this part covers.

Interface for both scales and its new underlying object, quant_entry_t, provide getters for a mask, data_type and groups (no need to worry about ndims any longer!), as well as default values checks.
Initialization is still done through set, reset was replaced with set,

Any operations with masks should happen only after verifying that specific arg scale is not default! It's forced now through an invalid mask value which can't be used as is.

Among legacy use-cases here are the main changes across sources:
common:

  • some primitive attribute checkers got more checks added.
    cpu:
  • many places updated mask != 0 which was the default value for common and non-initialized scales to mask > 0 as now the default mask is negative.
    • A check for equality is still valid while it's highly recommended to avoid unequal comparison (unless you really know what you are doing).
      gpu:
  • found some bugs in gemm_with_post_ops implementations.
  • changed logic in several generic/cudnn kernels to match the new behavior.
    tests:
  • updated gtests to comply with updated primitive checks.

Part 2 will cover zero-points.

Disclaimer: the change is somewhat fundamental, the bug leaks are highly possible even if all tests are passing. Feel free to report any if I missed something.

@dzarukin dzarukin requested review from a team as code owners December 13, 2024 21:37
@dzarukin
Copy link
Contributor Author

make test

@github-actions github-actions bot added platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic labels Dec 13, 2024
@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from ad9b2d3 to 9b4efc6 Compare December 13, 2024 23:33
@dzarukin
Copy link
Contributor Author

make test

@dzarukin dzarukin force-pushed the dzarukin/refactor_quant branch from 9b4efc6 to 58b979e Compare December 16, 2024 18:10
@dzarukin
Copy link
Contributor Author

make test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:tests platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 platform:cpu-x64 Intel64/AMD64 processors. Codeowner: @oneapi-src/onednn-cpu-x64 platform:gpu-generic Codeowner: @oneapi-src/onednn-gpu-generic platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant