Feature/quantizer factory refactoring #1963

BloodAxe · 2024-04-15T11:44:36Z

List of breaking changes:

QATRecipeModificationCallback is not required anymore. modify_params_for_qat method is now a part of TensorRT Quantizer class
Fallback mechanism:
Class can be left "as is", with warning message indicating it is deprecated. Implementation is no-op (making no changes to config file)
Quantization params config section now has different structure
Fallback mechanism:
Probably can detect whether new/old config is given and fallback to TRT quantizer in the latter case
Trainer.ptq / Trainer.qat methods.
We can make them work as before, but this would limit us to using only a TRT quantizer.

…rrt_nms

… quantizatino params are used

shaydeci · 2024-04-16T13:27:19Z

src/super_gradients/training/sg_trainer/sg_trainer.py


+        # TODO: Should this really be here?


Why not?
We can do PTQ/QAT on multiple GPUs (I specifically remember I wrote support for it in the past...)

Okie, removing todo then)

shaydeci · 2024-04-16T13:42:08Z

src/super_gradients/recipes/quantization_params/default_tensorrt_ptq.yaml

@@ -0,0 +1,34 @@
+quantizer:
+  TRTPTQQuantizer:


I don't understand the logic of decoupling the exported from the quantizer...
I feel this is puts more risk for errors (like having the TRT quantizer with int8 and onnx exporter, which migt mislead users to try and run that onnx file on onnruntime(.
There is some benefit for debugging though, I guess...

I agree. I think the exporter and quantizer should be part of the same object

shaydeci · 2024-04-16T13:44:48Z

src/super_gradients/recipes/quantization_params/default_tensorrt_ptq.yaml

+output_path: ${architecture}_trt_ptq_int8.onnx
+
+# Model-specific parameters
+export_params:


We need to properly document all the export params with their corresponding exporters.
I would feel a bit lost if I wanted lets say, try and build a TRT engine on a segmentation model for instance.

src/super_gradients/recipes/quantization_params/default_tensorrt_qat.yaml

shaydeci · 2024-04-16T13:47:43Z

src/super_gradients/recipes/quantization_params/default_tensorrt_ptq.yaml

+    detection_max_predictions_per_image: 128
+    detection_num_pre_nms_predictions: 1000
+    detection_predictions_format: flat
+    detection_postprocessing_use_tensorrt_nms: False


I find it pretty confusing that we have an Exporter object, but model specific export_params....
Especially given the fact that there's detection_postprocessing_use_tensorrt_nms argument here.

This is nothing new btw, ExportParams are not introduced in this PR, it's just I've added explicitly all the fields that exists in ExportParams structure to help user start editing these parameters.
I can duplicate docstrings on each setting, but it's going quite long YAML (see ExportParams)

src/super_gradients/training/sg_trainer/sg_trainer.py

…ntization on instantiation of TRT Quantizers

ofrimasad

Very impressive.
Not sure I understand the logic.
Why do we convert to onnx and then apply the specific quantazier and exporter?

ofrimasad · 2024-04-18T07:33:08Z

src/super_gradients/conversion/abstract_exporter.py

+    """
+
+    @abc.abstractmethod
+    def export_from_onnx(self, source_onnx: str, output_file: str) -> str:


I don't think I understand the logic of this.
We said that SG will not export the TRT engine, or OpenVino engine. We export an onnx file (even if that file is framework specific)

ofrimasad · 2024-04-18T11:53:43Z

src/super_gradients/module_interfaces/exportable_segmentation.py


-            model = ptq(
+            quantized_model = tensorrt_ptq(


I don't understand. This is only for TRT, what if I want to convert to something else?

ofrimasad · 2024-04-18T11:57:54Z

src/super_gradients/recipes/quantization_params/default_tensorrt_ptq.yaml

@@ -0,0 +1,34 @@
+quantizer:
+  TRTPTQQuantizer:


I agree. I think the exporter and quantizer should be part of the same object

…f engine

BloodAxe added 30 commits March 28, 2024 11:33

OpenVINO

e6bc8fe

OpenVINO

246d59d

PTQ with QC

e284b31

OpenVINO

05c38fa

quantize_with_accuracy_control

b27ce4d

coco2017_yolo_nas_s_ptq_only

83dd059

coco2017_yolo_nas_s_ptq_only

d33a1eb

coco2017_yolo_nas_s_ptq_only

96f3049

coco2017_yolo_nas_s_ptq_only_cpu

f0536d4

coco2017_yolo_nas_s_ptq_only_cpu

e0e5c91

coco2017_yolo_nas_s_ptq_only_cpu

08b829e

coco2017_yolo_nas_s_ptq_only_cpu

93c507c

coco2017_yolo_nas_s_ptq_only_cpu

10146bc

coco2017_yolo_nas_s_ptq_only_cpu

ec135ca

coco2017_yolo_nas_s_ptq_only_cpu

2aba1cb

Working PoC of export of quantized model

eb75e0f

PoC of the plugin-based exporter & quantizer

240c0b1

PoC of the plugin-based exporter & quantizer

792aaf5

Make TRT & Vino quantizers work for PTQ

22ead7b

Make TRT QAT work

faeaf70

Implementing export (WIP)

0a8b01d

Update exporters API

a2fd101

Update exporters API

0f5f8ca

Print only common metircs

04e9230

Print only common metircs

a701f85

Tune QAT parameters

72e590d

Split PTQ & QAT methods into two different classes

7a42947

Added doc

be785d9

Fix qat using wrong model

bc1741b

Update trainer

fdc25cd

BloodAxe added 12 commits April 15, 2024 12:35

Remove openvino stuff

173f00d

Remove openvino stuff

e8bb9aa

Remove openvino stuff

63f6d3f

Revert irrelevant stuff

fa2eb86

Revert irrelevant stuff

0d111e9

postprocessing_use_tensorrt_nms -> detection_postprocessing_use_tenso…

cbc670f

…rrt_nms

Revert src/super_gradients/modules/skip_connections.py

d708ae1

Remove irrelevant stuff

24e1355

Update QATRecipeModificationCallback

6b7b19e

Adding back ptq() and qat() methods

65f174c

Adding back ptq() and qat() methods

54c8b58

Force PTQ/QAT modes when called from corresponding methods and legacy…

a8ab948

… quantizatino params are used

shaydeci suggested changes Apr 16, 2024

View reviewed changes

BloodAxe added 13 commits April 17, 2024 16:24

Improve output path handling of exported model

8854ab6

Remove TODO

58ff42b

Remove leftover

4d14410

Added missing max_batches docstring

39416ae

Update notebook

9d1fe2b

Added missing installation pf trt quantizer for

77427bc

Reoranize quantizers sub-package to allow lazy-install of pytorch-qua…

20423af

…ntization on instantiation of TRT Quantizers

Fix tests

8974d56

Fix syntax error

f89f9ba

Fix import

5836dae

Fix import

aa3264f

Fix import

be70f48

Update docs in YAML files

5ecefd1

ofrimasad reviewed Apr 18, 2024

View reviewed changes

BloodAxe added 3 commits April 19, 2024 14:22

Update notebook to use postprocessing_use_tensorrt_nms:bool instead o…

a6cf4d3

…f engine

Update notebook to use postprocessing_use_tensorrt_nms:bool instead o…

0a3c076

…f engine

Fixing tests

8be4575

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/quantizer factory refactoring #1963

Feature/quantizer factory refactoring #1963

BloodAxe commented Apr 15, 2024 •

edited

Loading

shaydeci Apr 16, 2024

BloodAxe Apr 18, 2024

shaydeci Apr 16, 2024

ofrimasad Apr 18, 2024

shaydeci Apr 16, 2024

shaydeci Apr 16, 2024

BloodAxe Apr 18, 2024 •

edited

Loading

ofrimasad left a comment

ofrimasad Apr 18, 2024

ofrimasad Apr 18, 2024

ofrimasad Apr 18, 2024

Feature/quantizer factory refactoring #1963

Are you sure you want to change the base?

Feature/quantizer factory refactoring #1963

Conversation

BloodAxe commented Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BloodAxe Apr 18, 2024 • edited Loading

Choose a reason for hiding this comment

ofrimasad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BloodAxe commented Apr 15, 2024 •

edited

Loading

BloodAxe Apr 18, 2024 •

edited

Loading