Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT is encountering issues with models quantized using pytorch-quantization. #3976

Open
wjx10210 opened this issue Jul 1, 2024 · 1 comment

Comments

@wjx10210
Copy link

wjx10210 commented Jul 1, 2024

I am currently learning about two tools, pytorch-quantization and trt-engine-explorer. I have created a virtual model structure as follows:
image

When I used trt-engine-explorer to visualize the engine structure, two 'reformat' nodes appeared. How can I remove these two 'reformat' nodes?
image

Environment

TensorRT Version:8.6.1.6

NVIDIA GPU:RTX A1000 laptop

NVIDIA Driver Version: 535.129.03

CUDA Version:11.8

CUDNN Version:8.6.0

Operating System:

Python Version (if applicable):3.8

PyTorch Version (if applicable):1.13

This is onnxfile.
quan.onnx.zip

Here is the complete code:

code.py.zip

@lix19937
Copy link

lix19937 commented Jul 1, 2024

When I used trt-engine-explorer to visualize the engine structure, two 'reformat' nodes appeared. How can I remove these two 'reformat' nodes?

Reformatting layers can be inserted by TensorRT for internal tensors to improve performance.

As usual, TensorRT assumes that the network inputs/outputs are in FP32 linear format. However, many tactics in TensorRT require different formats, like NHWC8 or NC/32HW32 formats, so TensorRT automatically inserts Reformat layers to transform the format of the tensors.

If you want to remove these additional reformats layers, you can specify the previous/next layers's format. See the docs here: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reformat-free-network-tensors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants