Skip to content

Latest commit

 

History

History
65 lines (50 loc) · 1.87 KB

README.md

File metadata and controls

65 lines (50 loc) · 1.87 KB

Examples of TensorRT models using ONNX

All useful sample codes of TensorRT models using ONNX

  • to do later
    • Performance Measurement
    • Remove duplicate code
    • refactoring
    • Add a comment

0. Development Environment

  • Device
    • RTX4090
  • Dependency
    • cuda 12.2
    • tensorrt 10.5.0
    • torch 2.4.1

1. Basic step

  1. Generation TensorRT Model by using ONNX
    1.1 TensorRT CPP API
    1.2 TensorRT Python API
    1.3 Polygraphy

  2. Dynamic shapes for TensorRT
    2.1 Dynamic batch
    2.2 Dynamic input size

2. Intermediate step

  1. Custom Plugin
    3.1 Adding a pre-processing layer by cuda

  2. Modifying an ONNX graph by ONNX GraphSurgeon
    4.1 Extracting a feature map of the last Conv for Grad-Cam
    4.2 Generating a TensorRT model with a custom plugin and ONNX

  3. TensorRT Model Optimizer
    5.1 Explict Quantization (PTQ)
    5.2 Explict Quantization (QAT)
    5.3 Explict Quantization (ONNX PTQ)
    5.4 Implicit Quantization (TensorRT PTQ)
    5.5 Sparsity (2:4 sparsity pattern)
    5.6 Pruning
    5.7 Distillation
    5.8 NAS(Neural Architecture Search)
    5.9 Combinations multi-method

3. Advanced step

  1. Super Resolution
    6.1 Real-ESRGAN
  2. Object Detection
    7.1 yolo11
  3. Instance Segmentation
  4. Semantic Segmentation
  5. Depth Estimation
    10.1 Depth Pro ( "It is under repair due to an accuracy issue.")

4. reference