Examples of TensorRT models using ONNX

All useful sample codes of TensorRT models using ONNX

to do later
- Performance Measurement
- Remove duplicate code
- refactoring
- Add a comment

0. Development Environment

Device
- RTX4090
Dependency
- cuda 12.2
- tensorrt 10.5.0
- torch 2.4.1

1. Basic step

Generation TensorRT Model by using ONNX
1.1 TensorRT CPP API
1.2 TensorRT Python API
1.3 Polygraphy
Dynamic shapes for TensorRT
2.1 Dynamic batch
2.2 Dynamic input size

2. Intermediate step

Custom Plugin
3.1 Adding a pre-processing layer by cuda
Modifying an ONNX graph by ONNX GraphSurgeon
4.1 Extracting a feature map of the last Conv for Grad-Cam
4.2 Generating a TensorRT model with a custom plugin and ONNX
TensorRT Model Optimizer
5.1 Explict Quantization (PTQ)
5.2 Explict Quantization (QAT)
5.3 Explict Quantization (ONNX PTQ)
5.4 Implicit Quantization (TensorRT PTQ)
5.5 Sparsity (2:4 sparsity pattern)
5.6 Pruning
5.7 Distillation
5.8 NAS(Neural Architecture Search)
5.9 Combinations multi-method

3. Advanced step

Super Resolution
6.1 Real-ESRGAN
Object Detection
7.1 yolo11
Instance Segmentation
Semantic Segmentation
Depth Estimation
10.1 Depth Pro ( "It is under repair due to an accuracy issue.")

4. reference

TensorRT-Github