English | 简体中文
🚀TensorRT-YOLO is a user-friendly and extremely efficient inference deployment tool for the YOLO series, specifically designed for NVIDIA devices. This project not only integrates TensorRT plugins to enhance post-processing effects but also utilizes CUDA kernels and CUDA graphs to accelerate inference. TensorRT-YOLO provides support for both C++ and Python inference, aiming to offer a plug-and-play deployment experience. It includes task scenarios such as object detection, instance segmentation, image classification, pose recognition, oriented bounding box detection, and video analysis, meeting developers' multi-scenario deployment needs.
- Diverse YOLO Support: Fully compatible with YOLOv3 to YOLOv11, as well as PP-YOLOE and PP-YOLOE+, meeting the needs of different versions.
- Multi-scenario Applications: Provides example code for diverse scenarios such as Detect, Segment, Classify, Pose, and OBB.
- Model Optimization and Inference Acceleration:
- ONNX Support: Supports static and dynamic export of ONNX models, including TensorRT custom plugin support, simplifying the model deployment process.
- TensorRT Integration: Integrated TensorRT plugins, including custom plugins, accelerate post-processing for Detect, Segment, Pose, OBB, and other scenarios, enhancing inference efficiency.
- CUDA Acceleration: Optimizes pre-processing with CUDA kernels and accelerates inference processes with CUDA graph technology, achieving high-performance computing.
- Language Support: Supports C++ and Python (mapped through Pybind11, enhancing Python inference speed), meeting the needs of different programming languages.
- Deployment Convenience:
- Dynamic Library Compilation: Provides support for dynamic library compilation, facilitating calling and deployment.
- No Third-Party Dependencies: All features are implemented using standard libraries, with no additional dependencies, simplifying the deployment process.
- Rapid Development and Deployment:
- CLI Tools: Provides a command-line interface (CLI) tool for quick model export and inference.
- Cross-Platform Support: Supports various devices such as Windows, Linux, ARM, x86, adapting to different hardware environments.
- Docker Deployment: Supports one-click deployment with Docker, simplifying environment configuration and deployment processes.
- TensorRT Compatibility: Compatible with TensorRT 10.x versions, ensuring compatibility with the latest technologies.
- Installation Guide
- Quick Start
- Usage Examples
- API Documentation
- Python API Documentation (
⚠️ Not Implemented) - C++ API Documentation (
⚠️ Not Implemented)
- Python API Documentation (
- Frequently Asked Questions
⚠️ Collecting...
- Model Support List
- Recommended CUDA version >= 11.0.1 (Minimum CUDA version 11.0.1)
- Recommended TensorRT version >= 8.6.1 (Minimum TensorRT version 8.6.1)
- OS: Linux x86_64 (Recommended) arm / Windows /
- Refer to the 📦 Quick Compilation and Installation documentation
Important
Before inference, please refer to the 🔧 CLI Model Export documentation to export the ONNX model suitable for this project's inference and build it into a TensorRT engine.
Important
Before inference, please refer to the 🔧 CLI Model Export documentation to export the ONNX model suitable for this project's inference and build it into a TensorRT engine.
Note
Using the --cudaGraph
option can significantly improve inference speed, but note that this feature is only available for static models.
The -m, --mode
parameter can be used to select different model types, where 0
represents detection (Detect), 1
represents oriented bounding box (OBB), 2
represents segmentation (Segment), 3
represents pose estimation (Pose), and 4
represents image classification (Classify).
-
Use the
tensorrt_yolo
library'strtyolo
command-line tool for inference. Run the following command to view help information:trtyolo infer --help
-
Run the following command for inference:
trtyolo infer -e models/yolo11n.engine -m 0 -i images -o output -l labels.txt --cudaGraph
Inference results will be saved to the
output
folder, and visualized results will be generated.
Note
DeployDet
, DeployOBB
, DeploySeg
, DeployPose
and DeployCls
correspond to detection (Detect), oriented bounding box (OBB), segmentation (Segment), pose estimation (Pose) and image classification (Classify) models, respectively.
For these models, the CG
version utilizes CUDA Graph to further accelerate the inference process, but please note that this feature is limited to static models.
import cv2
from tensorrt_yolo.infer import DeployDet, generate_labels, visualize
# Initialize the model
model = DeployDet("yolo11n-with-plugin.engine")
# Load the image
im = cv2.imread("test_image.jpg")
# Model prediction
result = model.predict(cv2.cvtColor(im, cv2.COLOR_BGR2RGB))
print(f"==> detect result: {result}")
# Visualization
labels = generate_labels("labels.txt")
vis_im = visualize(im, result, labels)
cv2.imwrite("vis_image.jpg", vis_im)
Note
DeployDet
, DeployOBB
, DeploySeg
, DeployPose
and DeployCls
correspond to detection (Detect), oriented bounding box (OBB), segmentation (Segment), pose estimation (Pose) and image classification (Classify) models, respectively.
For these models, the CG
version utilizes CUDA Graph to further accelerate the inference process, but please note that this feature is limited to static models.
#include <opencv2/opencv.hpp>
// For convenience, the module uses standard libraries except for CUDA and TensorRT
#include "deploy/vision/inference.hpp"
#include "deploy/vision/result.hpp"
int main() {
// Initialize the model
auto model = deploy::DeployDet("yolo11n-with-plugin.engine");
// Load the image
cv::Mat cvim = cv::imread("test_image.jpg");
// Convert the image from BGR to RGB
cv::cvtColor(cvim, cvim, cv::COLOR_BGR2RGB);
deploy::Image im(cvim.data, cvim.cols, cvim.rows);
// Model prediction
deploy::DetResult result = model.predict(im);
// Visualization (code omitted)
// ...
return 0;
}
For more deployment examples, please refer to the Model Deployment Examples section.
Symbol legend: (1) ✅ : Supported; (2) ❔: In progress; (3) ❎ : Not supported; (4) ❎ : Self-implemented export required for inference.
Task Scenario | Model | CLI Export | Inference Deployment |
---|---|---|---|
Detect | ultralytics/yolov3 | ✅ | ✅ |
Detect | ultralytics/yolov5 | ✅ | ✅ |
Detect | meituan/YOLOv6 | ❎ Refer to official export tutorial | ✅ |
Detect | WongKinYiu/yolov7 | ❎ Refer to official export tutorial | ✅ |
Detect | WongKinYiu/yolov9 | ❎ Refer to official export tutorial | ✅ |
Detect | THU-MIG/yolov10 | ✅ | ✅ |
Detect | ultralytics/ultralytics | ✅ | ✅ |
Detect | PaddleDetection/PP-YOLOE+ | ✅ | ✅ |
Segment | ultralytics/yolov3 | ✅ | ✅ |
Segment | ultralytics/yolov5 | ✅ | ✅ |
Segment | meituan/YOLOv6-seg | ❎ Implement yourself referring to tensorrt_yolo/export/head.py | 🟢 |
Segment | WongKinYiu/yolov7 | ❎ Implement yourself referring to tensorrt_yolo/export/head.py | 🟢 |
Segment | WongKinYiu/yolov9 | ❎ Implement yourself referring to tensorrt_yolo/export/head.py | 🟢 |
Segment | ultralytics/ultralytics | ✅ | ✅ |
Classify | ultralytics/yolov3 | ✅ | ✅ |
Classify | ultralytics/yolov5 | ✅ | ✅ |
Classify | ultralytics/ultralytics | ✅ | ✅ |
Pose | ultralytics/ultralytics | ✅ | ✅ |
OBB | ultralytics/ultralytics | ✅ | ✅ |
Open source projects require effort. If this project has been helpful to you, consider buying the author a coffee. Your support is the greatest motivation for the developer to keep maintaining the project!
TensorRT-YOLO is licensed under the GPL-3.0 License, an OSI-approved open-source license that is ideal for students and enthusiasts, fostering open collaboration and knowledge sharing. Please refer to the LICENSE file for more details.
Thank you for choosing TensorRT-YOLO; we encourage open collaboration and knowledge sharing, and we hope you comply with the relevant provisions of the open-source license.
For bug reports and feature requests regarding TensorRT-YOLO, please visit GitHub Issues!