Skip to content

Latest commit

 

History

History
365 lines (316 loc) · 16 KB

README.en.md

File metadata and controls

365 lines (316 loc) · 16 KB

English | 简体中文

🚀 TensorRT YOLO

GitHub License GitHub Release GitHub commit activity GitHub Repo stars GitHub forks

🚀TensorRT-YOLO is a user-friendly and extremely efficient inference deployment tool for the YOLO series, specifically designed for NVIDIA devices. This project not only integrates TensorRT plugins to enhance post-processing effects but also utilizes CUDA kernels and CUDA graphs to accelerate inference. TensorRT-YOLO provides support for both C++ and Python inference, aiming to offer a plug-and-play deployment experience. It includes task scenarios such as object detection, instance segmentation, image classification, pose recognition, oriented bounding box detection, and video analysis, meeting developers' multi-scenario deployment needs.

Detect Segment
Pose OBB

✨ Key Features

  • Diverse YOLO Support: Fully compatible with YOLOv3 to YOLOv11, as well as PP-YOLOE and PP-YOLOE+, meeting the needs of different versions.
  • Multi-scenario Applications: Provides example code for diverse scenarios such as Detect, Segment, Classify, Pose, and OBB.
  • Model Optimization and Inference Acceleration:
    • ONNX Support: Supports static and dynamic export of ONNX models, including TensorRT custom plugin support, simplifying the model deployment process.
    • TensorRT Integration: Integrated TensorRT plugins, including custom plugins, accelerate post-processing for Detect, Segment, Pose, OBB, and other scenarios, enhancing inference efficiency.
    • CUDA Acceleration: Optimizes pre-processing with CUDA kernels and accelerates inference processes with CUDA graph technology, achieving high-performance computing.
  • Language Support: Supports C++ and Python (mapped through Pybind11, enhancing Python inference speed), meeting the needs of different programming languages.
  • Deployment Convenience:
    • Dynamic Library Compilation: Provides support for dynamic library compilation, facilitating calling and deployment.
    • No Third-Party Dependencies: All features are implemented using standard libraries, with no additional dependencies, simplifying the deployment process.
  • Rapid Development and Deployment:
    • CLI Tools: Provides a command-line interface (CLI) tool for quick model export and inference.
    • Cross-Platform Support: Supports various devices such as Windows, Linux, ARM, x86, adapting to different hardware environments.
    • Docker Deployment: Supports one-click deployment with Docker, simplifying environment configuration and deployment processes.
  • TensorRT Compatibility: Compatible with TensorRT 10.x versions, ensuring compatibility with the latest technologies.

🔮 Documentation and Tutorials

💨 Quick Start

🔸 Prerequisites

  • Recommended CUDA version >= 11.0.1 (Minimum CUDA version 11.0.1)
  • Recommended TensorRT version >= 8.6.1 (Minimum TensorRT version 8.6.1)
  • OS: Linux x86_64 (Recommended) arm / Windows /

🎆 Quick Installation

Important

Before inference, please refer to the 🔧 CLI Model Export documentation to export the ONNX model suitable for this project's inference and build it into a TensorRT engine.

Python SDK Quick Start

Important

Before inference, please refer to the 🔧 CLI Model Export documentation to export the ONNX model suitable for this project's inference and build it into a TensorRT engine.

Python CLI Inference Example

Note

Using the --cudaGraph option can significantly improve inference speed, but note that this feature is only available for static models.

The -m, --mode parameter can be used to select different model types, where 0 represents detection (Detect), 1 represents oriented bounding box (OBB), 2 represents segmentation (Segment), 3 represents pose estimation (Pose), and 4 represents image classification (Classify).

  1. Use the tensorrt_yolo library's trtyolo command-line tool for inference. Run the following command to view help information:

    trtyolo infer --help
  2. Run the following command for inference:

    trtyolo infer -e models/yolo11n.engine -m 0 -i images -o output -l labels.txt --cudaGraph

    Inference results will be saved to the output folder, and visualized results will be generated.

Python Inference Example

Note

DeployDet, DeployOBB, DeploySeg, DeployPose and DeployCls correspond to detection (Detect), oriented bounding box (OBB), segmentation (Segment), pose estimation (Pose) and image classification (Classify) models, respectively.

For these models, the CG version utilizes CUDA Graph to further accelerate the inference process, but please note that this feature is limited to static models.

import cv2
from tensorrt_yolo.infer import DeployDet, generate_labels, visualize

# Initialize the model
model = DeployDet("yolo11n-with-plugin.engine")
# Load the image
im = cv2.imread("test_image.jpg")
# Model prediction
result = model.predict(cv2.cvtColor(im, cv2.COLOR_BGR2RGB))
print(f"==> detect result: {result}")
# Visualization
labels = generate_labels("labels.txt")
vis_im = visualize(im, result, labels)
cv2.imwrite("vis_image.jpg", vis_im)

C++ SDK Quick Start

Note

DeployDet, DeployOBB, DeploySeg, DeployPose and DeployCls correspond to detection (Detect), oriented bounding box (OBB), segmentation (Segment), pose estimation (Pose) and image classification (Classify) models, respectively.

For these models, the CG version utilizes CUDA Graph to further accelerate the inference process, but please note that this feature is limited to static models.

#include <opencv2/opencv.hpp>
// For convenience, the module uses standard libraries except for CUDA and TensorRT
#include "deploy/vision/inference.hpp"
#include "deploy/vision/result.hpp"

int main() {
    // Initialize the model
    auto model = deploy::DeployDet("yolo11n-with-plugin.engine");
    // Load the image
    cv::Mat cvim = cv::imread("test_image.jpg");
    // Convert the image from BGR to RGB
    cv::cvtColor(cvim, cvim, cv::COLOR_BGR2RGB);
    deploy::Image im(cvim.data, cvim.cols, cvim.rows);
    // Model prediction
    deploy::DetResult result = model.predict(im);
    // Visualization (code omitted)
    // ...
    return 0;
}

For more deployment examples, please refer to the Model Deployment Examples section.

🖥️ Model Support List

Detect Segment
Pose OBB

Symbol legend: (1) ✅ : Supported; (2) ❔: In progress; (3) ❎ : Not supported; (4) ❎ : Self-implemented export required for inference.

Task Scenario Model CLI Export Inference Deployment
Detect ultralytics/yolov3
Detect ultralytics/yolov5
Detect meituan/YOLOv6 ❎ Refer to official export tutorial
Detect WongKinYiu/yolov7 ❎ Refer to official export tutorial
Detect WongKinYiu/yolov9 ❎ Refer to official export tutorial
Detect THU-MIG/yolov10
Detect ultralytics/ultralytics
Detect PaddleDetection/PP-YOLOE+
Segment ultralytics/yolov3
Segment ultralytics/yolov5
Segment meituan/YOLOv6-seg ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment WongKinYiu/yolov7 ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment WongKinYiu/yolov9 ❎ Implement yourself referring to tensorrt_yolo/export/head.py 🟢
Segment ultralytics/ultralytics
Classify ultralytics/yolov3
Classify ultralytics/yolov5
Classify ultralytics/ultralytics
Pose ultralytics/ultralytics
OBB ultralytics/ultralytics

☕ Buy the Author a Coffee

Open source projects require effort. If this project has been helpful to you, consider buying the author a coffee. Your support is the greatest motivation for the developer to keep maintaining the project!

📄 License

TensorRT-YOLO is licensed under the GPL-3.0 License, an OSI-approved open-source license that is ideal for students and enthusiasts, fostering open collaboration and knowledge sharing. Please refer to the LICENSE file for more details.

Thank you for choosing TensorRT-YOLO; we encourage open collaboration and knowledge sharing, and we hope you comply with the relevant provisions of the open-source license.

📞 Contact

For bug reports and feature requests regarding TensorRT-YOLO, please visit GitHub Issues!

🙏 Thanks

Featured|HelloGitHub