detr module

The detr module contains the DetrLearner class, which inherits from the abstract class Learner.

Class DetrLearner

Bases: engine.learners.Learner

The DetrLearner class is a wrapper of the DETR [1] object detection algorithm based on the original DETR implementation. It can be used to perform object detection on images (inference) and train DETR object detection models.

The DetrLearner class has the following public methods:

`DetrLearner` constructor

DetrLearner(self, model_config_path, iters, lr, batch_size, optimizer, backbone, checkpoint_after_iter, checkpoint_load_iter,
temp_path, device, threshold, num_classes, panoptic_segmentation)

Constructor parameters:

model_config_path: str, default="OpenDR/src/perception/object_detection_2d/detr/algorithm/config/model_config.yaml"
Specifies the path to the config file that contains the additional parameters from the original DETR implementation.
iters: int, default=10
Specifies the number of epochs the training should run for.
lr: float, default=1e-4
Specifies the initial learning rate to be used during training.
batch_size: int, default=1
Specifies number of images to be bundled up in a batch during training. This heavily affects memory usage, adjust according to your system.
optimizer: {'sgd', 'adam', 'adamw'}, default='adamw'
Specifies the type of optimizer that is used during training.
backbone: {'resnet50', 'resnet101'}, default='resnet50'
Specifies the backbone architecture. Other Torchvision backbones are also valid, but have no pretrained DETR models available. Therefore other backbone models have to be learned from scratch.
checkpoint_after_iter: int, default=0
Specifies per how many training iterations a checkpoint should be saved. If it is set to 0 no checkpoints will be saved.
checkpoint_load_iter: int, default=0
Specifies which checkpoint should be loaded. If it is set to 0, no checkpoints will be loaded.
temp_path: str, default='temp'
Specifies a path where the algorithm looks for pretrained backbone weights, the checkpoints are saved along with the logging files.
device: {'cpu', 'cuda'}, default='cuda'
Specifies the device to be used.
threshold: float, default=0.7
Specifies the threshold for object detection inference. An object is detected if the confidence of the output is higher than the specified threshold.
num_classes: int, default=91
Specifies the number of classes of the model. The default is 91, since this is the number of classes in the COCO dataset, but modifying the num_classes allows the user to train on its own dataset. It is also possible to use pretrained DETR models with the specified num_classes, since the head of the pretrained model with be modified appropriately. In this way, a model that was pretrained on the coco dataset can be finetuned to another dataset. Training on other datasets than COCO can be done by creating a DatasetIterator that outputs (Image, BoundingBoxList) tuples. Below you can find an example that shows how you can create such a DatasetIterator.
panoptic_segmentations: bool, default=False
Specifies whether panoptic segmentation is performed. If True, the download() method will download COCO panoptic models and the model returns, next to bounding boxes, segmentations of objects.

`DetrLearner.fit`

DetrLearner.fit(self, dataset, val_dataset, logging_path, silent, verbose, annotations_folder, train_images_folder,
train_annotations_file, val_images_folder, val_annotations_file)

This method is used for training the algorithm on a train dataset and validating on a val dataset. Returns a dictionary containing stats regarding the last evaluation ran.

Parameters:

dataset: object
Object that holds the training dataset. Can be of type ExternalDataset or a custom dataset inheriting from DatasetIterator.
val_dataset : object, default=None
Can be of type ExternalDataset or a custom dataset inheriting from DatasetIterator. Object that holds the validation dataset.
logging_path : str, default=''
Path to save TensorBoard log files. If set to None or '', TensorBoard logging is disabled.
silent : bool, default=False
If True, all printing of training progress reports and other information to STDOUT are disabled.
verbose : bool, default=True
Enables the maximum verbosity.
annotations_folder : str, default='Annotations'
Folder name of the annotations json file. This folder should be contained in the dataset path provided.
train_images_folder : str, default='train2017'
Name of the folder that contains the train dataset images. This folder should be contained in the dataset path provided. Note that this is a folder name, not a path.
train_annotations_file : str, default='instances_train2017.json'
Filename of the train annotations json file. This file should be contained in the dataset path provided.
val_images_folder : str, default='val2017'
Folder name that contains the validation images. This folder should be contained in the dataset path provided. Note that this is a folder name, not a path.
val_annotations_file : str, default='instances_val2017.json'
Filename of the validation annotations json file. This file should be contained in the dataset path provided in the annotations folder provided.

`DetrLearner.eval`

DetrLearner.eval(self, dataset, images_folder, annotations_folder, annotations_file)

This method is used to evaluate a trained model on an evaluation dataset. Returns a dictionary containing stats regarding evaluation.

Parameters:

dataset : object
ExternalDataset class object or DatasetIterator class object. Object that holds the evaluation dataset.
images_folder : str, default='val2017'
Folder name that contains the dataset images. This folder should be contained in the dataset path provided. Note that this is a folder name, not a path.
annotations_folder : str, default='Annotations'
Folder name of the annotations json file. This file should be contained in the dataset path provided.
annotations_file : str, default='instances_val2017.json'
Filename of the annotations json file. This file should be contained in the dataset path provided.
verbose : bool, default=True
Enables the maximum verbosity.

`DetrLearner.infer`

DetrLearner.infer(self, image)

This method is used to perform object detection on an image. Returns an engine.target.BoundingBoxList object, which contains bounding boxes that are described by the left-top corner and its width and height, or returns an empty list if no detections were made.

Parameters:

image : object
Image of type engine.data.Image class or np.array. Image to run inference on.

`DetrLearner.save`

DetrLearner.save(self, path, verbose)

This method is used to save a trained model. Provided with the path, it creates the "name" directory, if it does not already exist. Inside this folder, the model is saved as "detr_[backbone_model].pth" and the metadata file as "detr_[backbone].json". If the directory already exists, the "detr_[backbone_model].pth" and "detr_[backbone].json" files are overwritten.

If self.optimize was run previously, it saves the optimized ONNX model in a similar fashion with an ".onnx" extension, by copying it from the self.temp_path it was saved previously during conversion.

Parameters:

path: str
Path to save the model, including the filename.
verbose: bool, default=False
Enables the maximum verbosity.

`DetrLearner.load`

DetrLearner.load(self, path)

This method is used to load a previously saved model from its saved folder. Loads the model from inside the directory of the path provided, using the metadata .json file included.

Parameters:

path: str
Path of the model to be loaded.

`DetrLearner.optimize`

DetrLearner.optimize(self, do_constant_folding)

This method is used to optimize a trained model to ONNX format which can be then used for inference.

Parameters:

do_constant_folding: bool, default=False
ONNX format optimization. If True, the constant-folding optimization is applied to the model during export. Constant-folding optimization will replace some of the ops that have all constant inputs, with pre-computed constant nodes.

`DetrLearner.download`

DetrLearner.download(self, path, mode, verbose)

Download utility for various DETR components. Downloads files depending on mode and saves them in the path provided. It supports downloading:

The default resnet50 and resnet101 pretrained models.
The weights for resnet50 and resnet101 bacbones.
A test dataset with a single COCO image and its annotation.

Parameters:

path : str, default=None.
Local path to save the files.
mode : {'pretrained', 'weights', 'test_data'}, default='pretrained'
This str determines what file to download. Note that for modes 'weights' and 'pretrained' a model is downloaded and loaded according to the value of self.backbone. Backbones for which pretrained models are available, are: 'resnet50' and 'resnet101'. Also, a pretrained model with dilation is downloaded in case self.args.dilation is True. In case self.panoptic_segmentation is True, a model that was pretrained on the COCO panoptic dataset is downloaded.
verbose : bool, default=True
Enables the maximum verbosity.

ROS Node

A ROS node is available for performing inference on an image stream. Documentation on how to use this node can be found here.

Tutorials and Demos

A tutorial on performing inference is available here. Furthermore, demos on performing training, evaluation and inference are also available.

Examples

Training example using an ExternalDataset.

To train properly, the backbone weights are downloaded automatically in the temp_path. Default backbone is 'resnet50'. The training and evaluation dataset should be present in the path provided, along with the JSON annotation files. The default COCO 2017 training data can be found here (train, val, annotations). The batch_size argument should be adjusted according to available memory.

from opendr.perception.object_detection_2d import DetrLearner
from opendr.engine.datasets import ExternalDataset

detr_learner = DetrLearner(temp_path='./parent_dir', batch_size=8, device="cuda")

training_dataset = ExternalDataset(path="./data", dataset_type="COCO")
validation_dataset = ExternalDataset(path="./data", dataset_type="COCO")

detr_learner.fit(dataset=training_dataset, val_dataset=validation_dataset, logging_path="./logs")
detr_learner.save('./saved_models/trained_model')

Training example with a custom DatasetIterator.

This example serves to show how a custom dataset can be created by a user and used for training. In this way, the user can easily train on its own dataset. In order to do this, the user should create a DatasetIterator object that outputs (Image, BoundingBoxList) tuples. Here we show an example for doing this for the COCO dataset, but this can be done for any dataset as long as the DatasetIterator outputs (Image, BoundingBoxList) tuples.

import os
import numpy as np
from pycocotools.coco import COCO
from opendr.engine.datasets import DatasetIterator
from opendr.engine.data import Image
from opendr.engine.target import BoundingBoxList
from opendr.perception.object_detection_2d.detr.detr_learner import DetrLearner
from PIL import Image as im

# We create a DatasetIterator object that loads coco images and annotations and outputs (Image, BoundingBoxList) tuples.
class CocoDatasetIterator(DatasetIterator):
    def __init__(self, image_folder, annotations_file):
        super().__init__()
        self.root = os.path.expanduser(image_folder)
        self.coco = COCO(annotations_file)
        self.ids = list(self.coco.imgs.keys())

    def __getitem__(self, idx):
        # Get ids of image and annotations
        img_id = self.ids[idx]
        ann_ids = self.coco.getAnnIds(imgIds=img_id)

        # Load the annotations with pycocotools
        target = self.coco.loadAnns(ann_ids)

        # Convert coco annotations to BoundingBoxList objects
        bounding_box_list = BoundingBoxList.from_coco(target, image_id=img_id)

        # Load images
        path = self.coco.loadImgs(img_id)[0]['file_name']
        img = im.open(os.path.join(self.root, path)).convert('RGB')

        # Convert image to Image object
        image = Image(np.array(img))

        return image, bounding_box_list

    def __len__(self):
        return len(self.ids)

# We create a learner that trains for 3 epochs
learner = DetrLearner(iters=3, temp_path="temp")

# We download a pretrained detr model from the detr repo
learner.download()

# Download dummy dataset with a single picture
learner.download("test_data")

# The dummy dataset is stored in the temp_path
image_folder = "temp/nano_coco/image"
annotations_file = "temp/nano_coco/instances.json"

dataset = CocoDatasetIterator(image_folder, annotations_file)

learner.fit(dataset)

Inference and result drawing example on a test .jpg image, similar to and partially copied from detr_demo colab.

This example shows how to perform inference on an image and draw the resulting bounding boxes using a detr model that is pretrained on the coco dataset.

import numpy as np
import urllib
import cv2
from opendr.perception.object_detection_2d import DetrLearner
from opendr.perception.object_detection_2d.detr.algorithm.util.draw import draw


# Download an image
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
req = urllib.request.urlopen(url)
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
img = cv2.imdecode(arr, -1)

learner = DetrLearner(threshold=0.7, backbone='resnet101')
learner.download()
bounding_box_list = learner.infer(img)
cv2.imshow('Detections', draw(img, bounding_box_list))
cv2.waitKey(0)

Inference and result drawing example on a test .jpg image with segmentations, similar to detr_demo colab.

This example shows how to perform inference on an image and draw the resulting bounding boxes and segmentations using a detr model that is pretrained on the coco_panoptic dataset.

import numpy as np
import urllib
import cv2
from opendr.perception.object_detection_2d import DetrLearner
from opendr.perception.object_detection_2d.detr.algorithm.util.draw import draw

# Download an image
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
req = urllib.request.urlopen(url)
arr = np.asarray(bytearray(req.read()), dtype=np.uint8)
img = cv2.imdecode(arr, -1)

# We want to return the segmentations and plot those, so we set panoptic_segmentation to True.
# Also, we have to modify the number of classes, since the number of panoptic classes in the pretrained detr model is 250.
learner = DetrLearner(panoptic_segmentation=True, num_classes=250)
learner.download()
bounding_box_list = learner.infer(img)
cv2.imshow('Detections', draw(img, bounding_box_list))
cv2.waitKey(0)

Optimization example for a previously trained model.

Inference can be run with the trained model after running self.optimize.

from opendr.perception.object_detection_2d.detr.detr_learner import DetrLearner

detr_learner = DetrLearner()
detr_learner.download()
detr_learner.optimize()
detr_learner.save('./parent_dir/optimized_model')

References

[1] End-to-end Object Detection with Transformers, arXiv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detr.md

detr.md

detr module

Class DetrLearner

`DetrLearner` constructor

`DetrLearner.fit`

`DetrLearner.eval`

`DetrLearner.infer`

`DetrLearner.save`

`DetrLearner.load`

`DetrLearner.optimize`

`DetrLearner.download`

ROS Node

Tutorials and Demos

Examples

References

Files

detr.md

Latest commit

History

detr.md

File metadata and controls

detr module

Class DetrLearner

DetrLearner constructor

DetrLearner.fit

DetrLearner.eval

DetrLearner.infer

DetrLearner.save

DetrLearner.load

DetrLearner.optimize

DetrLearner.download

ROS Node

Tutorials and Demos

Examples

References

`DetrLearner` constructor

`DetrLearner.fit`

`DetrLearner.eval`

`DetrLearner.infer`

`DetrLearner.save`

`DetrLearner.load`

`DetrLearner.optimize`

`DetrLearner.download`