Skip to content

Latest commit

 

History

History
103 lines (83 loc) · 6.53 KB

ROS-Nodes.md

File metadata and controls

103 lines (83 loc) · 6.53 KB

This document describes the ROS nodes that are used or implemented by the project.

  1. Camera node
  2. DNN node
    1. TrailNet
    2. YOLO
    3. Running tests
  3. Controller node
  4. Image publisher node

Camera node

The camera node (gscam) publishes to the /camera/image_raw topic using the standard ROS Image message. This is the master source which provides information to other nodes like the TrailNet DNN, and obstacle avoidance.

DNN node

The caffe_ros node provides support for DNN inference using TensorRT library. This node currently supports Caffe models such as classification (TrailNet), YOLO-based object detection, regression and semantic segmentation. The node produces messages in different formats depending on the type of the network and post-processing options (e.g. TrailNet vs YOLO).

The timestamp in the header of the message is the same as the timestamp from the corresponding camera message. This simplifies synchronization of the image frames with DNN output and other nodes for the purpose of debugging and logging.

TrailNet

This node publishes output of the DNN (e.g. softmax layer) using standard Image message.

  • for 1D output like that of the softmax layer, the dimensions of the published image are 1x1xC where C is the size of the layer (e.g. 6 for the TrailNet out layer or 1000 for ImageNet's softmax layer).
  • The encoding field of the message is in the following format: 32FCXXX where XXX is equal to number of channels. For example, for TrailNet: 32FC6, for ImageNet: 32FC1000. 32 means 32-bit and F means float. See the ROS documentation for more details.
  • The data field contains the output of the DNN in byte array representation so it should be cast to a 32-bit float type.

The default topic name is /caffe_ros/network/output but it can be changed using launch file or command line.

The node requires two parameters:

  • prototxt_path - path to the Caffe model .prototxt file
  • model_path - path to the Caffe model binary file Additionally, make sure that other parameters, like the input/output layer names (input_layer/output_layer) are set correctly. An example of the command line for TrailNet:
rosrun caffe_ros caffe_ros_node __name:=trails_dnn _prototxt_path:=/data/src/redtail/models/pretrained/TrailNet_SResNet-18.prototxt _model_path:=/data/src/redtail/models/pretrained/TrailNet_SResNet-18.caffemodel _output_layer:=out

Note: when using rosrun remember to prefix parameters with _ (__ for system parameters) and use := when assigning a value.

To check that the DNN node is producing some data, make sure that camera node (a real camera via gscam or an image_pub node) is running, and execute rostopic echo /trails_dnn/network/output. An example of the output:

header: 
  seq: 201
  stamp: 
    secs: 1501106944
    nsecs: 796543294
  frame_id: ''
height: 1
width: 1
encoding: 32FC6
is_bigendian: 0
step: 24
data: [110, 74, 230, 58, 135, 125, 95, 60, 228, 14, 124, 63, 133, 26, 6, 61, 156, 75, 115, 63, 91, 87, 138, 60]

To convert the data byte array to float array, use this simple Python script:

import struct
data = bytearray([110, 74, 230, 58, 135, 125, 95, 60, 228, 14, 124, 63, 133, 26, 6, 61, 156, 75, 115, 63, 91, 87, 138, 60])
print(struct.unpack('<%df' % (len(data) / 4), data))

YOLO

In case the node is launched with post_proc:=YOLO argument, then a special YOLO post-processing is applied. In such case, the node publishes output of the DNN using standard Image message in a certain format: the output is a 2D, single-channel "image" that has the following format: WxHx1 (so encoding == 32FC1) where W is fixed and equals 6, and H is equal to the number of detected objects. For example, if the DNN has detected 2 objects, then the output is 6x2 image. For each detected object, the 6 values are the following:

0  : label (class) of the detected object (e.g. person or a dog).
1  : probability of this object.
2,3: x and y coordinates of the top left corner of the object in image coordinates.
4,5: width and height of the object in image coordinates.

All values are 32-bit floats, including label. Label indices correspond to 20 classes from PASCAL VOC 2012 dataset.

For example, if DNN detected a person (label:14) and a dog (label: 12) in the image with dimensions 320x180 then the output might look something like that:

Label Prob X Y Width Height
14.0 0.5 120.0 80.0 30.0 60.0
12.0 0.4 160.0 115.0 40.0 20.0

Running tests

To verify that the DNN node is working properly, run the basic end-to-end tests that come with caffe_ros package. In the catkin workspace directory, run the following (replace paths if needed):

catkin_make caffe_ros_tests
export REDTAIL_TEST_DIR=/data/src/redtail/ros/packages/caffe_ros/tests/data
export REDTAIL_MODEL_DIR=/data/src/redtail/models/pretrained
rostest caffe_ros tests_basic.launch test_data_dir:=$REDTAIL_TEST_DIR \
    trail_prototxt_path:=$REDTAIL_MODEL_DIR/TrailNet_SResNet-18.prototxt trail_model_path:=$REDTAIL_MODEL_DIR/TrailNet_SResNet-18.caffemodel \
    object_prototxt_path:=$REDTAIL_MODEL_DIR/yolo-relu.prototxt object_model_path:=$REDTAIL_MODEL_DIR/yolo-relu.caffemodel

Note: running the tests on the device like TX2 may take about two minutes.

Controller node

Image publisher

The image_pub node is a simple ROS node that reads video or image files and publishes frames as a ROS topic as an Image message. There is one mandatory parameter, img_path, which specifies the path to an image or video file. This node supports setting custom frame rates as well as repeating an image indefinitely. The default topic is /camera/image_raw which can be changed using the camera_topic parameter.

An example of publishing frames from the video file using the file's native frame rate:

rosrun image_pub image_pub_node _img_path:=/data/videos/trail_test.mp4

And example of publishing the same image repeatedly at 30 frames/sec:

rosrun image_pub image_pub_node _img_path:=/data/images/trail_right.png _pub_rate:=30 _repeat:=true