SUN RGB-D Scene Recognition

Data Preparation

SUN RGB-D Scene dataset is available here. In addition, allsplit.mat and SUNRGBDMeta.mat files need to be downloaded from the SUN RGB-D toolbox. In order to localize the paths provided in the SUNRGBDMeta.mat file and to make the dataset simple for the local system:

sh run_steps.sh step="SAVE_SUNRGBD"
python main_steps.py --dataset-path "../data/sunrgbd/" --data-type "RGB_JPG" --debug-mode 0

This copies RGB images into train/test folders by renaming files with category information according to the provided train/test splits.

sh run_steps.sh step="SAVE_SUNRGBD"
python main_steps.py --dataset-path "../data/sunrgbd/" --data-type "Depth_Colorized_HDF5" --debug-mode 0

This converts depth maps to the proposed colorized RGB-like representations using the provided camera intrinsic values and saves files in train/test folders and hdf5 file format. See the file structure here for the saved files location.

Note that, data preparation works quite slowly especially for the depth data. Nevertheless, it is needed to run just once.

Preparing Source Codes

To make the source code suitable for the SUN RGB-D Scene dataset, some minor changes need to be done. We have evaluated the system with ResNet-101 (the best in term of recognition accuracy) backbone model. Other models can be applied as well. However, to apply multi-level fusion for other models, the best layers on the dataset need to be evaluated and assigned (get_best_modality_layers and get_best_trio_layers functions in each model). Refer to the paper for the details. Some other changes can be summarized as follows.

Unlike the 10 train/test splits of Washington RGB-D Object dataset, there are no multiple train/test splits. Therefore, there is no need for --split-no parameter. Therefore, the split no information should be removed from the file paths of save/load model records (e.g. params.split_no in overall_struct.py).
Dataset loaders in base_model.py need to be edited to SUN RGB-D dataset loader (see train/test data loaders in eval function). We have already provided related custom data loaders in utils package for both SUN RGB-D Scene and Washington RGB-D Object datasets. The same should be done in extract_cnn_features.py as well.
Finally, use DataTypesSUNRGBD instead of DataTypes for --data-type parameters and get_data_transform in demo_scene/demo.py instead of current data transforms of Washington RGB-D dataset.

Params for Overall Run

The command line parameters to run the overall pipeline:

--dataset-path "../data/sunrgbd/"

This is the root path of the dataset.

 --features-root "models-features"

This is the root folder for saving/loading models, features, weights, etc.

--data-type "RGB_JPG"

Data type to process, RGB_JPG for rgb, Depth_Colorized_HDF5 for depth data. And RGBD for multi-modal fusion.

--net-model "alexnet"

Backbone CNN model to be employed as the feature extractor. Could be one of these: alexnet, vgg16_bn, resnet50, resnet101, and densenet121.

--debug-mode 1

This controls to run with all of the dataset (0) or with a small proportion of dataset (1). Default value is 1 to check if everything is fine with setups etc.

--debug-size 3

This determines the proportion size for debug-mode. The default value of 3 states that for every instance of a category, 3 samples are going to be taken to process.

--log-dir "../logs"

This is the root folder for saving log files.

--batch-size 64

You can set the batch size with this parameter.

--run-mode 2

There are 3 run modes. 1 is to use the finetuned backbone models, 2 is to use fixed pretrained CNN models, and 3 is for fusion run. Before running for fusion (3), you should run the framework for RGB and depth first with run-mode 1 or 2.

--num-rnn 128

You can set the number of random RNN with this parameter.

--save-features 0

If you want to save features, you can set this parameter to 1.

--reuse-randoms 1

This decides whether the already saved random weights are going to be used. If there are not available saved weights, it will save the weights for later runs. Otherwise, if it is set to 0, weights are not going to saved/load and the program generates new random weights for each run.

--pooling "random"

Pooling method can be one of max, avg, and random.

--load-features 0

If the features are already saved (with the --save-fatures 1), it is possible to load them without the need for run the whole pipeline again by setting this parameter to 1.

There is one other parameter --trial. This is a control param for multiple runs. It could be used for multiple runs to evaluate different parameters in a controlled way.

Run Individual Steps

See here for the details to run individual steps.

Back to Home Page

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sunrgbd_info.md

sunrgbd_info.md

SUN RGB-D Scene Recognition

Data Preparation

Preparing Source Codes

Params for Overall Run

Run Individual Steps

Files

sunrgbd_info.md

Latest commit

History

sunrgbd_info.md

File metadata and controls

SUN RGB-D Scene Recognition

Data Preparation

Preparing Source Codes

Params for Overall Run

Run Individual Steps