SUN RGB-D Scene dataset is available here. In addition, allsplit.mat
and SUNRGBDMeta.mat
files need to be downloaded from the SUN RGB-D toolbox.
In order to localize the paths provided in the SUNRGBDMeta.mat
file and to make the dataset simple for the local system:
sh run_steps.sh step="SAVE_SUNRGBD"
python main_steps.py --dataset-path "../data/sunrgbd/" --data-type "RGB_JPG" --debug-mode 0
This copies RGB images into train/test
folders by renaming files with category information according to the provided train/test
splits.
sh run_steps.sh step="SAVE_SUNRGBD"
python main_steps.py --dataset-path "../data/sunrgbd/" --data-type "Depth_Colorized_HDF5" --debug-mode 0
This converts depth maps to the proposed colorized RGB-like representations using the provided camera intrinsic values and saves files in train/test
folders and hdf5
file format. See the file structure here for the saved files location.
Note that, data preparation works quite slowly especially for the depth data. Nevertheless, it is needed to run just once.
To make the source code suitable for the SUN RGB-D Scene dataset, some minor changes need to be done. We have evaluated the system with ResNet-101 (the best in term of recognition accuracy) backbone model. Other models can be applied as well. However, to apply multi-level fusion for other models, the best layers on the dataset need to be evaluated and assigned (get_best_modality_layers
and get_best_trio_layers
functions in each model). Refer to the paper for the details. Some other changes can be summarized as follows.
-
Unlike the 10 train/test splits of Washington RGB-D Object dataset, there are no multiple train/test splits. Therefore, there is no need for
--split-no
parameter. Therefore, the split no information should be removed from the file paths of save/load model records (e.g.params.split_no
inoverall_struct.py
). -
Dataset loaders in
base_model.py
need to be edited to SUN RGB-D dataset loader (see train/test data loaders ineval
function). We have already provided related custom data loaders inutils
package for both SUN RGB-D Scene and Washington RGB-D Object datasets. The same should be done inextract_cnn_features.py
as well. -
Finally, use
DataTypesSUNRGBD
instead ofDataTypes
for--data-type
parameters andget_data_transform
indemo_scene/demo.py
instead of current data transforms of Washington RGB-D dataset.
The command line parameters to run the overall pipeline:
--dataset-path "../data/sunrgbd/"
This is the root path of the dataset.
--features-root "models-features"
This is the root folder for saving/loading models, features, weights, etc.
--data-type "RGB_JPG"
Data type to process, RGB_JPG
for rgb, Depth_Colorized_HDF5
for depth data. And RGBD
for multi-modal fusion.
--net-model "alexnet"
Backbone CNN model to be employed as the feature extractor. Could be one of these: alexnet
, vgg16_bn
, resnet50
, resnet101
, and densenet121
.
--debug-mode 1
This controls to run with all of the dataset (0
) or with a small proportion of dataset (1
). Default value is 1
to check if everything is fine with setups etc.
--debug-size 3
This determines the proportion size for debug-mode. The default value of 3
states that for every instance of a category, 3 samples are going to be taken to process.
--log-dir "../logs"
This is the root folder for saving log files.
--batch-size 64
You can set the batch size with this parameter.
--run-mode 2
There are 3 run modes. 1
is to use the finetuned backbone models, 2
is to use fixed pretrained CNN models, and 3
is for fusion run. Before running for fusion (3
), you should run the framework for RGB and depth first with run-mode 1
or 2
.
--num-rnn 128
You can set the number of random RNN with this parameter.
--save-features 0
If you want to save features, you can set this parameter to 1
.
--reuse-randoms 1
This decides whether the already saved random weights are going to be used. If there are not available saved weights, it will save the weights for later runs. Otherwise, if it is set to 0
, weights are not going to saved/load and the program generates new random weights for each run.
--pooling "random"
Pooling method can be one of max
, avg
, and random
.
--load-features 0
If the features are already saved (with the --save-fatures 1
), it is possible to load them without the need for run the whole pipeline again by setting this parameter to 1
.
There is one other parameter --trial
. This is a control param for multiple runs. It could be used for multiple runs to evaluate different parameters in a controlled way.
See here for the details to run individual steps.