DeepLab-v3 Semantic Segmentation in TensorFlow

This repo attempts to reproduce DeepLabv3 in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. The implementation is largely based on DrSleep's DeepLab v2 implemantation and tensorflow models Resnet implementation.

Setup

Please install latest version of TensorFlow (r1.5) and use Python 3.

Download and extract PASCAL VOC training/validation data (2GB tar file), specifying the location with the --data_dir.
Download and extract augmented segmentation data (Thanks to DrSleep), specifying the location with --data_dir and --label_data_dir (namely, $data_dir/$label_data_dir).
For inference the trained model with 75.14% mIoU on the Pascal VOC 2012 validation dataset is available here. Download and extract to --model_dir.
For training, you need to download and extract pre-trained Resnet v2 101 model from slim specifying the location with --pre_trained_model.

Training

For training model, you first need to convert original data to the TensorFlow TFRecord format. This enables to accelerate training seep.

python create_pascal_tf_record.py --data_dir DATA_DIR \
                                  --image_data_dir IMAGE_DATA_DIR \
                                  --label_data_dir LABEL_DATA_DIR

Once you created TFrecord for PASCAL VOC training and validation deta, you can start training model as follow:

python train.py --model_dir MODEL_DIR --pre_trained_model PRE_TRAINED_MODEL

Here, --pre_trained_model contains the pre-trained Resnet model, whereas --model_dir contains the trained DeepLabv3 checkpoints. If --model_dir contains the valid checkpoints, the model is trained from the specified checkpoint in --model_dir.

You can see other options with the following command:

python train.py --help

The training process can be visualized with Tensor Board as follow:

tensorboard --logdir MODEL_DIR

Evaluation

To evaluate how model perform, one can use the following command:

python evaluate.py --help

The current best model build by this implementation achieves 75.14% mIoU on the Pascal VOC 2012 validation dataset.

	Method	OS	mIOU
paper	MG(1,2,4)+ASPP(6,12,18)+Image Pooling	16	77.21%
repo	MG(1,2,4)+ASPP(6,12,18)+Image Pooling	16	75.14%

Here, out model trained about 9 hours (with GTX 1080Ti) with following parameters:

python train.py --train_epochs 33 --batch_size 9 --model_dir models/ba=9,wd=5e-4,max_iter=35k --max_iter 35000

Inference

To apply semantic segmentation to your images, one can use the following commands:

python inference.py --data_dir DATA_DIR --infer_data_list INFER_DATA_LIST --model_dir MODEL_DIR

The trained model is available here. One can find the detailed explanation of mask such as meaning of color in DrSleep's repo.

TODO:

Pull requests are welcome.

Freeze batch normalization during training
Multi-GPU support
Channels first support (Apparently large performance boost on GPU)
Model pretrained on MS-COCO
Unit test

Acknowledgment

This repo borrows code heavily from

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
dataset		dataset
images		images
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_pascal_tf_record.py		create_pascal_tf_record.py
deeplab_model.py		deeplab_model.py
evaluate.py		evaluate.py
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepLab-v3 Semantic Segmentation in TensorFlow

Setup

Training

Evaluation

Inference

TODO:

Acknowledgment

About

Releases

Packages

Languages

License

sunxirui310/tensorflow-deeplab-v3

Folders and files

Latest commit

History

Repository files navigation

DeepLab-v3 Semantic Segmentation in TensorFlow

Setup

Training

Evaluation

Inference

TODO:

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages