Skip to content

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

License

Notifications You must be signed in to change notification settings

dubvulture/tensor_fcn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TensorFCN

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCN-8 in particular) based on the code written by shekkizh and modified to be used with ease for any given task.

The model can be applied on the Scene Parsing Challenge dataset provided by MIT straightaway after cloning this repo. For a pretrained model checkout this link, it is not very precise but that's the only one I could get due to my lack of computing resources.

Table of contents

  1. Prerequisites
  2. Differences
  3. Usage
  4. Datasets

Prerequisites

  • numpy
  • scipy
  • opencv (both 2.4.x and 3.x should work)
  • Tested only with tensorflow 1.1.0 and python 2.7.12 on Ubuntu 16.04. I tried to make this python 3 compatible but I haven't checked yet if it works.

Differences

As pointed out frequently in the issue tracker of FCN.tensorflow there were some discrepancies between the caffe and the tensorflow implementation. Here are the main ones and how I handled them:

  • Conv6 padding 1

In the original implementation there was no padding in order to shrink the tensor down to [batch_size, 1, 1, 4096]. This works well when the input images are 224x224 but for any other resolution it will break the deconvolution phase. Since the padding did no harm and the results were still acceptable I decided to leave it as it is.

  • Average Pooling or Max Pooling 2

I just sticked to the original implementation using max pooling.

  • Final layer of VGG 3

I just sticked to the original implementation using the relu'd layer.

  • VGG19 or VGG16

The original implementation used VGG16, shekkizh used VGG19. I leave you with the freedom of choice.

Usage

example.py should be self-explanatory for basic usage. Note that a trained model can be run also on arbitrary sized images that will be accordingly padded to avoid information loss during pooling.

Things you can set while setting up the network and training phase:

  • Number of classes
  • Validation Set (if you want to keep track of its loss)
  • Learning Rate
  • Keep Probability (1 - Dropout) for some layers
  • Training loss summary frequency
  • Validation loss summary frequency
  • Model saving frequency
  • Maximum number of steps
  • Choosing between VGG19 or VGG16 (and maybe others if implemented) as encoders.

Things you can tweak in the code:

Scene Parsing Dataset / ADEChallenge2016

get_ade_dataset.sh is a simple script to download, verify and extract the whole dataset. It also changes the file/directory structure in order to be more coherent and compatible with the ADE_Dataset class.

Datasets

In dataset_reader there are two classes, BatchDataset which is supposed to be an abstract class, and ADE_Dataset which is an example on how to specialize BatchDataset and is ready to be used for training. Basic usage of a subclass:

dt = MyDataset(*args, **kwargs)
images, annotations, weights, names = dt.next_batch()

where images, annotations and weights are numpy arrays of shape [batch_size, height, width, channels] (3 channels for images and 1 for both annotations and weights). In ADE_Dataset weights are not used but for other tasks with different datasets they might be useful.

When subclassing BatchDataset there are things to keep in mind:

  • the argument names is required to identify its elements, but it might not be mandatory to be passed by the user to the subclass unless you want to be able to specify a subset of the dataset (as I did with ADE_Dataset when creating the validation set)
  • the argument image_op is often required to perform cropping and resizing when handling batch sizes greater than 1 unless you have a homogeneous dataset
  • image_size has to be specified only if batch_size is greater than 1

About

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published