Skip to content

A full Keras implementation of Fully Convoltional Networks (FCNs) for the task of semantic image segmentation.

Notifications You must be signed in to change notification settings

eszaher/Fully-Convolutional-Networks-for-Semantic-Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fully-Convolutional-Networks-for-Semantic-Segmentation

A Keras implementation of the Fully Convolutional Networks (FCN-32s, FCN-8s) set up by Shelmar et al. and dedicated for the task of semantic segmentation on PASCAL VOC dataset. For more details regarding the task of semantic segmentation, Methodology, datasets, internals of the networks, results, and discussion, please see the detailed report.

Image semantic segmentation is the task of grouping pixels that belong to the same class in one group or class, a dense predection/classification task on the levels of pixels. The following example shows an image and its semantically segmented version.

alt text

Networks

Fully Convolutional Networks are encoder-decoder based architectures in which an encoder network performs subregion classification over the shallow-level feature maps. The decoder architecture, based on upsampling/deconvolutional layers, performs pixel-wise dense predictions. Results are enhanced in FCN-8s by fusing the outputs from shallower layers (rich with location information) with the ones from deeper layers to produce more enhanced and boundary-preserving predictions. The following triplet shows an image, groundtruth, its FCN-32s result, and FCN-8s result (boosted with skip connections) respectively.

alt text

Data

Pascal VOC 2012 dataset is used and augmented with the Berkeley Segmentation Boundaries Dataset (SBD) , whichcontains 11,355 labelled images (8,498 training, 2,857 vali-dation). For training, the 676 unique images from the PascalVOC dataset were augmented with both the training set andthe last 1,657 images (out of 2,857 total) in the validation setof Berkeley SBD. The first 1,200 images of the SBD validationset for was for validating our models. Refer to the report and code file for details and links for preprocessed and augmented dataset.

Drawbacks and Future Work

The relative small margin (~2-3%) in mean Intersection Over Unioun (IOU) between our models and the original ones by Shelmar et al. is believed to be due to the differences in framework, optimization, and dataset augmentation. Our implementation is Keras-based while the original networks were built and trained using Caffe. The authors benefited from their augmented dataset, GPU resources, and the wrapping around network layers to assign different learning rates for different layers, resulting in better fine tuning, but at the cost of training time (3 days). Future work will investigate data augmentation, utilization of GPU-resources, and other optimizers.

  • The following repos were helpful for the prepartion of this project:

About

A full Keras implementation of Fully Convoltional Networks (FCNs) for the task of semantic image segmentation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published