Vous êtes sur la page 1sur 3

Paper Summary : Rich feature hierarchies for accurate object detection and semantic

segmentation

This paper proposes detection and simultaneous classification algorithm which


together performs semantic segmentation. Semantic segmentation is different than
segmentation where we do not need to segment every pixel but rather than that we need to
perform object segmentation where we are only interested in detecting and identifying all
objects of interest in the image.
To achieve semantic segmentation they propose a two step process where one block
proposes all the possible regions which are likely to have objects. Second module takes the
the proposed regions and classifies it into one of given classes. They first apply
high-capacity convolutional neural networks to bottom-up region proposals in order to
localize and segment objects. Then they train large CNNs when labeled training data is very
limited. They show that by experiments that with supervision highly effective performance
can be achieved, to pre-train the network for a auxiliary task with abundant data which is
similar to image classification task and fine tuning on the limited data which was not
abundant and scarcely available. This fine tuning approach is very effective and they argue
that it will be applicable to not only vision but any datasets where there is data scarcity and
pretrained models are available which are trained on large datasets. They argue that it is
significant that results achieved in this method are by using only classical vision methods
combined together to perform semantic object segmentation. Rather than opposing lines of
scientific inquiry, the two are natural and inevitable partners.

Strengths :
- Fast detection algorithm
- Simple fine-tuning is enough on scarce datasets
- Gets state of the art performance

Weakness :
- Training not end to end, separate blocks for region proposals and classification
Paper Summary : Fully Convolutional Networks for Semantic Segmentation

This paper proposes the segmentation using convolutional neural networks. They are
the first authors to introduce segmentation deep network for images. They also propose novel
architecture of fully convolutional networks where they do not use any fully connected linear
architecture and each module in the network is convolution block. They also propose novel
upsampling block to generate back images which are segmented.
One of the major contribution of the paper is fully convolutional neural network
architecture which has all convolution blocks. And in upsampling they also propose a novel
architecture of upsampling blocks which uses encoded reduced feature maps to upsample the
encoded features using transposed convolution. On output image they perform pixel wise
cross entropy loss to train the network. They train this network on Pascal VOC 07
segmentation dataset and show that their method works gives state of the art segmentation
performance.

Strengths :
- Performing segmentation using neural network
- State of the art results

Weakness :
- Results computed on only one dataset
Paper Summary : Unsupervised Domain Adaptation by Backpropagation

This paper introduces a novel idea of unsupervised domain adaptation in deep


learning framework. This is the first paper to introduce unsupervised domain adaptation using
adversarial training approach. The problem of unsupervised domain adaptation deals with
adapting knowledge from one data domain with labels to another data domain without any
label information. It is assumed that both data domain have equal number of classes and
number of classes are known apriori. The task is then to use labelled information at one data
domain lets call it source domain and leverage that information to perform inference for
unlabelled second data domain namely target domain.
Proposed method tackles this issue by utilizing labelled information at the source
domain and unlabelled data information domain at target domain. They introduce an
adversarial approach towards adapting to data from two different domain. They propose three
neural network architecture namely Feature Extractor, Label Classifier and Domain
Classifier. Feature extractor is extracts the features from images and is convolutional neural
network. When source data domain is input label classifier uses label information to classify
data in to any of given classes. Domain classifier predicts which data domain the input image
is from. For target data input image only domain classification is done since label information
is not available. Label classifier is trained with cross entropy loss and domain classifier is
trained with binary cross entropy loss. FE and LC both tries to minimize the cross entropy
loss. DC tries to minimize binary cross entropy loss but FE tries to maximize the cross
entropy loss. Intuitively, job of domain classifier is to correctly classify the image domain
and feature extractor tries to fool the domain classifier which domain the data is from leading
to robust features which are domain invariant.
They show by experiment on many datasets such as MNIST, SVHN, MNIST-M and
SYN for digits and Office dataset for objects to show that their method outperforms all the
classical unsupervised domain adaptation approaches.

Strengths :
- Novel idea for domain adaptation
- Domain adaptation using adversarial approach
- State of the art results for unsupervised domain adaptation

Weakness :
- No discussion for domain adaptation from multiple domains, where we have
unlabelled data from more than one domain

Vous aimerez peut-être aussi