Vous êtes sur la page 1sur 6

International Journal of Scientific Research Engineering & Technology (IJSRET)

Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882

IJSRET @ 2014
Literature Survey for Face Detection under Illumination Variation
PG Scholar Department of ECE, PET Engineering College, India
Professor Department of ECE, PET Engineering College, India
Abstract: Face detection is important in the field of
computer vision. It is widely used in many applications
such as security control, human system interaction,
interactive games and intelligent user interface. Face
detection is one of the major problem in areas such as
cluttered scenes, color variation and illumination
variation. To overcome this several existing techniques
have been used. Recently a new technique introduced is
the hybrid feature extraction using the combination of
several local transform features including local binary
pattern that are robust to illumination changes and local
gradient pattern that are robust to intensity changes. This
hybridization improves the detection performance and
provides accurate face detection even under any
uncontrolled illumination variation .This literature
survey discusses all the existing methods available for
face detection.
Keywords: Face Detection, Illumination Invariant,
In face detection it identifies and locates the face
in an image regardless of their scale, position,
orientation and illumination. Illumination variations are
the biggest problem in face detection. So several
strategies are developed to cope with this illumination
problem. The neural network first examines the
detection window in an image and decides whether each
window contains a face. Each network is trained to
output the presence or absence of face. It is very
difficult as well as a challenging task to train a neural
network because of the difficulty in characterizing the
non face images[1].There is a problem in detecting face
when subjected to unconstrained and cluttered scenes
where the specific goal is to identify how to
differentiate the face from everything else. To mitigate
this effect a descriptive model of an object class is used
that is rich enough to effectively model any of the
possible shapes, colors and lighting variations of an
object that can be described in terms of over complete
dictionary representation [2].This representation has the
ability to capture the structure of the class of objects that
would like to detect while at the same time ignoring the
noise in the image. The specific learning machine used is
the SVM classification technique that is particularly
attractive for learning in high dimensional spaces with
very few training examples [2]. In BDF method, features
are analyzed discriminatively. Then statistical modeling
of face and non faces images are performed after which
it is trained using bayes classifier. It results in robust
generalization performance [5].A major problem is that
the textures in the real world are not uniform due to
some variations in orientation, scale and other visual
appearance. The gray scale invariance is often necessary
due to uneven illumination, such that the grayscale and
rotation invariant texture operator is used which allow
for detecting the uniform pattern in the circular
neighborhood of any quantization of the angular space
and at any spatial resolution [12].The facial components
are extracted by using Haar like features and the
topology verification can be done using graph matching
techniques this will give robust face detection under out
of plane rotation, illumination variations and partial
occlusion[11].Rotation invariant multiview face detector
can able to detect the face under different pose changes
which gives more accurate and fast multiview face
detection[9].SURF relies on integral image
representation as it works faster in integral image. It is
partly inspired by SIFT implementation and it can be
obtained by computing gradients, magnitude and
orientation from the key points. They also showed an
upright version of SURF descriptor which are rotation
invariant and better suited where the camera remains
more or less horizontal[14].The face description can be
done by dividing into several regions and for each region
LBP features are extracted to model histogram.This
enables us to have an efficient description of face under
global illumination changes[16].The histogram gradient
for face detection and object detection can be illustrated
in[15],[17],[18].In order to overcome the problem of
detecting face across multispectral illumination
regularized transfer boosting algorithm is used which
International Journal of Scientific Research Engineering & Technology (IJSRET)
Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882
IJSRET @ 2014
minimizes the exponential loss function criteria[19].The
Gradient features are extracted from all over the face
using LGP which makes the local intensity variations
along the edge components robust and LGP has more
discriminative power between face and non face. They
also illustrated that for better detection performance
several local transform features can be combined[20].
[A] Haar wavelet transform and SVM classification:
Constantine papageorgiou and Tomaso Poggio
[2] proposed a trainable system to detect the objects even
under cluttered scenes by using a new method of
representation. It uses the intensity difference between
the adjacent classes to describe the objects. It can easily
transfer the pixel space into an over complete dictionary
of wavelet features. This representation can be
efficiently computable as a Haar wavelet transform.
Finally it gives the richer description of patterns.
They also show that a specific learning engine
called SVM is used for classification purpose. It learns
the difference between our object class and other
patterns. It also minimizes the empirical error and
complexity of the classifier at the same time, also has the
capability of learning in high dimensional spaces by
using only a few learning examples. The major
drawback of this paper is that all the fine scales of
wavelet are not used as features for learning. Since these
scale features capture high frequency details as learning
therefore it cannot able to characterize the class well.
Fig 1: wavelet coefficients
[B] Width First Search Tree Structure and
Vector Boosting algorithm:
Chang Huang, Haizhou Ai, Yuanlil and Shihang
[9] Lao proposed a tree structured detector If one route
passes from root node to some certain leaf node then the
input pattern given is identified as face.WFS access
every nodes of routes with the help of deterministic
vectors and by using non exclusive pass route selection
mechanism. The detector takes a moderate decision for
an unidentified input pattern neither too characterized
resulting from aggression nor excessively cautious
pyramid or scalar vector. The advantage of using this
method is more flexibility in case of decision making
In this paper they include the vector boosting
algorithm to learn the strong classifiers which give the
output of deterministic vectors G(x) obtained from the
WFS tree structure. These deterministic vectors are
helpful in learning the unified branching nodes. The
vectorization can overcome the different multiclass
( )
( ) =
( )
( ), . ,
( )
( ) ,
( )
( ) = { , } (1)
The major drawback of this paper is that it can
handle faces with ROP and RIP angles for rotation
invariant multiview face detection such that it becomes a
formidable problem because of extension in the
detectable face range.
[C] Component Based Approach:
Lultz Goldman, Ulrich.j.Monich and Thomas Sikora
[11] proposed the component based approach which
combines both the statistical and structural pattern
recognition domain. In this method facial components
International Journal of Scientific Research Engineering & Technology (IJSRET)
Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882
IJSRET @ 2014
are detected to localize different facial components
which includes left eye, right eye nose and mouth .For
each facial components an individual face detector is
built such that the facial components are detected
individually after that topology verification and face
localization procedure is carried out. They differ only in
the training pattern which is used to train each individual
The graph based topology is used to handle
the multiview face detection. Thus for the detected
components a graphical model is constructed. All the
components represent the nodes and all the nodes are
combined to form an edge. Now the face region is
estimated based on the fixed relationship evaluated from
the graphical construction.
The classification process is done using a
classifier cascade using Adaboost learning strategy. The
cascade consists of multiple numbers of strong
classifiers and this strong classifier consists of multiple
numbers of weak classifiers. These weak classifiers are
combined to form a strong classifier. At each stage it
decides whether it contains face or not. The major
drawback mentioned in this paper is computational
complexity because of the construction of graphical
[D] Deformable Part Models:
Pedro F.Felzenszwalb, RossB.Girshick, David Mc
Allester and Deva Ramanan[17] proposed a mixture of
deformable models. We can train the system using a
partially labeled data and for that latent variable
information is used. In this paper star models are used to
concanate with the root filter such that it has the ability
to represents entire objects and the part filters are used to
represent only the part of the objects. The root filter
defines the detection window and the part filter values
are scaled down by levels down in the pyramid so that
the features at that level are twice the resolution of
features at the root level .After the filtering process is
over we transform the filter response and compute score
at each position. The Overall score at each location can
be given as a weighted sum of root filter response plus
the transformed version and the part filter response.
( , , ) =
( , ) +

(2( , ) + ) + (2)
The system detects the object by computing
the score of the filter at each position and then
thresholding the score. High scoring root location
defines the object detection.
( ) =max
( , ) (3)
The classifier used is the LATENT SVM
which leads to non convex optimization problems ,
becomes convex when the latent information is specified
for the positive examples. For training instead of using
large number of negative examples it is in general to
construct the data having positive and hard negative
examples. For training Bootstrapping method is applied
where it train a model with negative examples and then
they collect all the negative examples that are not
classified correctly by a model to form a set of hard
negative examples. The entire process is repeated here
by training the new model with the hard negative
examples. The major drawback of this paper is
confusion among the classes may occur and also it
cannot represent higher models.
[E] Local Binary Pattern:
T.Ahonen, A.Hadid and M.pietikainen [16] proposed
the local binary pattern for the description of face
efficiently. Here the Face is divided into number of
regions from which the LBP feature can be extracted and
concanate this into an enhanced feature vector which can
be used to describe the face.LBP has a two bitwise
transition as 0 and 1by using LBP. Several global
descriptors can be built by combining all the local
descriptors. Finally to encode both appearance and
spatial relationship an extended version of histogram
modeling is used, that is the spatially encoded histogram.
For each region the histogram is computed which
contains the information about the patterns obtained on
the pixel basis. This regional level information is used to
build a global level description.
In[16],[20] the local binary pattern is used
which may encode the image pixel into an 8 bit binary
pattern by comparing the center pixel value in the 33
block with the neighborhood value within the 33 block.
It takes the intensity difference between the center pixel
value and neighborhood pixel value. First it divides the
examined window into cells. For each pixel in the cell it
compare it pixel with all its eight neighbors in the cell. If
the center pixel is greater than neighbor value then it is
written as 1 or else 0.After that the histogram is
computed over the cell and then performs normalization
of the histogram to produce the feature vector. The main
advantage of using LBP is it is invariant to monotonic
International Journal of Scientific Research Engineering & Technology (IJSRET)
Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882
IJSRET @ 2014
changes in illumination. If there is uniformity in the
color region the variations obtained from the intensity
difference are small. These variations can be suppressed
by setting threshold. The major drawback present is even
though it is illumination invariant it discards the property
1110001= 225
Fig 2: Operation Of Local Binary Pattern
( , ) = ( )2 (4)
Where ( , ) is the given center pixel position,
denotes the neighborhood pixel value, denotes the
center pixel value and the function S( ) is defined as
S( ) =
0, >0
[F] Local gradient pattern:
Bongjin Jun,Inho Choi, and Daijin Kim[20] Proposed
the local gradient pattern which uses the gradient value
of the eight neighbors of a given pixel, which are
computed as the absolute value of intensity difference
between the given pixel and neighboring pixels. The
average of these values are taken and used as a threshold
value for the LGP encoding. The pixel is assigned to
have a value 1 if the gradient value of the neighboring
pixel is greater than the threshold value or else the value
is assigned as 0.Finally it is done by concatenating both
0s and 1s to produce the LGP code. The sensitivity
generates many different patterns of local intensity
variations and makes the training of face detection
difficult .To overcome this problem in this paper they
propose a new method of representation called as LGP
which generates constant patterns irrespective of local
intensity variations along the edges.
Fig 3: Operation Of Local Gradient Pattern
At a given center pixel location ( , ), it takes the
2r+1 by 2r+1 neighboring pixels surrounding the
center pixel. Here a gradient value is defined between
the center pixel and the neighboring pixel value as
| |
and the average of p gradient values is given
as = .The
( , ) can be expressed as
( , ) ( )2 (6)
Where, the function S( ) can be defined as follows
S( ) =
0, >0
Inorder to improve the overall detection
performance hybridization of several local transform
features are performed. By performing hybridization of
two or more local transform features we can able to
make the face detection robust to global illumination
changes by LBP as well as local intensity changes by
The local transform features and the
hybridization are applied to face inorder to evaluate the
performance measure. The face detection rates for LBP,
LGP and hybrid feature for different database are
analyzed. Here two databases used are MIT+CMU
database and FDDB database. It indicates that the LGP
feature had better detection rate than LBP feature, while
the hybrid feature had the best detection rate overall
when compared to other detectors.
Table1:Performance analysis of several face
detectors using MIT+CMU database
Detector Detection rate False positive
per image
LBP 90 8
LGP 93 4
Hybrid 96 2
137 135 115
99 82 79
70 54 45
1 1 1
1 0
0 0
70 20 20
50 60 120
20 20 120
10 40 40
10 >37.5 60
40 40 60
0 1 1
1 1
International Journal of Scientific Research Engineering & Technology (IJSRET)
Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882
IJSRET @ 2014
Table 2: Performance analysis of several face
detectors using FDDB database
Detector Detection
False positive
per image
LBP 72 5
LGP 74 2
Hybrid 78 0.2
In this a detailed literature survey for face
detection under illumination variation is discussed
briefly. Face detection is currently a very active research
area.Many advanced algorithms are developed to detect
the face in cluttered backgrounds, low quality images
and lighting variations. Several face detectors are used
for detecting the face under uncontrolled variation .On
comparing the performance analysis shown in the table,
hybriding the local transform features may have the
better performance when compared to other detectors.
Apart from the efforts taken by me, the success of any
work depends largely on the encouragement and
guidelines of many others. I take this opportunity to
express my gratitude to the people who have been
instrumental in the successful completion of this work. I
would like to extend my sincere thanks to all of them. I
owe a sincere prayer to the LORD ALMIGHTY for his
kind blessings and giving me full support to do this
work, without which this would not have been possible.
I wish to express our gratitude to all, who helped me
directly or indirectly to complete this paper.
[1] H. Rowley, S. Baluja, and T. Kanade,
Neural Network-Based Face Detection, IEEE Trans.
Pattern Analysis and Machine Intelligence vol. 20, no.
1, pp. 23-38, Jan. 1998.
[2] Papageorgiou and T. Poggio, A Trainable
System for Object Detection, Intl J. Computer Vision,
vol. 38, no. 1, pp. 15-33, 2000.
[3] T. Randen and J. Husoy, Filtering for Texture
Classification: Comparative Study, IEEE Trans.
Pattern Analysis and Machine Intelligence, vol. 21, no.
4, pp. 291-310, Apr. 1999
[4] P. Viola and M. Jones, Robust Real-Time Face
Detection, Int J. Computer Vision, vol. 57, no. 2, pp.
137-154, 2004
[5] Chengjun Liu,A Bayesian Discriminating
Features Method for FaceDetection,IEEE Trans.Pattern
Analysis and Machine Intelligence, vol.25, No.6, June
[6] Lowe, Distinctive Image Features from Scale
Invariant Key points, Intl J. Computer Vision, vol. 60,
no. 2, pp. 91-110, 2004.
[7] X. Huang, S.Z. Li, and Y. Wang, Shape
Localization Based on Statistical Method Using
Extended Local Binary Pattern, Proc. Intl Conf.
Image and Graphics, pp. 184-187, 2004.
[8] Shan, S. Gong, and P. McOwan, Facial
Expression Recognition Based on Local Binary
Patterns: Comprehensive Study, Image and Vision
Computing, vol. 27, pp. 803-816, 2009
[9] Chang Hang, Haizhou Ai,Yuan Li and Shihong
Lao,High-Performance Rotation Invariant Multiview
Face Detection ,IEEE trans. Pattern Analysis and
Machine Intelligence,vol29,N0.4,2007
[10] Grimes and R. Rao, A Bilinear Model for
Sparse Coding,Proc. Neural Information Processing
Systems, vol. 15, pp. 1287-1294, 2003.
[11] Lutz Goldmann, UlrichJ. Monich and Thomas
Sikora, Components and their Topology For Robust
Face Detection in the presence of partial
occlusions,vol.2, No.3, September 2007
[12] T. Ojala, M. Pietikainen, and T. Maenpaa,
Multiresolutio Grayscale and Rotation Invariant
Texture Classification with Local Binary Patterns,
IEEE Trans. Pattern Analysis and Machin Intelligence,
vol. 24, no. 7, pp. 971-987, July 2002
[13] N. Sun, W. Zheng, C. Sun, C. Zou, and L. Zhao,
Gender Classification Based on Boosting Local Binary
Pattern, Proc. Intl Symp. Neural Networks, pp. 194-
201, 2006.
[14] H. Bay, A. Ess, T. Tuytelaars, and L. Gool,
Surf: Speeded Up Robust Features, Computer Vision
and Image Understanding, vol. 110, no. 3, pp. 346-359,
[15] O. Deniza, G. Buenoa, and J. Salido, Face
Recognition Using Histograms of Oriented Gradients,
Pattern Recognition Letters, vol. 32, no. 12, pp. 1598-
1603, 2011.
[16] T. Ahonen, A. Hadid, and M. Pietikainen, Face
Description with Local Binary Patterns: Application to
Face Recognition, IEEE Trans. Pattern Analysis and
International Journal of Scientific Research Engineering & Technology (IJSRET)
Volume 2 Issue 10 pp 659-664 January 2014 www.ijsret.org ISSN 2278 0882
IJSRET @ 2014
Machine Intelligence, vol. 28, no. 12,pp. 2037-2041,
Dec. 2006.
[17] P. Felzenszwalb, R. Girshick, D. McAllester,
and D. Ramanan,Object Detection with
Discriminatively Trained Part Based Models, IEEE
Trans. Pattern Analysis and Machine Intelligence,vol.
32, no. 9, pp. 1627-1645, Sept. 2010.
[18] P. Felzenszwalb, R. Girshick, and D.
McAllester, Cascade Object Detection with Deformable
Part Models, Proc. IEEE Conf.Computer Vision and
Pattern Recognition, pp. 2241-2248, 2010.
[19] .Zhiewei Dong, Yi, Zhen Zei and Stan.Z.Li,
Regularized Transfer Boosting for Face Detection
Across Spectrum, vol.19,No.3,March 2012.
[20] Bongjin Jun, Inho choi and Daijin Kim, Local
Transform Features and Hybridization for Accurate Face
Detection and Human detection,vol.35, No.6,June 2013