Académique Documents
Professionnel Documents
Culture Documents
Corresponding Author
978-1-4577-2078-9/12/$26.002011 IEEE 1
2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
978-1-4577-2078-9/12/$26.002011 IEEE 2
2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
Fig. 5. 2-D Wavelet Decomposition of a face image. (a) Original image (b) 1-level wavelet decomposition (c) 2-level wavelet decomposition
give a global description while the high frequency components where Vi+1 is the updated velocity vector, Vi is the present
concentrate on the finer details of the image. As shown in velocity vector, LBest is the Local Best Position of the particle,
Fig. 5, at the end of each level of wavelet decomposition, GBest is the best position attained by the entire swarm, is
four new images called LL, LH, HL and HH are created from the inertia factor.
the original image. The LL image is a reduced version of The positions of the particles are updated as
the original and retains most details. The LH image contains 1
horizontal edge features, while the HL image contains vertical if rand < , (Xi+1 ) = 1 else (Xi+1 ) = 0 (3)
1+eVi+1
edge features. The HH image contains only high frequency where Xi+1 is the updated position and rand is a string of
information and is typically noisy. In wavelet decomposition, random numbers between 0 and 1.
only the LL image is used to produce the next level of 2) Fitness Function: The fitness function mainly aims
decomposition as in Fig. 5. at increasing the class separation, which optimizes the
In this paper, face recognition using DWT is based on recognition process. The class means and global means are
the facial features extracted from the Reverse Biorthogonal calculated as follows
Wavelet Transform. Ni
Mi = N1i Wj (i) , i=1,2,3....L
P
C. Binary Particle Swarm Optimization
j=1
Particle Swarm Optimization, first introduced by Kennedy
(i)
and Eberhart, is an algorithm based on the social behavior where Wj , j=1,2,...Ni represents the sample images from
of bees and birds [7], [8]. This method searches the problem class wi and the grand mean M0 is given by
space iteratively and converges to an optimum solution. The L
1 X
position of each particle is updated by a velocity vector using M0 = Ni Mi
N
prior knowledge about the best position of the particle and i=1
that of the swarm as a whole. In each iteration, each particle is where N is the total number of images for all the classes. Thus
evaluated based on the value returned by a Fitness Function. In the fitness function F is computed as follows
v
Binary Particle Swarm Optimization (BPSO) [9], the particle u L
uX
position is coded as a binary string. The parameters used are T
F =t (M M ) (M M )
i 0 i 0 (4)
Swarm Size N=30, cognitive factor c1 =2, social factor c2 =2, i=1
Inertia Weight =0.6, Number of Iterations=100. where T denotes transpose of the matrix.
1) BPSO Algorithm: The algorithm for BPSO operation is
In this paper, the BPSO is used as a feature selector and
as follows:
the parameters are defined above.
Step 1: Initialize parameters c1 , c2 , r1 and r2
3) Euclidean Classifier: The Euclidean Distance Classifier
Step 2: Generate N particles with random positions and
is used to measure the similarity between test vector and the
velocities
reference vectors in the gallery [2]. The reference vectors
Step 3: Evaluate particles fitness using fitness function given
are the feature vectors obtained from the training images by
in Eq.(4)
applying DWT and BPSO feature selection. The Euclidean
Step 4: If fitness > fitness of particles LBest, update LBest.
distance between any two vectors in space is given as
Step 5: If fitness > fitness of present GBest, update GBest. v
Step 6: If stopping criteria are satisfied, terminate process and u N
uX 2
return feature vector. Else, update the velocity of each particle D =t (p q ) i i (5)
using Eq. (2) and position using Eq. (3) i=1
The velocities of the particles are updated as where pi or qi is the coordinate of p or q respectively in
Vi+1 =Vi +r1 c1 (LBestXi )+r2 c2 (GBestXi ) (2) dimension i.
978-1-4577-2078-9/12/$26.002011 IEEE 3
2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
978-1-4577-2078-9/12/$26.002011 IEEE 4
2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
(a) FERET
3) Proposed FR model for FERET database: The face Fig. 10. Results for Experiment 1:FERET
recognition model is implemented for FERET database with
the preprocessing block as shown in Fig. 9(a). A fixed number
of images from each class are chosen to be the training set. The
FERET images are subjected to the preprocessing techniques
as in Fig. 8. The one-dimensional wavelet transform is applied
to each of these images and features are selected using BPSO. (a) Extended Yale B
This collection of images after feature extraction and selection
forms the face gallery. A test image is randomly picked from
the remaining preprocessed FERET images and the rbio DWT
is applied to it. The transformed test image is multiplied
with the feature vector returned by the BPSO in the training
stage. Using the Euclidean Classifier Eq. (5), the test image
(b) ORL
is compared with each of the images in the face gallery. One
with the least distance is returned as the best match.
Initially, we choose a value randomly for c (of Eq. (1))
between 0.25 and 2 and for DWT level of decomposition
between 5 and 8. One wavelet family among the four
popular families namely, haar, symlet, biorthogonal and (c) UMIST
Fig. 11. RR v/s Training:Testing Ratios
reverse biorthogonal, is chosen randomly. Then, by trial and
error method, we found that the following parameter values TABLE I
resulted in the highest recognition rate: c=0.25, level of PARAMETERS AND R ESULTS OF E XPERIMENTS
decomposition=7, wavelet family-Reverse Biorthogonal (rbio).
ORL UMIST EXTENDED FERET
The results are shown in Fig. 10(b), (c), (d). Thus fixing YALE B
these values, an experiment was conducted for different DWT level (rbio1.3) 5 5 7 7
Number of features selected 194 355 62 260
training:testing ratios. As seen in Fig. 10(a), the recognition Average testing time/ image(ms) 9.16 7.19 46 40
Training to Testing Ratio 4:6 7:12 3:16 8:12
rate remains almost constant at the maximum value beyond Peak Recognition Rate(%)
Average Recognition Rate(%)
98.33
97.68
99.16
98.51
97.9
96.32
94.2
86.14
the ratio 8:12. Hence, we find the ratio 8:12 is optimum for
minimum computation time. The average recognition rate of facial expressions and facial details. All the images are taken
86.14% thus obtained is better than existing systems [17]. against a dark homogeneous background with the subjects in
an upright, frontal position.The size of each image is 112
B. Experiment 2: ORL and UMIST Databases 92 pixels, with 256 gray levels per pixel.
ORL database contains different images of 40 distinct The UMIST Face Database contains grayscale images of
subjects [18]. Images are taken at different times, varying the size 112 92 pixels [19]. It has images of 20 unique subjects.
978-1-4577-2078-9/12/$26.002011 IEEE 5
2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
The number of images per person varies from 19 to 36. It performance under frontal poses with variations in facial
contains images with varing angles from left profile to right expressions and facial details (ORL and UMIST databases).
profile. We have chosen 19 images per person. The experimental results indicate that the proposed method
In this experiment, we apply the general face recognition has performed well under severe illumination variations with
model in Fig. 1 with the preprocessing techniques in the order top recognition rates having reached 97.9% for Subset 5 of
log transform and unsharp masking for ORL and UMIST Extended Yale B considering only Pose 1. It is also successful
databases as shown in Fig. 9(b). As ORL and UMIST database in tackling the most challenging task of pose variance in FR
images do not contain much background information, k-means with average recognition rate of 86.14% for ColorFERET
fails to isolate it and hence is not used. The preprocessed considering all 13 poses. Thus, the proposed method has
images are given to a feature extractor which is DWT (reverse proven to be a promising technique under arbitrary variations
biorthogonal wavelet). As in Experiment 1 described above, in illumination, poses and backgrounds.
we determine the values of c and the decomposition level as This paper uses a simple Euclidean Classifier. By using
0.25 and 5 respectively by trial and error method. The reverse other classifiers such as SVM, Random Forest etc., the
biorthogonal wavelet was found to give the best results. Using performance of the FR system is expected to improve
these values, the recognition rates for various training:testing substantially.
ratios for both ORL and UMIST databases are shown in Fig.
11(b), (c). The training:testing ratios were fixed at 4:6 for ORL
R EFERENCES
and 7:12 for UMIST as there is no significant improvement
in the recognition rate beyond these values. It is seen that the [1] W. Zhao, R. Chellappa, P. T. Philips, A. Rosenfeld, Face Recognition: A
Literature Survey, ACM Computing Surveys, Vol. 35, No. 4, pp. 399-455,
wavelet family and the value of c are consistent with those of 2008.
FERET. Due to the smaller size of ORL and UMIST images, [2] Rabab M. Ramadan, Rehab F. Abdel-Kader, Face Recognition Using
decomposition levels higher than 5 did not provide higher Particle Swarm Optimization-Based Selected Features, International
Journal of Signal Processing, Image Processing and Pattern Recognition,
accuracies. The results are tabulated in Table I. Vol. 2, No. 2, 2009.
[3] Rong Xiao, Wujun Li, Yuan Dong Tian, Xiaoou Tang, Joint Boosting
C. Experiment 3: Extended Yale B Database Feature Selection for Robust Face Recognition, Proceedings of Computer
Extended Yale B contains 16128 images of 28 human Vision and Pattern Recognition, IEEE Computer Society, pp. 1415-1422,
2006.
subjects under 9 poses and 64 illumination conditions [20]. [4] R. Gonzalez, R. Woods, Digital Image Processing, Addison Wesley
We have used 19 images from subset 5 (Pose 1) for each of Publishing Company, 3rd Edition, 2009.
the 28 subjects. The size of the images are 640 480 pixels. [5] Rajesh Garg, Bhawna Mittal, Sheetal Garg, Histogram Equalization
Techniques for Image Enhancement, Proceedings of IJECT, Vol. 2, Issue
In this experiment, we apply the general face recognition 3, 2011.
model in Fig. 1 with the preprocessing techniques for [6] Muhammad Almas Anjum, M. Y. Javed, A. Basit, Face Recognition Using
the Extended Yale B database as shown in Fig. 9(c). To Double Dimension Reduction, World Academy of Science, Engineering
and Technology, 2005.
normalize the illumination variance, histogram equalization [7] J. Kennedy, R. Eberhart, Particle Swarm Optimization, Proceedings of
is used before performing log transform and unsharp IEEE International Conference on Neural Networks, pp. 1942-1948, 1995.
masking. As in the previous experiments, we fix c=0.25 [8] J. Kennedy, R. Eberhart, A New Optimizer using Particles Swarm Theory,
Proceedings of 6th International Symposium on Micro Machine, Human
and level of decomposition=7 by trial and error. Again, the Science, pp. 39-43, 1995.
reverse biorthogonal wavelet was found to give the best [9] J. Kennedy, R. C. Eberhart, A Discrete Binary Version of the Particle
results. Thus fixing these parameters, results for different Swarm Algorithm, Proceedings of IEEE International Conference on
Systems, Man, and Cybernetics, Vol. 5, pp. 4104-4108, 1997.
training:testing ratios were obtained as shown in Fig. 11(a). [10] FERET Database: http://face.nist.gov/colorferet.
Extended Yale B (Pose 1) images differ from each other [11] Susanta Mukhopadhyay, Bhabatosh Chanda, Multiscale Morphological
only in terms of illumination and have no pose variations. Segmentation of Gray-Scale Images, IEEE transactions on Image
Processing, Vol. 12, No. 5, 2003.
Illumination variations are to an extent neutralised by [12] Nikhil R. Pal, Sankar K. Pal, A Review On Image Segmentation
histogram equalization. Thus, the recognition rate does not Techniques, Pattern Recognition, Vol. 26, No. 9, pp. 1277-1294, 1993.
significantly increase beyond the ratio 3:16. The results are [13] Wikipedia contributors, K-means Clustering.
[14] Wikipedia contributors, Cluster analysis.
tabulated in Table I. [15] Suman Tatiraju, Avi Mehta, Image Segmentation using k-means
clustering, EM and Normalized Cuts, 2008.
IV. C ONCLUSION [16] Madhuri A. Dalal, Nareshkumar D. Harale, Umesh L. Kulkarni, An
A novel approach for a flexible FR system is proposed, iterative improved k-means Clustering, Proceedings of International
Conference on Advances in Computer Engineering, 2011.
which uses the combination of k-means clustering for [17] G. M. Deepa, R. Keerthi, N. Meghana, K. Manikantan, Face recognition
preprocessing, DWT-based feature extraction and a using spectrum-based feature extraction, Applied Soft Computing
BPSO-based feature selection, implemented using MATLAB Journal, Vol. 12, Issue 9, pp. 2913-2923, 2012.
[18] ORL Database: http://www.cl.cam.ac.uk/Research/DTG/attarchive/
[21]. k-means clustering has played a key role in efficient facedatabase.html.
background removal, which is the main contributor for [19] UMIST Database: http://www.sheffield.ac.uk/eee/research/iel/research/
the high recognition rates being obtained in ColorFERET face.
[20] Extended Yale B Database: http://cvc.yale.edu/projects/yalefacesB/
database. A successful attempt has been made to equally subsets.html.
handle all image variations (facial expressions, pose and [21] MATLAB: www.mathworks.com.
illumination). The proposed method exhibits extremely good
978-1-4577-2078-9/12/$26.002011 IEEE 6