226
081868606498 $10.00 0 1998 IEEE
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.
ntuple defines a set n locations in the image space. Recog and inverse DCT is
nition is achieved by computing the distance between the
N1 Nl
stored ntuples and the ntuple generated by the image to be
recognized. Here too, the ORL face database is used.
u=o v=o
(22 + 1)m +
(29 l ) V T
2 Background cos(
2N 1 cos( 2N ) (5)
227
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.
vector is determined by the number of DCT coefficients that b j (ot) is Gaussian. For 1 1.j 5N
retained. This number is initially chosen based on some
heuristics, and it can be frozen after experimentation. Each
image gives an observation sequence, whose length is de
termined by the size of the sliding window, the amount of
overlap between adjacent windows and the size of the im where, ot is of size D x 1.
age. Step 6: Now use the Viterbi algorithm to find the optimal
The 2D face image is now represented by a correspond state sequence Q* for each training sequence. Here, the
ing 1D DCT domain observation sequence, which is suit state reassignment is done. A vector is assigned state i if
able for HMM training or testing. q: = i.
Step 7: The reassignment is effective only for those training
3.2 HMM for Face Recognition vectors whose earlier state assignment is different from the
Viterbi state. If there is any effective state reassignment,
To do face recognition, for every subject to be recog then repeat Steps 3 to 6; else STOP and the HMM is trained
nized, a separate HMM is trained i.e. to recognize M sub for the given training sequences.
jects we have M distinct HMMs at our disposal. The details of Viterbi algorithm can be found in [6].
An ergodic HMM is employed for the present work. The Face Recognition: Once, all the HMMs are trained, then
training of each of the HMM is carried out using the DCT we can go ahead with recognition. For the face image to be
based training sequence. recogmzed, the data generationstep is followed as described
HMM Training: Following steps give a procedure of in Section 3.1. The trained HMMs are used to compute the
HMM training. llkelihood function as follows: Let 0 be the DCT based
Step 1: Cluster all R training sequences, generated from observation sequence generated from the face image to be
R number of face images of one particular subject, i.e. recognized,
{ O w } , 1 5 w 5 R, each of length T , in N clusters us 1: Using the Viterbi algorithm,we first compute
ing some clustering algorithm, say kmeans clustering al
gorithm. Each cluster will represent a state of the training Qf = argmax P[O,& / X i ] (12)
vector. Let them be numbered from 1 to N . Q
Step 2: Assign cluster number of the nearest cluster to each 2: The recognized face corresponds to that i for which the
of the training vector. i.e. tth training vector will be as likelihood function P[O,QT/Xi] is maximum.
signed a number i if its distance, say Euclidean distance, is
smaller than than its distance to any other cluster j , j # i.
Step 3: Calculate mean { p i } and covariance matrix {Xi} 4 Experimental Evaluation
for each state (cluster).
ORI, face database is used in the experiments. There are
(7) 400 images of 40 subjects  10 poses per subject. All the
face images have are of size 92 x 112 with 256 grey levels.
The face data has are male or female subjects, with or with
out glasses, with or without beard, with or without some
facial expression. Pentium 200Mhz. machine in multiuser
environment was used to evaluate our strategy.
Training: The training set (see Fig. 4) chosen has 5 differ
where Ni is the number of vectors assigned to state i. ent poses of each subject implying the number of training
Step 4: Calculate A and II matrices using event counting. sequences per HMM to be 5. For every subject a HMM is
No. of occurrences of 01 E i trained as follows:
Ti = for 1 5 i 5 N (9) 1: Sample the image with a square 16 x 16 window. Take
No. of training sequences = R
DCT of the subimage enclosed by this window. Scan the
DCT matrix in a zigzag fashion and select only few signif
No. of occurrences of ot E i and ot+l E j icant DCT coefficients, say 15. Arrange them in a vector

az3. .  (10) form. Complete scanning of the face image from topleft
No. of occurrences of ot E i
for1 5 i,j 5 Nandfor 15 t 5 T  1 comer down upto rightbottom of the image with 75% over
lap. This gives an observation sequence.
Step 5: Calculate the B matrix of probability density for 2: Repeat Step 1 for all the training images. This step gives
each of the training vector for each state. Here we assume the set of observation sequences for HMM training.
228
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.
3: Use the training algorithm described earlier to train the 5
state HMM. Y
. .
Recognition The recognition test is performed on the re I : y ; s y n path
maining 200 images which are not the part of training. Sam
ple test images are in Fig. 5. Each image is recognized
separately following the steps outlined below.
1: For the input image, generate the DCT based observa
tion sequence. Note that window size, amount of overlap
and number of DCT coefficients is exactly same as in the
training phase.
3: Use the Viterbi algorithm to decode the state sequence
and find the stateoptimized likelihood function for all the
stored HMMs, namely, V i, P[O,Qf/Xi], where Qr =
argmaxQ P[O,Q/Xi], and 0 is the dctbased observation
sequence corresponding to the image to be recognized.
I I
4: Select that label of the HMM for which the likelihood
IMAGE INPUT
function is maximum.
X and Y Subimage Size
By varying the size of sampling window, percentage x and y Overlap
overlap and number of DCT coefficients, experiments are
carried out with the same set of images to test the efficacy
the proposed scheme.
The highest recognition score achieved is 99.5% i.e. only
one image (shown in Fig. 5 with a ‘X’ on it) out of 200 im
ages is misclassified with a sampling window of 16 x 16 Figure 1. Sampling scheme to generate Sub
with an overlap of 75% using 10 significant DCT coeffi images.
cients. Nevertheless, the recognition varies from 74.5% to
99.5%, depending on the size of subimage, amount of over
lap and the number of DCT coefficients. Table 1 gives the
detailed experimental evaluation. Results tabulated are for
subimage of size 8 x 8 and 16 x 16 with 50% and 75%
overlap with 10, 15 and 2 1 DCT coefficients.
5 Conclusions
For the face data set considered, the results of the pro
posed DCTHMM based scheme show substantial improve
ment over the results obtained from some other methods. Figure 2. Scheme of converting 2D DCT coeffe
We are currently investigatingthe robustness of proposed cients to vector
scheme with respect to noise and possible scale variations.
229
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.
Table 1. Results of experimentation for the proposed DCTHMM based face recognition scheme.
Table 2. Comparative recognition results of some of the other methods as reported by the respective
authors on ORL face database.
eferences [4] S.H. Lin, S.Y. Kung and L.J.Lin, ‘Face Recogni
tiodDetection by Probabilistic Decision Based Neu
[l] R. Chellappa, C.L. Wilson and S.A. Sirohy, ‘Human ral Network’, IEEE Trans. on Neural Networks, Vol.
and Machine Recognition of Faces: A Survey’ JEEE 8, NO. 1, pp. 114132,Jan. 1997.
Proceedings, Vol. 83, No. 3 , pp. 704740, May 1995. [5] S.M. Lucas, ‘Face Recognition with the Continuous
ntuple Classifier’, BMVC97, 1997.
[2] Joachim Hornegger, Heinrich Niemann, D.Pulus
and G. Schlottke, ‘Object Recognition using hidden [6] L. R. Rabiner, ‘A tutorial on Hidden Markov Mod
Markov Models’, Proc. of Pattern Recognition in els and Selected Applications in Speech Recognition’,
Practice, pp. 3744, 1994. IEEE Proceedings, Vol. 77, No. 2, pp. 257285, 1989.
[ 3 ] S. Lawarance, C.L. Giles, A.C. Tsoi and A.D. Back, [7] A.N. Rajagopalan, K. Sunil Kumar, Jayashree Kar
‘Face Recognition : A Convolutional NeuralNetwork lekar, Manivasakan, Milind Patil, U.B. Desai, P.G.
Approach’, IEEE Trans. on Neural Networks, Vol. 8, Poonacha, S. Choudhuri, ‘Finding Faces in Pho
NO. 1, pp. 98113, Jan. 1997. tographs’, ICCV98, pp. 640645, Jan. 1998.
230
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.

R
E
Select
C
the Libel 0
for G
WhCIl
, ._ I
P[ 1 1s
N
Maxlmun I
Observitim T
I
0
N
L
A HMMFace Mcdels
Figure 5. Sample ORL test face images. The face
image with ‘X’ on it, is the only misrecognised face
image out of 200 test face images for the recogni
tion rate of 99.5%.
23 1
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY BOMBAY. Downloaded on December 2, 2008 at 23:59 from IEEE Xplore. Restrictions apply.