Académique Documents
Professionnel Documents
Culture Documents
\
|
=
x
y
1
tan u
,
|
.
|
\
|
=
r
z
1
cos |
(1)
where r is the diameter of mesh M (Figure 2 shows
a spherical projection). Then, we can store normals of
mesh M in a two-dimensional structure, namely the
normal sphere N, by using the previously 2D-projected
vertex coordinates given by equation (1).
Figure 2. Projecting the (a) 3D mesh vertices in (b)
spherical coordinates.
In order to obtain an optimized tessellation of 3D
geometry we have to re-shaping triangles by fixing the
vertex coordinates according to a quite regular
structure recursively generated from an icosahedron
(see Figure 3). So there is only one normal to each
point (triangle) on the surface of the normal sphere;
each normal is characterized by its components, so that
the spatial distribution of the normals can be
represented in a two-dimensional structure N by
representing each normal as a point on the normal
sphere surface. In details, for each triangle t
i
in N with
The 18th International Conference on Pattern Recognition (ICPR'06)
0-7695-2521-0/06 $20.00 2006
vertex coordinates
) , ( ) , , (
3 2 1
| u e
i i i
v v v
we assign the
RGB value obtained from the corresponding normal n
i
to the triangle t
i
on the mesh M. We refer this resulting
structure as the Normal Sphere N of the mesh M.
Figure 3. Mesh details before and after resampling.
In order to nullify the effects of face rotations we
consider the angles between normals of contiguous
triangles. In more details, for each triangle on N its
three contiguous neighbors are considered and
the angles
i
t
l k j
t t t , ,
) , (
j i
t t o
,
) , (
k i
t t o
and
) , (
l i
t t o
between
their normals are computed and sorted (
3 2 1
o o o s s
).
Then
2 1
,o o and
3
o
are quantized with 8 bits in the
range [0, 255] and considered as R, G or B band, so
the resulting codification can be seen as a 24 bit color
(see Figure 1). This is the new facial shape that the
GFD descriptor is applied on.
3. Face Classification
To turn the 3D classification problem into a 2D
classification task allows us to save a meaningful part
of the computational complexity. However, we want
the rotation invariance property to be still preserved, so
a certain attention has to be paid to select a 2D face
descriptor. In this case, we opted for a Fourier based
operator. The Fourier Transform (FT) is largely used
in image processing, because it often makes some
limitations to be overcome by working in the
frequency domain (e.g.: noise, shifts). In particular, we
adopted a FT based descriptor defined by Zhang in [7]
for classifying binary trademarks, but we readapted it
to our purposes.
Given an image I in the Cartesian space xOy, we
convert it in a polar space O, by relating I(x,y) with
I(,), defining
( ) ( )
2 2
c c
y y x x + =
and
( ) ( ( )
c c
x x y y a = / 2 tan ) u
, where is the
center of the Cartesian space. The polar Fourier
Descriptor is defined on this polar space as follows:
(
c c
y x O , )
( ) ( )
(
|
.
|
\
|
+ =
__
|
t
t u |
u
T
i
R
r
j r I FD
r
i
i
2
2 exp , ,
(2)
where R r < s 0 and
( ) T i
i
/ 2t u =
, T i < s 0 ;
R < s 0
and T < s | 0 . The and | represent the
number of selected radial and angular frequencies. The
FD descriptor is made rotation invariant by retaining
only the magnitude of the coefficients, while
robustness with respect to the scaling is achieved by
dividing the first coefficient by the area containing the
image and all the remaining coefficients by the first
one, as follows:
( ) ( )
( )
( )
( )
)
=
0 , 0
,
, ,
0 , 0
1 , 0
,
0 , 0
FD
n m FD
FD
FD
area
FD
V
(3)
The most important factor is the way, in which the
point ( )
c c
y x O ,
is localized. Indeed, imposing some
restrictions places this method on a par with all the
approaches requiring a pre-alignment of the face.
We solve this problem by keeping in mind that the
objects to be classified are faces, while large
experimentation proved that we can locate the nose tip
as the point of maximum curvature on the face surface.
By fixing this point onto the normal sphere as the
center of the Cartesian space, left/right side and
bottom/up rotations are made irrelevant, while rollings
(7
th
case in Figure 4) vanish thanks to the
rotation/scaling invariant property of the FD
descriptor.
4. Experimental Results
We built a gallery of 120 subjects enrolled by a
structured light scanner from Inspeck Corp.. In this
case the pose variations were acquired through
multiple scanning of the same individual. Acquisition
differs for pose and expression; in more detail, ten 3D
models with different facial expression have been
acquired for each subject, while 5 pose per subject
have been considered. Figure 4 shows a subset of
poses and expressions we considered.
Figure 4. Pose variations (1st, 2nd and 3rd
neutral, 45 and 60 right side, bottom, and rolled
orientation).
In the first experiment we investigated the best
resolution providing highest value of the CMS. The
The 18th International Conference on Pattern Recognition (ICPR'06)
0-7695-2521-0/06 $20.00 2006
gallery set contains 120 images with neutral
expression, one per subject, while the probe consists of
1080 images that are the remaining 9 expressions for
120 subjects. Figure 5 shows that 1616 pixels are too
little giving very poor results in terms of CMS. The
best results are achieved for a resolution of 3232,
while the performances get worse again when the
resolution increases. This behavior can be explained by
considering that high resolution carry too much high-
frequency information that is of no use in this case.
The second experiment tests the robustness of the
method with respect to pose variations. In particular as
discussed in Section 3 left/right and bottom/up face
rotations are made irrelevant by locating the nose tip
on the normal sphere, thus we considered here a
gallery set of 20 aligned images, as in the first
experiment, while the probe consists of 180 images (9
per subject) with different facial expression and rolled
of a random angle . In
| 6 / , 0 t o e |
Figure 6 we
compared performances of NS+GFD, NS+PCA and
the normal map (NM) based method from [1] when 30
head rolling occurs. Results confirm the superiority of
the proposed method with respect to both facial
expression and pose variation.
Figure 5. CMS for different image resolutions.
5. Concluding remarks
We presented a novel 3D face recognition approach
based on normal sphere, a 2D data structure
representing local curvature of facial surface, aimed to
biometric applications. The 2D classification task
allows us to save a meaningful part of the
computational complexity, preserving the rotation
invariance property. So, it proved to be simple,
invariant to posture variations, fast and with an high
average recognition rate. As the normal sphere is a 2D
mapping of mesh features, ongoing research will
integrate additional 2D color info (texture) captured
during the same enrolment session. Implementing a
true multi-modal version of the basic algorithm which
correlates the texture and normal image could further
enhance the discriminating power even for complex
3D recognition issues such as the presence of beard,
moustache, eyeglasses, etc.
Figure 6. CMS on rolled faces.
6. References
[1] A. F. Abate, M Nappi, S. Ricciardi, G. Sabatino, "Fast
3D Face Recognition Based On Normal Map", in Proc. of
IEEE International Conference on Image Processing
(ICIP05), Genova, Italy, 2005.
[2] C. Beumier and M. Acheroy, "Face verification from
3D and grey level cues", in Pattern Recognition Letters, Vol.
22, No. 12, pp. 1321-1329, 2001.
[3] K. Bowyer, K. Chang, P. Flynn, "A survey of
approaches to threedimensional face recognition", in Proc. of
17
th
International Conference on Pattern Recognition (ICPR
2004), Vol. 1, pp. 358-361, Aug. 2004.
[4] A. M. Bronstein, M. M. Bronstein, and R. Kimmel,
"Expression invariant 3D face recognition", in Proc. of
Audio- and Video- Based Person Authentication (AVBPA
2003), Guildford, UK, Lecture Notes in Computer Science, J.
Kittler and M.S. Nixon, Vol. 2688, pp. 62-69, 2003.
[5] R. Chellapa, C.L. Wilson, S. Sirohey, "Human and
machine recognition of faces: A Survey," in Proc. of the
IEEE, Vol. 83, No. 5, pp. 705-740, 1995.
[6] G. Medioni and R. Waupotitsch, "Face recognition and
modeling in 3D", in Proc. of IEEE Int'l Workshop on
Analysis and Modeling of Faces and Gestures (AMFG
2003), Nice, France, pp. 232-233, Oct. 2003.
[7] D. Zhang, G. Lu, Shape-based image retrieval using
generic Fourier descriptor, in Proc. of Signal Processing:
Image Communication 17, pp. 825848, 2002.
The 18th International Conference on Pattern Recognition (ICPR'06)
0-7695-2521-0/06 $20.00 2006