Vous êtes sur la page 1sur 5

201O International Confrence on Educational and Information Technolog (ICEIT 2010)

3D Face Reconstruction using a Single 2D Face Image


Vue Ming, Qiuqi Ruan, Snior Member, IEEE, Xiaoli Li
Institute of Information Science,
Beijing JiaoTong University,
Beijing 100044, P.R. China
myname35875235@126.com
Abstract-In this paper, a novel framework for 3D face
reconstruction from a single 2D face image was proposed. We
focus on generating 3D face model without expensive devices
and complicated calculation. First, we preprocess 2D face
image, including illumination compensation, face detection and
feature point extraction. Then, the method is based on a 3D
morphable face model that encodes shape and texture in terms
of model parameters. The prior 3D face model is a linear
combination of "eigenheads" obtained by applying peA to a
training set of laser-scanned 3D faces. To account for pose and
illumination variations, the algorithm simulates the process of
image formation in 3D face and it estimates 3D face and
texture of faces from a single image. As opposed to previous
literature, our method can locate facial feature points
automatically in a 2D facial image. Moreover, we also show
that our proposed method for pose estimation can be
successfully applied to 3D face reconstruction. In our
experiment results, the method proposed has a satisfed
performance regarding calculating time and the reconstructing
photorealistic 3D faces.
Keyords- 3D face reconstucton; 3D morphable model;
il uminaton comensaton; pose estimation; feature point
eraction
I. INTRODUCTION
Information and Communication Technologies are
increasingly entering in all aspects of our life and in all
sectors, opening a world of unprecedented scenarios where
people interact with electronic devices embedded in
environments that are sensitive and responsive to the
presence of users. Indeed, since the frst examples of
"intelligent" buildings featuring computer aided securit and
fre safet systems, the request for more sophisticated
services, provided according to each user's specifc needs
has characterized the new tendencies within domestic
research.
3D face reconstruction using a single 2D facial image is a
challenging task requiring several process such as depth
estimation or face modeling since it is an ill-posed problem.
Research in the areas of computer gaphics and machine
vision has given rise to a number of concepts that aid in
facial reconstruction [1], for instance Structure fom Motion
(SFM) [2], Shape fom Contours [3], Shape fom Silhouette
[4]. However, all of these methods require several 2D images
or video fames per person to reconstruct the 3D face, which
is not always practically feasible to obtain several 2D images
per person. In this paper, we focus exclusively on the
existing state-of-the -art of 3D facial reconstruction using
only a single fontal 2D image as the training input for each
person. Blanz et al. [5] proposed a 3D alignment algorithm to
recover the shape and texture parameters of a 3D morphable
model. Romdhani et al. [6] introduced linear shape and
texture ftting for 3D face reconstruction. Shape alignment
and interpolation method correction (SAIMC) introduced by
Jiang et al. [7] to retrieve a 3D facial model fom a simple
fontal 2D image. Applying the EM algorithm to facial
landmarks in a 2D image, Sung et al. [8] proposed a pose
estimation algorithm to infer the pose parameters of rotation,
scaling and translation for reconstructing photorealistic 3D
faces. In general, the aforementioned 2D based methods do
not consider the specifc structures of human faces, and thus
fequently leads to the worse performance on profle pose
face samples. 3D based methods overcome this problem, but
they either require heavy manual labeling work or are time
consuming [1].
In this paper, we propose an efcient and flly automatic
2D-to-3D integrated face reconstuction method to provide a
solution to the above problems. Our system includes three
components: preprocessing for 2D facial image, 3D face
shape reconstruction and texture correspondence. First,
combined with AdaBoost Algorithm and the skin color
likelihood estimation, we fnd the face region, and ASM [9]
is used to facial feature point's detection. As a result, the
input is no longer arbitrar and it is advantageous to use
prior knowledge of the structure of a human face. Then, the
3D face shape is reconstructed according to the feature
points and a 3D face database. From a single image, the
algorithm estimates facial texture considering specular
refection and cast shadows. Compared with previous work,
our famework is flly automatic and avoids the burden
enrollment work.
The remainder of this paper is organized as follows. First,
we describe the automatic 2D facial image preprocessing
process in section 2. In section 3, we describe the algorithm
for recovering shape parameters fom images. Then, we
introduce texture correspondence algorithm in section 4.
Section 5 concludes the paper.
II. PREPROCESSING FOR STEREO PAIRS
Under non-ideal conditions, many issues will afect 2D
images acquisition, such as illumination, image angle and
distance. The preprocessing scheme is based on three main
978-1-4244-8035-7110/$26.00 2010 IEEE V3-32
2010 International Confrence on Educational and Information Technolog (ICEIT 2010)
tasks, illumination compensation, face detection and feature
points detection. We present these tasks in more detail in the
rest of the section. The flly automated tasks handle noisy
and incomplete input data, are immune to rotation and
translation and are suitable for diferent resolutions as shown
in Figure 1.
Feature Points
Dtection
Figure 1. The famework of 2D facial image preprocessing.
A. Ilumination Compensation
Illumination compensation is used here to denote the
normalization of the input brightness image so that the efect
of varing illumination conditions is diminished. I this
section, compensates illumination is performed by
generating fom the input image a novel image relit fom a
fontal direction. We introduce the recent work on image
based scene relighting used for rendering realistic images,
e.g. placing a virtual object in a real image [10]. The idea
behind image relighting is ver simple. Malassiotis [11]
indicated that the image irradiance (brightness) is a fnction
of scene geometr, surface refectance (bidirectional
refectance distribution fnction, BRF), viewer position
and illumination distribution. Given one or more images of
the scene fom different viewpoints, scene geometry and
surface refectance one may invert this fnction to recover
the illumination distribution. Then the recovered illumination
can be used either to render a virtual object or to relight the
scene under novel illumination. The method not only
enhances local properties and details the texture information
of facial images, but also extracts local details efectively
improving disparit acquisition.
Given the estimate of the light source direction
L relighting the input image with fontal illumination Lo is
straightforward. Let Ie,I
D
be respectively the input pose
compensated color and de
p
th images obtained fom shape
fom shading (SFS) and Ie the illumination compensated
image [11]. Then the image irradiance for each pixel U is
approximated by
Ie(u) = A(u)R(I
D
,L,u),icCu) = A(u)R(ImLo'u)
(1)
Where A is the unknown face texture fnction
(geometr independent component) and R is a rendering of
the surface with constant albedo. When R( u) = n( U)
T
L ,
with n( u) denoting the surface normal at U , we get the
well known Lambertian refection model. Eq. (1) is written
(2)
i.e. the illumination compensated image is given by the
multiplication of the input image with a ratio image.
Figure 2. Illumination compensation of facial image
B. Face Detection
This paper will integate the two face detection methods
to design one new face detection algorithm based on the
advantages and disadvantages of the AdaBoost and the skin
color segmentation algorithm [12]. Firstly eliminate the
majorit background regions by Adaboost segmentation with
faster detection speed; meanwhile detected the majorit of
human face, including the side-face and multi-pose face.
Then we utilize YCbCr color space, which can separate
brightness and chroma ver well, besides, the color space is
discrete, which is easy to realize clustering algorithm and
other merits. We use skin color Gaussian model, which face
detection accuracy for single-face is ver high, and avoids
detecting the multi-face, the multi-pose image and missing
detection; fnally, it only eliminates the inhuman face region.
The conversion formula fom the RGB space to the YCbCr
space is as follows:
[
Y
I
[ 0.2990
Cb -0.1687
Cr 0.5000
1 0
Figure 3.
0.5870
-0.3313
-0.4187
o
0.1140
-0.5000
-0.0813
o
1

8
1

[l (3)
128 B
o 1
Face detection of 2D facial image
The processes of face detection efectively refect image
low-level features like edges, peaks, valleys, and ridges,
which is equal to enhancing key facial element information
such as nose, eyes and mouth plus local characteristics like
dimples, melanotic nevus and scars. They not only preserve
global facial information, but also enhance local
characteristic. When the pose, expression and position of a
V3-33
2010 International Confrence on Educational and Information Technolog (ICEIT 2010)
face change, local chages are less than global changes,
resulting are ver robust feature detection.
C Feature Points Detection
Automatic facial feature detection is a difcult but a key
task for many practical applications of face image analysis.
In this paper, invisible points under occlusions or undetected
points are estimated through the global shape and texture
constraints using Active Appearance Models [13]. We defne
the facial feature points mostly located around the eyes, nose,
eyebrows, mouth, and boundar of a face. These points will
provide general shape information about any faces. First, we
use the General Whole Face Shape Template with open eye
(or the one with closed eye) to initialize the whole face, thus
get the approximate locations of two outer comers of eyes.
Then, we apply local ASM to mouth to estimate mouth
contour and get the true edge of mouth by Canny operator. If
the eyes are detected as open eye and the mouth is detected
as 0 shape mouth, then choose the Whole Face Templates
for open eye and 0 shape mouth to search the whole face
contour; and so on. Take advantage of the multi-resolution
searching [13], we can get the whole face contour when
ASM converges or te maximum times of ASM iteration is
reached. Totally 68 feature points are located automatically
on the faces.
Figure 4. Facial feature points of 2D facial image
The approach preserves the local neighbor stucture of a
facial image and increases global discriminant information.
This has many positive benefts, such as compressing data
thereby reducing storage requirements, removing
unnecessary noise, extracting efective features for realizing
visualization of higher dimensional data.
III. 3D FACE SHAPE RECONSTRUCTION
First, for training, we use 200 laser-scanned 3D faces in
the USF Human-ID database [5]. Each face in the database
has 75972 vertices. The original images consist of
considerable dense points in 3D space. We exactly align the
range images and approximate the original range images
with a simple and regular mesh by the multi-solution ftting
scheme [14] for better performance. The geometr of a 3D
face model is represented with a shape-
vector S = (Xp,ZpXz, .. ,y,Z
n
? E R3n [15]. PCA
is conducted to get a more compact and regular shape
representation of face by the primar components. Here,
S is the average shape, P E R3nxm is the matrix of the frst
m eigen vectors (in descending order according to their
eigenvalue) [15]. A new face shape S I can be expressed as
(4)
Where a = (al'az,,a
m
)T E Rm is the coefcients
of the shape eigenvectors.
We selected t 2D facial feature points for 3D
reconstruction as discussed in section II.C. We
describeS
j
= (Xp,Xz,,X
t
')T E RZt is the set of
X, Y coordinates of feature points on the surface as the sub
shape vector of S . A new face based on the
X, Y coordinates of those feature points, can be expressed
as
(5)
S
-
RZt P RZtxm X Y . Where
j
E and
j
E are the , coordmates
of the feature points on S and P , respectively.
In the reconstuction step, we transform face coordinate
to image coordinate to obtain the transformed shape S
S =cS +T (6)
Where T E RZt is the translation vector and C E R is the
scale coefcient, assumed fontal view and no rotation
matrix required.
We apply an iterative procedure to compute the face
geomet coefcient a . Let S
j
be the initial value of S
and Y and T
y
be the average ofsets of all t feature points of
S to the original along X, Y axes, respectively, then
T
1 "
(Y,T
y
) =-
L.
S
f
(7)
t i
=1
C
I:=I
(S - (Y,T
y
)T ,S)
I:JI
S
II
Z
(8)
The face geometry coefcient a can be computed using
[15], derived fom (4) and (5):
Where A = diag( V
I
' V z' ... , V
m
) is applied to constrain
a to avoid the outliers, A is the weighting factor, and V
i
is
V3-34
2010 International Confrence on Educational and Information Technolog (ICEIT 2010)
the i -th eigenvalue. Then a new S can be obtained by
applying i to Eq.(5).
IV. TEXUR CORRESPONDENSE
I an analysis-by-synthesis loop, the morphable face
model can be fting to a novel face shown in an input image
I
i
npu
t
(x, y) , aiming at fnding the model parameter / for
texture correspondence. For fting the model to an image,
we only consider the centers of triangles, about O.3mm
2
in
size [16]. The illumination model of Phong approximately
describes the difse and specular refection on a surface. On
each vertex k , the red channel is
Ir
,
modez(X,y) Rk
.
Lr
,
amb
+
Rk
.
Lr
,
dr (nk
.
T) + ks
.
Lr
,
dir
.
(", vk)
(10)
Where R
k
is the red component of the difse refection
coefcient stored in the texture vector T , Lr
,a
mb and
Lr
,
dir are te red intensities of the ambient and direct light,
I is the direction of illumination, ks is the specular
refectance, V defnes the angular distribution of the
specularities, 1 is the viewing direction, and
r
k
= 2 (n
k
1) n
k
-/ is the direction of maximum
specular refection [16, 5]. Green and blue channel are
computed in the same way. The transformed
I
r
,modei' I g
,model
and
I
b,model
are drawn at a position
(Px,p
y
) in the fnal image
Imodel
. The optimization
algorithm stars fom the average face at a position and
orientation roughly aligned with the face in the image. The
gradient descent algorithm is applied to minimize the sum of
square diferences over all color channels and all pixels in
the input image and the synthetic reconstruction.
Ek::IIII
i
npu
t(Xk'Yk)-Imodel
(xk'Yk)lr
(11)
kEK
Where K is stochastic point set,
(Xk , Yk)
is the
barycentric of triangular faces in the projection point of
image plane. For each iteration of the optimization process,
the ftting algorithm analytically computes the gradient of the
cost fnction and then updates the parameters:
a
E
/ / -AI (12)
If
IE -Et
a
stl
is smaller than the given threshold e, the
iteration is complete and the parameters / is updated.
Figure 5. The fowchart of 3D face reconstruction
V. CONCLUSION
This paper presents a new attempt to a practical 3D face
reconstruction system using a single 2D image. On the basis
of a thorough research on preprocessing, 3D face shape
reconstruction and texture correspondence, a lively 3D face
model is obtained. Given the large variations in illuminations
and changes in viewpoint fom font to profle, the
performance of our algorithm seems promising. The result
clearly demonstrates a potential possibilit of creating a cost
efective, easy-to-use facial model acquisition system
applicable to a wide range of 3D face reconstruction. For
frther evaluation, the method needs to be applied to a larger
database.
ACKNOWLEDGMNT
This work was supported partly by the National Natural
Science Foundation of China (Grant No.60973060),
Specialized Research Fund for the Doctoral Program of
Higher Education (Grant No. 200800040008), The Doctoral
Candidate Outstanding Innovation Foundation (Grant No.
141092522) and the Fundamental Research Funds for the
Central Universities (Grant No. 2009YJS025).
REFERECES
[I] Martin D. Levine, Yingfeng (Chris) Yu, "State-of-the-art of 3D facial
reconstruction methods for face recognition based on a single 2D
training image per person", Patter Recognition Letters, vol. 30, no.
10, pp.908-913, IS July 2009.
[2] C. Bregler, A. Hertzmann, H. Biermann, "Recovering non-rigid 3D
shape fom image streams", In: Proc. IEEE Comput. Soc. Conf on
Computer Vision and Patter Recognition, vol. 2, 2000, pp. 690-696.
[3] G. Himaanshu, AK RoyChowdhury, R. Chellappa, "Contour-based
3D face modeling fom a monocular video", In: British Machine
Vision Conference, BMVC04, Kingston University, London,
September 7-9,2004.
[4] B. Moghaddam, J.H. Lee, H. Pfster, R. Machiraju, "Model-based 3D
face capture with shape-fom-silhouettes", In: IEEE Int. Workshop on
Analysis and Modeling of Faces and Gestures (AMFG), Bice, France,
pp. 20-27,2003.
[5] V. Blanz, T. Vetter, "Face recognition based on ftting a 3D
morphable model", IEEE Trans. Patter Anal. Machine Intell., vol. 25,
no.9, pp. 1063-1074, 2003.
[6] S. Romdhani, V. Blanz, T. Vetter, "Face identifcation by ftting a 3D
morphable model using linear shape and texture error fnctions", In:
Proc. ECCV, vol. 4. pp. 3-19,2002.
[7] D. Jiang, Y. Hu, S. Yan, L Zhang, H. Zhang, W. Gao, "Efcient 3D
reconstruction for face recognition", Patter Recogn., vo1.38, no.6,
pp. 787-798,2005.
V3-35
2010 International Confrence on Educational and Information Technolog (ICEIT 2010)
[8] Sung Won Park, Jingu Heo and Marios Savvides, "3D face
reconstuction", IEEE Computer Society Conference on Computer
Vision and Patter Recognition Workshops, pp.I-8, 2008.
[9] S. Milborrow and F. Nicolls, "Locating Facial Features with an
Extended Active Shape Model", ECCV2008.
[10] I.Sato, YSato, K.Ikeuchi, "Acquiring a radiance distribution to
superimpose virtual objects onto a real scene", IEEE Trans. Visualiza.
Comput. Graph., vo1.5, no.l, pp.I-12, 1999.
[II] Sotins Malassiotis, Michael G. Strintzis, "Robust face recognition
using 2D and 3D data: Pose and illumination compensation", voL38,
no.12,pp. 2537-2548,2005.
[12] Ying Li, J.H. Lai, P.e. Yuen, "Multi-template ASM Method for
feature points detection of facial image with diverse expressions", 7th
Interational Conference on Automatic Face and Gesture Recognition.
[13] N. Uchida, T. Shibahara, T. Aoki, H. Naajima, K. Kobayashi, "3D
face recognition using passive stereo vision", IEEE Interational
Conference on Image Processing, 2005.
[14] Gabriel Peyer, "Numerical Mesh Processing", Chapter 4.
[15] Dalong Jiang, Yuxiao Hu, Shuicheng Van, Lei Zhang, Hongiang
Zhang, Wen Gao, "Efcient 3D reconstruction for face recognition.
Patter Recognition", voL38, no.6, pp.787-798, June 2005.
[16] V Blanz, S. Romdhani, T. Vetter, "Face identification across
diferent poses and illuminations with a 3D morphable model", Fifh
IEEE Interational Conference on Automatic Face and Gesture
Recognition, pp. 192-197,2002.
V3-36

Vous aimerez peut-être aussi