Vous êtes sur la page 1sur 23

Face Detection

Henry Chang and Ulises Robles

Face Detection
Henry Chang and Ulises Robles

Introduction
In this project we developed and implemented a color based technique for detecting
frontal human faces in images where they appear.
Some research has been done in this area. Usually, face detection is achieved by training
neural networks and measuring distances between training sets in order to detect areas
that might indicate a human face. Another method for doing face detection is by using
grayscale and color information. Using this method we do not need to take the time, for
instance, to train a neural network. We will implement an algorithm to detect faces
independently of the background color of the scene.
The method consists in two image processing steps. First. we separate skin regions from
non-skin regions. After that, we locate the frontal human face(s) within the skin regions.
In the first step, we get a chroma chart that shows likelihoods of skin colors. This chroma
chart is used to generate a gray scale image from the original color image. This image has
the property that the gray value at a pixel shows the likelihood of that pixel of
representing the skin. We segment the gray scale image to separate skin regions from non
skin regions. The luminance component itself is used then, together with template
matching to determine if a given skin region represents a frontal human face or not.
This document is divided in several pages, each one describing a part of the process to
achieve the detection.
The project was implemented in Matlab using the Matlab Image Processing Toolkit and
the code is provided at the end as well.

Next: Skin Color Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000

SkinColorModel
Inordertosegmenthumanskinregionsfromnonskinregionsbasedoncolor,weneeda
reliableskincolormodelthatisadaptabletopeopleofdifferentskincolorsandto
differentlightingconditions[1].Inthefollowingsection,wewilldescribeamodelof
skincolorinthechromaticcolorspaceforsegmentingskin.
ThecommonRGBrepresentationofcolorimagesisnotsuitableforcharacterizingskin
color.IntheRGBspace,thetriplecomponent(r,g,b)representsnotonlycolorbutalso
luminance.Luminancemayvaryacrossaperson'sfaceduetotheambientlightingandis
notareliablemeasureinseparatingskinfromnonskinregion[2].Luminancecanbe
removedfromthecolorrepresentationinthechromaticcolorspace.Chromaticcolors[3],
alsoknownas"pure"colorsintheabsenceofluminance,aredefinedbyanormalization
processshownbelow:
r=R/(R+G+B)
b=B/(R+G+B)

Note:Colorgreenisredundantafterthenormalizationbecauser+g+b=1.
Chromaticcolorshavebeeneffectivelyusedtosegmentcolorimagesinmany
applications[4].Itisalsowellsuitedinthiscasetosegmentskinregionsfromnonskin
regions.Thecolordistributionofskincolorsofdifferentpeoplewasfoundtobe
clusteredinasmallareaofthechromaticcolorspace.Althoughskincolorsofdifferent
peopleappeartovaryoverawiderange,theydiffermuchlessincolorthaninbrightness.

Inotherwords,skincolorsofdifferentpeopleareveryclose,buttheydiffermainlyin
intensities[1].Withthisfinding,wecouldproceedtodevelopaskincolormodelinthe
chromaticcolorspace.
Atotalof32500skinsamplesfrom17colorimageswereusedtodeterminethecolor
distributionofhumanskininchromaticcolorspace.Oursamplesweretakenfrom
personsofdifferentethnicities:Asian,CaucasianandAfrican.Astheskinsampleswere
extractedfromcolorimages,theskinsampleswerefilteredusingalowpassfilterto
reducetheeffectofnoiseinthesamples.Theimpulseresponseofthelowpassfilteris
givenby:

Figure1showsthecolordistributionoftheseskinsamplesinthechromaticcolorspace.

Figure1.Colordistributionforskincolorofdifferentpeople
Thecolorhistogramrevealedthatthedistributionofskincolorofdifferentpeopleare
clusteredinthechromaticcolorspaceandaskincolordistributioncanberepresentedby
aGaussianmodelN(m,C),where:

Mean:m=E{x}wherex=(rb)T
Covariance:C=E{(xm)(xm)T}.
Figure2showstheGaussianDistributionN(m,C)fittedbyourdata.

Figure2.FittingskincolorintoaGaussianDistribution
WiththisGaussianfittedskincolormodel,wecannowobtainthelikelihoodofskinfor
anypixelofanimage.Therefore,ifapixel,havingtransformfromRGBcolorspaceto
chromaticcolorspace,hasachromaticpairvalueof(r,b),thelikelihoodofskinforthis
pixelcanthenbecomputedasfollows:

Hence,thisskincolormodelcantransformacolorimageintoagrayscaleimagesuch
thatthegrayvalueateachpixelshowsthelikelihoodofthepixelbelongingtotheskin.
Withappropriatethresholding,thegrayscaleimagescanthenbefurthertransformedtoa
binaryimageshowingskinregionsandnonskinregions.Thisprocessoftransforminga
colorimagetoaskinlikelihoodimageandthentoaskinsegmentedimageisdetailedin
thenextsection.

Next: Skin Segmentation Previous: Introduction Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

SkinSegmentation
Beginningwithacolorimage,thefirststageistotransformittoaskinlikelihoodimage.
ThisinvolvestransformingeverypixelfromRGBrepresentationtochroma
representationanddeterminingthelikelihoodvaluebasedontheequationgiveninthe
previoussection.Theskinlikelihoodimagewillbeagrayscaleimagewhosegrayvalues
representthelikelihoodofthepixelbelongingtoskin.Asamplecolorimageandits
resultingskinlikelihoodimageareshowninFigure3.Allskinregions(liketheface,the
handsandthearms)wereshownbrighterthanthenonskinregion.

Figure 3. (Left) The Original color Image


(Right) The skin-likelihood image.
However,itisimportanttonotethatthedetectedregionsmaynotnecessarilycorrespond
toskin.Itisonlyreliabletoconcludethatthedetectedregionhavethesamecolorasthat
oftheskin.Theimportantpointhereisthatthisprocesscanreliablypointoutregions
thatdonothavethecoloroftheskinandsuchregionswouldnotneedtobeconsidered
anymoreinthefacefindingprocess.
Sincetheskinregionsarebrighterthantheotherpartsoftheimages,theskinregionscan
besegmentedfromtherestoftheimagethroughathresholdingprocess.Toprocess
differentimagesofdifferentpeoplewithdifferentskin,afixedthresholdvalueisnot
possibletobefound.Sincepeoplewithdifferentskinshavedifferentlikelihood,an
adaptivethresholdingprocessisrequiredtoachievetheoptimalthresholdvalueforeach
run.
Theadaptivethresholdingisbasedontheobservationthatsteppingthethresholdvalue
downmayintuitivelyincreasethesegmentedregion.However,theincreaseinsegmented
regionwillgraduallydecrease(aspercentageofskinregionsdetectedapproaches100%),
butwillincreasesharplywhenthethresholdvalueisconsiderablytoosmallthatother
nonskinregionsgetincluded.Thethresholdvalueatwhichtheminimumincreasein
regionsizeisobservedwhilesteppingdownthethresholdvaluewillbetheoptimal
threshold.Inourprogram,thethresholdvalueisdecrementedfrom0.65to0.05insteps
of0.1.Iftheminimumincreaseoccurswhenthethresholdvaluewaschangedfrom0.45
to0.35,thentheoptimalthresholdwillbetakenas0.4.
Usingthistechniqueofadaptivethresholding,manyimagesyieldgoodresults;theskin
coloredregionsareeffectivelysegmentedfromthenonskincoloredregions.Theskin
segmentedimageofpreviouscolorimageresultingfromthistechniqueshowninFigure
4.WepresentsomemoreresultsusingthisskindetectiontechniqueinFigures5and6.

Figure 4. (Left) Skin-likelihood Image. (Right) SkinSegmented image

Original Image

Skin-likelihood Image Skin-segmented Image

Figure 5. Image processing sequences for "face.jpg".

Original Image

Skin-likelihood Image Skin-segmented Image

Figure 6. Image processing sequences for "graduation.jpg".

Itisclearfromtheresultsabovethatnotalldetectedskinregionscontainfaces.Some
correspondtothehandsandarmsandotherexposedpartofthebody,whilesome
correspondstoobjectswithcolorssimilartothoseoftheskin.Hencethesecondstageof
facefinderwillemployfacialfeaturestolocatethefaceinalltheseskinlikesegments.

Next: Skin Regions Previous: Skin Color Model Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

SkinRegions
Using the result from the previous section, we proceed to determine which regions can
possibly determine a frontal human face. To do so, we need to determine the number of
skin regions in the image.

A skin region is defined as a closed region in the image, which can have 0, 1 or more
holes inside it. Its color boundary is represented by pixels with value 1 for binary images.
We can also think about it as a set of connected components within an image [2]. All
holes in a binary image have pixel value of zero (black).
The process of determining how many regions we have in a binary image is by labeling
such regions. A label is an integer value. We used an 8-connected neighborhood (i.e., all
the neighbors of a pixel) in order to determine the labeling of a pixel. If any of the
neighbors had a label, we label the current pixel with that label. If not, then we use a new
label. At the end, we count the number of labels and this will be the number of regions in
the segmented image.
To separate each of the regions, we scan through the one we are looking for and we create
a new image that will have ones in the positions where the label we are searching occurs.
The others are set to zero. After this, we iterate through each of the regions found in order
to determine if the region might suggest a frontal human face or not. Figure 7 shows the
segmented skin regions from last section as well as a particular skin region selected by
the system that correspond to the face of the baby image.

Figure 7. (Left) Segmented Skin Regions. (Right) A Skin


Region

Number of holes inside a region


After experimenting with several images, we decided that a skin region should have at
least one hole inside that region. Therefore, we get rid of those regions that have no holes.
To determine the number of holes inside a region, we compute the Euler number [5] of
the region, defined as:

E=C-H
where E: is the Euler
number
C: The number of
connected components
H: The number of
holes in a region.
The development tool (Matlab) provides a way to compute the Euler number. For our
case, we already set the number of connected components (i.e. the skin region) to 1 since
we are considering 1 skin region at a time. The number of holes is, then:
H=1-E
where H: The number
of holes in a region
E: The Euler
number.
Once the system has determined that a skin region has more than one hole inside the
region, we proceed to analyze some characteristics in that particular region. We first
create a new image with that particular region only. The rest is set to black.

Center of the mass


To study the region, we first need to determine its area and center of the region. There are
many ways to do this. One efficient way is to compute the center of mass (i.e., centroid)
of the region [5]. The center of area in binar images is the same as the center of the mass
and it is computed as shown below:

where: B is the matrix of size [n x m] representation of


the region.
A is the area in pixelsof the region
Note that for this computation, we are also considering the holes that the region has.

Orientation
Most of the faces we considered in this project are vertically oriented. However, some of
them have a little inclination. We would like to have a higher matching if we rotate our
template face in the right angle. One way to determine a unique orientation is by
elongating the object. The orientation of the axis of elongation will determine the
orientation of the region. In this axis we will find that the inertia should be the minimum.
The axis will be computed by finding the line for which the sum of the squared distances
between region points and the line is minimum. In other words, we compute the leastsquares of a line to the region points in the image [5]. At the end of the process, the angle
of inclination (theta) is given by:

where:

and:

Width and height of the region


At this point, we have the center of the region and its inclination. We still need to
determine the width and height of the region in order to resize our template face so it has
the same width and height of our region.
First, we fill out the holes that the region might have. This is to avoid problems when we
encounter holes. Since the image is rotated some angle theta, the need to rotate our region
-theta degrees so that it is completely vertical. We now proceed to determine the height
and width by moving 4 pointers: one from the left, right, top and bottom of the image. If
we find a pixel value different from 0, we stop and this is the coordinate of a boundary.
When we have the 4 values, we compute the height by subtracting the bottom and top
values and the width by subtracting the right and the left values.

Region ratio
We can use the width and the height of the region to improve our decision process. The
height to width ratio of the human faces is around 1. In order to have less misses
however, we determined that a minimum good value is 0.8. Ratio values below 0.8 do not
suggest a face since human faces are oriented vertically.

The ratio should also have an upper limit. We determined by analyzing the results in our
experiments that a good upper limit should be around 1.6. There are some situations
however, that we indeed have a human face, but the ratio is higher. This happens when
the person has no shirt or is dressed in such a way that part of the neck and below is
uncovered. In order to account for this cases, we set the ratio to be 1.6 and eliminate the
region below the corresponding height to this ratio.
While the above improves the classification, it can also be a drawback for cases such as
the arms that are very long. If the skin region for the arms has holes near the top, this
might yield into a false classification.

Template Face
One of the most important characteristics of this method is that it uses a human face
template to take the final decision of determining if a skin region represents a face. This
template was choosen by averaging 16 frontal view faces of males and females wearing
no glasses and having no facial hair. The template we used shown in Figure 8. Notice
that the left and right borders of the template are located at the center of the left and right
ears of the averaged faces. The template is also vertically centered at the tip of the nose of
the model.

Figure 8. Template face (model)


used to verify the existence of
faces in skin regions.
At this point, we have all the required parameters to do the matching between the part of
the image corresponding to the skin region and the template human face. Template
matching is described in the next section.

Next: Template Matching Previous: Skin Segmentation Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

Template Matching
This section shows how to do the matching between the part of the image corresponding
to the skin region and the template face.
For the image corresponding to the skin region, we first close the holes in the region and
multiplying this image by the original one. the development toolkit provides a function to
close the holes based on the neighboring pixels.In Figure 9, we show the same baby
image with and without the holes, and the product of the image without holes by the
original image.

Figure 9. (Left) A Skin Region


(Middle) The same region without the holes.
(Right) Result of the original grayscale image by the image in the middle.
The template face has to be positioned and rotated in the same coordinate as the skin
region image. This is done as follows:

Resize the template frontal face according to the height and width of the region
computed in the previous section. (Figure 10).

Figure 10.
(Left) Original template face.
(Right) Resized according to
height and width.
Rotate the resized template face according to -theta, so the template face is
aligned in the same direction the skin region is.Generate a new image that selects
only the model region by cropping it to the boundary of the region (the rotation
process usually makes the image bigger, i.e., adds black pixels to the image).
After that, eliminate the aliasing presented in the edges of the new image. (Figure
11).

Figure 11.
(Left) Rotated template face
(Right). The result of
cropping the image on the
left.
Compute the center of the rotated template face as shown in the previous section.
Create the grayscale image that will have the resized and rotated template face
face model. This image must be of the same size as the original image (Figure
12).

Figure 12. The center of the


Template face is located in
the center of the skin region.
We then compute the cross-correlation value betweenthe part of the image
corresponding to the skin region (Right in Figure 9) and the template face properly
proceessed and centered (Figure 11). We empirically determined, from our experiments,
that a good threshold value for classifying a region as a face is if the resulting
autocorrelation value is greater that 0.6.
After the system decided that the skin region correspond to a frontal human face, we get a
new image with a hole exactly the size and shape of that of the processed template face.
We then invert the pixel values of this image to generate a new one, which, multiplied by
the original grayscale image, will yield an image as the original one, but with the
template face located in the selected skin region. This is shown in Figure 13(4), in which
the face of the baby is replaced by the template face.

Figure 13 (1) As in Figure 12,


but with a hole in the template
face

Figure 13 (2)
As in (1), but inverted

Figure 13 (3)
The previous image is
multipled by the original one

Figure 13 (4)
As in (3), but adding the
Template face to it.

We finally get the coordinates of the part of the image that has the template face. With
these coordinates, we draw a rectangle in the original color image. This is the output of
the system which in this case, detected the face of the baby as shown in Figure 14.

Figure 14. Final Result


We present more results in the next section.

Next: Results and Discussion Previous: Skin Regions Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

Results and Discussion

We tested the method with a set of 30 images. The achieved classification rate was 76%.
Most of the misses included regions that had very similar skin likelihood values and
regions that were indeed skin regions but they were very high, such as the arms and legs
with more than one hole in the upper part of the skin region. Other misses happened due
the constrain we set of having one or more holes in a skin region in order to process that
region.
We present some images and their corresponding processing in order to detect if there is
any face in the image.
In Figure 15, we see that the neck of the lady is long, and this might cause that we detect
the neck also. As described in the previous section, we set the ratio to be 1.6 and the
height decreased accordingly. Notice also that the template face was fitted into the skin
region very accurately, giving a cross correlation value grater than 0.8.

Original Image

Skin-likelihood Image

Skin-segmented Image

Image and Template


Face

Final Detection

Figure 15. Image processing sequence for face detection for the image "blackgirl.jpg"
In Figure 16, we see that the child has blond hair, which in this case is very similar to the
child's face color. This results in having a large skin region as shown in the third image.
Consequently, the face model was fit in a larger area then the child's face. The region was
detected with a cross correlation value of 0.71

Image

inal Image

Skin-likelihood Image

Skin-segmented Image

Image and Template Face

Final D

Figure 16. Image processing sequence for face detection for the image "blackgirl.jpg"
In Figure 17, we see an image that was neat and easy to be detected. The woman skin
region has 2 holes (the eyes are not included). The man has 5, and the baby has 2. The
cross correlation value for the three of them was greater than 0.8.

Skin-likelihood Image

Skin-segmented Image

Image and Template Face

Figure 17. Image processing sequence for face detection for the image "blackgirl.jpg"
Figure 18 was a bit more complicated since the skin region corresponding to the man
only presented one hole (not even noticed here), but the cross correlation value was
greater than 0.85 and this resulted in a good classification.

Final Detec

Original Image

riginal Image

Skin-likelihood Image

Skin-segmented Image

Image and Template Face

Final Detection

Figure 18. Image processing sequence for face detection for the image "chinesecouple.jpg"
In Figure 19, we can appreciate that our implementation can classify faces of different
races. The skin segmentation was accurate. The cross-correlation value was around 0.7.
Notice that the template face is a little bit off the real face. This is for the center of the
mass was to the left of the nose of the lady. The reason for this is that the left part of the
image has a larger skin area than the right part (notice the opening in the hair to the left).

Skin-likelihood Image

Skin-segmented Image

Image and Template Face

Figure 19. Image processing sequence for face detection for the image "naomi.jpg"
Finally, Figure 20 illustrates 2 human faces of slightly different skin colors. Notice that
the hands and the cat regions were not detected since the ratio was lower than 0.8 (wider
than higher), which does not correspond to a human face region. In both faces, the

Final Detecti

riginal Image

template face was elongated a little bit due to the height to width ratio.

Skin-likelihood Image

Skin-segmented Image

Image and Template Face

Figure 20. Image processing sequence for face detection for the image "women.jpg"

Next: Conclusion Previous: Template Matching Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

Conclusion
The retrieval of images containing human faces requires detection of human faces in such
images. We implemented a new method that segments skin regions out and locate faces
using template matching in order to detect frontal human faces. We used 30 images to test
the performance of this implementation and we got 76% of accuracy.

Final Detecti

The misses usually included regions with a similar skin likelihood values and regions
that certainly were skin regions, but corresponded to other parts of the body such as legs
and arms. In other cases, misses were found due the constrain we set of having one or
more holes in a skin region to be in considered for the processing described in the
previous sections.
Our current implementation is limited to the detection of frontal human faces. A possible
and interesting extension would be to expand the template matching process to include
sided-view faces as well.

Next: Source Code Previous: Results and Discussion Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000Face Detection
Henry Chang and Ulises Robles

Source Code
Please Note: We have received many inquiries about the source code. This code does not intend
to solve someone's particular project and it is not complete (it is around 85%). The purpose of this
code is to give you an idea of how we implemented part of the algorithm.
Please do not send us any email regarding code questions or skin samples, because it will NOT
be answered. Other project questions are welcome. Thank you.

ChromaDist.m : returns the chromatic components of the image low pass filtering
is carried out to remove noise.

ColorDistPlot.m : plot the chromatic color distribution of the image.

SegmentSkin.m : Assume the skinmodel.m is run. Then, produce two images,


skinlikelihood greyscale image, skin1and skin segment binary image, skin2.

skinmodel.m: 32500 skin samples from 17 color images will be used here to
determine the the color distribution of a human face in chromatic color space.

detect.m : main routine that given theskin segmented image and the original
image, determines which regions correspond to a face and marks them with a
rectangle.

processregion.m : plot the chromatic color distribution of the image.

faceInfo.m : Gets some information of the region that might indicate a face.

center.m: Computes the center of mass or centroid of a skin region.

orient.m: Determines the inclination angle of the region with respect to the
vertical line.

isThisAFace.m

clean_model.m

recsize.m

Next: References Previous: Conclusion Contents: Face Detection

Henry Chang and Ulises Robles


Last modified: Thu. May 25, 2000

Vous aimerez peut-être aussi