Vous êtes sur la page 1sur 6

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL.

4, ISSUE 1

MARCH 2014

Unconcealed Gun Detection using Haar-like and


HOG Features - A Comparative Approach
Sorath Asnani, Syed Danial Waseem, Ali Asghar Manjotho
Abstract Due to its wide variety of applications, object
detection has been the center of attention for researchers in the
field of digital image processing and computer vision. When
trained with the sample training dataset, various object
classifiers can detect and classify the objects with prominent
accuracy and precision. The major step in any of the object
classification algorithm is feature selection. Performance of the
classifier depends on robustness of the feature vector selected.
This paper presents unconcealed gun detection method by
using Boosted Cascade Classifier. The classifier was trained
with two of the widely known feature types: Haar-like features
and Histogram of Oriented Gradients (HOG) features. The
paper also presents a comparative study between the two of the
feature types under the consideration of unconcealed gun
detection. The classifier was trained with the dataset of 11,257
number of images using both the types of features separately
and tested with dataset of 700 number of images. Using the
Haar-like features the classifier attained the accuracy of
42.14% with the precision of 45.73%. While using the HOG
features, the classifier gained the accuracy of 88.57% with the
precision of 95.30%. The evaluation metrics clearly depicts the
superiority of HOG features over the Haar-like features in
unconcealed gun detection.

it has other objects as well. For example, the image of gun


may contain a person holding that gun, a person to which the
gun is being pointed, the environmental background and so
on. So it is not suggested to provide whole image to a
classifier. Rather, it is more beneficial to extract some
features from the object of interest in an image.
Feature is simply a piece of information related to the object
in an image. A feature may typically be the shape, edge,
color, size or texture of object. The more strong the feature
set, the more accurate is the classifier. So the most important
step in training a classifier is to decide the feature set and to
select a robust feature extraction technique.
There are various feature extraction algorithms used in
practice today for the purpose of classification, few of them
are: SIFT (Scale Invariant Feature Transform) [1], SURF
(Speeded-up Robust Features) [2], FAST (Fusing Points and
Lines for High Performance Tracking) [3], MSER
(Maximally Stable External Regions) [4], BRISK (Binary
Robust Invariant Scalable Key-points) [5], HOG (Histogram
of Oriented Gradients) [6], Haar wavelets [7], LBP (Local
Binary Patterns) [8] and many others.

Keywords Object Detection, Cascade Classifier, Haar-like


features, HOG features

The biggest problem in object detection is to find the most


appropriate feature type for the particular application.
Literature survey reveals that Haar and HOG feature types
are the most powerful features from all the above mentioned
features. But at the same time we cannot ignore the
importance of other features in forming the basis for the
development of these very famous feature types.

I. INTRODUCTION
Unconcealed weapon detection is a modern research area in
the field of Computer Vision. It is an application of smart
surveillance which provides a guaranteed, cost efficient and
accurate security system. Weapon detection is not a new
field for research because much work has been done on
concealed weapon detection. The interesting point is that
Unconcealed Weapon Detection is a relatively new field of
research.

This paper provides a comparative study between the two


well-known feature types i.e. HOG and Haar-like features
for the purpose of unconcealed weapon detection.
The HOG features, introduced by Dalal and Triggs in [6],
are well known for their high detection accuracy whereas the
Haar-like features, proposed by Viola and Jones in [7] are
famous for their fast training speed. This paper proposes a
practical approach to validate the above properties of both
the feature types in case of unconcealed gun detection.

To detect any object, a classifier is trained. A classifier is


nothing but a computer program which is capable to identify
an object of interest among all other objects. A classifier
needs some input to process so that it can be trained
accordingly. Input to the classifier is in the form of training
images. An image not only contains the object of interest but

The rest of the paper is ordered as follows: Section II


outlines the previous work, Section III provides the details

Department of Computer Systems Engineering


Mehran University of Engineering and Technology, Jamshoro, Sindh,
Pakistan.
sorath.asnani@hotmail.com, engr.syeddanialwaseem@outlook.com,
ali.manjotho@faculty.muet.edu.pk

34

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL. 4, ISSUE 1

of the feature types used in this paper, Section IV describes


the classifier training for gun detection, Section V discusses
the training and testing corpus, Section VI describes the
methodology carried out for training the classifier,
Experimental results are demonstrated in Section VII,
Section VIII summarizes and concludes the proposed
system. Finally the future work and references are provided.

MARCH 2014

III. FEATURE SELECTION


Feature selection and the training algorithm are the two key
steps of creating an efficient classifier. In this section we
explain the working mechanism of Haar-like and HOG
features one by one.
A. HAAR LIKE FEATURES

II. RELATED WORK

The Haar-like features, introduced by Viola-Jones in [7],


have been used widely for face detection. Haar-like feature
takes its name from the Haar Wavelets. Haar-like feature is a
rectangular feature. Each rectangle is made up of alternative
white and black blocks. The Haar-feature is computed by
subtracting the sum of white and black regions with each
other. First, calculate the sum of all the white rectangular
regions and the sum of black regions separately. Then
subtract the sums from each other. Indisputably the
estimation of the subtraction is the estimation of the Haar
characteristic.

This section examines the related work of HOG and Haarlike features.
The Haar-like features are being used extensively for face
detection. These features have proved to be successful for
face detection as in [7]. The famous HOG features have been
used successfully in Human and Pedestrian Detection in [9].
The research not only stops there. But the researchers are
striving hard to compare both the feature types for many
other applications.

Haar-like features can be classified as two, three and four


rectangle features, as shown in figure 1.

A Pedestrian detection method based on both the feature


types i.e. HOG and Haar-like features, is proposed in [10].
The combined approach was suggested to take advantage of
both features i.e. to obtain the greater detection precision by
HOG and to greatly reduce the training time by the use of
Haar-like features.
Liang et al. [11] proposed a multi kernel approach for
Vehicle detection based on HOG and Haar-like features. The
experimental results in [11] shows that vehicle detection
system is 88% accurate using HOG features, by using only
Haar-like features its accuracy decreases to 86% and when
both the features are combined, the overall accuracy
increases up to 91%, which is a significant achievement of
the final trained classifier.

(a)

(b)

(c)

Figure 1: Haar-like features (a) Two-Rectangle features, (b) ThreeRectangle features, and (c) Four-Rectangle features

Figure 2 shows the Haar-like features extracted from the


training dataset.

Krerngkamjornkit et al. [12] enhanced the work of Viola and


Jones framework. They experimented on Human body
detection and tracking. Their research work reveals that the
actual Viola and Jones face detector is 70% accurate which
was increased to 84% in [12] by reducing the drawbacks of
Viola and Jones framework.

Figure 2: Haar-like feature extraction

The investigation of using Haar-like features is provided in


[13]. An improvement on Viola Jones algorithm was
proposed in [14]. An improved approach of feature selection
for Viola-Jones framework in proposed in [15].

B. HOG FEATURES
HOG Features were introduced by Dalal and Triggs in [6].
HOG refers to Histogram of Oriented Gradients. The name
itself is defining it i.e. these features are actually the
histograms of the gradient orientations of the pixels. The
directional change in the intensity or color of a picture is
known as Image Gradient. To extract information from
images, the image gradients are used. HOG is suitable where
the classification is required on the basis of shape. HOG uses
overall appearance and shape of an object because it works
on the basis of edges and direction of pixels in an image.
The overall silhouette of an object is captured by the HOG
features. In our case, to detect unconcealed guns, we need to

The MATLAB documentation on Training a Cascade Object


Detector [16] reveals the significance of HOG features. It is
clearly mentioned that the HOG features can be used to
detect any type of object and they are being extensively used
for creating custom classifiers. This is the major reason for
using the HOG features for the purpose of detecting
unconcealed guns. On the other hand, the greater
achievement of the Haar-like features in detecting the faces,
humans, pedestrians, vehicles and stop signs; has directed
our attention to use this feature type for gun detection.

35

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL. 4, ISSUE 1

focus on its shape, which is the main reason for using HOG
features.

MARCH 2014

known as Cascade Classifier. The boosted cascade


classifier works on the concept of sliding window. A small
rectangular window is slid over the entire image during each
stage. At each stage, the sliding window is moved from left
to right and from top to bottom of an image. If the region
covered by the window does not contain object of interest, it
is marked as negative and is discarded from the process. On
the other hand, if the object of interest is found at any stage,
it is moved to the subsequent stages. The final detection
result is provided by the last stage.

Now, we briefly discuss the working mechanism of HOG


feature extraction. Image is a collection of pixels. The
cluster of adjacent pixels forms a spatial region known as
Cell. Adjacent cells are combined into a larger region
called Block. The blocks are adjusted by 50% overlap
with one another. Horizontal and Vertical gradients are
calculated for each cell, known as a Gradient Vector. The
gradient vector of each cell inside a block is consolidated
which structures the final feature vector. The gradients of
cell are the local histograms computed over the pixels of the
cell. Different experiments show that the best results are
obtained by having 8x8 pixels per cell, and 2x2 cells per
block as shown in figure 3.

The more the number of stages and the number of training


images, the more efficient is the classifier.
V. TRAINING AND TEST CORPUS
We have created two separate classifiers, one is based on the
HOG features and the other is based on the HAAR-like
features. In order to train a classifier, an image corpus is
required containing both the Positive as well as Negative
images. The Positive images contain the objects of interest
while the Negative images are those which do not contain
the desired object.
The Cascade Classifier requires extensive number of images
for training. It does not work well for the scale and rotation
invariant images. To train it accurately, the training dataset
must contain the images from all the directions in which one
is interested to detect. Taking this point into consideration,
we have set our camera at particular distance and at a fixed
angle for collecting the dataset. It is about 7 ft. above the
ground and at the angle of 65 degrees in the z-axis. All
images are captured at the specified parameters. The images
are captured under different lightening conditions. The
dimension of all the images is 640 x 480.

Figure 3: Block and Cell Organization

We have computed the HOG features from the training


image dataset using MATLAB functions. The HOG features
are shown in figure 4.

The training corpus comprises of 11,257 images among


them 5000 are Positive images and 6257 are Negative
images. The testing corpus contains 350 Positive and 350
Negative images i.e. 700 images in total.
(a)

(b)

The test dataset contains the best cases (i.e. images which
contain clearer view of the gun and less occlusion) as well as
the worst case (i.e. images in which the gun is not clearly
visible). The images are taken under different lightning
conditions.

(c)

Figure 4: HOG features. (a) Input image, (b) HOG features using cell size
8x8, (c) HOG features by using cell size 32x32.

It can be seen through the above figure, that the HOG


features capture the overall structure of an object. HOG
features depends upon the cell and block size. They must be
kept so that the shape is clearly visible with in HOG feature
set.

Tables 1 and 2 show the Training and Testing Corpus,


respectively.
Table 1: Training Corpus

IV. CASCADE CLASSIFIER


This section briefly describes the basic working principle of
the Cascade Classifier.
The cascade classifier comprises of various stages and each
stage involves week learners. The output of one stage is
provided as the input to the next stage. Thats why it is

36

Table 2: Testing Corpus

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL. 4, ISSUE 1

The Training and Testing dataset is same for both the


classifiers.

MARCH 2014

In the same way, the Haar-like features are also extracted


from all the images to create the Haar feature based
classifier.

Figures 5 and 6 show some sample Positive and Negative


Images from the image corpus, respectively.

iv. Training the Cascade Classifier


Finally the extracted features are provided to the cascade
classifier for training. Along with these features, the
negative images are also provided to the classifier, so that
the final classification model can distinguish among the guns
and other objects. The output of the classifier is the .xml file
which contains all the information regarding the trained
classifier. While training the classifier, the True positive rate
is set to 99.5% and the False Alarm Rate is kept 0.5%.

Figure 5: Positive Images

Figure 7: Training the Cascade Classifier

v.

Testing the Trained Classifier

The performance of the trained model is then tested by


providing the test images to the classifier.

Figure 6: Negative Images

VII.

VI. METHODOLOGY

EXPERIMENTAL RESULTS

This section demonstrates the test consequences of both the


trained classifiers.

In this section we discuss the methodology used to train the


classifiers. First of all the image dataset is collected as
shown in the above section. In the MATLABs working
directory, separate directories are created for Positive and
Negative Training dataset.

Figures 8 and 9 depict the experimental results of the HOG


and HAAR based classifier respectively.

Following steps are carried out to train the classifier:


i.

Image Preprocessing

In this step, all the training images are converted into grayscale images. Then the images are sharpened and blurred to
get the variety of training images and in this way, we can
extend our image dataset.
ii.

Figure 8: Experimental Results of HOG based Classifier

ROI Selection

The gray images are processed further to get the trained


classifier. The complete image is not provided for feature
extraction but we select some particular Region of Interest
(ROI) from within the image to extract the features of that
particular region. The ROI includes the gun in the image.
iii. Feature Extraction
After ROI selection, we extract the HOG features. The
overall silhouette of an object is captured by the HOG
features. The ROIs must contain the unconcealed gun, by
extracting the HOG features the shape of the gun is taken for
further processing.

Figure 9: Experimental Results of Haar based Classifier

37

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL. 4, ISSUE 1

MARCH 2014

The results in figure 8 and 9 are very muchh surprising. Both


the detectors are completely different from
m each other. The
results of HOG based classifier seems morre accurate while
in the HAAR based classifier, no doubt thhe gun is detected
but it is mistakenly detecting many otherr objects as well.
Now let us compare both the classifier on the basis of
training time as shown in table 3.
Table 3: Training Time Durationn

Figure 10: Evaluation Metrics of


o both Classifiers

Figure 11 shows the comparison off Precision and Accuracy


of both classifiers. It is quite cleaar that the Precision and
Accuracy of the HAAR based classiifier is almost half that of
the HOG based classifier.

The HAAR feature based classifier took eextremely greater


training time than the HOG based classifierr. There is a much
greater difference in 2.5 hours and 4.4 dayys. Inspite of that
much training time duration, the perform
mance of HAAR
based classifier is not satisfactory.
Now let us compare the classifiers on the baasis of Evaluation
Metrics. The comparison results are shownn in table 4. It is
clearly revealed that the accuracy and preecision of HAAR
based classifier is almost half that of the HOG based
classifier.
Table 4: Comparison on the basis of Evaluattion Metrics

Evaluation Metrics

HOG Based
Classifier

HAAR Based
Classifier

True Positive Rate

81.14%

884.28%

False Positive Rate

4%

1100%

True Negative Rate

96%

00%

False Negative Rate

18.86%

115.71%

Precision

95.30%

445.73%

Accuracy

88.57%

442.14%

Figure 11: Precision and Accuracy


y of both Classifiers

VIII. CONCLU
USION
This paper has proposed a security
y system focused around
Unconcealed Gun Detection. To deetect any object the two
main issues are: i) feature selectio
on and ii) classification.
There are a number of feature typess available, each type has
its own importance and application
ns. Every feature will not
suit for every application. So the mo
ost challenging task is to
identify the feature type which best suits the application. For
that purpose we studied various featture types and found that
HOG and HAAR-like features hav
ve greatly contributed in
the field of Pattern Recognition. Most
M
of the recent object
recognition algorithms extensively use
u both of these feature
types.

Figure 10 shows the graphical view of thhe comparison of


Evaluation Metrics for both the classifiers. For the classifier
to work correctly, true positive rate musst be as high as
possible and the false positive rate musst be as low as
possible. But for the HAAR based claassifier the false
positive rate is much greater, which is the m
main reason for its
failure.

HOG and Haar-like features work particularly


p
well for face
recognition. But for the case of Weapon Detection, the
ures is unknown. In this
performance of both of these featu

38

ASIAN JOURNAL OF ENGINEERING, SCIENCES & TECHNOLOGY, VOL. 4, ISSUE 1

MARCH 2014

paper we have provided a comparative study of both the


feature types for the Unconcealed Weapon Detection.

Conference on Computer Vision and Pattern Recognition,


Vol. 1 (June 2005), pp. 886893.

According to the experimental results of section VI, it is


quite evidence that HOG feature type is the most suitable
and robust feature type for the purpose of unconcealed
weapon detection in image processing.

[7] Viola, P., and Jones, M., "Rapid Object Detection using a
Boosted Cascade of Simple Features", Proceedings of the
2001 IEEE Computer Society Conference, Volume 1, 15
April 2001, pp. 511518.

We compared both the classifiers on various parameters like


training time duration, accuracy, precision, true positive rate,
false positive rate, true negative rate and false positive rate.
From every aspect, the HOG based classifier is much better
than the HAAR based classifier.

[8] Ojala, T., Pietikainen, M., and Maenpaa, T.,


"Multiresolution Gray-scale and Rotation Invariant Texture
Classification With Local Binary Patterns", IEEE
Transactions on Pattern Analysis and Machine Intelligence,
Volume 24, No. 7 July 2002, pp. 971987.

HOG based classifier is trained in 2.5 hours whereas HAAR


based classifier took approximately 4.4 days for training the
classifier. The precision and accuracy of HAAR based
classifier is almost the half of the HOG based classifier.

[9] Jia, X. H., Zhang, J. Y., Fast Human Detection by


Boosting Histograms of Oriented Gradients, 4th
International Conference on Image and Graphics, pp. 683688, 2007.

Keeping in view the above discussion and the results, we


therefore conclude that Haar-like features do not work well
for the Gun detection instead the most suitable technique is
the HOG feature extraction.

[10] Xin, Y., Su Li, S., A Combined Pedestrian Detection


Method based on Haar-like Features and HOG Features, 3rd
International Workshop on Intelligent Systems and
Applications (ISA), pp. 1-4, 2011.
[11] Liang, P., Teodoro, G., Ling, H., Blasch, E., Chen, G.,
Bai, L., Multiple Kernel Learning for Vehicle Detection in
Wide Area Motion Imagery, 15th International Conference
on Information Fusion (FUSION), pp. 1629-1636, July
2012.

FUTURE WORK
The future work is:
-

To make the HOG based classifier more accurate


To include multiple gun types for detection.
To implement the Gun Detector to improve the
existing security systems.

[12] Kremgkamjornkit, R., Simic, M., Enhancement of


Human Body Detection and Tracking Algorithm based on
Viola and Jones Framework, 11th International Conference
on Telecommunication in Modern Satellite, Cable and
broadcasting Services (TELSIKS), vol. 1, pp. 115-118, 2013.

REFERENCES

[13] Peleshko, D., Soroka, K., Research of Usage of Haarlike Features and AdaBoost Algorithm in Viola-Jones
Method of Object Detection, 12th International Conference
on the Experience of Designing and Application of CAD
Systems in Microelectronics (CADSM), pp. 284-286, 2013.

[1] Lowe, G. D., Distinctive Image Features from Scale


Invariant key points, International Journal of Computer
Vision, vol. 60, no. 2, pp. 91-110, 2004.
[2] Herbert, B., Ess, A., Tuytelaars, T., and L. Van Gool,
SURF: "Speeded Up Robust Features", Computer Vision
and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346359, 2008.

[14] Li, Q., Niaz, U., Merialdo, B., An Improved Algorithm


on Viola-Jones Object Detector, 10th International
Workshop on Content-based Multimedia Indexing (CBMI),
pp. 1-6, 2012.

[3] Rosten, E., and Drummond, T.; "Fusing Points and Lines
for High Performance Tracking", Proceedings of the IEEE
International Conference on Computer Vision, Vol. 2
(October 2005): pp. 15081511.

[15] Lang, R. S., Luerssen, H. M., Powers, W. M.,


Evolutionary Feature Preselection for Viola-Jones
Classifier Training, Spring Conference on Engineering and
Technology (S-CET), pp. 1-4, 2012.

[4] Nister, D., and Stewenius, H., "Linear Time Maximally


Stable Extremal Regions", Lecture Notes in Computer
Science. 10th European Conference on Computer Vision,
Marseille, France: 2008, no. 5303, pp. 183196.

[16] mathworks.com, Train a Cascade Object Detector,


http://www.mathworks.com/help/vision/ug/train-a-cascadeobject-detector.html, Accessed: [09 Nov, 2014].

[5] Leutenegger, S., Chli, M., and Siegwart, R., "BRISK:


Binary Robust Invariant Scalable Keypoints", Proceedings
of the IEEE International Conference, ICCV 2011, pp. 25482555.
[6] Dalal, N. and Triggs, B., "Histograms of Oriented
Gradients for Human Detection", IEEE Computer Society

39

Vous aimerez peut-être aussi