Vous êtes sur la page 1sur 5

Proceedings of 2019 IEEE

International Conference on Mechatronics and Automation


August 4 - 7, Tianjin, China

A New Saliency Object Extraction Algorithm Based on


Itti’s Model and Region Growing
Yunwei Jia1,2,Chenxiang Hao1,2* and Kun Wang1,2
1. Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical
Engineering, Tianjin University of Technology.
2. National Demonstration Center for Experimental Mechanical and Electrical Engineering Education ( Tianjin University
of Technology ).
Tianjin, 300384, China
15122305129@163.com

Abstract - A new method based on Itti’s model is proposed in orientation and intensity. The Itti model is widely accepted and
order to extract saliency objects as complete as possible. It applied to machine vision problems such as target detection
combines the advantages of Itti's model and region growing. [5-8], image compression encoding [9], and rendering of the
Firstly, visual features of early such as intensity, color and dynamic environment [10].
orientation are extracted from the input static color image.
In all these papers, Itti's model was used to extract the
Secondly, three conspicuity maps are created according to early
features. Thirdly, in order to obtain the initial seed points, saliency map or the saliency position of a given image, but
thresholds based on three conspicuity maps are set respectively. most of the saliency objects in the saliency map are not
Fourthly, three maps after region growing are attained. Finally, complete. This increases the difficulty of the follow-up
three maps which are based on region growing, are combined into processing. Therefore, it is very important to extract saliency
a saliency map. A series of experiments has demonstrated that the objects as complete as possible.
algorithm is effective. Compared with Itti’s model, the precision, Fig. 1 shows the processing results of Itti's model. It can
the recall rate and F-measure of the saliency object extraction by be seen that, not only saliency objects are not completely
the proposed algorithm are improved obviously.
extracted, but the error points of some out of saliency objects
are also extracted. On the other word, both the false positive
Index Terms - Region growing. Itti’s model. Saliency map.
and false negative are obvious. Taking the first image as an
Feature extraction.
example, no complete saliency objects are detected. If we take
I. INTRODUCTION the signpost as a saliency object, there are some background
points in the saliency map, and the complete signpost can’t be
The human eye can quickly select the part of interest from seen from the saliency map. If we take the white clouds as
the input scene and input it into the brain for processing. This saliency objects, the result is the same too. In order to solve
is because of the strategy of information selection of the this problem, an improved method based on Itti's model is
human visual system, which uses the attention mechanism of proposed. The flow diagram of the new method is shown in
human visual system to guide the human eye to focus on Fig. 2. The detailed description of the theory is in the section
saliency regions in massive data [1]. III.
Visual attention is a process of using the useful
information extracted to locate, identify, and understand
objects in the scene. It allows the vision system to prioritize
important regions, and secondly analysis regions that are not
important. In order to simulate how the human eye divides the
priority of the region to be processed, many models of visual
attention were proposed. These models are divided into two
types, bottom-up processes and top-down processes [2-3]. The
bottom-up process is mainly based on low-level features such
as color, orientation, intensity, and the relationship among
these features to obtain saliency regions. The top-down
process not only use the low-level features, but also use prior
knowledge, such as scenes, targets, textures, colors, and their
interrelationships, which is very susceptible to people's
subjective consciousness. Therefore, the bottom-up process is
the fundamental of the visual attention. (1) (2)
Itti’s model which was proposed by L. Itti et al. is a Fig. 1 The saliency map obtained by ltti's model.
typical algorithm of the bottom-up process [4]. It created the The first column is the input image, and the second column is the
experimental results.
saliency map by using three types of features: color,

978-1-7281-1699-0/19/$31.00 ©2019 IEEE 224


Step 2: The mean absolute deviation β is obtained by the
following formula [14]:
n

|x i −u|
β = i
(2)
n
Step 3: The threshold is set as

T = u + 3× β (3)

Since the selected seed points must be representative.


According to the 3σ rule, when a set of data conforms to the
normal distribution, the probability that the value is greater
than ( μ + 3 × σ ) is only 0.13% [15]. Where μ denotes the
average of the set of data, σ denotes the standard deviation of
the set of data. According to the 3σ rule, we can obtain the
average value by using formula (1) and the approximate value
of the standard deviation by using formula (2). The number of
seeds is relatively small and representative when we use T as
the threshold.
C. Rules of Region Growing
Fig. 2 The flow diagram for extracting the saliency map by the new method. The study found that the relationship among R, G and B is
relative stable when illumination changes. And this
II. THE MAIN IDEA OF THE PROPOSED METHOD relationship of one object is often consistent though R, G and
In this paper, Itti’s model is used to obtain conspicuity B of one object are often changed because of illumination
maps of color, orientation and intensity. Then initial seed Uneven. Therefore, the relationship among R, G and B is used
points are selected by threshold segmentation, and the final for region growing in this paper.
saliency map is obtained by region growing. III. SPECIFIC STEPS OF THE PROPOSED ALGORITHM
A. Principle of Region Growing
Region growing is a common image segmentation method 1) Input a static color image that includes three color
which based on the similarity of the points in a certain region. channels: r (red), g (green), b (blue).
Starting from the initial seed points, adjacent points with 2) According to three color channels of r, g and b, the
similar characteristics are merged into the current region to intensity channel is computed as:
gradually grow the region until there is no point merging [11].
r+ g +b
Difficulties in region growing mainly include the selection of I = (4)
seed points, rules of region growing, and conditions to 3
terminate the growth [12].
B. Selection of Seed Points 3) Obtain the 9-layer pyramid of the intensity channel, the
The initial seed points are selected by threshold red channel, the green channel, and the blue channel
segmentation [13]. The detailed steps are as follows: respectively by sub-sampling with the Gaussian filter.
Step 1: Itti’s model is used to obtain the conspicuity maps 4) Set the r, g, b value of a point to 0, if the intensity of
of orientation, color and intensity, and the mean value u of this point is less than MI/10 (MI represents the maximum value
saliency points of each conspicuity map is calculated. of the intensity of the current layer).
5) Get the 9-layer pyramid of four broadly-tuned color
n channels R, G, B and Y by formula (5)-(8).
x i
g +b
u= i
(1) R =r− (5)
n 2
Where xi denotes the response value of each point in the r+b
G = g− (6)
conspicuity map. n denotes the number of total points in the 2
conspicuity map. r+g
B =b− (7)
2

225
represented by S I , S C , SO . The conspicuity maps are
r+ g |r− g |
Y = − −b (8) obtained by point-by-point addition, which is denoted ⊕ .
2 2 10) Select the initial seed points of the three conspicuity
6) Then color maps are obtained by formula (9)-(12). maps respectively according to the method described in part B
of section II.
RG ( c , s ) = | ( R ( c ) − G ( c )) Θ ( G ( s ) − R ( s )) | (9) 11) Region growing is applied in the third layer of the
pyramid based on the color feature. The third layer is used in
BY ( c , s ) = | ( B ( c ) − Y ( c )) Θ ( B ( s ) − Y ( s )) | (10) this paper just because it has good performance both on
accuracy and real-timing. Detailed steps are as follows:
c ∈ {3 , 4 , 5} (11) Step 1: In the third layer of the broadly-tuned color
channel, count the occurrence of the main color of the eight
s = c + δ , δ ∈ {1, 2} (12) points that are adjacent to the seed point. If R>G and R>B,
then R is the main color for this point, and x plus 1. If G>B
Where Θ means to do the matrix subtraction after the and G>R, then G is the main color for this point, and y plus 1.
two images are adjusted to the same size. In this step, 12 color If B>G and B>R, then B is the main color for this point, and z
feature maps are obtained. plus 1.
7) Then the feature maps of orientation and the feature Step 2: The rules of region growing are chosen as the main
maps of intensity are obtained by formulas (13) and formulas color that has the biggest number. For example, in the case of
(15) respectively. x>y>z, the rules of region growing is R>G and R>B.
Step 3: Three maps are obtained by region growing.
O (c , s , θ ) =| O (c , s )Θ O ( s , θ ) | (13)
12) The saliency map can be obtained by linearly
combining the three maps obtained in step 11).
π π 3π
θ ∈ {0 , , , } (14) IV. EXPERIMENTS AND RESULTS
4 2 4
30 images were selected randomly from the MRSA
I (c , s ) =| I ( c )Θ I ( s ) | (15) dataset [16], which was widely used in many papers for
experiments. MATLAB 2015b was used as the platform for
Where O (c, s, θ ) denotes the feature map of orientation the experiments. The operating system of PC is windows 10,
and I (c, s ) denotes the feature map of intensity. It can be and the processor is Intel core i5 processor, 8G memory.
seen that there are 6 intensity feature maps and 24 orientation Fig. 3 shows some experimental results. It can be seen that
feature maps. saliency objects extracted by the proposed method are more
8) All the maps of feature are normalized by N (.) in complete than those extracted by the Itti's model. Furthermore,
almost no background was mistaken as the saliency object.
order to increase the contrast of the conspicuity maps. First,
From Fig. 3, we can see that some points of the saliency
normalize the inputted feature map to a unified range [0, Mf]
objects are still missing using the proposed algorithm. The
(Mf is set to 10 in this paper). Second, calculate the mean
main reason is that the rules of region growing only consider
value m of all other local maximum values. Then multiply the
the color feature. It can be resolved by follow-up processing.
feature map of whole by (Mf-m)2.
The precision, the recall rate and F-measure are used to
9) The conspicuity maps are computed by formula (16)-
quantitatively analyze the validity of the proposed algorithm.
(18).
They can be calculated by formula (19) - (21) respectively.
5 δ ∈{1 , 2 }
SI = ⊕ ⊕δ
c=3 s =c+
N ( I ( c , s )) (16) pre =
n so
(19)
ns
5 δ ∈{1 , 2 }
SC = ⊕ ⊕δ
c=3 s =c+
N ( RG ( c , s ) + BY ( c , s )) (17) rec =
n so
(20)
no
5 δ ∈{1 , 2 }
2 × pre × rec
SO =  N (⊕ ⊕δ N ( O ( c , s , θ ))) (18) fms = (21)
π π 3π
θ ∈{ 0 , , , }
c=3 s =c+ pre + rec
4 2 4

The intensity conspicuity map, the color conspicuity map Where nso represents the number of saliency points of the
and the orientation conspicuity map are respectively saliency object in the saliency map, ns represents the total
number of saliency points in the saliency map. no represents

226
the total number of the points of the saliency object. pre ,
0.9
rec and fms represent the precision, the recall rate and F- the average of the precision
0.8 the average of the recall rate
measure of the saliency object extraction respectively. F- the average of the F-measure

measure is expressed to evaluate the effect of the obtained 0.7

saliency map by expanding the values of pre and rec . 0.6


The average of the precision, the recall rate and the F-
0.5
measure of these 30 images are shown in Fig. 4. It can be seen
that, the proposed method is much better than the Itti's model. 0.4

More details of the performance of the proposed


0.3
algorithm are shown in Fig. 5 - Fig. 7. It can be seen that more
than 90% of the saliency map obtained by the proposed 0.2

algorithm are significantly improved. The object extracted by 0.1


the proposed algorithm is more complete than the Itti's model.
However, for some images with complex backgrounds, the 0
Itti model the new method
new method is not ideal, and the main reason is that the growth model
rules are not strict enough. Therefore, the following research Fig. 4 The average result of precision, recall rate and F-measure
should focus on the rules for region growing.
IV. CONCLUSION
This paper proposed a new saliency object extraction
algorithm which is based on Itti’s model and region growing.
Experimental results show that, the proposed algorithm
reduces both false negatives and false positives simultaneously
compared with the Itti’s model. It not only increases the
completeness of the extract the saliency object obviously, but
also decrease the mistake that background was extracted as the
saliency object. More than 90% of the saliency object Fig. 5 The precision of these 30images
extracted by the proposed algorithm was better than by Itti’s
model on the performance of precision, the recall rate and F-
measure. However, for some images, which have complex
backgrounds, the results are not good. The reason may be that
the rules of region growing are too simple. That's the focus of
the future study.

Fig. 6 The recall rate of these 30images

Fig. 7 The F-measure of these 30images

ACKNOWLEDGMENT

(1) (2) (3) (4) This paper was supported by National Natural Science
Fig. 3 Saliency map.(1)Original image.(2)Saliency object extracted by Foundation of China (NO. 61201081), Tianjin Natural Science
Photoshop.(3)Saliency object extracted by IttiȽs model.(4)Saliency object Foundation (NO. 18JCYBJC19300).
extracted by the new method.

227
REFERENCES
[1] Y. Dong, M.-T. Pourazad, P. Nasiopoulos, “Human Visual System-
Based Saliency Detection for High Dynamic Range Content,” IEEE
Transactions on Multimedia, vol. 18, no. 4, pp. 549-562, 2016.
[2] A. Borji, D. Sihite, L. Itti, “Quantitative analysis of human-model
agreement in visual saliency modeling: A comparative study,” IEEE
Trans. Image Process, vol. 22, no. 1, pp. 55-69, January 2013.
[3] N.-R. Barai, M. Kyan, D. Androutsos, “Human visual system inspired
saliency guided edge preserving tone-mapping for high dynamic range
imaging,” 2017 IEEE International Conference on Image Processing
(ICIP), 2017.
[4] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual
attention for rapid scene analysis,” IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, 1998.
[5] G.Fei, Z. Ye, J. Wang, et al, “Visual Attention Model Based Vehicle
Target Detection in Synthetic Aperture Radar Images: A Novel
Approach,” Cognitive Computation, vol. 7, no. 4, pp. 434-444, 2015.
[6] N.Xian, “Comparative study of swarm intelligence-based saliency
computation,” International Journal of Intelligent Computing &
Cybernetics, vol. 10, no. 3, pp. 348-361, 2017.
[7] S. Liu, J. Huang, Z. Liu, “An ESS Target Detection Method Based on
Itti's Saliency Map and Maximum Saliency Density,” Electronics Optics
& Control, vol. 22, no. 12, pp. 9-14, 2015.
[8] L. Liu, X. Yu, B. Dong, “A Fast Segmentation Algorithm of PET Images
Based on Visual Saliency Model,” Procedia Computer Science, vol. 92,
pp. 361-370, 2016.
[9] J. Zhang, L. Sui, L. Zhuo, et al, “An approach of bag-of-words based on
visual attention model for pornographic images recognition in
compressed domain,” Neurocomputing, vol. 110, no. 8, pp. 145-152,
2013.
[10] X. Hou, O. Sourina, “Stable adaptive algorithm for Six Degrees-of-
Freedom haptic rendering in a dynamic environment,” Visual Computer,
vol. 29, no. 10, pp. 1063-1075, 2013.
[11] B.-K. Chakraborty, M.-K. Bhuyan, S. Kumar, “Combining image and
global pixel distribution model for skin colour segmentation,” Pattern
Recognition Letters, vol.88, no. 3, pp. 33-40, 2017.
[12] S. Yang, H. Xu , Z. Tang, “Infrared target extraction method based on
visual saliency and regional growth,” Journal of Shenyang Aerospace
Ace University, vol. 35, no. 4, pp. 85-89, 2018.
[13] J.-P. Sarkar, I. Saha, U. Maulik, “Rough possibilistic type-2 fuzzy C-
means clustering for MR brain image segmentation,” Applied Soft
Computing, vol. 46, pp. 527, 2016.
[14] R. Ronald, N.-A. Yager, “A note on mean absolute deviation,”
Information Sciences, vol. 279, 2014.
[15] Y. Jia, S. Sun, L. Yang, D. Wang, “Salient space detection algorithm for
signal extraction from contaminated and distorted spectrum,”Analyst,
vol. 143, pp. 2656-2664, 2018.
[16] T. Liu, J. Sun, N. Zheng, X. Tang, and H. -Y. Shum, “Learning to detect
a salient object,” Computer Vision and Pattern Recognition, pp. 1-
8,2007.

228

Vous aimerez peut-être aussi