Académique Documents
Professionnel Documents
Culture Documents
Abstract - A new method based on Itti’s model is proposed in orientation and intensity. The Itti model is widely accepted and
order to extract saliency objects as complete as possible. It applied to machine vision problems such as target detection
combines the advantages of Itti's model and region growing. [5-8], image compression encoding [9], and rendering of the
Firstly, visual features of early such as intensity, color and dynamic environment [10].
orientation are extracted from the input static color image.
In all these papers, Itti's model was used to extract the
Secondly, three conspicuity maps are created according to early
features. Thirdly, in order to obtain the initial seed points, saliency map or the saliency position of a given image, but
thresholds based on three conspicuity maps are set respectively. most of the saliency objects in the saliency map are not
Fourthly, three maps after region growing are attained. Finally, complete. This increases the difficulty of the follow-up
three maps which are based on region growing, are combined into processing. Therefore, it is very important to extract saliency
a saliency map. A series of experiments has demonstrated that the objects as complete as possible.
algorithm is effective. Compared with Itti’s model, the precision, Fig. 1 shows the processing results of Itti's model. It can
the recall rate and F-measure of the saliency object extraction by be seen that, not only saliency objects are not completely
the proposed algorithm are improved obviously.
extracted, but the error points of some out of saliency objects
are also extracted. On the other word, both the false positive
Index Terms - Region growing. Itti’s model. Saliency map.
and false negative are obvious. Taking the first image as an
Feature extraction.
example, no complete saliency objects are detected. If we take
I. INTRODUCTION the signpost as a saliency object, there are some background
points in the saliency map, and the complete signpost can’t be
The human eye can quickly select the part of interest from seen from the saliency map. If we take the white clouds as
the input scene and input it into the brain for processing. This saliency objects, the result is the same too. In order to solve
is because of the strategy of information selection of the this problem, an improved method based on Itti's model is
human visual system, which uses the attention mechanism of proposed. The flow diagram of the new method is shown in
human visual system to guide the human eye to focus on Fig. 2. The detailed description of the theory is in the section
saliency regions in massive data [1]. III.
Visual attention is a process of using the useful
information extracted to locate, identify, and understand
objects in the scene. It allows the vision system to prioritize
important regions, and secondly analysis regions that are not
important. In order to simulate how the human eye divides the
priority of the region to be processed, many models of visual
attention were proposed. These models are divided into two
types, bottom-up processes and top-down processes [2-3]. The
bottom-up process is mainly based on low-level features such
as color, orientation, intensity, and the relationship among
these features to obtain saliency regions. The top-down
process not only use the low-level features, but also use prior
knowledge, such as scenes, targets, textures, colors, and their
interrelationships, which is very susceptible to people's
subjective consciousness. Therefore, the bottom-up process is
the fundamental of the visual attention. (1) (2)
Itti’s model which was proposed by L. Itti et al. is a Fig. 1 The saliency map obtained by ltti's model.
typical algorithm of the bottom-up process [4]. It created the The first column is the input image, and the second column is the
experimental results.
saliency map by using three types of features: color,
|x i −u|
β = i
(2)
n
Step 3: The threshold is set as
T = u + 3× β (3)
225
represented by S I , S C , SO . The conspicuity maps are
r+ g |r− g |
Y = − −b (8) obtained by point-by-point addition, which is denoted ⊕ .
2 2 10) Select the initial seed points of the three conspicuity
6) Then color maps are obtained by formula (9)-(12). maps respectively according to the method described in part B
of section II.
RG ( c , s ) = | ( R ( c ) − G ( c )) Θ ( G ( s ) − R ( s )) | (9) 11) Region growing is applied in the third layer of the
pyramid based on the color feature. The third layer is used in
BY ( c , s ) = | ( B ( c ) − Y ( c )) Θ ( B ( s ) − Y ( s )) | (10) this paper just because it has good performance both on
accuracy and real-timing. Detailed steps are as follows:
c ∈ {3 , 4 , 5} (11) Step 1: In the third layer of the broadly-tuned color
channel, count the occurrence of the main color of the eight
s = c + δ , δ ∈ {1, 2} (12) points that are adjacent to the seed point. If R>G and R>B,
then R is the main color for this point, and x plus 1. If G>B
Where Θ means to do the matrix subtraction after the and G>R, then G is the main color for this point, and y plus 1.
two images are adjusted to the same size. In this step, 12 color If B>G and B>R, then B is the main color for this point, and z
feature maps are obtained. plus 1.
7) Then the feature maps of orientation and the feature Step 2: The rules of region growing are chosen as the main
maps of intensity are obtained by formulas (13) and formulas color that has the biggest number. For example, in the case of
(15) respectively. x>y>z, the rules of region growing is R>G and R>B.
Step 3: Three maps are obtained by region growing.
O (c , s , θ ) =| O (c , s )Θ O ( s , θ ) | (13)
12) The saliency map can be obtained by linearly
combining the three maps obtained in step 11).
π π 3π
θ ∈ {0 , , , } (14) IV. EXPERIMENTS AND RESULTS
4 2 4
30 images were selected randomly from the MRSA
I (c , s ) =| I ( c )Θ I ( s ) | (15) dataset [16], which was widely used in many papers for
experiments. MATLAB 2015b was used as the platform for
Where O (c, s, θ ) denotes the feature map of orientation the experiments. The operating system of PC is windows 10,
and I (c, s ) denotes the feature map of intensity. It can be and the processor is Intel core i5 processor, 8G memory.
seen that there are 6 intensity feature maps and 24 orientation Fig. 3 shows some experimental results. It can be seen that
feature maps. saliency objects extracted by the proposed method are more
8) All the maps of feature are normalized by N (.) in complete than those extracted by the Itti's model. Furthermore,
almost no background was mistaken as the saliency object.
order to increase the contrast of the conspicuity maps. First,
From Fig. 3, we can see that some points of the saliency
normalize the inputted feature map to a unified range [0, Mf]
objects are still missing using the proposed algorithm. The
(Mf is set to 10 in this paper). Second, calculate the mean
main reason is that the rules of region growing only consider
value m of all other local maximum values. Then multiply the
the color feature. It can be resolved by follow-up processing.
feature map of whole by (Mf-m)2.
The precision, the recall rate and F-measure are used to
9) The conspicuity maps are computed by formula (16)-
quantitatively analyze the validity of the proposed algorithm.
(18).
They can be calculated by formula (19) - (21) respectively.
5 δ ∈{1 , 2 }
SI = ⊕ ⊕δ
c=3 s =c+
N ( I ( c , s )) (16) pre =
n so
(19)
ns
5 δ ∈{1 , 2 }
SC = ⊕ ⊕δ
c=3 s =c+
N ( RG ( c , s ) + BY ( c , s )) (17) rec =
n so
(20)
no
5 δ ∈{1 , 2 }
2 × pre × rec
SO = N (⊕ ⊕δ N ( O ( c , s , θ ))) (18) fms = (21)
π π 3π
θ ∈{ 0 , , , }
c=3 s =c+ pre + rec
4 2 4
The intensity conspicuity map, the color conspicuity map Where nso represents the number of saliency points of the
and the orientation conspicuity map are respectively saliency object in the saliency map, ns represents the total
number of saliency points in the saliency map. no represents
226
the total number of the points of the saliency object. pre ,
0.9
rec and fms represent the precision, the recall rate and F- the average of the precision
0.8 the average of the recall rate
measure of the saliency object extraction respectively. F- the average of the F-measure
ACKNOWLEDGMENT
(1) (2) (3) (4) This paper was supported by National Natural Science
Fig. 3 Saliency map.(1)Original image.(2)Saliency object extracted by Foundation of China (NO. 61201081), Tianjin Natural Science
Photoshop.(3)Saliency object extracted by IttiȽs model.(4)Saliency object Foundation (NO. 18JCYBJC19300).
extracted by the new method.
227
REFERENCES
[1] Y. Dong, M.-T. Pourazad, P. Nasiopoulos, “Human Visual System-
Based Saliency Detection for High Dynamic Range Content,” IEEE
Transactions on Multimedia, vol. 18, no. 4, pp. 549-562, 2016.
[2] A. Borji, D. Sihite, L. Itti, “Quantitative analysis of human-model
agreement in visual saliency modeling: A comparative study,” IEEE
Trans. Image Process, vol. 22, no. 1, pp. 55-69, January 2013.
[3] N.-R. Barai, M. Kyan, D. Androutsos, “Human visual system inspired
saliency guided edge preserving tone-mapping for high dynamic range
imaging,” 2017 IEEE International Conference on Image Processing
(ICIP), 2017.
[4] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual
attention for rapid scene analysis,” IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, 1998.
[5] G.Fei, Z. Ye, J. Wang, et al, “Visual Attention Model Based Vehicle
Target Detection in Synthetic Aperture Radar Images: A Novel
Approach,” Cognitive Computation, vol. 7, no. 4, pp. 434-444, 2015.
[6] N.Xian, “Comparative study of swarm intelligence-based saliency
computation,” International Journal of Intelligent Computing &
Cybernetics, vol. 10, no. 3, pp. 348-361, 2017.
[7] S. Liu, J. Huang, Z. Liu, “An ESS Target Detection Method Based on
Itti's Saliency Map and Maximum Saliency Density,” Electronics Optics
& Control, vol. 22, no. 12, pp. 9-14, 2015.
[8] L. Liu, X. Yu, B. Dong, “A Fast Segmentation Algorithm of PET Images
Based on Visual Saliency Model,” Procedia Computer Science, vol. 92,
pp. 361-370, 2016.
[9] J. Zhang, L. Sui, L. Zhuo, et al, “An approach of bag-of-words based on
visual attention model for pornographic images recognition in
compressed domain,” Neurocomputing, vol. 110, no. 8, pp. 145-152,
2013.
[10] X. Hou, O. Sourina, “Stable adaptive algorithm for Six Degrees-of-
Freedom haptic rendering in a dynamic environment,” Visual Computer,
vol. 29, no. 10, pp. 1063-1075, 2013.
[11] B.-K. Chakraborty, M.-K. Bhuyan, S. Kumar, “Combining image and
global pixel distribution model for skin colour segmentation,” Pattern
Recognition Letters, vol.88, no. 3, pp. 33-40, 2017.
[12] S. Yang, H. Xu , Z. Tang, “Infrared target extraction method based on
visual saliency and regional growth,” Journal of Shenyang Aerospace
Ace University, vol. 35, no. 4, pp. 85-89, 2018.
[13] J.-P. Sarkar, I. Saha, U. Maulik, “Rough possibilistic type-2 fuzzy C-
means clustering for MR brain image segmentation,” Applied Soft
Computing, vol. 46, pp. 527, 2016.
[14] R. Ronald, N.-A. Yager, “A note on mean absolute deviation,”
Information Sciences, vol. 279, 2014.
[15] Y. Jia, S. Sun, L. Yang, D. Wang, “Salient space detection algorithm for
signal extraction from contaminated and distorted spectrum,”Analyst,
vol. 143, pp. 2656-2664, 2018.
[16] T. Liu, J. Sun, N. Zheng, X. Tang, and H. -Y. Shum, “Learning to detect
a salient object,” Computer Vision and Pattern Recognition, pp. 1-
8,2007.
228