Académique Documents
Professionnel Documents
Culture Documents
Dr.C R Venugopal Dept. of Electronics and Communication Sri Jayachamarajendra College of Engg., Mysore, Pin: 570 006 E_mail: cr_venu@yahoo.com
color descriptor. A Euclidian distance measure is used for the color descriptor.
1. Introduction
Image databases are becoming popular with many applications. One of the key issues of these areas is content-based image retrieval (CBIR) which helps users to retrieve relevant images based on their contents. Color is one of the most dominant and distinguishing visual feature used for CBIR. Using color to index and search images dates back to some of the early work on color histogram [6]. Since then many variants of the histogram indexing have been proposed. Even though color histogram is widely used as color descriptor, which is easy to compute but they result in large feature vectors that are difficult to index and leads to high search and retrieval cost. Several color descriptors have been proposed recently and they try to incorporate spatial information to varying degrees. This includes compact color moments [1], [2], binary color sets [3], color coherence vector [4], and color correlogram [5]. The feature vector dimensions of typical color descriptors are quite large. The representative color descriptor is compact and it is based on the observation that the small numbers of color/colors are usually sufficient to characterize the color information in an image region. Since the descriptor captures the representative or dominant colors in a given region, we refer to it as the dominant or representative
94
query image and forwards the most similar images to the interface module. The database images are indexed according to their feature vectors to speed up retrieval and similarity computation. Note that both the data insertion and the query processing functionalities use the feature vector extraction module.
To speed-up the evaluation of range queries we describe cluster-based indexing method. This is carried out by reducing the number of candidate images, the images to be indexed on which the optimal regionmatching problem has to be solved. The procedure is as follows: 1. Given n the number of query regions, for each query region q j , find the regions belonging to cluster c j , where
j = 1,.., n . 2. For each region ri in the image database a. Find the feature vector f i , for region ri . b. For each query region q j , in the query set
i. Find query feature vector f j , for q j . ii. Find the Euclidean distance between f i and
f j using: dij =
k =1
( fik f jk ) 2 ,
where m is the dimension of the feature vector. This score is zero if the regions features are identical, it increases as the match becomes less perfect. iii. Measure the similarity between f i and f j using ij = dij , where is the search range limit set by user. iv. If ij 0 , then f i belongs to cluster c j and go to step 2. After the completion of the above procedure, we index only representatives of c j , where j = 1,..., n using R*Tree. Once the user selects the query, we apply range search on the tree and the selected regions belonging to image are retrieved as resultant set. Members of resultant set is ranked according to overall score and return the best matches in decreasing order of similarity along with their relative information. In case of R*-Tree, all the regions in the database are being indexed. When we pose a queryby-example, based on the range, selected images are displayed as a resultant set according to ranking of similarity in descending order. In sequential search, all the regions stored in the database are compared for similarity and intern retrieved for display, making it inefficient. All three methods yield good performance when the accuracy of resultant set is considered but the proposed method over scores all the above.
3. Experimental Results
The representative color descriptor is tested on a database of 200 color flag images and a database of 1200
95
Ground Truth images belonging to 20 various categories. After segmentation 440 regions from flag images and 1300 regions from Ground Truth images are obtained. Among them, 13 and 67 image regions containing a variety of colors are chosen from flag and Ground Truth database respectively as queries. To determine relevant matches in the database to the query image region a subjective test is carried before evaluation. The time complexity associated with proposed method (cluster-Index) and other methods (R*-Tree and Sequential search) are shown in Fig 3(a) and 3(b). As we can observe from the graph that efficiency of the proposed method is high (less time). The retrieval accuracy is measured by precision and recall. The average recall over precision graph for Ground Truth image queries is plotted in Figure 4 for all three methods. In general, a more effective system shows a higher precision for all values of recall. We can observe that the proposed method achieves good results in terms of the retrieval accuracy. Figure 2 shows the snapshot of region-based image retrieval. The retrievals in the example show good match of the query images.
Time(s)
(a)
Time complexity of Indexing Algorithms
1.8 1.6 1.4 Avg. Time(s) 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 Avg. Number of Retrievals Cluster_Index R*-Tree_Index Seqn_search
(b) Figure 3. Time in seconds versus number of retrievals (a) Flags database (b) Ground Truth image database
Avg. Recall v/s avg. Precision per category
1.1 1.05 Avg. Precision 1 0.95 0.9 0.85 0.8 0.75 0.7 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1.02 Avg. Recall Cluster_Index R*-Tree_Index Seqn_Search
4. Conclusions
The Cluster-based R*-Tree indexing method for efficient retrieval of the images is proposed and discussed. The technique is tested on a natural image database, flag image database and ground truth database as applications. The mean shift algorithm is used for segmentation to obtain regions of interest to improve the effectiveness of the retrieval system. Experimental results depict the proposed method gives better performance in terms of efficiency and accuracy of retrieval. A query-by-example based toolbox IMAGE is implemented for the database manipulation and retrieval in JAVA. As further extension of the proposed work, the system should be tested on a more populated database.
Figure 4. Average Recall versus average Precision per category of Ground Truth database.
5. References
[1]
[2]
[3] [4]
[5]
[6]
M. A. Stricker and M. Orengo, Similarity of color images, Proc. SPIE, Storage Retrieval Still Image Video Databases IV, vol. 2420, 1996, pp. 381392. M. Stricker and A. Dimai, Color indexing with weak spatial constraints, Proc. SPIE Storage Retrieval Still Image Video Databases IV, vol. 2670, 1996,pp. 2940. J. Smith and S.-F. Chang, Tools and techniques for color image retrieval, Proc. SPIE, vol. 2670, 1996. G. Pass and R. Zabih, Histogram refinement for content based image retrieval, Proc. IEEE Workshop Applications Computer Vision, pp. 96-102, 1996, pp. 27. J.Huang, S R Kumar, M Mithra, W.Zhu, and R. Zabih, Image indexing using color correlograms, Proc. IEEE conf. Computer vision and pattern Recognition, 1997,pp. 762-768. W. Y. Ma, B. Manjunath, NeTra: A toolbox for navigating large image databases, Proc. IEEE Int. Conf. Image Processing,1997,pp.56-71.
96