Vous êtes sur la page 1sur 6

Mixed Query Image Retrieval System

Bingjing Cai , Chris Zheng , Sen Yang * , Jeffery Z. J. Zheng

School of Software, Yunnan University, Kunming, Yunnan, P.R. China, 650091 Email: bjcai117@gmail.com

Conjugate Systems Pty Ltd, 45 Greenways Road, Glen Waverley, Victoria 3150, Australia Email: veryhotsausage@gmail.com

* Department of CS&T, School of Information, Yunnan University, Kunming, Yunnan, P.R.China, 650091 Email: janssenkm@gmail.com

School of Software, Yunnan University, Kunming, Yunnan, P.R.China, 650091 Email: conjugate@tom.com

Abstract - This paper discusses a proposed mixed query image and text retrieval database, combining text-based search technology and image content-based search technology for more accurate results. The target application will be aimed at large, categorized image sets, such as large image collections of libraries or patent offices. It can be shown that this mixed query image search engine, can achieve better efficiency and higher quality than traditional text-based queries and content-based systems. This paper will discuss the designing principle of the mixed query image search engine, outline the architecture and show results from an initial prototype database.

Index Terms - content-based image search technology, keyword-based image search technology, mixed query image retrieval system



A. Two Phrases of Development of Image Retrieval System

Technology In the sudden explosion into the information age, all types of data are being produced at an enormous rate and are still increasing rapidly. These data includes large sets of image, sound, video and other multimedia data being generated by cheap digital capture and storage facilities. With increase in demand for processing and categorized this rich multimedia information, large amounts of data are being created without a way to classify or search through them. It is with this need in mind that image search engines have come to be one of the hottest and fastest growing areas of research and application development. Currently, there exists two ways of searching for an image –

by keyword or by content. The first method was introduced with the development of text-based search technology; the second method introduced with the content-based search technology. Keyword-based image retrieval systems are based on the traditional text-based retrieval technology. The concept is simple. The images are tagged with keywords and managed by the system. Users search for images via a textbox and images tagged with the keyword will be displayed. The search is easy but the tagging becomes laborious. Keywords have to be manually tagged to each picture by the user. For most people, this system of organizing is much too time consuming. If pictures are not tagged, then they cannot be organized and data is lost. Furthermore, text labels do not represent the picture itself, only a very abstract description which differs between people and cultures. There is no standard for tagging available and the only dependable tagging comes from ‘social tagging’ – an example being Flickr [1]. However, tag still cannot represent the content within and the search results will be unsuitable in many cases. Content-base search uses queries in the form of image objects rather than text tags. The principle behind content-based retrieval technology is to extract either meaning or measurement from within the picture itself. Image properties such as color, texture, shape and other qualities can be expressed as a given measurement, able to be processed by the computer. Once the picture is able to be quantitatively and qualitatively defined, it thus can be managed. As a consequence, the real attraction for content-based search is that it promises the possibility for automation of image

classification and search.

B. Research and development of image retrieval techniques

With the emergence of large scale image collections, content-based image retrieval was proposed. Since then, many techniques in this research direction have been developed and many image retrieval systems, both research and commercial, have been built. In the early 90s, IBM developed its first content-based image retrieval, QBIC, standing for Query By Image Content system [2]. It is the first commercial content-based Image Retrieval system and its system framework and techniques have profound effects on later Image Retrieval system. Photobook [3] is a set of interactive tools for browsing and searching images developed at MIT Media Lab. Photobook consists of three sub-books, from which shape, texture, and face features are extracted respectively. Users can then query based on corresponding features in each of the three sub-books. There are other content-based image retrieval systems such as the ViualSEEK system [4] developed at Columbia University, Like.com website system [5] developed by the Riya team in the United States and so forth. Like.com is one of the best true visual search engines, the

contents of photos are used to search and retrieve similar items. Its launch focuses on handbags, jewelry, shoes, and watches, allowing users to search and purchase items from thousands of leading and boutique brands. It has classified image databases, such image databases of handbags and jewelry separately, and has evident effect of content-based image retrieval. Currently, however, the popular model for image retrieval of most systems has been based on text-based search technology so far, such as Google, AltaVista, and Yahoo and so forth. Since the low-level features of images (color, shape, texture, etc) do not represent the image semantic information, which means they do not tell what the image is, the results of the content-based image retrieval are not always satisfying.

C. Existing problem and the objective of this paper

Currently, most people focus on the improvement and optimization a certain kind of content-based image retrieval techniques. Improvement of image content-based search techniques is definitely of great significance to obtain accurate retrieval results in large size image databases. However, only

one kind of image search techniques applied to image retrieval

systems, whether it is keyword-based search technique or image content-based search technique, cannot always obtain

satisfactory results and meet users’ requirement. Effective and efficient system architecture for the image retrieval system is needed, combining with both text-based search and image content-based search. Therefore, in this paper, we discuss a method of improving image retrieval for large image databases. It combines keyword-based search and image content-based search technologies. We name it as Mixed Query Image Retrieval system. We present the system architecture, and illustrate our system prototype. The experiment demonstrated that, the degree of accuracy, effect and efficiency of this mixed query image retrieval system are greatly enhanced.


Recent years, there has been a rapid increase in the size of digital image collections. Both military and civilian equipment generates gigabytes of images every day. As for such large size image databases, although image content-based search is faster and more accurate than text-based search, the retrieval results are not always ideal. It is very possible that, for example, when we use the image content-based search engine to seek a picture of a red bus, the return picture may be of a red house. As it is known to us, the image content-based search engine extracts visual features from images, such as color, texture and shape, according to different feature extraction algorithms. It does not have human perception and the ability to distinguish and identify true meanings of images. The color and shape, as well as the texture of the house may be very similar to the red bus. But it is not the result we expect. But if in a relatively small, categorized image database, image content-based retrieval engine definitely works better. The design principle of the mixed query image retrieval system is that first we divide the original large image database into a number of relatively small image databases based on categories and establish keyword-indexing for each category of the now divided image databases. We then employ image content-based search engine within each of the classified image database. A. Combination of Keyword-based Search and Image Content-based Search

There are two layers of combination of text-based and image content-based search. Firstly, according to categories, divide the original large image database into small image database and establish keyword-on-category indexing associated with small databases. Secondly, in every classified image databases, establish both image content-based indexing and keyword-based indexing. 1) Classify image database. As for large image database, we can refer to image content information or descriptive information of images to classify the image database based on category. A number of small image sub-databases are generated and each of them belongs to a certain category, for instance, shoes or handbags. Take the image database of the patent office for example; suppose that the number of patent pictures is about 5,000,000 and the number of the patent categories is more than 40. If we divide the image database according to the standard international patent category, a number of image sub-databases are generated, which are classified and contain about 100,000 to 200,000 images each. In smaller image database, both the image content-based and text-bases search engines can obtain more accurate retrieval results and work faster. And efficiency of the whole system can be enhanced and higher quality of search results can also be obtained. A distributed cluster of sub-databases are generated after classifying the original large image databases. We can see it as shown in figure 1. After classifying, the structure of image databases is similar to the category structure of libraries’ databases or patent office IPC categories structure [5]. Figure 2 shows an example of such kind of categorized structure.

2 shows an example of such kind of categorized structure. Fig.1. Classify the large image database

Fig.1. Classify the large image database into small distributed image databases on categories

into small distributed image databases on categories Fig.2. An example of classified image databases structure

Fig.2. An example of classified image databases structure

Then, according to these categories, we can employ traditional text-based search indexing technique to establish text index for the cluster of image databases. Thus, we could utilize key words to search for a certain kind of image collections. 2) Establishing Content-based Indexing After classifying the original large size image databases into relative small image databases on different categories, we could utilize image content-based search. Since we have divided large size image database into small, categorized sub-databases, content-based search engine is more accurate and faster. Feature (content) extraction is the basis of content-based image retrieval. In a broad sense, features may include both text-based features (keywords, annotations, etc.) and visual features (color, texture, shape, faces, etc.). Within the visual feature scope, the features can be further classified as general features and domain-specific features. The former include color, texture and shape features while the latter is application dependent and may include, for example, human faces and fingerprints. Feature extraction algorithms of color-based, texture-based and shape-based image retrieval are available and in hot discussions. Here, we propose mixed content-based image retrieval, combining color-based, texture-based and shape-based search. In our actual system, we utilize a mixed content-based image retrieval engine. Since we aim at introducing the method and architecture of the

mixed query image retrieval system in this paper, we do not intend to discuss content-based image retrieval techniques, including multi-dimensional indexing techniques, in detail.


There are four


parts in

the mixed

query image

retrieval system.





query interface,







content-based indexing databases.




and image

A. User Query Interface

The user query interface should be friendly and flexible. To communicate with the user in a friendly manner, the query interface is graphics-based. The interface collects the information needed from the users and displays the retrieval results back to the users. Users can input keywords based on category to assess to a certain kind of image database, and then use the digital image characteristics, a picture or keywords of pictures to search in the image database. After processing, the system will return results via the user query interface.

B. Management and Control Module

The module is the linchpin of the system, controlling and managing the performance of the system. It processes user request, analyzes which type of the user request, text or image, process user request, and then return results to users.

C. Text-based indexing and image content-based indexing


There are two parts, that is, keywords based on category indexing and image content-based indexing. According to categories, a large image database is divided into relatively small classified image databases. Keyword-on-category indexing is established, associated with classified image databases. After processing text information in the management and control module, system automatically give control to the console of the corresponding image databases, according to the keyword-on-category indexing table. Image content-based indexing database keep image information and established image indexing tables. After

receiving query request from management and control module, the system searches in the image database according to the indexing tables and then return result to the management and control module.

D. Database

The database contains two kinds of data, text and image. The text database keep descriptive information related to images. The image database contains real pictures. The system architecture is shown in figure 3.



Figure 4 shows the processing flow in the mixed query image retrieval system. Several main parts in the process are explained briefly below. From information that the user input, extract keywords based on category and according to these keywords, select a corresponding clusters of indexing databases. Use either content-based search or text-based search in the selected image databases.

search or text-based search in the selected image databases. Fig.3. System architecture of mixed query image

Fig.3. System architecture of mixed query image retrieval system

System architecture of mixed query image retrieval system Fig.4. Interactive Model of Mixed query image retrieval

Fig.4. Interactive Model of Mixed query image retrieval system

Gather candidate results from every selected indexing

database, optimal resort, and select the most optimal set of results. According to the optimal set of image indexes, search in

the image databases, and access to the actual image pictures. Organize the result pages, and return them to users.

In the current design, process iii, process ii and process iv occupy most of the internal processing time for the entire search process. In test queries, the processing time of process ii is less than 0.1 second. The processing time of process iii is less than 1 second and the time of process iv is also less than 1 second. The whole process can be completed within 2 seconds. In actual operation, especially in the real internet environment, the network transmission speed of process v can be improved by small size image output.


According to the system architecture design in the previous section, we have developed the prototype of mixed query image retrieval system. As shown below:

of mixed query image retrieval system. As shown below: Fig.5a. System prototype Fig.5b. System prototype VI.

Fig.5a. System prototype

retrieval system. As shown below: Fig.5a. System prototype Fig.5b. System prototype VI. C ONCLUSION This paper

Fig.5b. System prototype



This paper proposes a method to solve the problem of how to enhance the efficiency of image retrieval system applied to large scale image collections. Categorized segmentation of a large image database is the core concept. In this paper, we discuss a mixed query image retrieval system, combined both image content-based and text-bases search technologies, and present the system architecture that we design for this combined system and the system prototype. Experiments confirm that the mixed query image retrieval system can manage the large image database better and achieve high efficiency and better quality. Virtually, there are still many open issues needed to be solved before image retrieval systems can be put into practice. To achieve faster retrieval speed and make the image retrieval system scalable to large size image collections, multi-dimensional indexing technique is of great importance to the image retrieval system. In this paper we only discuss a kind of effective architecture of such systems. In conclusion, integration of multiple technique and information sources of humans and computers will lead to a more successful image retrieval system.


I would like to thank Tony Chen for editing my English writing.

REFERENCES [1] Flickr web site, available at http://www.flickr.com



[3] Photobook system of MIT University home page, available at


[4] VisualSEEK system of The DVMM Lab at Columbia University, available














[5] LIKE.COM website, available at http://www.like.com

[6] China Intelligence and Patent Website, available at http://www.cnipr.com/

[7] Jeffrey Z. J. Zheng, Chris H. Zheng and Tosiyasu L. Kunii, “Concept Cell

Model for Knowledge Representation”, International Journal of Information Acquisition, Vol 1, No. 2, 149-168(2004)

[8 [Beynon-Davies 1993] P. Beynon-Davis. “Information System

Development”. The Macmillan Press 1993.

[9] [Booch 1990] G. Booch. “ Object-oriented Analysis and Design” . Addison Wesley. 1990.

[10] [Coad 1991] P Coad and E Yourdon. “Object-Oriented Analysis”.

Prentice-Hall, 1991.

[11] [Dijikstra 1979] E. Dijikstra. “Programming Considered as a Human

Activity.” Classics in Software Engineering. Yourdon Press, 1979.

[12] [Myers 1982] G. Myers. “Advances in Computer Architecture”. John

Wiley and Sons, 1982.

[13] [Tsai 1988] J. Tsai and J. Ridge, “Intelligent Support for Specifications

Transformation”. IEEE Software Vol. 5(6), p.34.

[14] [Yourdon 1979] E. Yourdon and L. Constantine, “Structured Design”.

Prentice-Hall, Englewood-Cliffs, 1979.

[15] The google image search page, available at http://images.google.com

[16] Thomas Deselasers, Tobias Weyand, Daniel Keysers, Wolfgang

Ma-cherey and Hermann Ney, “FIRE in ImageCLEF 2005: Combining

Content-based Image Retrieval with Textual Information Retrieval”, In

Workshop of the Cross--Language Evaluation Forum (CLEF 2005), lecture

Notes in Computer Science, volume 4022, pages 652-661, Vienna, Austria,

September 2005