Académique Documents
Professionnel Documents
Culture Documents
SYNPOSIS (Phase I)
Submitted by
PRASAD BANOTH
(Register No: CS0604)
MASTER OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
(Distributed Computing Systems)
SYNP0SIS
AIM
In this project as the first phase Knowledge Summarization is carried out through
Clustering Techniques which is stored in Buffer/Database from where Content-Based
Retrieval is done to carry out predictive measure ( by Medical researches) for Medical
Image Data.
MOTIVATION
We are drowning in data but starving for knowledge. Data Mining Techniques
[1][2][3] can be solution to discover knowledge from large data. Especially in medical
data large data is present where as Knowledge discovered is minimal, so the predictive
action is less.
RELEATED WORK
The goal of content-based image retrieval (CBIR) [8] is to retrieve images similar
[4] to an image/sketch provided by the user.
Very large collections of images are growing ever more common. From stock
photo collections and proprietary databases to the World Wide Web, these collections are
diverse and often poorly indexed.
IMAGE RETRIEVAL
SHAPE
Boundary-Based methods use only the border of the object shape and completely ignore
its interior. On the other hand, the Region-Based techniques take into account internal
details besides the boundary details.
COLOR
Color [6] is a commonly used feature for realizing content-based image retrieval
(CBIR)[8].There are many approaches for CBIR which is based on well known and
widely used color histograms.
There are mainly three Color-Based approaches for Content-Based Image Retrieval
• Global Color Histogram (GCH)
• GRID
• Color-Shape Histograms
Given a query image Q, this algorithm retrieves images that contain Common similar
regions with Q, where objects of Q may appear in the target images in scaled, translated, or
color shifted form.
This algorithm performs an image indexing phase in which images in the database are
indexed before images matching a given query image Q can be retrieved Indexing of
images is done only once at the beginning and when new images are added to the database,
while the steps for querying need to be repeated for each query image.
Steps involved in both indexing of images and querying for similar images are:
Generating Signatures for Sliding Windows.
Clustering the Sliding Windows.
Region Matching.
Image Matching
CLUSTERING
Clustering [3][10] of data is a method by which large sets of data are grouped into
clusters of smaller sets of similar data.
CLUSTERING ALGORITHMS
CLUSTERING TECHNIQUES
K-Means Method [3][10]: For Content-Based Image Retrieval as the first phase.
X-Means: Enhanced version of K-Means Method.
K-MEANS CLUSTERING
Start
No of Clusters K
Centroid No
yes
Distance objects No
to centroid objects Move End
group
Grouping based on
minimum distance
SPSS (originally, Statistical Package for the Social Sciences)[11] was released in
its first version in 1968, and is among the most widely used programs for statistical
analysis in social science. It is used by market researchers, health researchers, survey
companies, government, education researchers, and others. In addition to statistical
analysis, data management and data documentation are features of the base software.
A single DICOM file[12] contains both a header (which stores information about
the patient's name, the type of scan, image dimensions, etc), as well as all of the image
data (which can contain information in three dimensions).
SYSTEM MODEL
In this proposal, the main goal is Content-Based Image Retrieval (CBIR).Hear the
performance enhancement is done through X-Means. As a first phase, a very large
collection of images of medical database (DB) of World Wide Web is done as a
collection. Hear, retrieval is not fast due to size of image. So the process of indexing and
storing it as Knowledge Summarization (KS) is done as next level. From KS Content-
Based Retrieval can be performed from output unit.
CBIRC ARCHITECTURE
DB1
DB level
DB2 DB level clustering
Knowledge Process
Summarization
DB3
Content
DBN Based
Retrieval
IMPLEMENTATION
Functions performed:
Private Sub Form_Load ()
Private Sub cmdReset_Click () ….. Reset data.
Private Sub txtNumCluster_Change () …... Change number of cluster and reset data.
Private Sub Picture1_MouseDown () …… Collecting data and showing result.
Private Sub Picture1_MouseMove
Sub kMeanCluster () …... main function to cluster data into k number
of Clusters.
Function dist …… calculate Euclidean distance.
Private Function min2 (num1, num2) ….. Return min value between two numbers.
RESULTS
When User click picture box to input new data (X, Y), the program will make
group/cluster the data by minimizing the sum of squares of Euclidean distances between
data and the corresponding cluster centroid. Each dot is representing an object and the
coordinate (X, Y) represents two attributes of the object. The colors of the dot and label
number represent the cluster.
Figure: 1.3 Sample input and output for K-Meams Clustering algorithm.
COCCLUSION
Very large collections of images are growing ever more common. With the
proliferation of image data, the need to search and retrieve images efficiently and
accurately from a large image database[8][9] or a collection of image databases [8][9] has
drastically increased. Shape [5][6][7] and Color [6] of an object plays an important role
while image retrieval. To address such a demand, Content-Based Image Retrieval [8]
through Clustering (CBIRC) is proposed In the system. As the first phase Knowledge
Summarization is carried out through Clustering Techniques [3][10] which is stored in
interfacing unit that can act as Buffer/Database[8][9] from where Content-Based
Retrieval [8] is done to carry out predictive measure ( by Medical researches) for Medical
Image Data.
With the advances in image processing, information retrieval, and database
management, there have been extensive studies on content-based image retrieval (CBIR)
[8][9] for large image databases. CBIR systems retrieve images based on their visual
contents. Earlier efforts in CBIR research have been focused on effective feature
representations for images. The visual features of images, such as color [6], texture, and
shape features [5][6][7] have been extensively explored to represent and index image
contents, resulting in a collection of research prototypes and commercial systems. To
address such a demand, Content-Based Image Retrieval through Clustering (CBIRC) is
proposed. This method provides database clustering [3][10] and improves the query
processing by analyzing the summarized knowledge.
REFERENCES