Académique Documents
Professionnel Documents
Culture Documents
Abstract – Fourier Transform Infrared Spectroscopy (FTIR) order to establish whether it is indeed possible to
is becoming a powerful tool for use in the study of biomedical distinguish normal and abnormal cells from analysis of the
conditions, including cancer diagnosis. As part of an ongoing entire spectra. If the characteristic spectra of abnormal and
programme of research into the potential early diagnosis of normal tissue components are known, it may be possible to
cervical cancer, Hierarchical Cluster Analysis (HCA) and
compare the spectra in each cluster to these reference
Fuzzy C-Means (FCM) have been applied to distinguish
FTIR spectra obtained from cancerous and non-cancerous spectra and hence achieve accurate diagnosis.
cells. In recent experimentation on non pre-processed FTIR In previous clinical work [6], FTIR spectral data was
spectra data, the FCM method has been shown to achieve first empirically pre-processed, using such techniques as
significantly better results. Nevertheless, two limitations mean-centring and variance scaling, and then various data
apply to this technique. Firstly, a priori assumption of the clustering techniques were compared to the manually
number of clusters is needed. This is a problem because, in obtained classifications. This showed that correct
general, this is not known in medical diagnosis. The other clustering could only be achieved by applying pre-
limitation is that it involves a greedy local search processing techniques which were hand-tuned to the
methodology such that sub-optimal solutions may be returned
particular sample characteristics. Moreover, these pre-
and, thus, misdiagnosis could occur. Bandyopadhyay [8] has
recently proposed a Simulated Annealing Fuzzy Clustering processing procedures need extra tools, time and expertise
algorithm (SAFC) which can avoid these two limitations. which are undesirable and, indeed, not practical in a
However, when we implemented the proposed algorithm, it clinical environment. Following this, a comparison was
was found that sub-optimal solutions could be obtained in made of FCM and HCA in classifying non pre-processed
certain circumstances. In this paper, we extend the SAFC FTIR oral cancer data [7], in which the FCM method was
algorithm to overcome this difficulty and apply this modified found to perform significantly better. However, in this
version to the classification of seven sets of FTIR spectra data study it was necessary to use the (previously established)
which have been taken from three oral cancer patients. With actual number of clusters as an input parameter to FCM
no prior specification of cluster number, our modified SAFC
clustering, Obviously, in a fully automated diagnostic tool,
algorithm is shown to obtain the correct (clinical)
classification of clusters in 4 out of 7 data sets. In the such cluster numbers would not be know a priori. Efficient
remaining 3 data sets it produces a number of clusters which, clustering techniques are required to automatically detect
while differing from the clinical classification, appear to the number of disease categories or different levels of
better match the underlying data when subjectively visualised disease.
using Principal Component Analysis (PCA). In this paper an algorithm that extends upon the ideas
presented by Bandyopadhyay for fuzzy clustering [8] is
I. INTRODUCTION developed that is able to automatically detect the number
of clusters. A detailed discussion of the standard FCM
Cancers have become one of the major health problems algorithm and the proposed extensions is presented in
that humankind endures and thus research into diagnosis section II and section III. Simulated annealing is combined
methods and treatment has become an important research with the conventional FCM method to guide a search
area for the scientific community. In Britain, more than through all cluster configurations. The Xie-Beni validity
one in three people will be diagnosed with cancer during measure (XB) [9] is used to optimise the number of
their lifetime and one in four will die from cancer. If clusters and their locales. The results show that the
cancers can be diagnosed earlier then patients can begin proposed simulated annealing fuzzy clustering (SAFC)
treatment programmes quicker and, ultimately, a lower algorithm can produce better clustering than the standard
mortality rate can be achieved. FTIR is becoming a FCM method as used in our previous work [7].
valuable tool for use in the study of a wide variety of
biomedical conditions, including the diagnosis of cancer II. BACKGROUND
and other diseases [1-5]. It is based on the differential
absorption of a range of wavelengths of Infrared Radiation Fuzzy clustering has been widely used in medical
(IR) dependent on the molecular structures and bonds of diagnosis, pattern recognition, exploratory data analysis
the substance(s) the IR passes through. Thus, this and image segmentation. The most prominent fuzzy
technology can detect changes in the chemical composition clustering algorithm used is the Fuzzy C-Means (FCM), a
of cell nuclei that can indicate the early onset of cancer. fuzzification of K-Means or ISODATA. It was originally
Unfortunately, it has also been established that ‘simple’ introduced by Bezdek in 1981 [10] as an extension to
measures, such as peak height(s) or peak location(s) are Dunn’s algorithm [11]. Nevertheless there are limitations
not sufficient to permit accurate diagnosis. As a precursor to the FCM technique. Firstly, it needs a priori assumption
to a diagnostic system, an investigation was carried out in on the number of clusters which is not practical in real
medical diagnosis applications; secondly, it is a nonconvex 1
method [12], so may often lead to local minimum solutions If d ij ≠ 0 µ ij = 2
(5)
c d ij
and hence misdiagnosis could occur. In order to overcome
these drawbacks, the SAFC algorithm was proposed by ∑(d
k =1
) m −1
ik
Bandyopadhyay [8]. It avoids these limitations in the FCM
technique and provides good quality solutions with Else µ ij = 1
accurate clusters.
5) Repeat step 2) to 4) until the minimum J value is
A. Fuzzy C-Means achieved.
∑µ
at that equilibrium. Therefore, the probability of accepting
ij =1, ∀i = 1,...n (3) a worse energy level is greater when T is high and smaller
j =1
when T is low. By gradually decreasing the temperature
and repeating the Metropolis simulation, new energy levels
The Euclidean distance between xi and v j is represented will be achieved until no more improvements are possible.
The first attempt to bring simulated annealing into
by || xi − v j || . The parameter m is used to control the optimisation problems was by Kirkpatrick et al [14] who
fuzziness of membership of each datum, m > 1. There is no used simulated annealing as a new optimisation search
theoretical basis for the optimal selection of m, but a value paradigm to escape local optima and hopefully converge to
of m = 2.0 is usually chosen. U = ( µ ij ) n*c is a fuzzy the global optimum. Since this time, simulated annealing
has been used on a wide range of combinatorial
partition matrix and V = {v1 , v2 ,...vc } is a set of cluster optimisation problems and achieved good results. Al-
centres. FCM is described by the following steps [10]: Sultan and Selim [15, 16] and Klein and Dubes [17] have
developed algorithms based on simulated annealing to find
the global minimum solution using FCM and other crisp
1) Initialize membership matrix µ ij with random value,
(non-fuzzy) clustering methods. However the (fixed)
make sure it satisfies conditions (2) and (3). number of clusters is a prerequisite parameter in both these
techniques. Maulik and Bandyopadhyay [18] developed a
2) Compute the fuzzy centres v j using fuzzy clustering technique which combines a genetic
n
algorithm and FCM clustering to automatically segment
∑ (µ ij ) m xi satellite images obtained by remote sensing.
vj = i =1
n
, ∀j = 1,..., c (4) III. THE PROPOSED ALGORITHM
∑ (µ
i =1
ij ) m
3) Calculate the new distance A good clustering has the properties that data points
within the same cluster are compact and data points in
d ij =|| xi − v j ||, ∀i = 1,..., n, ∀j = 1,...c different clusters are separated. Xie and Beni [9] created a
fuzzy clustering criterion based on this idea, referred to as
4) Update the fuzzy membership µ ij the XB validity measure. Let S represent the compactness
and separation validity function. It can be defined as
shown in the following equation:
c n centre respectively, and d = 1,...N , where N is the
∑∑ µ
j =1 i =1
ij
2
|| xi − v j ||2 number of dimensions. Perturb Centre can then be
S= (6) expressed as: vnew [ d ] = vcurrent [ d ] + cr . The parameter
n min ij || vi − v j ||2
perturb_rate was set through initial experimentation.
The parameters are the same as in section II A. In a
π
simpler form, it can be expressed as: S = , where π is
2) Split Centre
s As opposed to Perturb Centre, Split Centre will pick up
the compactness of the fuzzy c-partition of the data set. the centre of the biggest cluster and split it into two
c n
∑∑ µ
centres. This is performed by choosing a data point from
|| xi − v j || 2
2
ij the data set as a reference point, expressed as wchosen [d ] .
j =1 i =1
The π= component of equation There are two kinds of situation that must be taken into
n account when choosing the reference data point. The first
(6) measures the compactness of each class. The smaller
exists when the boundary between the biggest cluster and
π indicates more compact the clusters are. s is called the the other clusters are not obvious, that is to say there are
separation of the fuzzy c-partition of the data set. some data points whose membership degree to the chosen
s = (d min ) 2 , where d min = min ij || vi − v j || is the centre is close to 0.5. In this case, the data point which has
minimum distance between cluster centres. If s is bigger, membership degree that is smaller than but closest to 0.5 is
then the clusters are separated. Therefore, a smaller S chosen as the reference point. The second situation occurs
indicates all of the clusters are more compact and with when the biggest cluster is clearly separate from the other
greater separation from each other. clusters. This means that there exist some data points
The corresponding process of simulated annealing and which have a proportionally large membership degree to
SAFC is shown in Table I. This table shows the XB the biggest cluster centre (at least greater than 0.5) and all
validity function value as the energy associated with a of the other data points have membership degrees that are
configuration. In other words, it is used as an evaluation ‘small’. In our experimentation, we define ‘small’ as a
function to measure how the FTIR oral cancer data sets are membership degree which is less than 0.4. In this case, the
classified. There are two reasons for choosing the XB mean value of all the membership degrees above 0.5 is
validity measure as the evaluation function. Firstly, the calculated and the data point that has a membership degree
FCM algorithm needs a priori assumption of number of closest to the mean is chosen as the reference point. The
clusters. However, in the SAFC algorithm the number of distance between this chosen data point and current chosen
clusters is unknown, thus the original evaluation function centre is dist[d ] =| vcurrent [d ] − wchosen [d ] | . The two new
of FCM is not suitable. Secondly, of the existing validity centres are then generated by using vnew [d ] = vcurrent [d ]
measures available for fuzzy clustering, the XB validity
measure has been shown to be able to detect the correct ± dist[d ] . Finally, vcurrent [d ] is deleted.
number of clusters in several experiments [19].
The most interesting part of the proposed SAFC 3) Delete Centre
algorithm is how to perturb the current centres (current
configuration). Three perturbation functions are randomly As with Perturb Centre, the smallest cluster will be
chosen on each perturbation. They are: perturbing an chosen but this time is deleted from the current cluster set.
existing centre (Perturb Centre), splitting an existing
centre (Split Centre) and deleting an existing centre 4) SAFC process summary:
(Delete Centre). At the beginning of each perturbation i. Randomly initialise the number of clusters to num ,
function, an existing centre is chosen based on the size of
where 2 ≤ num ≤ n and n is number of data points.
the cluster. Let | C j | represent the size of cluster j; it is
ii. Initialise fuzzy membership µ ij , subject to conditions
expressed by:
n (2) and (3) being satisfied.
| C j |= ∑ µ ij , ∀j = 1,...c (7) iii. Calculate data centres using (4).
i =1 iv. Calculate current XB value using (6).
where c is the number of clusters. v. T = Tmax , setting maximum temperature as starting
temperature.
The three functions are described below and are
followed by a summary of the whole SAFC process. vi. If T ≥ Tmin , i = 1 to k
vii. Alter centres by randomly calling one of Peturb
1) Perturb Centre Centre, Split Centre or Delete Centre.
viii. Calculate new XB value using (6) based on the
The smallest cluster will be chosen computed by perturbation.
computing the cluster sizes using equation (7) in the set of ix. If new XB < current XB, accept new centre. Set new
centres. This centre position is then modified through XB to current XB, save best XB and best centres
addition of the change rate cr = perturb_rate * rand, where position.
perturb_rate ∈ [-0.007, 0.007] and rand ∈ [0, 1]. Let
vcurrent [d ] and vnew [d ] represent current centre and new
Table I
The corresponding process of simulated annealing and SAFC