Vous êtes sur la page 1sur 8

The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No.

3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 45



AbstractNumber of Patients with cancer, heart disease & Diabetes are increasing day by day because of
excessive consumption of alcohol, inhale of harmful gases, intake of contaminated food, drugs and smoking
etc. A range of therapies have been provided by researchers already. Early diagnosis is of considerable
significance of the physician's skills conducted based on their knowledge and experience yet an error might
occur. Using various Artificial Intelligence methods for medical diagnosis of diseases has recently become
widespread. These intelligent systems help physicians as a diagnosis assistant. Now, various Artificial Neural
Network, Rough Set, Decision Tree, Bayesian Network are very popular for this purpose. This paper provides
a review of different soft computing method in diagnosis and detection of above mentioned disorders
acuteness. The survey is carried out for three different types of data of different diseases with cross validation
and percentage split for testing new data sets of each. The results indicates that Rough Set Theory gives
maximum accuracy and coverage area but with maximum computational time complexity. On the other hand
Neural and Bayesian Network give quite satisfactory results. Moreover the obtained results also suggest that
accuracy depends on the quality of normalization of data.
KeywordsBayesian Classifier; Classification; Decision Tree; Disease Diagnosis; Rough Set; Soft
Computing; Neural Network.
AbbreviationsArtificial Neural Network (ANN); Conditional Independence (CI); Diabetes Mellitus (DM);
Directed Acyclic Graph (DAG); Independency Conditional Search (ICS); Rough Set Exploration System
(RSES); Waikato Environment for Knowledge Analysis (WEKA).

I. INTRODUCTION
HE use of intelligence expert systems in medical
diagnosis is increasing gradually. There is no doubt
that evaluation of data taken from patient and
decisions of experts are the most important factors in disease
diagnosis. But, expert systems and different artificial
intelligence techniques for classification also help experts in a
great deal [Brachman et al., 1996; Mitchell, 1997; Han &
Kamber, 2000]. For this several soft computing methods are
already proposed. Classification systems, helping possible
errors that can be done because of fatigued or inexperienced
expert to be minimized, provide medical data to be examined
in shorter time and more detailed.
Breast cancer is a type of cancer originating from breast
tissue. Worldwide, breast cancer accounts for 22.9% of all
cancers (excluding non-melanoma skin cancers) in women.
The first noticeable symptom of breast cancer is typically a
lump that feels different from the rest of the breast tissue. The
primary risk factors for breast cancer are female sex and older
age. Other potential risk factors include: smoking, genetics,
lack of childbearing or lack of breastfeeding higher levels of
certain hormones, certain dietary patterns, and exposure to
light pollution.
Diabetes Mellitus (DM) or simply diabetes is a group of
metabolic diseases in which a person has high blood sugar.
This high blood sugar produces the symptoms of frequent
urination, increased thirst, and increased hunger. Untreated,
diabetes can cause many complications. Acute complications
include diabetic ketoacidosis and nonketotic hyperosmolar
coma. Serious long-term complications include heart disease,
kidney failure, and damage to the eyes. Diabetes is due to
either the pancreas not producing enough insulin, or because
cells of the body do not respond properly to the insulin that is
produced.
Among various life- threatening diseases, heart diseases
have a great deal of attention in medical research. Also, it has
more impact on human health. Various heart diseases was
T
*Department of Electronics & Communication Engineering, GIMT, Krishna Nagar, INDIA. E-Mail: sudip.mandal007{at}gmail{dot}com
**Department of Information technology, NEHU, Shilong, INDIA. E-Mail: dr_goutamsaha{at}yahoo{dot}com
***Department of Computer Science Engineering, University of Calcutta, Kolkata, INDIA. E-Mail: pal.rajatk{at}gmail{dot}com
Sudip Mandal*, Goutam Saha** & Rajat K. Pal***
A Comparative Study on Disease
Classification using Different Soft
Computing Techniques
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 46
discussed and founded how they lead to heart attack. The
number one cause of death in industrialized countries was
due to cardiovascular disease. Cardiovascular diseases not
only have a major impact on individuals and their quality of
life in general, but also on public health costs and the
countries economies. Risk factors for these pathologies
include diabetes, smoking, family history, obesity, high
cholesterol. Blood flow to the heart muscles was decreased
when block occurs in coronary arteries. The
electrocardiogram recordings were analyzed to detect
irregularity of heart beat problems occurred due to
cardiovascular diseases.
The healthcare industry collects huge amounts of
healthcare data and that need to be mined to discover hidden
information for effective decision making. Discover of
hidden patterns and relationships often go unexploited.
Clinicians and patients need reliable information about an
individuals risk of developing different disease. Ideally, they
would have entirely accurate data and would be able to use a
perfect model to estimate risk. Such a model would be able to
categorize people with disease and others. Indeed, the perfect
model would even be able to predict the timing of the
diseases onset.
In this paper selected algorithms were considered from
different category of classification algorithms. Rough Set
Theory, Decision Tree, Neural Network and Bayesian
Classifier are extensively applied in the dataset regarding the
above three diseases dataset. The rest of the paper is
organized as follow. Data description data preparation are
described in section two followed by overview of few most
popular techniques for computed diagnosis. Detailed results
are discussed in Section IV. Next conclusion and references
are given accordingly.
II. DATA DESCRIPTION DATA
PREPARATION
The decision table contains attributes i.e. condition and
objects i.e. different cases of samples. The decision table
describes the decision in terms of conditions that must be
satisfied in order to obtain the decisions specified in the
decision table. Based on different condition or different value
of attribute the decision of the sample may vary its states. All
datasets related to Breast Cancer, Diabetes, and Heart Disease
are downloaded from UCI machine Learning Repository
(http://archive.ics.uci.edu/ml/dataset).
This Breast Cancer database was obtained from the
University of Wisconsin Hospitals, Madison from Dr.
William H. Wolberg [Mangasarian & Wolberg, 1990]. There
are 10 attributes and 699 instances. Clump Thickness,
Uniformity of Cell Size, Uniformity of Cell Shape, Marginal
Adhesion, Single Epithelial Cell Size, Bare Nuclei, Bland
Chromatin, Normal Nucleoli, Mitoses and Class (2 for
benign, 4 for malignant) . This is to be noted that all data are
normalized in this case.
Pima Indians Diabetes Database from National Institute
of Diabetes and Digestive and Kidney Diseases is used as the
training for different techniques [Smith et al., 1988]. In this
database, there are 768 numbers of instances and 9 number of
Attributes such as Number of times pregnant, Plasma glucose
concentration a 2 hours in an oral glucose tolerance test,
Diastolic blood pressure (mm Hg), Triceps skin fold
thickness (mm), 2-Hour serum insulin (mu U/ml), Body mass
index (weight in kg/(height in m)
2
), Diabetes pedigree
function, Age (years) and Class (tested positive or negative).
Here in the dataset few attributes are normalized and few are
not.
In the dataset of Heart Disease there are 270 numbers of
Instances and 14 numbers of Attributes such as age, Sex,
chest pain type (4 values), resting blood pressure, serum
cholesterol in mg/dl, fasting blood sugar > 120 mg/dl, resting
electrocardiographic results (values 0,1,2), maximum heart
rate achieved, exercise induced angina, old peak = ST
depression induced by exercise relative to rest, the slope of
the peak exercise ST segment, number of major vessels (0-3)
colored by flourosopy, thal: 3 = normal; 6 = fixed defect; 7 =
reversible defect and class (Disease present or not) [Detrano
et al., 1989]. In this case no attribute is normalized.
These decision tables are used as a training dataset to
learn and infer from the training data and to know the hidden
dependency between different attributes for different disease
states.
III. OVERVIEW OF THE TECHNIQUES
EMPLOYED
3.1. Decision Tree (J -48)
Decision tree is one of the most popular and efficient
technique in data mining which is established and well-
explored by many researchers. Decision trees are categorized
as a supervised method that trying to find the relationship
between input attributes and target attributes which represent
the relationship in structure as a model. The model
constructed by using input attributes to predict target
However, some decision tree algorithms may produce a large
structure of tree size and it is difficult to understand. J48 is an
implementation of C4.5 algorithm [Witten & Frank, 2005].
C4.5 was a version earlier algorithm developed by J. Ross
Quinlan. There two methods in pruning support by J48 first
are known as subtree replacement, it work by replacing nodes
in decision tree with leaf [Floriana Esposito et al., 1997;
Mohamed et al., 2012; Mandal et al., 2013]. Basically by
reduce the number of test with certain path. It works with the
process of starting from leaves that overall formed tree and
do a backward toward the root. The second type implemented
in J48 is subtree raising by moved nodes upwards toward the
root of tree and also replacing other nodes on the same way.
According to Zhao and Zhang (2007), C4.5 algorithm
produce decision tree classification for a given dataset by
recursive division of the data and the decision tree is grown
using Depth-first strategy. On data testing this algorithm will
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 47
emphasized on splitting dataset and by selecting a test that
will give best result in information gain. In discrete attributes
as well, these algorithms consider a test with a result of many
as the number of different values and test binary attribute for
each attribute will continue to grow in different values each
attribute will be considered. In order to gather the entropy
gain of all these binary tests efficiently, the training data set
belonging to the node in consideration is sorted for the values
of the continuous attribute and the entropy gains of the binary
cut based on each distinct values are calculated in one scan of
the sorted data. This process is repeated for each continuous
attributes.
The training dataset is a set {S = S
1
, S
2
,} of already
classified samples. Each sample Si consists of a p-
dimensional vector {x
1i
, x
2i
,., x
pi
}, where x
ij
represent
attributes or features of the sample, as well as the class in
which Si falls. This algorithm has a few base cases.
1. All the samples in the list belong to the same class.
When this happens, it simply creates a leaf node for
the decision tree saying to choose that class.
2. None of the features provide any information gain. In
this case, C4.5 creates a decision node higher up the
tree using the expected value of the class.
3. Instance of previously-unseen class encountered.
Again, C4.5 creates a decision node higher up the tree
using the expected value.
3.2. Rough Set Theory
Rough set philosophy is founded by Pawlak (2002) on the
assumption that with every object of the universe of discourse
some information (data, knowledge) is associated. Objects
characterized by the same information are indiscernible
(similar) in view of the available information about them.
Rough set [Pawlak, 2002; Midelfart et al., 2002; Hassanien &
Ali, 2004; David & Balakrishnan, 2010; Mandal & Saha,
2013] is defined in the following way: In an information
system let X U be a target set that we wish to represent
using an attribute subset P, i.e., an arbitrary set of objects X
comprises a single class, and we wish to express this class
roughly with respect to universe using the equivalence classes
induced by attribute subset P. In general, X cannot be
expressed exactly, because the set may include and exclude
objects which are indistinguishable on the basis of attributes
P. The target set X can be approximated using only the
information contained within P by constructing the P-lower
and P-upper approximations of X.
The P-lower approximation or positive region, is the
union of all equivalence classes in [X]
P
which are the subsets
and contained by the target set. The P-upper
approximation,

is the union of all equivalence classes in


[X]
P
which have non-empty intersection with the target set.
The lower approximation of a target set is a conservative
approximation consisting of only those objects, which can
positively be identified as members of the set. The upper
approximation is a liberal approximation, which includes all
objects that might be members of target set. The accuracy of
the rough-set representation of the set X can be given by the
following:
=
|

|
||
(1)
In an information system there often exist some
condition attributes that do not provide any additional
information about the objects in U. So, we should remove
those attributes since the complexity and cost of decision
process can be reduced if those condition attributes are
eliminated. Given a classification task mapping a set of
variables P to a set of labeling decision D, a reduct R (set
reduced variable) is defined as any R P, such that
(P,D)=(R,D) where is the classification accuracy. The set
of attributes which are common to all reduct is called core.
The core is the set of attributes which is possessed by every
legitimate reduct, and therefore consists of attributes which
cannot be removed from the information system without
causing collapse of the equivalence-class structure. Let
S=(U,C,D) be a reduced decision table where C denotes the
reduced no. of attributes i.e. reduct. Every X U determines
a sequence c
1
(x) .c
n
(x); d
1
(x) d
m
(x), where
{c
1
,..,c
n
}=C and {d
1
,..,d
m
}=D. The sequence will be
called a decision rule induced by x (in S) and will be denoted
by c
1
(x)...c
n
(x) d
1
(x) d
2
(x), or in short Cx D. Now
these learned rules are generated that can be used for
classification for new dataset or new instances with new
attribute set of values to predict the states of disease or
decision.
3.3. Bayesian Network
The Bayesian belief network [Friedman & Goldszmidt, 1998;
Friedman et al., 2000; Masys, 2001; Darwiche, 2010; Mandal
et al., 2013A] is a kind of probabilistic model for the
construction of genetic network. It uses Direct Acyclic Graph
which consists of nodes representing attributes and directed
acyclic edges between them according to their independency
and each node is attached with a joint probability distribution
table to represent dependency relationships between
variables. Since every independent statement in belief
networks satisfies a group of axioms, we can construct belief
networks from data by analyzing conditional independence
relationships. The Conditional Independence (CI) test based
method is used by all the algorithms of the second category
which analyze relations of different quantities based on their
dependency relationships. For any three disjoint node sets X,
Y, and Z in a belief network, X is said to be d-separated from
Y by Z if there is no active undirected path between X and Y.
The amount of information flow between two nodes can
be measured by using mutual information, when no nodes are
instantiated, or conditional mutual information, when some
other nodes are instantiated. In information theory, the mutual
information of two nodes, is defined as
IX
i
, X
j
= PX
i
, X
j
log
PX
i
, X
j

PX
i
PX
j
X
i
,X
j
(1)
where, X
i
, X
j
are two nodes and C is a set of nodes.
Conditional mutual information is used as CI tests to measure
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 48
the average information between two nodes when the statuses
of some valves are changed by the condition-set C. When
I(X
i
,X
j
/C) is smaller than a certain threshold value, we say
that X
i
, X
j
are d-separated by the condition-set C, and they
are conditionally independent.
At the moment, only the ICS algorithm is implemented.
The algorithm makes two steps, first, find a skeleton (the
undirected graph with edges if there is an arrow in network
structure) and second, direct all the edges in the skeleton to
get a DAG. Starting with a complete undirected graph, we try
to find conditional independencies P[x, y|Z] in the data. For
each pair of nodes x, y, we consider sets Z starting with
cardinality 0, then 1 up to a user defined maximum.
Furthermore, the set Z is a subset of nodes that are neighbors
of both x and y. If an independency is identified, the edge
between x and y is removed from the skeleton. A test is
performed by using any of the score metrics to test whether
variables x and y are conditionally independent given a set of
variables Z.
3.4. Neural Network
Now a days Artificial Neural Network (ANN) [Cowan &
Sharp, 1988; Nayak et al., 2001; Gonalves et al., 2005;
Rocha et al., 2007; Emad W. Saad & Donald C. Wunsch,
2007; Mehdi Neshat & Abas E. Zadeh, 2010] has been
widely used as a tool for solving many decision modeling
problems. A multilayer perception is a feed forward ANN
model that is used extensively for the solution of a no. of
different problems. An ANN is the simulation of the human
brain. It is a supervised learning technique used for non linear
classification and data mining applications. Neural Network
is a set of processing units when assembled in a closely
interconnected network, offers rich structure exhibiting some
features of the biological neural network. The structure of
neural network provides an opportunity to the user to
implement parallel concept at each layer level. Another
significant characteristic of ANN is fault tolerance. An ANN
is typically defined by three types of parameters:
1. The interconnection pattern between the different
layers of neurons
2. The learning process for updating the weights of the
interconnections
3. The activation function that converts a neuron's
weighted input to its output activation.
Mathematically, a neuron's network function is
defined as a composition of other function

(), which can


further be defined as a composition of other functions. This
can be conveniently represented as a network structure, with
arrows depicting the dependencies between variables. A
widely used type of composition is the nonlinear weighted
sum, where = (

()

), where (commonly
referred to as the activation function) is some predefined
function. It will be convenient for the following to refer to a
collection of functions

is simply a vector =
(
1
,
2
, ,

).
A key feature of neural networks is an iterative learning
process in which data cases are presented to the network one
at a time, and the weights associated with the input values are
adjusted each time. After all cases are presented, the process
often starts over again. During this learning phase, the
network learns by adjusting the weights so as to be able to
predict the correct class label of input samples. Once a
network has been structured for a particular application, that
network is ready to be trained. To start this process, the initial
weights are chosen randomly. Then the training or learning,
begins. The most popular neural network algorithm is back-
propagation algorithm. This back-propagation architecture is
the most popular, effective, and easy-to-learn model for
complex, multi-layered networks. Its greatest strength is in
non-linear solutions to ill-defined problems. The typical
back-propagation network has an input layer, an output layer,
and at least one hidden layer. Training inputs are applied to
the input layer of the network, and desired outputs are
compared at the output layer. During the learning process, a
forward sweep is made through the network, and the output
of each element is computed layer by layer. The difference
between the output of the final layer and the desired output is
back-propagated to the previous layers, usually modified by
the derivative of the transfer function, and the connection
weights are normally adjusted. This process proceeds for the
previous layers until the input layer is reached. The iteration
is stopped until certain stopping criterion like minimum error;
maximum number of iteration etc. is achieved.
IV. RESULTS AND DISCUSSION
For each of the dataset we performed two type of experiment
of classification for all above mentioned soft computing
process for each of the disease i.e. Breast Cancer, Heart
Disease and Diabetes. First experiment is 10 fold cross
validation where hole dataset is used as training data and then
apply the generated rules or network to same data for testing.
While in second case, we use percentage split 80% of the data
are used for the training and the rest of the data are used for
testing. We used WEKA Software tool
[http://www.cs.waikato.ac.nz/~ml/weka/ downloading.html]
for J-48, ANN and Bayesian Classifier for the
implementation in dataset of disease classification. Rough Set
theory is implemented using RSES software tools.
All experiments are performed in Windows-7 platform.
For each case, we reduce three important parameters i.e.
accuracy, coverage factor and time complexity. Accuracy is
defined as the ratio of number correct predicted instance to
the total numbers of predicted instance of testing data.
Coverage Factor can be defined as the numbers of testing
data that can be classified to the total numbers of input testing
data. Time shows the time complexity of each process.
Following tables show the comparative study of different
classification techniques for cross validation and percentage
split for testing new data.
In Rough Set Theory implementation, Exhaustive
Algorithm is used for calculation of reduct and core. The
number reduct for Breast cancer, Heart Disease and Diabetes
for cross validation & percentage split process are 19 & 24,
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 49
109 & 99 and 25 & 26 respectively. It is also found that for
Breast cancer Bare Nuclei is core which inseparable
attribute that denotes the most important factor for the Breast
Disease. So, Rough Set can be used for attribute reduction
and feature selection. Rough Set shows highest accuracy with
coverage factor for the cross validation cases for each of the
disease. But during percentage split process, as the data is not
normalized (for Diabetes) the coverage factor drastically fall
i.e. it cant decide the status of disease which is one of the
disadvantages of this process. Moreover time taken for the
exhaustive reduct calculation is very large comparative to the
others techniques.
With no of fold 10, and confidence factor 0.25, J48
algorithm is implemented for construction of decision tree.
Decision Tree provides vary fast prediction but it has least
accuracy than other approaches. Decision Tree shows that
one directional path or inference for classification of new
data based on different attribute value.
With 1 hidden layer, 0.2 learning rate, 0.2 momentum
and 500 number of iteration, Multilayer Perceptron Neural
Network is used for training testing process. From the above
table is can be observed that it provides acceptable good
accuracy with less time complexity than Rough Set but
higher than the decision Tree and Bayesian Classifier due to
iterative process for weight adjustment and error
minimization. It is interesting to observe that classification
accuracy and coverage area are increasing for percentage of
split experiment.
Using Simple estimator Algorithm to calculate
probability distribution table and CI search algorithm for
structure of network, Bayesian Network is constructed for
each of the disease. Bayesian Network gives the optimum
good result in term of good acceptable accuracy, coverage
area and speed of the process. Moreover from the Bayesian
Network, dependency among the attributes can be easily
observed which an advantage of this process.
From the above two tables, it can noticed that type of
data i.e. normalization of the data set is performed a great roll
for accuracy for classification accuracy and coverage area. In
the data is less normalized the accuracy & coverage area are
less. Moreover, time complexity depends on the number of
instances of dataset during training process. As the number of
instances increase, the time complexity increase with higher
prediction accuracy. Following Figures are shows the each
technique for Breast cancer Data for better understanding of
the whole study while for others disease are not shown for
similar figure.
Figure 1 shows the reduct set related to the breast cancer
where each set of reduct is able to generate a set of
classification rules that are used for classification of the new
dataset or itself.
Figure 2 shows the Feed Forward Neural Network with 1
hidden layer with 9 input and 2 output nodes with different
weight related their edges. These weights are iteratively
optimized for better accuracy. Depending upon the input
values and the weights of the edges between nodes the
decision node will be either benign or malignant.
Figure 3 shows the Decision Tree with different
conditional attribute in parent or children node with certain
binary criterion to move from one node to another. Decision
attributes (benign or malignant) are remaining at leaf nodes.
Figure 4 shows the Bayesian Network, where we
consider the class or decision attribute as virtual attribute or
node. The directed edges show the dependency among the
attributes. Each node consists of probability distribution table
with neighbor node that denotes the probability of influence.

Figure 1: Reduct Set for Breast Cancer Dataset
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 50

Figure 2: ANN for Breast Cancer

Figure 3: Decision Tree for Breast Cancer

Figure 4: Bayesian Network for Breast Cancer
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 51
Table 1: Comparative Study of Performance for Different Classification Technique (Cross Validation)
Disease
No. of
Training
Data
No. of
Testing
Data
No. of
Attributes
Classification
Technique
Type Accuracy
Coverage
Area
Time
(sec)
Breast
Cancer
699 699 10
Rough Set
All attributes are
Normalized
100% 1.00 11.11
Decision Tree 94.56% 0.950 0.050
Neural Network 95.13% 0.989 2.29
Bayesian Classifier 97.13% 0.992 0.03
Heart
Stat log
270 270 14
Rough Set
Few attributes are
normalized
100% 1.00 59.07
Decision Tree 76.67% 0.744 0.02
Neural Network 78.14% 0.803 1.30
Bayesian Classifier 81.11% 0.902 0.02
Diabetes 768 768 9
Rough Set
Not Normalized
100% 1.00 46.14
Decision Tree 73.81% 0.751 0.03
Neural Network 75.78% 0.812 2.09
Bayesian Classifier 74.39% 0.806 0.02
Table 2: Comparative Study of Performance for Different Classification Technique (Testing New Cases)
Disease
No of
Training
Data
No of
Testing
Data
No of
Attributes
Classification
Technique
Type Accuracy
Coverage
Area
Time
(sec)
Breast
Cancer
559 140 10
Rough Set
All attributes are
Normalized
95.9% 0.968 11.68
Decision Tree 92.86% 0.955 0.02
Neural Network 95.71% 0.994 2.12
Bayesian Classifier 96.43% 0.992 0.02
Heart
Stat log
216 54 14
Rough Set
Few attributes are
normalized
68.80% 0.889 40.08
Decision Tree 79.63% 0.726 0.02
Neural Network 85.19% 0.902 1.17
Bayesian Classifier 81.48% 0.906 0.02
Diabetes 614 154 9
Rough Set
Not Normalized
83.33% 0.078 30.35
Decision Tree 75.97% 0.777 0.02
Neural Network 75.97% 0.831 1.97
Bayesian Classifier 82.30% 0.862 0.02

V. CONCLUSIONS
Breast cancer, Diabetes and Heart Disease are the leading
causes of death worldwide and the early prediction of these
diseases are very important task. The computer aided disease
prediction system helps the physician as a tool for disease
diagnosis. In this paper, four intelligence techniques Rough
Set, Decision Tree, Neural Network and Bayesian Classifier
are implemented and tested on the three Data Set of
Throughout the study ten-fold cross validation and
percentage split for training & testing is performed using
RSES and WEKA software tools. From the analysis it is
concluded that, soft computing intelligence techniques plays
a major role in disease classification and prediction. Rough
Set gives very high accuracy but more time requirement for
normalized data. But Neural Network and Bayesian Classifier
are widely used today for their great accuracy, coverage area
and less time requirement. Decision Tree is easy to
understand but give less accuracy. The classification accuracy
can be improved by reduction in features. In spite of lots of
others hybridized intelligence techniques, expert system,
optimization techniques are under research to improve the
accuracy, these techniques are most commonly used for
classification with acceptable accuracy for medical diagnosis
and various algorithms are under research which will improve
the efficiency of these approaches.
REFERENCES
[1] J.D. Cowan & D.H. Sharp (1988), Neural Nets and Artificial
Intelligence, Proceedings of the American Academy of Arts
and Sciences, Vol. 117, No. 1, Pp. 85121.
[2] J.W. Smith, J.E. Everhart, W.C. Dickson, W.C. Knowler &
R.S. Johannes (1988), Using the ADAP Learning Algorithm
to Forecast the Onset of Diabetes Mellitus, Proceedings of the
Symposium on Computer Applications and Medical Care, IEEE
Computer Society Press, Pp. 261265.
[3] R. Detrano, A. Janosi, W. Steinbrunn, M. Pfisterer, J. Schmid,
S. Sandhu, K. Guppy, S. Lee & V. Froelicher (1989),
International Application of a New Probability Algorithm for
the Diagnosis of Coronary Artery Disease, American Journal
of Cardiology, Vol. 64, Pp. 304-310.
[4] O.L. Mangasarian & W.H. Wolberg (1990), Cancer Diagnosis
via Linear Programming, SIAM News, Vol. 23, No. 5, Pp. 1 &
18.
[5] R. Brachman, T. Khabaza, W.Kloesgan, G.Piatetsky Shapiro &
E. Simoudis (1996), Mining Business Databases, Comm.
ACM, Vol. 39, No. 11, Pp. 4248.
[6] Floriana Esposito, Donato Malerba & Giovanni Semeraro
(1997), A Comparative Analysis of Methods for Pruning
Decision Trees, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 19, No. 5, Pp. 476491.
[7] T. Mitchell (1997), Machine Learning, McGraw Hill.
[8] N. Friedman & M. Goldszmidt (1998), Learning Bayesian
Networks with Local Structure, Kluwer Academic Publisher,
Pp. 421459.
The SIJ Transactions on Computer Science Engineering & its Applications (CSEA), Vol. 2, No. 3, May 2014
ISSN: 2321-2381 2014 | Published by The Standard International Journals (The SIJ) 52
[9] N. Friedman, M. Linial, I. Nachman & D. Peer (2000), Using
Bayesian Network to Analyze Expression Data, Journal of.
Computational Biology, Vol. 7, Pp. 601620.
[10] J. Han & M. Kamber (2000), Data Mining; Concepts and
Techniques, Morgan Kaufmann Publishers.
[11] D.R. Masys (2001), Linking Microarray Data to the
Literature, Nature Genetics, Vol. 28, Pp. 910.
[12] R. Nayak, L.C. Jain & B.K.H. Ting (2001), Artificial Neural
Networks in Biomedical Engineering: A Review, Asia-Pacific
Conference on Advance Computation.
[13] H. Midelfart, J. Komorowsk, K. Nrsett, F. Yadetie, A.K.
Sandvik & A. Lgreid (2002), Learning Rough Set Classifiers
from Gene Expression and Clinical Data, Fundamental
Informatica, Pp. 155183.
[14] Z. Pawlak (2002), Rough Set Theory and its Applications,
Journal of Telecommunications and Information Technology,
Pp. 7-10.
[15] E. Hassanien & J.M.H. Ali (2004), Rough Set Approach for
Generation of Classification Rules of Breast Cancer Data,
Informatica, Vol. 15, No. 1, Pp. 2338.
[16] L.B. Gonalves, M. M. B. R. Vellasco, M.A.C. Pacheco & F.J.
De Souza (2005), Inverted Hierarchical Neuro - Fuzzy BSP
System: A Novel Neuro-Fuzzy Model for Pattern Classification
and Rule Extraction in Databases, IEEE Transactions on
Systems, Man, and Cybernetics, Part C: Applications and
Reviews, Vol. 36, No. 2, Pp. 236248,
[17] I.H. Witten & E. Frank (2005), Data Mining Practical
Machine Learning Tools and Techniques, Second Edition,
Morgan Kaufmann Publisher, United States of America.
[18] M. Rocha, Paulo Cortez & Jos Neves (2007), Evolution of
Neural Networks for Classification and Regression,
Neurocomputing, Vol. 70, No. 1618, Pp. 28092816.
[19] Emad W. Saad & Donald C. Wunsch (2007), Neural Network
Explanation using Inversion, Neural Networks, Vol. 20, Pp.
7893.
[20] Y. Zhao & Y. Zhang (2007), Comparison of Decision Tree
Methods for Finding Active Objects, National Astronomical
Observatories, Advances of Space Research, Vol. 41, Pp.
19551959.
[21] A. Darwiche (2010), Review Article on Bayesian Network,
Communication of the ACM, Vol. 53, No. 12, Pp. 8090.
[22] J.M. David & K. Balakrishnan (2010), Machine Learning
Approach for Prediction of Learning Disabilities in School-Age
Children, International Journal of Computer Applications,
Vol. 9, No. 11, Pp. 712.
[23] Mehdi Neshat & Abas E.Zadeh (2010), Hopfield Neural
Network and Fuzzy Hopfield Neural Network for Diagnosis of
Liver Disorders, 5
th
IEEE International Conference Intelligent
Systems (IS), Pp. 162167.
[24] W.N.H.W. Mohamed, M.N.M. Salleh & A.H. Omar (2012), A
Comparative Study of Reduced Error Pruning Method in
Decision Tree Algorithms, IEEE International Conference on
Control System, Computing and Engineering, Pp. 392397.
[25] S. Mandal, G. Saha & RK. Pal (2013), An Approach Towards
Automated Disease Diagnosis & Drug Design using Hybrid
Rough-Decision Tree from Microarray Dataset, Journal of
Computer Science & Systems Biology, Vol. 6, Pp. 337343.
[26] S. Mandal & G. Saha (2013), Rough Set Theory based
Automated Disease Diagnosis using Lung Adenocarcinoma as
a Test Case, The SIJ Transactions on Computer Science
Engineering & its Applications (CSEA), Vol. 1, No. 3, Pp. 75
82.
[27] S. Mandal, G. Saha & RK. Pal (2013A), Reconstruction of
Dominant Gene Regulatory Network from Microarray Data
using Rough Set and Bayesian Approach, Journal of
Computer Science & Systems Biology, Vol. 6, Pp. 262270.
Sudip Mandal. He received the M.Tech. in ECE from Kalyani
Govt. of Engineering College. Recently he has held the position of
Head of Electronics and Communication Engineering Department in
GIMT, India. He is also pursuing Ph.D degree from University of
Calcutta. His current work includes Bioinformatics, Soft computing
and Tomography. The author is also member Computational
Intelligence Society and Man, System & Cybernetics Society of
IEEE. He published 2 National Conference Paper and 5
International Journal so far.
Dr. Goutam Saha. He received Ph.D. from Indian Institute of
Technology, Kharagpur in the area of Applied Biological
Engineering area. He has a patent in development of a novel
bioreactor for animal cell line culture. He was a post doctoral
research fellow from Indian Institute of Technology, Kharagpur and
Ben-Gurion University, Israel. His current area of interest is System
Biology. He was Head of IT Department, Govt. College of
Engineering and Leather Technology, Kolkata, India for many
years. Recently he has held the position of Professor of IT dept in
NEHU, India. He published 2 National/ International Conference
Paper and 6 International Journal so far.
Dr. Rajat Kr. Pal. He received Ph.D. from IIT Kharagpur. Now he
is Associate Professor in University of Calcutta, CSE Dept. His
research of interest includes VLSI, Algorithm, and Graph. He has
published 50 plus no National/ International Journal / Conference
Paper including IEEE, Elsevier and Springer. He is writer of book
Multi- Layer Channel Routing: Complexity and Algorithms. He is
patent holder of Methods and systems configured to compute a
guard zone of a three-dimensional object.

Vous aimerez peut-être aussi