Académique Documents
Professionnel Documents
Culture Documents
1
Abstract
The objective of this paper is testing different Bayesian Networks based
classifiers, and comparing the results between them. The experiments
that were realized to write this paper were done with the laboratory soft-
ware BayesiaLab[4], and the dataset that was used was IRIS dataset[2],
explained later.
2
Contents
1 Introduction 4
2 Dataset 4
3 Experiment 4
3.1 Naive Bayes and Sons and Spouses . . . . . . . . . . . . . . 4
3.2 Augmented Naive Bayes . . . . . . . . . . . . . . . . . . . . 7
3.3 Markov blanket, Augmented Markov blanket and minimal
augmented markov blanket . . . . . . . . . . . . . . . . . . 7
3.4 Semisupervised Learning . . . . . . . . . . . . . . . . . . . . 7
4 Conclusions 9
4.1 Subsection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1.1 Subsubsection . . . . . . . . . . . . . . . . . . . . . . 9
3
1 Introduction
The Bayesian Networks belong to the family of probabilistic graphical
models that are used to represent knowledge of a certain domain. Each
node in the graph represents a random variable, and the edges between
them represent their probabilistic dependences. In this experiment we are
going to compare different classifiers based on Bayesian networks (naive
Bayes, augmented naive Bayes, sons and spouses, Markov blanket, aug-
mented Markov blanket, minimal augmented Markov blanket, semisuper-
vised learning).
For this purpose we are going to use the software BayesiaLab under demo
version[4]. BayesiaLab is a bayesian network publishing and automatic
learning program which represents expert knowledge and allows one to
find it among a mass of data.
For testing and comparing the mentioned classifiers we are also going need
a dataset suited for the classification task. This dataset is defined in the
following section of the paper.
2 Dataset
The IRIS Dataset[2] is perhaps the best known database to be found in
the pattern recognition literature. Fisher’s paper is a classic in the field
and is referenced frequently to this day. The data set contains 3 different
classes of 50 instances each, where each class refers to a type of iris plant.
One class is linearly separable from the other 2; the latter are not linearly
separable from each other. About the attributes, it has 4 input attributes,
all of them expressed in centimeters, being the length and the width of
the petal and the sepal.
The formating made to the data file found in the UCI Repository only con-
sisted of adding a new header line with the names of the attributes. The
dataset was saved to a .data file, and later imported from BayesiaLab[4]
software.
3 Experiment
In this section of the paper we are going to see the experiment results.
First of all we have to load the dataset into the software, which applies
a KMeans algorithm with 4 intervals to discretize the data. This process
can be seen in the table (1).
Once we have imported the dataset we have ready and empy Bayesian
Network (Edgeless)(2) ready to apply the different learners that we are
going to work with.
4
Figure 1: Table with the importing dataset report.
5
Figure 3: Naive Bayes and Sons and Spouses trained Bayesian network.
6
P(C —PL,PW,SL,SW).
The results obtained from this learning get a precision of 100% for the
iris setosa, a 90% for the iris versicolor and a 94% for the iris virginica,
obtaining a total presicion of 94.67%. In a later section this results will
be compared with the results of the other learners.
Using the Sons and Spouses learner you obtain as sons of the class at-
tribute a subset of nodes which can have other parents, but in this case
this learner gives the very same solution than the first one(3), since its
a simple solution with a very high performance, having exactly the same
dependencies and precision P(C —PL,PW,SL,SW).
7
Figure 5: Markov Blanket, Augmented Markov Blanket and Minimal Aug-
mented Markov Blanket trained Bayesian network.
8
is again Petal Width(PW), but this node have relations with the other
attributes. The dependencies between the attributes are the following
P(C—PW), P(PW—PL) and P(PL—SW, SL).
About the performance, this algorithm gives the exact same results than
the previous Markov blanket algorithm.
4 Conclusions
Here goes the text.
S = πr2 (1)
One can refer to equations like this: see equation (1). One can also refer
to sections in the same way: see section 4.1. Or to the bibliography like
this: [1].
4.1 Subsection
More text.
4.1.1 Subsubsection
More text.
References
[1] Author, Title, Journal/Editor, (year)
[2] Fisher, IRIS Dataset, UCI Machine learning repository, (1936)
[3] Ben-Gal I., Bayesian Networks, Encyclopedia of Statistics in Quality
and Reliability, Wiley and Sons, (2007)
[4] Bayesia SAS BayesiaLab 5.0 Demo Version, (2010-2011)