AI - Fuzzy - Decision Trees

Fuzzy Decision Trees
Professor J. F. Baldwin
Classification and Prediction

For classification the universe for the target attribute is
a discrete set.
For prediction the universe for the target attribute is
continuous
For Prediction use fuzzy partition
f2
f3 f4
f5
f1
a
c
T
Arrange fuzzy sets

so there are equal
number of training
data points in each
of intervals
[a, b], [b, c], [c, d],
[d, e]
Target translation for prediction

Training Set
Tr
Translated
Training set
Tr'
This is now a
classification
Set.
A1
A2
An
Pr
a11
a12
a1n
t1
p1
A1
a11
a11
A2
a12
a12
An
a1n
a1n
T
fi
fi+1
Pr
p1fi(t1)
p1fi+1(t1)
Repeat for each row collecting equivalent rows

and adding probabilities
Preparing one attribute reduced

database for continuous attribute
Continuous attribute
Ai
From Tr'
if prediction
Tr if
classification
Pr
Pr(Ai,T)= ... ... Pr(A1..., An ,T)
A1 A i1 A i=1
g2
g1
choose number
of fuzzy sets
a
From Tr'
Ai
gi
g3 g4
Pr
Reduced database
An
c
Ai
g5
d
equal
number
of data
e points
in each
interval
Fuzzy ID3
Using the training set Tr' and one attribute reduced database
for all continuous attributes, we can use the method of ID3
previously given to determine the decision tree for predicting
or classifying the target and also post pruning
We modify the stopping condition. Do not expand node N if
S = Pr(T)Ln{Pr(T)} for that node is < some value v
T
Node N will have probability distribution {gi : i}

You can also limit the depth of the tree to some value.
For example expand tree to depth 4.
Evaluating new case for

classification
Ai
g1
gn
g2
Attribute value, for continuous

attribute will have probability
distribution over {gi}. Only
2 non-zero probabilities
New case will propagate through many branches of the tree

arriving at node Ni with probability i determined by multiplying
the probabilities of all branches to arrive at Ni
Let distributions for leaf nodes be Nj : {ti : ij}
Overall distribution is {t i : j ij}
j
Decision: choose tk where MAX{ j ij} = j kj

i
Evaluating new case for

prediction
Attribute value, for continuous
attribute will have probability
distribution over {gi}. Only
2 non-zero probabilities
Ai
g1
gn
g2
New case will propagate through many branches of the tree

arriving at node Ni with probability i determined by multiplying
the probabilities of all branches to arrive at Ni
Let distributions for leaf nodes be Nj : {fi : ij}
Overall distribution is {f i : jij}
j
predicted value= (f i ){ jij}

i
Fuzzy Sets important for Data

Mining
small
Partition each
universe with
{small, large}
large
small
INCOME
large profit : 0.874
large
profit
small profit : 0.543
OUTGOING
small
income
large
profit : 0.165
INCOME
large profit : 0.543
small
0
0
outgoing
Two crisp sets on each

universe can give at most
only 50%accuracy
We would require 16 crisp
sets on each universe to give
same accuracy as a two
fuzzy set partition
Profit
94.14% correct
Ellipse Example
1.5
illegal
X, Y universes each partitioned

into 5 fuzzy sets
about_-1.5 = [-1.5:1, -0.75: 0]
about_-0.75 = [-1.5:0, -0.75:1, 0:0]
about_0 = [-0.75:0, 0:1, 0.75:0]
about_0.75 = [0:0, 0.75:1, 1.5:0]
about_1.5 = [0.75:0, 1.5: 1]
legal
-1.5
-1.5
1.5
tree learnt on 126 random points from [-1.5,1.5]2
Tree for Ellipse example

about_1.5
L:0 I:1
about_1.5
about_0 .75
L:0.0092 I:0.9908
about_0 .75 L:0.3506 I:0.6494
about_0
L:0.5090 I:0.4910
about_0. 75
L:0.3455 I:0.6545
about_1. 5
L:0.0131 I:0.9869
about_ 1.5
L:0.1352 I:0.8648
L:0.8131 I:0.1869
about_0
L:1 I:0
about_0. 75
L:0.8178 I:0.1822
about_1. 5
L:0.1327 I:0.8673
about_ 1.5
L:0.0109 I:0.9891
about_0 .75
L:0.3629 I:0.6371
about_ 0
L:0.5090 I:0.5910
about_ 0. 75
L:0.3455 I:0.6545
about_ 1. 5
L:0.0131 I:0.9869
about_0 .75
about_0
about_0. 75
about_1.5
L:0 I:1
General Fril Rule

((Classification = legal) if (
((X is about_-1.5))
((X is about_-0.75)& (Y is about_-1.5))
((X is about_-0.75) & (Y is about_-0.75))
((X is about_-0.75) & (Y is about_0))
((X is about_-0.75) & (Y is about_0.75))
((X is about_-0.75) & (Y is about_1.5))
((X is about_0) & (Y is about_-1.5))
((X is about_0) & (Y is about_-0.75))
((X is about_0) & (Y is about_0))
((X is about_0) & (Y is about_0.75))
((X is about_0) & (Y is about_1.5))
((X is about_0.75) & (Y is about_-1.5))
((X is about_0.75) & (Y is about_-0.75))
((X is about_0.75) & (Y is about_0))
((X is about_0.75) & (Y is about_0.75))
((X is about_0.75) & (Y is about_1.5))
((X is about_1.5)))) :
((0 0)(0.0092 0.0092)(0.3506 0.3506)

(0.5090 0.5090)(0.3455 0.3455)(0.0131 0.0131)
(0.1352 0.1352)(0.8131 0.8131)(1 1)
(0.8178 0.8178)(0.1327 0.1327)(0.0109 0.0109)
(0.3629 0.3629)(0.5090 0.5090)(0.3455 0.3455)
(0.0131 0 . 0131)(0 0))
Results
The above tree was tested on 960 points

forming a regular grid on [-1.5,1.5]2
giving 99.168% correct classification.
The control surface for the positive
quadrant
Iris Classification
Data
3 classes - Iris-Setosa ,Iris-Versicolor and Iris-Virginica 50 instances of each class
Attributes
1. sepal length in cm ----universe [4.3, 7.9]
2. sepal width in cm ----universe [2, 4.4]
3. petal length in cm ----universe [1, 6.9]
4. petal width in cm ----universe [0.1, 2.5]
Fuzzy partition of 5 fuzzy sets on each universe
Iris Decision tree

(1 0 0)
v_small4
(1 0 0)
v_small3
3
small3
small4
(0 1 0)
{medium3,large3}
v_large3 (0 0 1)
4
(0 1 0)
{v_small2,small2}
(0.93 0.07 0)
{medium2,large2,v_large2}
(0 0.92 0.08)
3 {v_small3,small3,medium3,large3}
medium4
v_large3
(0 0.08 0.92)
v_small3
(0.33 0.33 0.33)
(0 1 0)
small3
3
large4
2
medium3
(0 0.81 0.19)
{med2,large2}
(0.33 0.33 0.33)
v_large2
2
large3
(0 0.27 0.73)
1 v_small1
(0 0.62 0.38)
{v_small2,small2} {small1,med1,large1}
v_large1 (0 0.95 0.05)
(0 0.27 0.73)
(v_small2,small2}
(0 1 0)
v_small1
1
(0 0.43 0.56)
{small1,med1}
med2
(0 0.35 0.65)
{large1,v_large1}
(0.33 0.33 0.33)
v_small1
(0 0.97 0.03)
small1
1
med1
large2
(0 0.74 0.26)
Gives
98.667%
accuracy
on test
data
(0 0.41 0.59)
large1
v_large1
v_large2
(0 0.12 0.88)
(0 0 1)
(0 0.02 0.98)
v_large3
( 0 0 1)
v_large5
Diabetes in Pima Indians

Diabetes mellitus in the Pima Indian population living
near Phoenix Arizona - 5 fuzzy sets used for each attribute
Data
768 over 21 yrs females - 384 training, 384 test classes Attributes
1 Number of times pregnant
2 Plasma glucose concentration
3 Diastolic blood pressure
4 Triceps skin fold thickness
5 2-Hour serum insulin
6 Body mass index
7 Diabetes pedigree function
8 Age
The decision tree was

generated
to a maximum depth of
4 given a tree of
161 branches.
This gave an accuracy of
81.25% on the training
set and 79.9% on the test
set.
Forward pruning algorithm the tree complexity is halved to

80 branches. This reduced tree gives an accuracy of
80.46% on the training set and 78.38% on the test set.
Post pruning reduces the complexity to 28 branches giving
78.125% on the training set and 78.9% on the test set
Diabetes Tree
(nd:0.99 d:0.01)
v_small8
(nd:0.09 d:0.91)
small8
(nd:0.3 d:0.7)
medium8
(nd:0.5 d:0.5)
{large8,v_large8}
(nd:0.96 d:0.04)
(nd:0.89 d:0.11)
(nd:0.6 d:0.4)
{v_small8,v_large8}
8
v_small7
(nd:0.65 d:0.35)
(nd:0.39 d:0.6)
small8
7 {small7,medium7}
medium8
(nd:0.88 d:0.12)
large7
(nd:0.58 d:0.42)
(nd:0.5 d:0.5)
large8
v_large7
(nd:0.22 d:0.78)
v_small3
3
(nd:0.68 d:0.32)
{small3,v_large3}
v_small8
(nd:0.74 d:0.26)
{medium3,large3}
8
(nd:0.29 d:0.71)
v_small6
(nd:0.64 d:0.36)
6
small6
8
v_small2
small2
2
medium2
large2
(nd:0.45 d:0.55)
(medium6,large6)
(nd:0.05 d:0.95)
v_large6
(nd:0.39 d:0.61)
7 {v_small7,small7}
(nd:0.44 d:0.56)
medium8
medium7
(nd:0.92 d0.08)
{large7,v_large7}
(nd:0.56 d:0.44)
5 v_small5 (nd:0.31 d:0.69)
large8
small5
(nd:0.03 d:0.97)
(nd:0.55 d:0.45) (medium5,large5,v_large5}
v_large8
small8
3
v_large2
(nd:0.09 d:0.91)
{v_small3,small3,v_large3}
(nd:0.29 d:0.71)
{medium3,large3}
Decision Tree for Pima Indian Problem
SIN XY Prediction Example

2
database
consists of 528 triples
(X, Y, sin XY)
where the pairs (X, Y)
form a regular grid on [0, 3]
class_ 1
class _2
class_ 3
class_4
class_5
about_ 0
about_0.3333
about_ 0.6667
about _ 1
about_ 1.333
about_1.667
about _ 2
about _2.333
about _ 2.6667
about _ 3
= [0:1 0.333333:0 ]
= [0:0 0.333333:1 0.666667:0]
= [0.333333:0 0.666667:1 1:0]
= [0.666667:0 1:1 1.33333:0]
= [1:0 1.33333:1 1.66667:0]
= [1.33333:0 1.66667:1 2:0]
= [1.66667:0 2:1 2.33333:0]
= [2:0 2.33333:1 2.66667:0]
= [2.33333:0 2.66667:1 3:0]
= [2.66667:0 3:1 ]
= [-1:1 0:0]
= [-1:0 0:1 0.380647:0]
= [0:0 0.380647:1 0.822602:0]
= [0.380647:0 0.822602:1 1:0]
= [0.822602:0 1:1]
Fuzzy ID3 decision tree with 100

branches
sinxy
Percentage error of
4.22% on a regular
test set of
1023 points.
control surface

AI - Fuzzy - Decision Trees

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

AI - Fuzzy - Decision Trees

Transféré par

Droits d'auteur :

Formats disponibles

Fuzzy Decision Trees

Classification and Prediction

Arrange fuzzy sets

Target translation for prediction

Repeat for each row collecting equivalent rows

Preparing one attribute reduced

Node N will have probability distribution {gi : i}

Evaluating new case for

Attribute value, for continuous

New case will propagate through many branches of the tree

Decision: choose tk where MAX{ j ij} = j kj

Evaluating new case for

New case will propagate through many branches of the tree

predicted value= (f i ){ jij}

Fuzzy Sets important for Data

large profit : 0.874

small profit : 0.543

large profit : 0.543

Two crisp sets on each

X, Y universes each partitioned

tree learnt on 126 random points from [-1.5,1.5]2

Tree for Ellipse example

General Fril Rule

((0 0)(0.0092 0.0092)(0.3506 0.3506)

The above tree was tested on 960 points

Iris Decision tree

(0.33 0.33 0.33)

Diabetes in Pima Indians

The decision tree was

Forward pruning algorithm the tree complexity is halved to

Decision Tree for Pima Indian Problem

SIN XY Prediction Example

Fuzzy ID3 decision tree with 100

Vous aimerez peut-être aussi