Vous êtes sur la page 1sur 9

Fuzzy Decision Trees

Professor J. F. Baldwin

Classification and Prediction


For classification the universe for the target attribute is
a discrete set.
For prediction the universe for the target attribute is
continuous
For Prediction use fuzzy partition
f2
f3 f4
f5
f1
a

c
T

Arrange fuzzy sets


so there are equal
number of training
data points in each
of intervals
[a, b], [b, c], [c, d],
[d, e]

Target translation for prediction


Training Set
Tr

Translated
Training set
Tr'
This is now a
classification
Set.

A1

A2

An

Pr

a11

a12

a1n

t1

p1

A1
a11
a11

A2
a12
a12

An
a1n
a1n

T
fi
fi+1

Pr
p1fi(t1)
p1fi+1(t1)

Repeat for each row collecting equivalent rows


and adding probabilities

Preparing one attribute reduced


database for continuous attribute
Continuous attribute

Ai

From Tr'
if prediction
Tr if
classification

Pr
Pr(Ai,T)= ... ... Pr(A1..., An ,T)
A1 A i1 A i=1

g2

g1

choose number
of fuzzy sets

a
From Tr'

Ai
gi

g3 g4

Pr
Reduced database

An

c
Ai

g5
d

equal
number
of data
e points
in each
interval

Fuzzy ID3
Using the training set Tr' and one attribute reduced database
for all continuous attributes, we can use the method of ID3
previously given to determine the decision tree for predicting
or classifying the target and also post pruning
We modify the stopping condition. Do not expand node N if
S = Pr(T)Ln{Pr(T)} for that node is < some value v
T

Node N will have probability distribution {gi : i}


You can also limit the depth of the tree to some value.
For example expand tree to depth 4.

Evaluating new case for


classification
Ai
g1

gn
g2

Attribute value, for continuous


attribute will have probability
distribution over {gi}. Only
2 non-zero probabilities

New case will propagate through many branches of the tree


arriving at node Ni with probability i determined by multiplying
the probabilities of all branches to arrive at Ni
Let distributions for leaf nodes be Nj : {ti : ij}
Overall distribution is {t i : j ij}
j

Decision: choose tk where MAX{ j ij} = j kj


i

Evaluating new case for


prediction
Attribute value, for continuous
attribute will have probability
distribution over {gi}. Only
2 non-zero probabilities

Ai
g1

gn
g2

New case will propagate through many branches of the tree


arriving at node Ni with probability i determined by multiplying
the probabilities of all branches to arrive at Ni
Let distributions for leaf nodes be Nj : {fi : ij}
Overall distribution is {f i : jij}
j

predicted value= (f i ){ jij}


i

Fuzzy Sets important for Data


Mining
small

Partition each
universe with
{small, large}

large

small

INCOME

large profit : 0.874

large

profit

small profit : 0.543

OUTGOING

small

income

large

profit : 0.165

INCOME

large profit : 0.543

small

0
0

outgoing

Two crisp sets on each


universe can give at most
only 50%accuracy
We would require 16 crisp
sets on each universe to give
same accuracy as a two
fuzzy set partition

Profit

94.14% correct

Ellipse Example
1.5

illegal

X, Y universes each partitioned


into 5 fuzzy sets
about_-1.5 = [-1.5:1, -0.75: 0]
about_-0.75 = [-1.5:0, -0.75:1, 0:0]
about_0 = [-0.75:0, 0:1, 0.75:0]
about_0.75 = [0:0, 0.75:1, 1.5:0]
about_1.5 = [0.75:0, 1.5: 1]

legal

-1.5
-1.5

1.5

tree learnt on 126 random points from [-1.5,1.5]2

Tree for Ellipse example


about_1.5

L:0 I:1
about_1.5

about_0 .75

L:0.0092 I:0.9908
about_0 .75 L:0.3506 I:0.6494
about_0
L:0.5090 I:0.4910
about_0. 75
L:0.3455 I:0.6545
about_1. 5
L:0.0131 I:0.9869

about_ 1.5

L:0.1352 I:0.8648
L:0.8131 I:0.1869
about_0
L:1 I:0
about_0. 75
L:0.8178 I:0.1822
about_1. 5
L:0.1327 I:0.8673
about_ 1.5
L:0.0109 I:0.9891
about_0 .75
L:0.3629 I:0.6371
about_ 0
L:0.5090 I:0.5910
about_ 0. 75
L:0.3455 I:0.6545
about_ 1. 5
L:0.0131 I:0.9869
about_0 .75

about_0

about_0. 75

about_1.5
L:0 I:1

General Fril Rule


((Classification = legal) if (
((X is about_-1.5))
((X is about_-0.75)& (Y is about_-1.5))
((X is about_-0.75) & (Y is about_-0.75))
((X is about_-0.75) & (Y is about_0))
((X is about_-0.75) & (Y is about_0.75))
((X is about_-0.75) & (Y is about_1.5))
((X is about_0) & (Y is about_-1.5))
((X is about_0) & (Y is about_-0.75))
((X is about_0) & (Y is about_0))
((X is about_0) & (Y is about_0.75))
((X is about_0) & (Y is about_1.5))
((X is about_0.75) & (Y is about_-1.5))
((X is about_0.75) & (Y is about_-0.75))
((X is about_0.75) & (Y is about_0))
((X is about_0.75) & (Y is about_0.75))
((X is about_0.75) & (Y is about_1.5))
((X is about_1.5)))) :

((0 0)(0.0092 0.0092)(0.3506 0.3506)


(0.5090 0.5090)(0.3455 0.3455)(0.0131 0.0131)
(0.1352 0.1352)(0.8131 0.8131)(1 1)
(0.8178 0.8178)(0.1327 0.1327)(0.0109 0.0109)
(0.3629 0.3629)(0.5090 0.5090)(0.3455 0.3455)
(0.0131 0 . 0131)(0 0))

Results

The above tree was tested on 960 points


forming a regular grid on [-1.5,1.5]2
giving 99.168% correct classification.
The control surface for the positive
quadrant

Iris Classification
Data
3 classes - Iris-Setosa ,Iris-Versicolor and Iris-Virginica 50 instances of each class
Attributes
1. sepal length in cm ----universe [4.3, 7.9]
2. sepal width in cm ----universe [2, 4.4]
3. petal length in cm ----universe [1, 6.9]
4. petal width in cm ----universe [0.1, 2.5]
Fuzzy partition of 5 fuzzy sets on each universe

Iris Decision tree


(1 0 0)

v_small4

(1 0 0)
v_small3
3
small3

small4

(0 1 0)
{medium3,large3}
v_large3 (0 0 1)
4

(0 1 0)
{v_small2,small2}
(0.93 0.07 0)
{medium2,large2,v_large2}

(0 0.92 0.08)
3 {v_small3,small3,medium3,large3}
medium4
v_large3

(0 0.08 0.92)

v_small3

(0.33 0.33 0.33)

(0 1 0)
small3
3
large4

2
medium3

(0 0.81 0.19)
{med2,large2}
(0.33 0.33 0.33)
v_large2

2
large3

(0 0.27 0.73)
1 v_small1
(0 0.62 0.38)
{v_small2,small2} {small1,med1,large1}
v_large1 (0 0.95 0.05)

(0 0.27 0.73)
(v_small2,small2}
(0 1 0)
v_small1
1
(0 0.43 0.56)
{small1,med1}
med2
(0 0.35 0.65)
{large1,v_large1}
(0.33 0.33 0.33)
v_small1
(0 0.97 0.03)
small1
1

med1

large2

(0 0.74 0.26)

Gives
98.667%
accuracy
on test
data

(0 0.41 0.59)
large1
v_large1

v_large2

(0 0.12 0.88)

(0 0 1)

(0 0.02 0.98)
v_large3

( 0 0 1)
v_large5

Diabetes in Pima Indians


Diabetes mellitus in the Pima Indian population living
near Phoenix Arizona - 5 fuzzy sets used for each attribute
Data
768 over 21 yrs females - 384 training, 384 test classes Attributes
1 Number of times pregnant
2 Plasma glucose concentration
3 Diastolic blood pressure
4 Triceps skin fold thickness
5 2-Hour serum insulin
6 Body mass index
7 Diabetes pedigree function
8 Age

The decision tree was


generated
to a maximum depth of
4 given a tree of
161 branches.
This gave an accuracy of
81.25% on the training
set and 79.9% on the test
set.

Forward pruning algorithm the tree complexity is halved to


80 branches. This reduced tree gives an accuracy of
80.46% on the training set and 78.38% on the test set.
Post pruning reduces the complexity to 28 branches giving
78.125% on the training set and 78.9% on the test set

Diabetes Tree
(nd:0.99 d:0.01)
v_small8
(nd:0.09 d:0.91)
small8
(nd:0.3 d:0.7)
medium8
(nd:0.5 d:0.5)
{large8,v_large8}
(nd:0.96 d:0.04)
(nd:0.89 d:0.11)
(nd:0.6 d:0.4)
{v_small8,v_large8}
8
v_small7
(nd:0.65 d:0.35)
(nd:0.39 d:0.6)
small8
7 {small7,medium7}
medium8
(nd:0.88 d:0.12)
large7
(nd:0.58 d:0.42)
(nd:0.5 d:0.5)
large8
v_large7
(nd:0.22 d:0.78)
v_small3
3
(nd:0.68 d:0.32)
{small3,v_large3}
v_small8
(nd:0.74 d:0.26)
{medium3,large3}
8
(nd:0.29 d:0.71)
v_small6
(nd:0.64 d:0.36)
6
small6
8

v_small2

small2
2

medium2

large2

(nd:0.45 d:0.55)
(medium6,large6)
(nd:0.05 d:0.95)
v_large6
(nd:0.39 d:0.61)
7 {v_small7,small7}
(nd:0.44 d:0.56)
medium8
medium7
(nd:0.92 d0.08)
{large7,v_large7}
(nd:0.56 d:0.44)
5 v_small5 (nd:0.31 d:0.69)
large8
small5
(nd:0.03 d:0.97)
(nd:0.55 d:0.45) (medium5,large5,v_large5}
v_large8
small8

3
v_large2

(nd:0.09 d:0.91)
{v_small3,small3,v_large3}
(nd:0.29 d:0.71)
{medium3,large3}

Decision Tree for Pima Indian Problem

SIN XY Prediction Example


2

database
consists of 528 triples
(X, Y, sin XY)
where the pairs (X, Y)
form a regular grid on [0, 3]

class_ 1
class _2
class_ 3
class_4
class_5

about_ 0
about_0.3333
about_ 0.6667
about _ 1
about_ 1.333
about_1.667
about _ 2
about _2.333
about _ 2.6667
about _ 3

= [0:1 0.333333:0 ]
= [0:0 0.333333:1 0.666667:0]
= [0.333333:0 0.666667:1 1:0]
= [0.666667:0 1:1 1.33333:0]
= [1:0 1.33333:1 1.66667:0]
= [1.33333:0 1.66667:1 2:0]
= [1.66667:0 2:1 2.33333:0]
= [2:0 2.33333:1 2.66667:0]
= [2.33333:0 2.66667:1 3:0]
= [2.66667:0 3:1 ]

= [-1:1 0:0]
= [-1:0 0:1 0.380647:0]
= [0:0 0.380647:1 0.822602:0]
= [0.380647:0 0.822602:1 1:0]
= [0.822602:0 1:1]

Fuzzy ID3 decision tree with 100


branches
sinxy

Percentage error of
4.22% on a regular
test set of
1023 points.

control surface

Vous aimerez peut-être aussi