Académique Documents
Professionnel Documents
Culture Documents
CIS 467
Yarmouk University
Department of Computer Information Systems
١
Algorithm for Decision Tree Induction
٢
Information Gain (ID3/C4.5)
p p n n
I ( p, n ) = − log 2 − log 2
p+n p+n p+n p+n
pi + ni
ν
E ( A) = ∑ I ( pi , ni )
i =1 p + n
z The encoding information that would be gained by
branching on A
Gain( A) = I ( p, n) − E ( A)
٣
Example 1 : Training Dataset
This
follows an
example
from
Quinlan’s
ID3
Hence
Gain(age) = I ( p, n) − E (age)
Similarly
age pi ni I(pi, ni) Gain(income) = 0.029
Youth 2 3 0.971 Gain( student ) = 0.151
Middle_Aged 4 0 0 Gain(credit _ rating ) = 0.048
Sinior 3 2 0.971
٤
Output: A Decision Tree for “buys_computer”
age?
Youth
overcast
Middle-aged Sinior
no yes no yes
٥
Extracting Classification Rules from Trees