DAA Project

Decision Tree Algorithm
Muhammad Monjurul Islam 11-95100-3 Jafiul Mahmud 10-95000-3
ID3 Decision Tree Algorithm

Iterative Dichotomiser 3 or ID3 builds a decision tree from a fixed set of examples. The resulting tree is used to classify future samples. The example has several attributes and belongs to a class (like yes or no). we will the attribute selection procedure uses in ID3 algorithm.
ID3 Decision Tree Algorithm (cont)

Attribute selection section has been divided into basic information related to data set, entropy and information gain has been discussed and few examples have been used to show How to calculate entropy and information gain using example data.
Attribute selection
Attribute selection is the fundamental step to construct a decision tree. There two term Entropy and Information Gain is used to process attribute selection, using attribute selection ID3 algorithm select which attribute will be selected to become a node of the decision tree and so on. Before dive into deep need to introduce few terminology used in attribute selection process.
Attribute selection (cont.)
Attributes
In the above table, Day, Outlook, Temperature, Humidity, Wind, Play ball denotes as attributes.
Class(C) or Classifier
Among these attributes Play ball refers as Class(C) or Classifier. Because based on Outlook, Temperature, Humidity and Wind we need to decide whether we can Play ball or not, thats why Play ball is a classifier to make decision.
Collection (S)
All the records in the table refer as Collection (S).
Entropy Calculation
Entropy is one kind of measurement procedure in information theory. In here, we will see how to calculate Entropy of given set of data. The test data used in here is fig 1 data,
Entropy Calculation (cont.)

Entropy(S) = n=1-p(I)log2p(I) p(I) refers to the proportion of S belonging to class I i.e. in the following table we have two kinds of class {No, Yes} with {5, 9}, the collection size is S=14. So the p(I) over C for the Entire collection is No (5/14) and Yes (9/14).

log2p(I), refers to the log2(5/14) and log2(9/14) over C. is over c i.e. summation of all the classifier items. In this case summation of p(No/S)log2p(No/S) and p(Yes/S)log2p(Yes/S) = -p(No/S)log2p(No/S) + -p(Yes/S)log2p(Yes/S)

Entropy(S) = -p(I)log2p(I) =) Entropy(S) = -p(No/S)log2p(No/S) + p(Yes/S)log2p(Yes/S) =) Entropy(S) = ((-(5/14)log2(5/14)) + (-(9/14)log2(9/14)) ) =) Entropy(S) = (-0.35714 x -1.485427) + (-0.64285 x 0.63743) =) Entropy(S) = 0.5305096 + 0.40977 = 0.940
So the Entropy of S is 0.940

Entropy is 0 if all members of S belong to the same class (the data is perfectly classified). The range of entropy is 0 ("perfectly classified") to 1 ("totally random").
Information Gain G(S,A)

Information gain is the procedure to select a particular attribute to be a decision node of a decision tree. Information gain is G(S,A) where S is the collection of the data in the data set and A is the attribute for which information gain will be calculated over the collection S. So if Gain(S, Wind) then it refers gain of Wind over S.
Information Gain G(S,A) (cont.)

Gain(S, A) = Entropy(S) ( ( |Sv|/|S| ) x Entropy(Sv) ) Where, S is the total collection of the records. A is the attribute for which gain will be calculated. v is all the possible of the attribute A, for instance in this if we consider Windy attribute then the set of v is { Weak, Strong }.

Sv is the number of elements for each v for instance Sweak = 8 and Sstrong = 6. is the summation of ( ( |Sv|/|S| ) x Entropy(Sv) ) for all the items from the set of v i.e. ( ( |Sweak|/|S| ) x Entropy(Sweak) ) + ( ( |Sstrong|/|S| ) x Entropy(Sstrong) )

if we want to calculate information gain of Wind over the collection set S using following formula,
Gain(S, A) = Entropy(S)-( ( |Sv|/|S| ) x Entropy(Sv) )

then the it will be as below, =) Gain(S, Wind) = Entropy(S) - ( ( |Sweak|/|S| ) x Entropy(Sweak) ) - ( ( |Sstrong|/|S| ) x Entropy(Sstrong) )

=) Gain(S, Wind) = 0.940 - ( (8/14 ) x Entropy(Sweak) ) - ( ( 6/14 ) x Entropy(Sstrong) ) Where
Entropy(Sweak) = -p(I)log2p(I)=( - ( (2/8)xlog2(2/8) ) ) + ( - ( (6/8)xlog2(6/8) ) ) Entropy(Sstrong) = -p(I)log2p(I)=( - ( (3/6)xlog2(3/6) ) ) +( - ( (3/6)xlog2(3/6) ) )

So Gain(S, Wind) will be as below, =) Gain(S, Wind) = 0.940 - ( (8/14 ) x Entropy(Sweak) ) - ( ( 6/14 ) x Entropy(Sstrong) ) =) Gain(S,Wind) = 0.940 - Value Of Sweak - Value Of Sstrong =) Gain(S, Wind) = 0.048
So the information gain of Wind over S is 0.048.

DAA Project

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

DAA Project

Transféré par

Droits d'auteur :

Formats disponibles

Decision Tree Algorithm

Muhammad Monjurul Islam 11-95100-3 Jafiul Mahmud 10-95000-3

ID3 Decision Tree Algorithm

ID3 Decision Tree Algorithm (cont)

Attribute selection (cont.)

All the records in the table refer as Collection (S).

Entropy Calculation (cont.)

Entropy Calculation (cont.)

Entropy Calculation (cont.)

So the Entropy of S is 0.940

Entropy Calculation (cont.)

Information Gain G(S,A)

Information Gain G(S,A) (cont.)

Information Gain G(S,A) (cont.)

Information Gain G(S,A) (cont.)

Gain(S, A) = Entropy(S)-( ( |Sv|/|S| ) x Entropy(Sv) )

Information Gain G(S,A) (cont.)

Information Gain G(S,A) (cont.)

Vous aimerez peut-être aussi