ICSAP '10 Classification Rule Construction Using Particle Swarm Optimization Algorithm for Breast Cancer Data Sets INTRODUCTION The results show that Particle Swarm Optimization (PSO) is very competitive in terms of accuracy PSO produces significantly simpler (smaller) rule sets
GENETIC ALGORITHM (1/2) Genetic algorithms (GA) are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover. GA involves three phases: Initialization, Operation, Termination. GENETIC ALGORITHM (2/2) The different Particle Swarm Data Mining Algorithms were implemented and tested against a GA and a Tree Induction Algorithm. From the obtained results, PSO proved to be a suitable candidate for classification tasks. The second, PSO are competitive, not only with other evolutionary techniques [1]. PITTSBURGH LEARNT FUZZY RULE BASE FOR FEATURE SUBSET SELECTION (1/4) This module investigates the problem of feature subset selection as a pre-processing step to a method which learns fuzzy rule bases using GA implementing the Pittsburgh approach [5,6]. And can improve an optimal learning solution of GA with applying PSO and Pittsburg Fuzzy Rule Algorithm to mutation procedure on GAs differentiation to obtain optimal solution. PITTSBURGH LEARNT FUZZY RULE BASE FOR FEATURE SUBSET SELECTION (2/4) STEP 1: Choose initial population. STEP 2: Define generation. STEP 3: Evaluate the fitness of each individual in the population STEP 4: Assign a mutation threshold value (5%) STEP 5: Breed new generation through crossover. STEP 6: In the uniform crossover scheme (UX) individual bits in the string are compared between two parents. The bits are swapped with a fixed probability, typically 0.05. STEP 7: Repeat until termination STEP 8: Select best-ranking individuals . Select worst ranking individuals. STEP 9:Repeat the algorithm for collection of data sets. STEP10:Eliminate the worst case individuals. STEP11:The irrelevant and redudant data sets is removed. STEP12:Construct rules from new data sets. PITTSBURGH LEARNT FUZZY RULE BASE FOR FEATURE SUBSET SELECTION (3/4) The Pittsburgh approach uses GA to generate a suitable fuzzy rule base that correctly classify the data set. For the Pittsburgh approach the entire fuzzy rule base is coded in each chromosome. This implies that when determining the fitness associated to each chromosome, an entire version of the rule base will be evaluated. PITTSBURGH LEARNT FUZZY RULE BASE FOR FEATURE SUBSET SELECTION (4/4) The learning process : Uses a GA for evolving a set of fuzzy rules Uses GA for excluding the unnecessary and redundant rules created during the first step was associated to a number of fuzzy sets that represent the linguistic values the feature can assume. DEVELOPING CLASSIFICATION RULES USING PARTICLE SWARM OPTIMIZATION ALGORITHM (1/3) In classification the knowledge or patterns discovered in the data set can be represented in terms of a set of rules.
IF <attrib = value> AND ... AND <attrib = value> THEN <class>
The knowledge representation in the form of rules has the advantage of being intuitively comprehensible to the user. DEVELOPING CLASSIFICATION RULES USING PARTICLE SWARM OPTIMIZATION ALGORITHM (2/3) The PSO algorithm steps: Evaluate the fitness of each particle Update individual and global best fitnesses and positions Update velocity and position of each particle DEVELOPING CLASSIFICATION RULES USING PARTICLE SWARM OPTIMIZATION ALGORITHM (3/3) For each particle Initialize particle END
Do For each particle Calculate fitness value If the fitness value is better than the best fitness value (pBest) in history set current value as the new pBest End
Choose the particle with the best fitness value of all the particles as the gBest For each particle Calculate particle velocity Update particle position End While maximum iterations or minimum error criteria is not attained CONCLUSION The problem of learning fuzzy rule bases using the Pittsburgh approach is to pre-process the data, by selecting the relevant subset of features. This proposed work can help to produce smaller fuzzy rule base, with higher accuracy. This method is implemented in Breast Cancer data sets. The resulted data sets for classification using PSO Algorithm, we can conclude, maximum number of iterations provide better success rates. REFERENCES [1]. Tiago Sousa , Arlindo Silva , Ana Neves Particle Swarm based Data Mining Algorithms for classification tasks parallelcomputing30(2004)767783. [2]. Pablo A.D.de Castro,Daniel M.Santoro,Heloisa A.Camargo,Maria C.Nicoletti improving a pittsburgh learnt fuzzy rule base using feature subsetselection. Proceedings of The Fourth International Conference On Hybrid Intelligent Systems, His04 [3] Hisao Ishibuchi, Tomoharu Nakashima, and Tetsuya Kuroda .a hybrid fuzzy genetics-based machine learning algorithm:hybridization of michigan approach and pittsburgh approach,.Pattern Recognition, vol. 37, no. 6, 2005, pp. 12871298 [4]. Pablo A. D. Castro, Heloisa A. Camargo learning and optimization of fuzzy rule base by means of self-adaptive genetic algorithm, ,IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 29, no. 4, Nov 2007, pp. 607612 [5]. Kosuke Yamamoto, Hiroharu Kawanaka, Tomohiro Yoshikawa, Tsuyoshi Shinogi, Shinji Tsuruok the effects of inactivation of rules for knowledge acquisition, Proceedings of the 2001 IEEE International Conference on Systems, Man and Cybernetics, Volume:3, 2001, page 1612-1617. [6]. Jaume Bacardit, Natalio Krasnogor smart crossover operator with multiple parents for a pittsburgh learning classifier system, IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 15, no. 11, Nov 1993, pp. 11481160. Thank you