Vous êtes sur la page 1sur 7

Artificial Neural Network Weights Optimization Using ICA, GA, ICA-GA and R-ICA-GA: Comparing Performances

Vahid Khorani
Islamic Azad University, Qazvin Branch Qazvin, Iran Kh.v.1993@gmail.com

Nafiseh Forouzideh
University of Tehran, Kish International Campus Kish Island, Iran n.forouzideh@gmail.com

Ali Motie Nasrabadi


Shahed University, Biomedical Engineering Department Tehran, Iran nasrabadi@shahed.ac.ir

AbstractArtificial neural networks (ANN) and evolutionary algorithms are two relatively young research areas that were subject to a steadily growing interest during the past years. This paper examines the use of different evolutionary algorithms, imperialist competitive algorithm (ICA), genetic algorithm (GA), ICA-GA and recursive ICA-GA (R-ICA-GA) to train a classification problem on a multi layer perceptron (MLP) neural network. All of named evolutionary training algorithms are compared together in this paper. The first goal of the paper is to apply new evolutionary optimization algorithms ICA-GA and RICA-GA for training the ANN and the second goal of the paper is to compare different evolutionary algorithms. It is shown that the ICA-GA has the best performance, in number of epochs, compared to the other algorithms. For this purpose, learning algorithms are applied on six known datasets (WINE, PIMA, WDBC, IRIS, SONAR and GLASS) which are used for classification problems. ICA, GA, ICA-GA, R-ICA-GA, evolutionary algorithms. ANN, optimization, hybrid

minimization algorithms have heuristic strategies to help escape from local minimums [2]. Evolutionary optimization algorithms such as GA and ICA are global minimization algorithms which are used to train the neural networks [9]. The optimization method used to determine weight adjustments has a large influence on the performance of neural networks [11]. In this paper two new hybrid evolutionary algorithms ICA-GA and R-ICA-GA are tested and compared to other evolutionary algorithms in classification problems. Results of four algorithms ICA, GA, ICA-GA and R-ICA-GA are compared together to show which algorithm has better performance in training the ANN. This paper is organized as follow: Section II provides a brief description of evolutionary algorithms ICA, GA, ICA-GA and R-ICA-GA. Section III surveys how much it needs to test more evolutionary algorithms for training ANNs. Section IV describes the research data and experiments. Section V compares test results of all different algorithms together and section VI discusses the conclusions and future research issues. II. INTRODUCTION OF LEARNING ALGORITHMS

I.

INTRODUCTION A. ICA ICA optimizes the objective function via imperialistic competition idea. This algorithm uses the assimilation policy which the imperialistic countries have reached after the 19th century. Based on this policy the imperialists try to improve the economy, culture and political situations of their colonies. This policy makes the colonys enthusiasm toward the imperialists. In this theory, an imperialist with its colonies is called an empire. In this approach, the power of an empire depends on the power of its imperialist and its colonies. By imperialistic competitions the imperialists which are weaker lose their colonies. These colonies would join a more powerful empire for greater supports. After a while, the weaker empires will lose all their colonies and their imperialists will transform to the colonies of the other empires; at the end, all the weak empires will be collapsed and only one powerful empire will be left.

Artificial neural networks are successful tools which are used in many problems such as pattern recognition, classification problems and regression problems [1]. Neural network learning in general is a nonlinear minimization problem with many local minima [2] which depends on network weights, architecture (including number of hidden layers, number of hidden neurons and node transfer functions) and learning rules [3]. Many methods such as back propagation [4], pruning algorithms [5], simulated annealing [6], particle swarm optimization algorithm (PSO) [7], GA [8] and ICA [9] were used to determine the ANN parameters because of importance of this problem. For the complexity of ANN research, it is very necessary to still do some work in this field [10]. Many learning algorithms find their roots in function minimization algorithms that can be classified into local minimization and global minimization. Local minimization algorithms, such as gradient-descent algorithms, are fast but usually converge to local minimums. In contrast, global

978-1-4244-9908-3/11/$26.00 2011 IEEE

61

Fig. 1 depicts the flowcharts of the ICA [12]. For coding this algorithm, an initial population is created randomly. Each member of this population is a vector of random numbers which is called a country. The cost of objective function will be calculated for each of these countries and shows their power. Countries which have more optimized costs are selected as the imperialists and other countries are selected as colonies and would be divided between the imperialists. Imperialists with their relevant colonies are named empires. Cost of each empire determines its power.

toward their imperialists, in each iteration. Fig. 2 depicts how colonies move toward their relevant imperialists. In this figure, x is a random variable with uniform distribution between 0 and d. Equation (2) shows how x is calculated:

where is a number greater than 1, and d is the distance between colony and imperialist. To search different points around the imperialist, a random amount of deviation is added to the direction of movement which is showed by in Fig. 2. In this figure, is a random number with uniform distribution between - and +. Equation (3) shows how to calculate :

where is a number that adjusts the deviation from the original direction. In this method, colonies will exchange their positions by their imperialists when they earn a cost more optimized in comparison to them. Fig. 3 depicts how the positions can be exchanged.

Figure 2. Moving colonies toward their relevant imperialists

Figure 1. Flowchart of ICA

Equation (1) shows how the power of empires can be calculated:

where T.C.n is the total cost of the nth empire and is a positive number which is considered to be less than 1 and coloniesn presents colonies of nth empire. The colonies move

Figure 3. Exchanging positions of the imperialists and a colony

62

Next step is imperialistic competition. All empires try to take possession of colonies of the other empires and control them. So, the most powerful empire must take possession of a colony from the weakest empire. Fig. 4 depicts how it can be modeled.

C. ICA-GA Using combination of the two evolutionary algorithms ICA and GA, these algorithms can be used next to each other for covering their disabilities. The suggested method is based on using the ICA in the first iterations of optimization. This leads that the ability of ICA in the first iterations of optimization be used. At the second part of optimization the developed population calculated by the ICA is given to the GA. So the ability of GA in finding the global optimum without forgetting other regions will be used in this section of algorithm. Fig. 6 shows flowchart of the algorithm [13].

Figure 4. Imperialistic competition. The most powerful empire will possess the weakest colony of the weakest empire.

After a few iterations, powerless empires will be collapsed in the imperialistic competition and their colonies will be divided among other empires. At the end, all the empires except the most powerful one will be collapsed and the colonies will be under the control of this unique empire. In this new world all the colonies have the same position and same costs and they are controlled by an imperialist with the same position and cost as themselves. In this condition the imperialistic competition ends and the algorithm stops. Position and cost of remained imperialist respectively show the optimized variables of the problem and optimized result of the problem [13]. B. GA By noticing the Darwin theory about evolution, GA is created. Fig. 5 shows the flowchart of this algorithm. For a specific problem, the GA codes a solution as an individual chromosome. It then defines an initial population of those individuals that represent a part of the solution space of the problem. The search space therefore, is defined as the solution space in which each feasible solution is represented by a distinct chromosome. Before the search starts, a set of chromosomes is randomly chosen from the search space to form the initial population. Next, through computations the individuals are selected in a competitive manner, based on their fitness as measured by a specific objective function. The genetic search operators such as selection, mutation and crossover are then applied one after another to obtain a new generation of chromosomes in which the expected quality over all the chromosomes is better than that of the previous generation. This process is repeated until the termination criterion is met, and the best chromosome of the last generation is reported as the final solution [13].

Figure 5. Flowchart of GA

Figure 6. Flowchart of ICA-GA

63

datasets which are used in this paper, are Known datasets available for download from http://archive.ics.uci.edu/ml/datasets.html. These datasets are presented as follow:
Figure 7. Behavior of R-ICA-GA to getting close to the global optimum

WINE PIMA WDBC IRIS SONAR GLASS

D. R-ICA-GA This algorithm proposes to combine two basic algorithms ICA and GA based on recursive repetition of them. GA always prevents increasing the concentration on talent full regions of optimization area. On the other hand, ICA always sends the colonies toward the imperialists and gathers a big percentage of population near the imperialists. The global optimum usually is near the local optimums in most of the problems. So the populations which are gathered near to the imperialists of ICA (which may be near the global optimum), will expand and twitch, by recursive combination of these two algorithms. In fact, this algorithm behaves based on ICA, but it uses the help of GA to expand the population of colonies which are gathered near the imperialists [13]. Fig. 7 shows the behavior of colonies in this algorithm. As shown in Fig. 7 this algorithm, uses ICA to move the population toward the imperialists with the difference that the population movement is done in a billowy way. This billowy movement leads to a fast decrease in the convergence curve in the times which algorithm switches from ICA to GA and vice versa. This fast decrease shows the more optimized regions next to the imperialists are found. Fig. 8 shows the flowchart of the R-ICA-GA algorithm. III. PROBLEM STATEMENT

WINE is used as a simple dataset to test algorithms which are used to train the ANN. PIMA is used as a complicated dataset and other datasets are used to assay the different learning methods which are used in this paper. All four algorithms ICA, GA, ICA-GA and R-ICA-GA are used to learn named datasets and are compared to each other.

As mentioned in section I the optimization method used to determine weight adjustments has a large influence on the performance of neural networks. Using new hybrid evolutionary algorithms ICA-GA and R-ICA-GA, and comparing their performances to other evolutionary algorithms ICA and GA leads to find the best algorithm in weight optimization for a classification problem.
Figure 8. Flowchart of R-ICA-GA

IV.

EXPERIMENTAL STUDIES A. Using WINE as a Simple Dataset WINE includes data from wine chemical analysis that belongs to 3 classes. It contains 178 samples with 13 attributes. The named dataset is divided to two subsets containing train and test samples. Sizes of train and test subsets are chosen 124 and 54 respectively. The samples are categorized to train and

Test results are presented in this part of the paper. Three groups of datasets are used for simulation which are classification problems. The ANN which is used in this simulation is a MLP network. At first a simple dataset is used to learn the ANN, second is a complicated dataset and at the third group, other standard datasets are studied in the ANN. All

64

test subsets randomly. The best topology of MLP network to learn the WINE dataset is determined as (7,4,3) [9]. Cost function which is used to optimize via evolutionary algorithms is the MSE function. So the evolutionary algorithms try to optimize the MSE of the MLP network. In order to preventing the over fitting on the MLP networks, Early Stopping Method (ESM) is used in this simulation. Fig. 9 depicts the convergence curves of training the ANN on the

for R-ICA-GA are chosen 20, 5 and 5 respectively. ICA-GA and R-ICA-GA both have behaved like ICA at the first epochs. TABLE I presents the average MSEs of train and test results while simulating the ANN. These values are calculated from 20 times training and testing the network using four different evolutionary algorithms. Each algorithm is permitted to continue until 200 epochs. TABLE I presents that the ICA-GA method is the best algorithm, among four algorithms which are used in this paper to learn the WINE dataset on the MLP network. The Early Stopping point detected column shows the talent of each algorithm to over fit the network until 200 epochs training process. As seen in the TABLE I, by comparing the talent of over fitting the network, ICA is the best choice.
TABLE I. TEST AND TRAINING RESULTS FOR WINE DATASET

Figure 9. Convergence curves of training the ANN on the WINE dataset using the different evolutionary algorithms

One simulation sample which its convergence curves are


TABLE II. INITIAL PARAMETERS AND PERFORMANCE RESULTS OF COMPARED OPTIMIZATION ALGORITHMS ON WINE DATASET

WINE dataset by using the different evolutionary algorithms. These curves are selected among 20 training simulations; the simulations are selected which their results have been the same as average results. As seen in the Fig. 9 the ICA is more powerful in the first epochs and the GA is more powerful in the last epochs. ICAGA in this simulation has used the ICA method for the first 20 epochs and after that has used the GA method. N1, N2 and N3

depicted in the Fig. 9 is studied here to compare other characteristics of different methods. TABLE II presents the results of learning the MLP network by evolutionary algorithms. As seen in TABLE II, the MLP networks which are learned by R-ICA-GA and ICA-GA methods have the most ability to classify the test samples.

65

TABLE III. TRAINING RESULTS OF NETWORK VIA VIA COMPARED OPTIMIZATION ALGORITHMS ON WINE DATASET

To show how each network classifies the samples Confusion Matrix is used in this paper. Columns of the Confusion Matrix are correct classes of samples and its rows are classes which the network has determined. TABLE III presents the confusion matrixes of test and train for networks. B. Using PIMA as a Complicated Dataset PIMA includes Pima Indians diabetes analysis that belongs to classes of healthy and diabetic. It contains 768 samples with 8 attributes. Sizes of train and test subsets are chosen 537 and 231 respectively. The best topology of MLP network to learn this dataset is determined as (12,5,2) [9]. TABLE IV presents the results of learning the MLP network using evolutionary

TABLE IV.

INITIAL PARAMETERS AND PERFORMANCE RESULTS OF COMPARED OPTIMIZATION ALGORITHMS ON PIMA DATASET

TABLE V.

TRAIN CORRECTED CLASIFIED (%)

66

algorithms. C. Using Other Datasets Four datasets WDBC, IRIS, SONAR and GLASS are studied in this section. Test results are presented in TABLE V. these results will be used in section V to compare performance of each algorithm. V. COMPARING THE TEST RESULTS

the ANNs. Results showed the ICA-GA is the best algorithm among these four algorithms used to train the network. REFERENCES
[1] Tsoulos, I., D. Gavrilis, and E. Glavas, Neural network construction and training using grammatical evolution. Science Direct Neurocomputing Journal, 2008. 72(1-3): p. 269-277. Shang, Y. and B.W. Wah, Global Optimization for Neural Networks Training. IEEE Computer, 1996. 29(3): p. 45-54. Abraham, A., Meta learning evolutionary artificial neural networks. Neurocomputing, 2004. 56: p. 1-38. Tang, P. and Z. Xi, The Research on BP Neural Network Model Based on Guaranteed Convergence Particle Swarm Optimization, in Second Intl. Symp. on Intelligent Information Technology Application. 2008. p. 13-16. Reed, R., Pruning algorithms-a survey. IEEE Trans. on Neural Networks, 1993. 4(5): p. 740 747. Souto, M.C.P.d., A. Yamazaki, and T.B. Ludernir, Optimization of neural network weights and architecture for odor recognition using simulated annealing, in Intl. Joint Conf. on Neural Networks. 2002. p. 547-552. Zhang, C., H. Shao, and Y. Li, Particle swarm optimization for evolving artificial neural network, in PROC IEEE INT CONF SYST MAN CYBERN. 2000. p. 2487-2490. Qu, X., J. Feng, and W. Sun, Parallel Genetic Algorithm Model Based on AHP and Neural Networks for Enterprise Comprehensive Business, in International Conference on Intelligent Information Hiding and Multimedia Signal Processing. 2008. p. 897-900. Tayefeh Mahmoudi, M., et al., Artificial Neural Network Weights Optimization based on Imperialist Competitive Algorithm. 2008. Wei, G., Evolutionary Neural Network Based on New Ant Colony Algorithm, in Computational Intelligence and Design, International Symposium on ISCID '08. 2008: Wuhan p. 318 321. Abdechiri, M., K. Faez, and H. Bahrami, Neural Network Learning Based on Chaotic Imperialist Competitive Algorithm, in 2nd International Workshop on Intelligent Systems and Applications (ISA). 2010 Wuhan, China. p. 1-5. Atashpaz Gargari, E. and L. Caro, Imperialist Competitve Algorithm: An Algorithm for Optimization Inspired by Imperialistic Competition, in 2007 IEEE Congress on Evolutionary Computation (CEC 2007). 2007. Khorani, V., F. Razavi, and A. Ghoncheh, A new hybrid evolutionary algorithm based on ICA and GA: Recursive-ICA-GA, in WORLDCOMP'2010. 2010: Las Vegas, Nevada, USA.

[2] [3] [4]

In this part the average results of the simulated networks are compared as seen in Fig. 10. It is obvious that ICA-GA is the best algorithm, to learn a network, among these four algorithms.

[5] [6]

[7]

[8]

[9] [10]

Figure 10. Average results of trains and tests

[11]

VI.

CONCLUSION

In this paper four evolutionary optimization algorithms ICA, GA, ICA-GA and R-ICA-GA were used to train a classification problem on a MLP neural network. All the named algorithms were compared to each other in number of epochs needed to train the networks, train and test errors, corrected classified errors and confusion matrixes. Standard datasets of classification problems were used to be learned on

[12]

[13]

67

Vous aimerez peut-être aussi