Académique Documents
Professionnel Documents
Culture Documents
Rohitash Chandra
School of Science and Technology
The University of Fiji
rohitashc@unifiji.ac.fj
Christian W. Omlin
The University of Western Cape
K J
Neural networks are loosely modeled to the brain. S i ( t ) = g ∑ V ik S k ( t − 1) + ∑W ij I j ( t − 1) (1)
They learn by training on past experience and make good k =1 j =1
generalization on unseen instances. Neural networks are
characterized into feedforward and recurrent neural where S k (t ) and I j (t ) represent the output of the state
networks. Feedforward networks are used in application
neuron and input neurons, respectively. Vik and W ij
where the data does not contain time variant information
while recurrent neural networks model time series represent their corresponding weights. g(.) is a sigmoidal
sequences and possesses dynamical characteristics. discriminant function.
Recurrent neural networks contain feedback connections.
They have the ability to maintain information from past
states for the computation of future state outputs.
4.3 Training Recurrent Neural Networks using We apply the combined genetic and gradient descent
Genetic Algorithms learning for phoneme classification using recurrent neural
networks. We classify two phonemes from features
We obtained the training and testing data set of extracted in Section 4.1. We used the network topology as
phonemes ‘b’ and‘d’ as discussed in Section 4.1. We used discussed in Section 4.3. We applied genetic algorithms
the recurrent neural network topology as follows: 12 for training, once the network has shown to learn 88% of
neurons in the input layer which represents the speech the samples; we terminate the training and apply gradient
frame input and 2 neurons in the output layer representing descent to further refine the knowledge learnt by genetic
each phoneme. We experimented with different number of training. We trained for 100 training epochs. Table 5 and
neurons in the hidden layer. 6 show illustrative results of the two major experiments
initialized with two different sets of weights, respectively.
Table 3: Genetic neural learning
The results show that the generalization performance
of combined genetic and gradient descent learning does
No. of No. of not improve significantly when compared to the
Training Generalization
Hidden Training performance of genetic neural learning alone.
Performance Performance
Neurons Generations
12 2 88% 82.6% Table 5: Genetic and gradient descent learning
14 4 88% 82.6%
16 7 88% 82.6% No. of No. of
Training
18 3 87.5% 82.6% Hidden Training Generalization
Performance
Neurons Generations Performance
Small random weights initialised in the range of - 1 to 1 12 100 81.8% 82.6%
14 100 81.1% 82.6%
Table 4: Genetic neural learning 16 100 81.3% 82.6%
18 100 81.3% 82.6%
No. of No. of
Training
Hidden Training Generalization Small random weights initialised in the range of - 1 to 1
Performance
Neurons Generations Performance
12 4 88% 82.6% Table 6: Genetic and gradient descent learning
14 4 88% 82.6%
16 3 88% 82.6% No. of No. of
Training
18 2 88% 82.6% Hidden Training Generalization
Performance
Neurons Generations Performance
Large random weights initialised in the range of - 5 to 5 12 100 78.1% 82.6%
14 100 85.6% 82.6%
We ran some sample experiments and found that the 16 100 81.3% 82.6%
population size of 40, crossover probability of 0.7 and 18 100 81.5 82.6%
mutation probability of 0.1 have shown good genetic
training performance. Therefore, we used these values for Large random weights initialised in the range of - 5 to 5
all our experiments. We ran two major experiments with
different weight initialization prior to training and trained
5. Conclusions [7] G. Towell and J.W. Shavlik, “Knowledge based artificial
neural networks”, Artificial Intelligence, vol. 70, no.4, 1994, pp.
119-166.
We have discussed about the popular recurrent neural
network training paradigms and outlined their strengths [8] C. Kim Wing Ku, M. Wai Mak, and W. Chi Siu, “Adding
and limitations. We discussed the application of recurrent learning to cellular genetic algorithms for training recurrent
neural networks on speech phoneme classification using neural networks,” IEEE Transactions on Neural Networks, vol.
Mel frequency cepstral coefficients feature extraction 10, no.2, 1999, pp. 239-252.
techniques. We have successfully trained recurrent neural
networks to classify phonemes ‘b’ and ‘d’ extracted from [9] P. Manolios and R. Fanelli, “First order recurrent neural
the speech database by gradient descent and genetic networks and deterministic finite state automata,” Neural
training methods. Computation, vol. 6, no. 6, 1994, pp.1154-1172.
[4] M. J. F. Gales, “Maximum likelihood linear transformations [17] C. W. Omlin and S. Snyders, “Inductive bias strength in
for HMM-based speech recognition”, Computer Speech and knowledge-based neural networks: application to magnetic
Language, vol. 12, 1998, pp. 75-98. resonance spectroscopy of breast tissues”, Artificial Intelligence
in Medicine, vol. 28, no. 2, 2003.
[5] C. Lee Giles, C.W Omlin and K. Thornber, “Equivalence in
Knowledge Representation: Automata, Recurrent Neural [18] P. Fransconi, M. Gori, M. Maggini, and G. Soda, “Unified
Networks, and dynamical Systems”, Proc. of the IEEE, vol. 87, integration of explicit rules and learning by example in recurrent
no. 9, 1999, pp.1623-1640. networks”, IEEE Transactions on Knowledge and Data
Engineering, vol. 7, no. 2, 1995, pp. 340-346.
[6] C.W. Omlin and C.L. Giles: "Pruning Recurrent Neural
Networks for Improved Generalization Performance", IEEE [19] I. Potamitis, N. Fakotakis and G. Kokkinakis, "Improving
Transactions on Neural Networks, vol. 5, no. 5, 1994, pp. 848- the Robustness of Noisy MFCC Features Using Minimal
851. Recurrent Neural Networks”, Proc. of the IEEE International
Joint Conference on Neural Networks, vol. 5, 2000, p. 5271.