Académique Documents
Professionnel Documents
Culture Documents
Perceptron
training
Algorithmas follows
Training methods
used can
be summarized
1. Apply an input pattern and calculate the output.
2. .
a) If the output is correct, go to step 1.
b) If the output is incorrect, and is zero, add each input to
its corresponding weight;or
c) If the output is incorrect, and is one, subtract each
input to its corresponding weight.
3.Go to step 1.
= 0 step 2a
>0 step2b
< 0 step2c
= ( T-A )
i = xi
wi(n+1) =wi(n)+ i
Where
i
= the correction associated with i th input xi
wi(n+1) = the value of weight i after adjustment
wi(n)
= the value of weight i before adjustment
Module2
Back propagation: Training Algorithm - Application Network Configurations - Network Paralysis - Local
Minima - Temporal instability.
INTRODUCTION
The expansion of ANN was under eclipse due to lack of
algorithms for training multilayer ANN.
BACK PROPAGATION
Back propagation is a systematic method for training
multilayer artificial neural networks
p1
First Layer
n1
Second Layer
a1
n1
b1
p2
PR
n
2
2n
3
b3
a2
a3
n
2
b
2n
3
b3
a1
b1
b1
b
p3
Third Layer
n1
a1
a2
a3
n
2
b
2n
3
b3
a2
a3
X1
W1,1
X2
W1,2
X3
W1,3
X4
W1,4
NET= XW
Artificial Neuron
OUT
Sigmoid Function
1
(1 EXP NET )
d (OUT)
---------- =OUT(1-OUT)
d (NET)
p1
n1
a1
b1
p2
n
2
b
p3
PR
2n
3
b3
n1
OUT1
Target 1
b1
a2
a3
n
2
b
2n
3
b3
OUT2
Target 2
OUT3
Target 3
OBJECTIVE OF TRAINING
TRAINING PAIR
TRAINING SET
TRAINING STEPS
Forward Pass
Step 1 and Step 2 constitute forward pass
Signal propagate from input to output.
NET = XW.
OUT = F(XW)
Reverse Pass
Step 3 and step 4 constitute reverse pass.
Weights in the OUTPUT LAYER is adjusted with the
modified delta rule.
Training is more complicated in the HIDDEN LAYERS,
as their output have no target for comparison.
FI
*
*
*
Wqp(n)
,Training Rate
wqp
Wqp(n+1)
Then the is multiplied by the OUT from neuron jthe source neuron.
This product is multiplied by the Learning Rate.
() typically the learning rate is taken as value
between 0.01-1.0.
This result is added to the weight.
An identical process is done for each weight
proceeding from a neuron in the hidden layer to
output layer.
w qp q, k OUTp, j
w
w
qp
qp
(n 1)
qp
(n) w qp
propagation
trains
the
hidden
layers
by
w qp q, k OUTp, j
qp
(n 1)
qp
(n) w qp
Previous Layer
Hidden Layer
Output Layer
w1p
w2p
1,k
k
1
2,k
q ,k
wqp
j
k
E =0.5 (Target-OUT)2.
E =0.5 (tk-yk)2
y k
E
(t k y k )
wkj
wkj
f ( y ink ) y ink
(t k y k )
*
y ink
wkj
(t k y k ) f ' ( y ink ) * z j
let
k (t k y k ) f ( y in , k )
(t k
d (OUT )
yk )
.
d ( NET )
k (t k y k ) f ( y in , k )
(t k y k )OUT (1 OUT ).
f ( y in , k )
y in , k
(t k y k ) f ( y in , k ) *
k w jk *
k w jk *
y in , k
w ji
y in , k z ji
z ji
z ji z inj
z inj w ji
f ( z inj ) z inj
z inj
w ji
k w jk * f ( z inj ) xi
w ji
let
j k w jk f ( z inj )
E
w jk
w jk
k z j .
Now consider the second case (Hidden Layer)
w ji
E
w ji
j x i .
b m (k 1) b m (k ) m
Example :Find the equation for Change in weight by
back propagation algorithm, when the activation function
used is Tan Sigmoid Function.
e n e n
OUT a n
e e n
OUT (1 ( a ) 2 )
Hence
(t k )(1 a 2 ).
w (t k )(1 a 2 )OUT k 1
w1
n1
b1
a1 w
2
b2
n2
a2
0.1
z2
-0.2
y1
0.15
0.2
1
Network Paralysis
also small.
Local Minima
Back Propagation algorithm employs a type of
Gradient descent method.
The error surface of a complex network is highly
convoluted, full of hills, valleys, folds etc.
The network can be get trapped in a local minimum
(shallow valley) when there is a much deeper
minimum nearby. ( This problem denoted as Local
Minima.)
It is avoided by statistical training methods.
Wasserman proposed a combined statistical and gradient
descent method
Temporal Instability
Human