Next: Issues in ANN Learning Up: Artificial Neural Nets Previous: Multilayer Nets, Sigmoid Units
The Backpropagation Algorithm
1. Propagates inputs forward in the usual way, i.e. All outputs are computed using sigmoid thresholding of the inner product of the corresponding weight and input ectors. All outputs at stage n are connected to all the inputs at stage n!1 ". Propagates the errors #ac$wards #y apportioning them to each unit according to the amount of this error the unit is responsi#le for. %e now derie the stochastic &ac$propagation algorithm for the general case. 'he deriation is simple, #ut unfortunately the #oo$($eeping is a little messy. input ector for unit j )x ji * ith input to the jth unit+ weight ector for unit j )w ji * weight on x ji + , the weighted sum of inputs for unit j o j * output of unit j ) + t j * target for unit j Downstream)j+ * set of units whose immediate inputs include the output of j Outputs * set of output units in the final layer Since we update after each training e,ample, we can simplify the notation somewhat #y imagining that the training set consists of e,actly one e,ample and so the error can simply #e denoted #y E. %e want to calculate for each input weight w ji for each output unit j. Note first that since z j is a function of w ji regardless of where in the networ$ unit j is located, 'he &ac$propagation Algorithm http-..www.speech.sri.com.people.anand.//1.html.node0/.html 1 of 1 1."2."311 4-"4 PM 5urthermore, is the same regardless of which input weight of unit j we are trying to update. So we denote this 6uantity #y . 7onsider the case when . %e $now Since the outputs of all units are independent of w ji , we can drop the summation and consider 8ust the contri#ution to E #y j. 'hus
)1/+ Now consider the case when j is a hidden unit. Li$e #efore, we ma$e the following two important o#serations. 1. 5or each unit k downstream from j, z k is a function of z j ". 'he contri#ution to error #y all units in the same layer as j is independent of w ji %e want to calculate for each input weight w ji for each hidden unit j. Note that w ji influences 8ust z j 'he &ac$propagation Algorithm http-..www.speech.sri.com.people.anand.//1.html.node0/.html " of 1 1."2."311 4-"4 PM which influences o j which influences each of which influence E. So we can write Again note that all the terms e,cept x ji in the a#oe product are the same regardless of which input weight of unit j we are trying to update. Li$e #efore, we denote this common 6uantity #y . Also note that , and . Su#stituting, 'hus,
)12+ %e are now in a position to state the &ac$propagation algorithm formally. Formal statement of the algorithm: Stochastic &ac$propagation)training e,amples, , n i , n h , n o + 9ach training e,ample is of the form where is the input ector and is the target ector. is the learning rate )e.g., .3:+. n i , n h and n o are the num#er of input, hidden and output nodes respectiely. Input from unit i to unit j is denoted x ji and its weight is denoted #y w ji . 7reate a feed(forward networ$ with n i inputs, n h hidden units, and n o output units. Initiali;e all the weights to small random alues )e.g., #etween (.3: and .3:+ 'he &ac$propagation Algorithm http-..www.speech.sri.com.people.anand.//1.html.node0/.html 0 of 1 1."2."311 4-"4 PM Until termination condition is met, <o 5or each training e,ample , <o 1. Input the instance and compute the output o u of eery unit. ". 5or each output unit k, calculate 0. 5or each hidden unit h, calculate 1. Update each networ$ weight w ji as follows-
Next: Issues in ANN Learning Up: Artificial Neural Nets Previous: Multilayer Nets, Sigmoid Units Anand Venkataraman 1999-09-16 'he &ac$propagation Algorithm http-..www.speech.sri.com.people.anand.//1.html.node0/.html 1 of 1 1."2."311 4-"4 PM