Académique Documents
Professionnel Documents
Culture Documents
LSTM Layer : Computes the output using LSTM units. I have added 100
units in the layer, but this number can be fine tuned later.
Output Layer : Computes the probability of the best possible next word as
output.
Long Short-Term Memories
1. ht = ot tanh(ct)
2. ct = ft ct 1 + it ~ct
3. ~ct = tanh(xtWc + ht 1Uc + bc)
4. ot = (xtWo + ht 1Uo + bo)
5. it = (xtWi + ht 1Ui + bi)
6. ft = (xtWf + ht 1Uf + bf )
Wi Wo æ æ xt ö ö
Input Gate it Output Gate ot ft = s ç W f ç ÷ + b f ÷
è è ht-1 ø ø
Similarly for it, ot
xt W Cell
ct-1 ht
ht-1
ct = ft Ä ct-1 +
æ xt ö
it Ä tanhW ç ÷
Wf
ft Forget Gate è ht-1 ø
ht = ot Ä tanhct
xt ht-1
2. New Memory Generation : It uses the input word and the past
hidden state to generate a
new memory which includes aspects of the new word.
3. Forget Gate : It uses the input word and the past hidden state to
make an assessment on
whether the past memory cell is useful for computation of the current
memory cell.