Académique Documents
Professionnel Documents
Culture Documents
Sourav Mishra
Dept. CDS
Course MTech.
SR. No. 15700
Date of Submission 28/03/2019
souravmishra@iisc.ac.in
Test set:
Input: b’¡start¿ ich habe noch immer keine
antwort auf diese fragen erhalten . ¡end¿’
The correct translation is ¡start¿ i have still
not had a reply to these questions . ¡end¿
Additive Attention translation: i have no rea-
son for the council . ¡end¿
Multiplicative Attention translation: i have
no objection of the commissioner . ¡end¿
Scaled Dot Product Attention translation: i
Figure 6: Learning Curve for Multiplicative Attention have no objection to be a few reasons in this issue
. ¡end¿
The lines below are obtained from the machine
translation task on the training and test sets The figures below show the attention weights
respectively for different models. for the above sentences. Figures 8, 9, 10 denote
the attention weights for the sentences on training
Training set: set.
attention type training time cost BLEU score on training set BLEU score on test set
Additive 7728.728s 1.2272 0.722671 0.722237
Multiplicative 6517.002s 1.0979 0.722671 0.722237
Scaled Dot Product 6718.238s 1.1746 0.722671 0.722237
5 Conclusion
Figure 9: Attention weights for Multiplicative Atten-
tion The BLEU scores are not satisfactory and are the
same for all attention models. This is due to lesser
number of training data and training epochs to
speed up training.
The attention weights plot show which parts of
the input sentences were crucial for the translation
task. This shows that attention based models learn
how to align sequences on their own in somewhat
similar manner to humans.
Currently very small overlaps are obtained as
seen from the outputs above. But improvements
are expected with larger number of training epochs
and data.
References
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Ben-
gio. 2014. Neural machine translation by jointly
learning to align and translate. arXiv preprint
arXiv:1409.0473.
Figure 12: Attention weights for Multiplicative Atten- Philipp Koehn. 2005. Europarl: A Parallel Corpus for
tion Statistical Machine Translation. In Conference Pro-
ceedings: the tenth Machine Translation Summit,
pages 79–86. AAMT, AAMT.
Minh-Thang Luong, Hieu Pham, and Christopher D
Manning. 2015. Effective approaches to attention-
based neural machine translation. arXiv preprint
arXiv:1508.04025.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob
Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz
Kaiser, and Illia Polosukhin. 2017. Attention is all
you need. In I. Guyon, U. V. Luxburg, S. Bengio,
H. Wallach, R. Fergus, S. Vishwanathan, and R. Gar-
nett, editors, Advances in Neural Information Pro-
cessing Systems 30, pages 5998–6008. Curran As-
sociates, Inc.