[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
This is the slide used in the oral presentation at PACLING2019.
(For Japanese speakers) 本発表は私の修論発表と同等ですので、日本語がわかる方は以下のスライドの方が読みやすいかもしれません。
https://www.slideshare.net/HayahideYamagishi/ss-181147693/HayahideYamagishi/ss-181147693
Font has been changed the original one (Hiragino Maru Gothic Pro W4) into the other one by the SlideShare.
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
This is the slide used in the oral presentation at PACLING2019.
(For Japanese speakers) 本発表は私の修論発表と同等ですので、日本語がわかる方は以下のスライドの方が読みやすいかもしれません。
https://www.slideshare.net/HayahideYamagishi/ss-181147693/HayahideYamagishi/ss-181147693
Font has been changed the original one (Hiragino Maru Gothic Pro W4) into the other one by the SlideShare.
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?Hayahide Yamagishi
1) The document examines what neural machine translation models learn about morphology through experiments analyzing the hidden states of NMT models.
2) It finds that character-based word representations better capture morphological information than word-based representations, and that lower encoder layers learn more about a word's structure while higher layers improve translation.
3) The target language does not significantly impact how much the model learns about source language morphology, and decoder states do not capture rich morphological information.
The document discusses why neural machine translation (NMT) models are better than statistical machine translation (SMT) models at producing translations of the appropriate length. It shows that SMT models tended to generate shorter translations due to optimizing for BLEU score, while NMT models directly optimize for maximum likelihood and thus produce lengthier translations that match the source text. The document then demonstrates this concept using a toy copying task, where an NMT model is able to match the length of input strings during translation more accurately than SMT models.
A hierarchical neural autoencoder for paragraphs and documentsHayahide Yamagishi
The document describes 3 hierarchical LSTM models for generating coherent multi-sentence text:
1) Standard LSTM encodes/decodes a document as a single sequence.
2) Hierarchical LSTM encodes sentences then the document.
3) Hierarchical LSTM with attention encodes sentences then decodes with attention over encoded sentences.
The models were evaluated on hotel reviews and Wikipedia using ROUGE, BLEU, and a coherence metric called L-value. The hierarchical LSTMs outperformed the standard LSTM, and hotel reviews were easier to generate than Wikipedia text.
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?Hayahide Yamagishi
1) The document examines what neural machine translation models learn about morphology through experiments analyzing the hidden states of NMT models.
2) It finds that character-based word representations better capture morphological information than word-based representations, and that lower encoder layers learn more about a word's structure while higher layers improve translation.
3) The target language does not significantly impact how much the model learns about source language morphology, and decoder states do not capture rich morphological information.
The document discusses why neural machine translation (NMT) models are better than statistical machine translation (SMT) models at producing translations of the appropriate length. It shows that SMT models tended to generate shorter translations due to optimizing for BLEU score, while NMT models directly optimize for maximum likelihood and thus produce lengthier translations that match the source text. The document then demonstrates this concept using a toy copying task, where an NMT model is able to match the length of input strings during translation more accurately than SMT models.
A hierarchical neural autoencoder for paragraphs and documentsHayahide Yamagishi
The document describes 3 hierarchical LSTM models for generating coherent multi-sentence text:
1) Standard LSTM encodes/decodes a document as a single sequence.
2) Hierarchical LSTM encodes sentences then the document.
3) Hierarchical LSTM with attention encodes sentences then decodes with attention over encoded sentences.
The models were evaluated on hotel reviews and Wikipedia using ROUGE, BLEU, and a coherence metric called L-value. The hierarchical LSTMs outperformed the standard LSTM, and hotel reviews were easier to generate than Wikipedia text.
7. 例: 表記揺れ (Google Translate Ja-En(10/17現在)の結果)
美ら海水族館は、大水槽のジン
ベエザメが人気の観光地であ
る。
7
The Churaumi Aquarium is a
popular tourist spot of
whale shark in the large
aquarium.
美ら海水族館は、大水槽のジン
ベイザメが人気の観光地であ
る。
The Churaumi Aquarium is a
popular tourist destination
with a large whale whale
shark.