September 27, 2017 1
Hybrid Machine Translation
by Combining Multiple
Machine Translation Systems
Matīss Rikters
Supervisor: Dr. sc. comp., prof. Inguna Skadiņa
September 27, 2017 2
Contents
• Introduction
• Aim and objectives
• Background and related work
• Combining statistical machine translations
• Combining neural machine translations
• Practical implementations
• Conclusions
Introduction
• Machine translation (MT) is a sub-field of natural language
processing that investigates the use of computers to
translate text from one language to another
• Rule-based MT (RBMT) is based on linguistic information
covering the main semantic, morphological, and syntactic
regularities of source and target languages
• Statistical MT (SMT) consists of subcomponents that are
separately engineered to learn how to translate from vast
amounts of translated text
• Neural MT (NMT) consists of a large neural network in which
weights are trained jointly to maximize the translation
performance
September 27, 2017 3
Introduction
Automatic Evaluation MT
• BLEU - one of the first metrics to report high
correlation with human judgments
• One of the most popular in the field
• The closer MT is to a professional human
translation, the better it is
• Scores a translation on a scale of 0 to 100
September 27, 2017 4
Aim and objectives
The aim is to research and develop methods
and tools that allow to combine output from
multiple different machine translation systems
to acquire one superior final translation.
The primary focus is on MT issues related to
Latvian, the reviewed and introduced methods
are usually applicable to other languages as
well
September 27, 2017 5
Aim and objectives
Objectives:
• Analyze RBMT, SMT and NMT methods as well
as existing HMT and multi-system MT (MSMT)
methods
• Experiment with different methods of combining
translations
• Evaluate quality of the resulting translations
• Investigate applicability of methods for Latvian
and other morphologically rich. less-resourced
languages
• Provide practical applications of MT combining
September 27, 2017 6
Background and
related work
September 27, 2017 7
Rule-based MT
September 27, 2017 8
Rule-based MT with
Grammatical Framework
September 27, 2017 9
Corpus-based MT
September 27, 2017 10
English Latvian
The cat sat on the mat Kaķis sēdēja uz paklāja
The rat sat on the mat Žurka sēdēja uz paklāja
Hybrid MT
Statistical rule generation
– Rules for RBMT systems are generated from training corpora
Multi-pass
– Process data through RBMT first, and then through SMT
Multi-System hybrid MT
– Multiple MT systems run in parallel
September 27, 2017 11
Neural MT
September 27, 2017 12
Combining statistical
MT output
September 27, 2017 13
Combining statistical
machine translation output
• Full sentence translations
• Simple sentence fragments
• Advanced sentence fragments
• Exhaustive search
• Neural network language models
September 27, 2017 14
Full sentence translations
September 27, 2017 15
Sentence tokenization
Translation with APIs
Google Translate Bing Translator LetsMT
Selection of the best
translation
Output
Full sentence translations
September 27, 2017 16
Probabilities are calculated based on the observed entry with
longest matching history 𝑤𝑓
𝑛
:
𝑝 𝑤 𝑛 𝑤1
𝑛−1
= 𝑝 𝑤 𝑛 𝑤𝑓
𝑛−1
𝑖=1
𝑓−1
𝑏(𝑤𝑖
𝑛−1
),
where the probability 𝑝 𝑤 𝑛 𝑤𝑓
𝑛−1
and backoff penalties
𝑏(𝑤𝑖
𝑛−1
) are given by an already-estimated language model.
Perplexity is then calculated using this probability:
𝑏−
1
𝑁 𝑖=1
𝑁
𝑙𝑜𝑔 𝑏 𝑞 𝑥 𝑖
,
where given an unknown probability distribution p and a
proposed probability model q, it is evaluated by determining
how well it predicts a separate test sample x1, x2... xN drawn
from p.
Experiments
September 27, 2017 17
• En→Lv
• Language model for the target language (Lv)
– JRC Acquis corpus version 3.0 (1.4M sentences)
– 5-gram LM trained with KenLM
• Parallel test sets
– 1581 random sentences from the JRC Acquis 3.0
– ACCURAT balanced test corpus for under
resourced languages (512 sentences)
Experiments
September 27, 2017 18
ACCURAT balanced test corpus
System BLEU
Google Translate 24.73
Bing Translator 22.07
LetsMT! 32.01
Hybrid Google + Bing 23.75
Hybrid Google + LetsMT! 28.94
Hybrid LetsMT! + Bing 27.44
Hybrid Google + Bing + LetsMT! 26.74
Experiments
September 27, 2017 19
JRC Acquis test corpus
System BLEU TER WER
Translations selected
Google Bing LetsMT Equal
Google Translate 16.92 47.68 58.55 100 % - - -
Bing Translator 17.16 49.66 58.40 - 100 % - -
LetsMT 28.27 36.19 42.89 - - 100 % -
Hybrid
Google + Bing
17.28 48.30 58.15 50.09 % 45.03 % - 4.88 %
Hybrid
Google + LetsMT
22.89 41.38 50.31 46.17 % - 48.39 % 5.44 %
Hybrid
LetsMT + Bing
22.83 42.92 50.62 - 45.35 % 49.84 % 4.81 %
Hybrid
Google + Bing + LetsMT
21.08 44.12 52.99 28.93 % 34.31 % 33.98 % 2.78 %
Human evaluation
September 27, 2017 20
• 5 native Latvian speakers were given a random 2% - 32
sentences
• They were told to mark which of the three MT outputs is the
best, worst and OK
• With the option to select multiple answers for best, worst or
OK
Human evaluation
September 27, 2017 21
System User 1 User 2 User 3 User 4 User 5 AVG user Hybrid BLEU
Bing 21,88% 53,13% 28,13% 25,00% 31,25% 31,88% 28,93% 16.92
Google 28,13% 25,00% 25,00% 28,13% 46,88% 30,63% 34,31% 17.16
LetsMT! 50,00% 21,88% 46,88% 46,88% 21,88% 37,50% 33,98% 28.27
Simple sentence fragments
September 27, 2017 22
Sentence tokenization
Translation with APIs
Google
Translate
Bing
Translator
LetsMT
Selection of the best translated chunk
Output
Sentence chunking (decomposition)
Syntactic parsing
Sentence recomposition
Simple sentence fragments
September 27, 2017 23
Input
Output
Google
Translate
3. the list referred to in paragraph 1 and all
amendments thereto shall be published in the official
journal of the european communities .
Chunk
( (S (NP (NP (CD 3.)) (SBAR (S (NP (DT the) (NN list)) (VP (VBD referred) (PP
(TO to)) (PP (IN in) (NP (NP (NN paragraph) (CD 1)) (CC and) (NP (DT all)
(NNS amendments) (NN thereto)))))))) (VP (MD shall) (VP (VB be) (VP (VBN
published) (PP (IN in) (NP (NP (DT the) (JJ official) (NN journal)) (PP (IN of)
(NP (DT the) (JJ european) (NNS communities)))))))) (. .)) )
Parse
3. the list referred to in
paragraph 1 and all
amendments thereto
shall be published in the
official journal of the
european communities
.
Google
Translate
LetsMT
Bing
Translator
Bing
Translator
LetsMT
Google
Translate
Bing Translator LetsMT
Google
Translate
Bing
Translator
LetsMT
Recompose
3. sarakstu, kas minēts 1. punktā,
un visus tā grozījumus ir publicēti
Eiropas kopienu oficiālajā žurnālā.
.. .
ir publicēti
Eiropas kopienu
oficiālajā
žurnālā.
publicē
oficiālajā
vēstnesī Eiropas
Kopienu
publicē Eiropas
Kopienu
Oficiālajā
Vēstnesī
3. punktā
minēto sarakstu
un visus
grozījumus 1
3. sarakstu, kas
minēts 1.
punktā, un visus
tā grozījumus
3. sarakstu, kas
minētas punktā
1 un visi
grozījumi tajos
Experiments
Syntactic analysis
Berkeley Parser
Sentences are split into chunks from the top level subtrees of the
syntax tree
Selection of the best chunk
The same as in the previous experiment
(5-gram LM with KenLM using JRC-Acquis)
Test data
The same as in the previous experiment
(1581 random sentences from JRC-Acquis)
September 27, 2017 24
Experiments
System
BLEU NIST
MHyT SyMHyT MHyT SyMHyT
Google Translate 18.09 8.37
Bing Translator 18.87 8.09
LetsMT! 30.28 9.45
Google + Bing 18.73 21.27 7.76 8.30
Google + LetsMT 24.50 26.24 9.60 9.09
LetsMT! + Bing 24.66 26.63 9.47 8.97
Google + Bing + LetsMT! 22.69 24.72 8.57 8.24
September 27, 2017 25
Experiments
September 27, 2017 26
Additional Experiments
September 27, 2017 27
Language Model Size (sentences) BLEU
5-gram JRC 1.4 million 24.72
12-gram JRC 1.4 million 24.70
12-gram DGT-TM 3.1 million 24.04
Experiments with different language models
Experiments with random chunks
Chunks BLEU
SyMHyT chunks 24.72
5-grams 11.85
Random 1-4 grams 7.33
Random 1-6 grams 10.25
Random 6-max grams 20.94
Human evaluation
September 27, 2017 28
System
Fluency
AVG
Accuracy
AVG
SyMHyT
selection
BLEU
Google 35.29% 34.93% 16.83% 18.09
Bing 23.53% 23.97% 17.94% 18.87
LetsMT 20.00% 21.92% 65.23% 30.28
SyMHyT 21.18% 19.18% - 24.72
Advanced sentence
fragments
An advanced approach to chunking
– Traverse the syntax tree bottom up, from right to left
– Add a word to the current chunk if
• The current chunk is not too long (sentence word count / 4)
• The word is non-alphabetic or only one symbol long
• The word begins with a genitive phrase («of »)
– Otherwise, initialize a new chunk with the word
– In case when chunking results in too many chunks,
repeat the process, allowing more (than sentence word
count / 4) words in a chunk
Changes in the MT API systems
– LetsMT! API temporarily replaced with Hugo.lv API
– Added Yandex API
September 27, 2017 29
Advanced sentence
fragments
September 27, 2017 30
Experiments
September 27, 2017 31
Selection of the best translation:
6-gram and 12-gram LMs trained with
– KenLM
– JRC-Acquis corpus v. 3.0
– DGT-Translation Memory corpus – 3.1 million sentences
– Sentences scored with the query program from KenLM
Test corpora
– 1581 random sentences from JRC-Acquis
– ACCURAT balanced evaluation corpus
Experiments
September 27, 2017 32
Sentence chunks with SyMHyT Sentence chunks with ChunkMT
• Recently
• there
• has been an increased interest in the
automated discovery of equivalent
expressions in different languages .
• Recently there has been an
increased interest
• in the automated discovery of
equivalent expressions
• in different languages .
Experiments
September 27, 2017 33
Experiments
September 27, 2017 34
System BLEU Equal Bing Google Hugo Yandex
BLEU - - 17.43 17.73 17.14 16.04
MSMT
Google + Bing 17.70 7.25% 43.85% 48.90% - -
MSMT
Google + Bing + LetsMT 17.63 3.55% 33.71% 30.76% 31.98% -
SyMHyT
Google + Bing 17.95 4.11% 19.46% 76.43% - -
SyMHyT
Google + Bing + LetsMT 17.30 3.88% 15.23% 19.48% 61.41% -
ChunkMT
Google + Bing 18.29 22.75% 39.10% 38.15% - -
ChunkMT
all four 19.21 7.36% 30.01% 19.47% 32.25% 10.91%
Exhaustive search
The main differences:
• the manner of scoring chunks with the LM
and selecting the best translation
• utilisation of multi-threaded computing that
allows to run the process on all available
CPU cores in parallel
• very slow
September 27, 2017 35
Corpora statistics
September 27, 2017 36
Chunks Combinations
Count Percentage
Legal General Legal General
1 4 210 16 13.28% 3.13%
2 16 178 78 11.26% 15.23%
3 64 262 131 16.57% 25.59%
4 256 273 127 17.27% 24.80%
5 1024 275 94 17.39% 18.36%
6 4096 201 47 12.71% 9.18%
7 16384 96 11 6.07% 2.15%
8 65536 49 6 3.10% 1.17%
9 262144 37 2 2.34% 0.39%
Corpora statistics
September 27, 2017 37
Legal domain General domain
Experiments
September 27, 2017 38
System
BLEU
Legal General
Full-search 23.61 14.40
Linguistic chunks 20.00 17.27
Bing 16.99 17.43
Google 16.19 17.72
Hugo 20.27 17.13
Yandex 19.75 16.03
Experiment Results
September 27, 2017 39
System Sentence / Chunk Perplexity
Full-search
Šis lēmums stājas spēkā tā publicēšanas dienā oficiālajā
vēstnesī .
16.57
ChunkMT
šo lēmumu . stājas spēkā tās publicēšanas dienā , oficiālajā
vēstnesī .
132.14
Other
possible
variants
šo lēmumu lēmums stājas spēkā trešajā dienā pēc tās
publicēšanas valsts oficiālajā vēstnesī.
54.31
Šis lēmums lēmums stājas spēkā trešajā dienā pēc tās
publicēšanas valsts oficiālajā vēstnesī .
68.82
Šis lēmums stājas spēkā tās publicēšanas dienā Savienības
Oficiālajā Vēstnesī .
21.79
Experiment Results
September 27, 2017 40
System Chunk / Perplexity
Bing
Šis lēmums lēmums stājas spēkā trešajā
dienā pēc tās publicēšanas
Savienības Oficiālajā Vēstnesī.
70.73 33.21 678.29
Google
šis lēmums stājas spēkā tā publicēšanas
dienā
oficiālajā vēstnesī.
568.43 64.58 6858.23
Hugo
šo lēmumu . stājas spēkā tās publicēšanas
dienā ,
valsts oficiālajā vēstnesī.
48.04 23.91 951.49
Yandex
šo lēmumu stājas spēkā tās publicēšanas
dienā
oficiālajā vēstnesī .
760.09 61.66 164.97
Neural network
language models
September 27, 2017 41
• RWTHLM
• CPU only
• Feed-forward, recurrent (RNN) and long short-term
memory (LSTM) NNs
• MemN2N
• CPU or GPU
• End-to-end memory network (RNN with attention)
• Char-RNN
• CPU or GPU
• RNNs, LSTMs and rated recurrent units
(GRU)
• Character level
Best models
September 27, 2017 42
• RWTHLM
• one feed-forward input layer with a 3-word
history, followed by one linear layer of 200
neurons with sigmoid activation function
• MemN2N
• internal state dimension of 150, linear part of
the state 75 and number of hops set to six
• Char-RNN
• 2 LSTM layers with 1024 neurons each and
the dropout set to 0.5
Experiment
Environment
September 27, 2017 43
Training
• Baseline KenLM and RWTHLM modes
• 8-core CPU with 16GB of RAM
• MemN2N
• GeForce Titan X (12GB, 3,072 CUDA cores)
12-core CPU and 64GB RAM
• Char-RNN
• Radeon HD 7950 (3GB, 1,792 cores)
8-core CPU and 16GB RAM
Translation
• All models
• 4-core CPU with 16GB of RAM
Experiment Results
September 27, 2017 44
System Perplexity
Training
Corpus
Size
Trained
On
Training
Time
BLEU
KenLM 34.67 3.1M CPU 1 hour 19.23
RWTHLM 136.47 3.1M CPU 7 days 18.78
MemN2N 25.77 3.1M GPU 4 days 18.81
Char-RNN 24.46 1.5M GPU 2 days 19.53
Char-RNN
General Domain
September 27, 2017 45
12.00
12.50
13.00
13.50
14.00
14.50
15.00
15.50
16.00
16.50
17.00
15.00
20.00
25.00
30.00
35.00
40.00
45.00
50.00
0.11 0.20 0.32 0.41 0.50 0.61 0.70 0.79 0.88 1.00 1.09 1.20 1.29 1.40 1.47 1.56 1.67 1.74 1.77
BLEU
Perplexity
Epoch
Perplexity BLEU-HY BLEU-BG Linear (BLEU-HY) Linear (BLEU-BG)
Char-RNN
Legal Domain
September 27, 2017 46
16.00
17.00
18.00
19.00
20.00
21.00
22.00
23.00
24.00
25.00
15.00
20.00
25.00
30.00
35.00
40.00
45.00
50.00
0.11 0.20 0.32 0.41 0.50 0.61 0.70 0.79 0.88 1.00 1.09 1.20 1.29 1.40 1.47 1.56 1.67 1.74 1.77
BLEU
Perplexity
Epoch
Perplexity BLEU-BG BLEU-HY Linear (BLEU-BG) Linear (BLEU-HY)
Combining neural
MT output
September 27, 2017 47
Combining neural machine
translation output
• Experimenting with NMT attention alignments
• Simple system combination using neural
network attention
• System combination by estimating confidence
from neural network attention
September 27, 2017 48
Experimenting with NMT
attention alignments
Goals
• Improve translation of multiword-expressions
• Keep track of changes in attention alignments
September 27, 2017 49
Workflow
September 27, 2017 50
Tag corpora with
morphological
taggers
UDPipe
LV Tagger
Identify MWE
candidates
MWE Toolkit
Align identified
MWE candidates
MPAligner
Shuffle MWEs into
training corpora;
Train NMT systems
Neural Monkey
Identify changes
Data
Training
– En → Lv
• 4.5M parallel sentences
for the baseline
• 4.8M after adding
MEWs/MWE sentences
– En → Cs
• 49M parallel sentences
for the baseline
• 17M after adding
MEWs/MWE sentences
Evaluation
– En → Lv
• 2003 sentences in total
• 611 sentences with at
least one MWE
– En → Cs
• 6000 sentences in total
• 112 sentences with at
least one MWE
September 27, 2017 51
WMT17 News Translation Task
Data
En → Lv
En → Cs
September 27, 2017 52
1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M
2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M
NMT Systems
Neural Monkey
– Embedding size 350
– Encoder state size 350
– Decoder state size 350
– Max sentence length 50
– BPE merges 30000
September 27, 2017 53
Experiments
Two forms of the presenting MWEs to the NMT system
– Adding only the parallel MWEs themselves
(MWE phrases)
each pair forming a new “sentence pair” in the parallel corpus
– Adding full sentences that contain the identified MWEs
(MWE sentences)
September 27, 2017 54
Languages En → Cs En → Lv
Dataset Dev MWE Dev MWE
Baseline 13.71 10.25 11.29 9.32
+MWE phrases - - 11.94 10.31
+MWE sentences 13.99 10.44 - -
Alignment Inspection
September 27, 2017 55
Alignment Inspection
September 27, 2017 56
Simple system combination using
neural network attention
September 27, 2017 57
Simple system combination using
neural network attention
September 27, 2017 58
Simple system combination using
neural network attention
Workflow
• Translate the same sentence with two different NMT
systems and one SMT system; save attention
alignment data from the NMT systems
• Choose output from the system that does not
• Align most of its attention to a single token
• Have only very strong one-to-one alignments
• Otherwise - back off to the output of the SMT
system
September 27, 2017 59
Experiments
September 27, 2017 60
System En->Lv Lv->En
Dataset Dev Test Dev Test
LetsMT! 19.8 12.9 24.3 13.4
Neural Monkey 16.7 13.5 15.7 14.3
Nematus 16.9 13.6 15.0 13.8
NM+NT+LMT - 13.6 - 14.3
Data – WMT17 News translation Task
System combination by estimating
confidence from neural network attention
𝐶𝐷𝑃 =
1
𝐽
𝑗
log 1 +
𝑖
∝𝑗𝑖
2
September 27, 2017 61
𝐴𝑃𝑜𝑢𝑡 = −
1
𝐼
𝑖 𝑗
∝𝑗𝑖∙ 𝑙𝑜𝑔 ∝𝑗𝑖
𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = 𝐶𝐷𝑃 + 𝐴𝑃𝑜𝑢𝑡 + 𝐴𝑃𝑖𝑛
𝐴𝑃𝑖𝑛 = −
1
𝐼
𝑗 𝑖
∝𝑖𝑗∙ 𝑙𝑜𝑔 ∝𝑖𝑗
System combination by estimating
confidence from neural network attention
September 27, 2017 62
Source Viņš bija labs cilvēks ar plašu sirdi.
Reference He was a kind spirit with a big heart.
Hypothesis He was a good man with a wide heart.
CDP -0.099
APout -1.077
APin -0.847
Confidence -2.024
System combination by estimating
confidence from neural network attention
September 27, 2017 63
Source
Aizvadītajā diennaktī Latvijā reģistrēts 71 ceļu satiksmes negadījumos, kuros cietuši
16 cilvēki.
Reference
71 traffic accidents in which 16 persons were injured have happened in Latvia
during the last 24 hours.
Hypothesis
The first day of the EU’European Parliament is the first of the three years of the Eur
opean Union .
CDP -0.900
APout -2.809
APin -2.137
Confidence -5.846
Experiments
September 27, 2017 64
BLEU
System En->De De->En En->Lv Lv->En
Neural Monkey 18.89 26.07 13.74 11.09
Nematus 22.35 30.53 13.80 12.64
Hybrid 20.19 27.06 14.79 12.65
Human 23.86 34.26 15.12 13.24
Data – WMT17 News translation Task
Human Evaluation
September 27, 2017 65
En->Lv Lv->En
LM-based overlap with human 58% 56%
Attention-based overlap with human 52% 60%
LM-based overlap with Attention-based 34% 22%
Language pair CDP APin APout Overall
En->Lv 0.099 0.074 0.123 0.086
Lv->En -0.012 -0.153 -0.2 -0.153
Practical
implementations
September 27, 2017 66
Practical
implementations
• Interactive multi-system machine translation
• Visualizing neural machine translation
attention and confidence
September 27, 2017 67
Interactive multi-system
machine translation
September 27, 2017 68
Start page
Translate with
onlinesystems
Inputtranslations
to combine
Input
translated
chunks
Settings
Translation results
Inputsource
sentence
Inputsource
sentence
Interactive multi-system
machine translation
• Adding a user-friendly interface to ChunkMT
– Draws a syntax tree with chunks highlighted
– Designates which chunks where chosen from which system
– Provides a confidence score for the choices
• Allows using online APIs or user provided translations
• Comes with resources for translating between
English, French, German and Latvian
• Can be used in a web browser
September 27, 2017 69
Practical
implementations
September 27, 2017 70
Practical
implementations
September 27, 2017 71
Visualizing NMT
attention and confidence
Works with attention alignment data from
• Nematus
• Neural Monkey
• AmuNMT
• OpenNMT
• Sockeye
Visualise translations in
• Linux Terminal or Windows PowerShell
• Web browser
• Line form or matrix form
• Save as PNG
• Sort and navigate dataset by confidence scores
September 27, 2017 72
Visualizing NMT
attention and confidence
September 27, 2017 73
Visualizing NMT
attention and confidence
September 27, 2017 74
Conclusions
September 27, 2017 75
Conclusions
• Exploration of a variety of methods for
combining multiple MT systems
• Mostly focused on translating from and to
Latvian, but also on other morphologically
complex languages like Czech and German
• All results evaluated using automatic
metrics; most of them also using manual
human evaluation
September 27, 2017 76
Conclusions
• Hybrid MT combination via chunking outperformed
individual systems in translating long sentences
• Hybrid combination for NMT via attention
alignments complies to the emerging technology
of neural network systems and can distinguish low
quality translations from high quality ones
• The graphical tools serve for performing
translations while inspecting their composition of
parts from individual systems, as well as
overviewing results of already generated
translations to quickly locate better or worse results
September 27, 2017 77
Conclusions
Since in most cases of evaluating both, the
chunking method and the attention-based
method, the author observed improvements in
automatic, as well as human evaluation, the
proposed hypothesis that it is possible to
achieve a higher quality MT than produced
by each component system individually, by
combining output from multiple different MT
systems can be considered as proven.
September 27, 2017 78
Main results
• Methods for hybrid machine translation
combination via chunking;
• Methods for hybrid neural machine
translation combination via attention
alignments;
• Graphical tools for overviewing the
processes.
September 27, 2017 79
Publications
• 11 publications
• 3 indexed in Web of Science
• 2 indexed in Scopus
• 10 peer reviewed
• Presented in
• 8 conferences
• 2 workshops
September 27, 2017 80
Publications
• Rikters, M., Fishel, M., (2017, September). Confidence
Through Attention. In the proceedings of The 16th Machine
Translation Summit.
• Rikters, M., Bojar, O. (2017, September). Paying Attention
to Multi-word Expressions in Neural Machine Translation. In
the proceedings of The 16th Machine Translation Summit.
• Rikters, M., Amrhein, C., Del, M., Fishel, M. (2017b,
September). C-3MA: Tartu-Riga-Zurich Translation Systems
for WMT17. In the proceedings of The 2nd Conference on
Machine Translation.
September 27, 2017 81
Publications
• Rikters, M., Fishel, M., Bojar, O. (2017a, August).
Visualizing Neural Machine Translation Attention and
Confidence. In The Prague Bulletin for Mathematical
Linguistics issue 109.
• Rikters, M. (2016d, December). Neural Network Language
Models for Candidate Scoring in Hybrid Multi-System
Machine Translation. In CoLing 2016, 6th Workshop on
Hybrid Approaches to Translation (HyTra 6).
• Rikters, M. (2016c, October). Searching for the Best
Translation Combination Across All Possible Variants. In
The 7th Conference on Human Language Technologies -
the Baltic Perspective (Baltic HLT 2016) (pp. 92-96).
September 27, 2017 82
Publications
• Rikters, M. (2016b, September). Interactive multi-system
machine translation with neural language models. In
Frontiers in Artificial Intelligence and Applications.
• Rikters, M. (2016a, July). K-Translate-Interactive Multi-
System Machine Translation. In The 12th International Baltic
Conference on Databases and Information Systems (pp.
304-318). Springer International Publishing.
• Rikters, M., Skadiņa, I. (2016b, May). Syntax-based multi-
system machine translation. In N. C. C. Chair) et al. (Eds.),
In Proceedings of The 10th International Conference on
Language Resources and Evaluation (LREC 2016). Paris,
France: European Language Resources Association
(ELRA).
September 27, 2017 83
Publications
• Rikters, M., Skadiņa, I. (2016a, April) Combining machine
translated sentence chunks from multiple MT systems. In
The 17th International Conference on Intelligent Text
Processing and Computational Linguistics (CICLing 2016).
• Rikters, M. (2015, July). Multi-system machine translation
using online APIs for English-Latvian. In ACL-IJCNLP 2015,
4th Workshop on Hybrid Approaches to Translation (HyTra
4).
September 27, 2017 84

Hybrid machine translation by combining multiple machine translation systems

  • 1.
    September 27, 20171 Hybrid Machine Translation by Combining Multiple Machine Translation Systems Matīss Rikters Supervisor: Dr. sc. comp., prof. Inguna Skadiņa
  • 2.
    September 27, 20172 Contents • Introduction • Aim and objectives • Background and related work • Combining statistical machine translations • Combining neural machine translations • Practical implementations • Conclusions
  • 3.
    Introduction • Machine translation(MT) is a sub-field of natural language processing that investigates the use of computers to translate text from one language to another • Rule-based MT (RBMT) is based on linguistic information covering the main semantic, morphological, and syntactic regularities of source and target languages • Statistical MT (SMT) consists of subcomponents that are separately engineered to learn how to translate from vast amounts of translated text • Neural MT (NMT) consists of a large neural network in which weights are trained jointly to maximize the translation performance September 27, 2017 3
  • 4.
    Introduction Automatic Evaluation MT •BLEU - one of the first metrics to report high correlation with human judgments • One of the most popular in the field • The closer MT is to a professional human translation, the better it is • Scores a translation on a scale of 0 to 100 September 27, 2017 4
  • 5.
    Aim and objectives Theaim is to research and develop methods and tools that allow to combine output from multiple different machine translation systems to acquire one superior final translation. The primary focus is on MT issues related to Latvian, the reviewed and introduced methods are usually applicable to other languages as well September 27, 2017 5
  • 6.
    Aim and objectives Objectives: •Analyze RBMT, SMT and NMT methods as well as existing HMT and multi-system MT (MSMT) methods • Experiment with different methods of combining translations • Evaluate quality of the resulting translations • Investigate applicability of methods for Latvian and other morphologically rich. less-resourced languages • Provide practical applications of MT combining September 27, 2017 6
  • 7.
  • 8.
  • 9.
    Rule-based MT with GrammaticalFramework September 27, 2017 9
  • 10.
    Corpus-based MT September 27,2017 10 English Latvian The cat sat on the mat Kaķis sēdēja uz paklāja The rat sat on the mat Žurka sēdēja uz paklāja
  • 11.
    Hybrid MT Statistical rulegeneration – Rules for RBMT systems are generated from training corpora Multi-pass – Process data through RBMT first, and then through SMT Multi-System hybrid MT – Multiple MT systems run in parallel September 27, 2017 11
  • 12.
  • 13.
  • 14.
    Combining statistical machine translationoutput • Full sentence translations • Simple sentence fragments • Advanced sentence fragments • Exhaustive search • Neural network language models September 27, 2017 14
  • 15.
    Full sentence translations September27, 2017 15 Sentence tokenization Translation with APIs Google Translate Bing Translator LetsMT Selection of the best translation Output
  • 16.
    Full sentence translations September27, 2017 16 Probabilities are calculated based on the observed entry with longest matching history 𝑤𝑓 𝑛 : 𝑝 𝑤 𝑛 𝑤1 𝑛−1 = 𝑝 𝑤 𝑛 𝑤𝑓 𝑛−1 𝑖=1 𝑓−1 𝑏(𝑤𝑖 𝑛−1 ), where the probability 𝑝 𝑤 𝑛 𝑤𝑓 𝑛−1 and backoff penalties 𝑏(𝑤𝑖 𝑛−1 ) are given by an already-estimated language model. Perplexity is then calculated using this probability: 𝑏− 1 𝑁 𝑖=1 𝑁 𝑙𝑜𝑔 𝑏 𝑞 𝑥 𝑖 , where given an unknown probability distribution p and a proposed probability model q, it is evaluated by determining how well it predicts a separate test sample x1, x2... xN drawn from p.
  • 17.
    Experiments September 27, 201717 • En→Lv • Language model for the target language (Lv) – JRC Acquis corpus version 3.0 (1.4M sentences) – 5-gram LM trained with KenLM • Parallel test sets – 1581 random sentences from the JRC Acquis 3.0 – ACCURAT balanced test corpus for under resourced languages (512 sentences)
  • 18.
    Experiments September 27, 201718 ACCURAT balanced test corpus System BLEU Google Translate 24.73 Bing Translator 22.07 LetsMT! 32.01 Hybrid Google + Bing 23.75 Hybrid Google + LetsMT! 28.94 Hybrid LetsMT! + Bing 27.44 Hybrid Google + Bing + LetsMT! 26.74
  • 19.
    Experiments September 27, 201719 JRC Acquis test corpus System BLEU TER WER Translations selected Google Bing LetsMT Equal Google Translate 16.92 47.68 58.55 100 % - - - Bing Translator 17.16 49.66 58.40 - 100 % - - LetsMT 28.27 36.19 42.89 - - 100 % - Hybrid Google + Bing 17.28 48.30 58.15 50.09 % 45.03 % - 4.88 % Hybrid Google + LetsMT 22.89 41.38 50.31 46.17 % - 48.39 % 5.44 % Hybrid LetsMT + Bing 22.83 42.92 50.62 - 45.35 % 49.84 % 4.81 % Hybrid Google + Bing + LetsMT 21.08 44.12 52.99 28.93 % 34.31 % 33.98 % 2.78 %
  • 20.
    Human evaluation September 27,2017 20 • 5 native Latvian speakers were given a random 2% - 32 sentences • They were told to mark which of the three MT outputs is the best, worst and OK • With the option to select multiple answers for best, worst or OK
  • 21.
    Human evaluation September 27,2017 21 System User 1 User 2 User 3 User 4 User 5 AVG user Hybrid BLEU Bing 21,88% 53,13% 28,13% 25,00% 31,25% 31,88% 28,93% 16.92 Google 28,13% 25,00% 25,00% 28,13% 46,88% 30,63% 34,31% 17.16 LetsMT! 50,00% 21,88% 46,88% 46,88% 21,88% 37,50% 33,98% 28.27
  • 22.
    Simple sentence fragments September27, 2017 22 Sentence tokenization Translation with APIs Google Translate Bing Translator LetsMT Selection of the best translated chunk Output Sentence chunking (decomposition) Syntactic parsing Sentence recomposition
  • 23.
    Simple sentence fragments September27, 2017 23 Input Output Google Translate 3. the list referred to in paragraph 1 and all amendments thereto shall be published in the official journal of the european communities . Chunk ( (S (NP (NP (CD 3.)) (SBAR (S (NP (DT the) (NN list)) (VP (VBD referred) (PP (TO to)) (PP (IN in) (NP (NP (NN paragraph) (CD 1)) (CC and) (NP (DT all) (NNS amendments) (NN thereto)))))))) (VP (MD shall) (VP (VB be) (VP (VBN published) (PP (IN in) (NP (NP (DT the) (JJ official) (NN journal)) (PP (IN of) (NP (DT the) (JJ european) (NNS communities)))))))) (. .)) ) Parse 3. the list referred to in paragraph 1 and all amendments thereto shall be published in the official journal of the european communities . Google Translate LetsMT Bing Translator Bing Translator LetsMT Google Translate Bing Translator LetsMT Google Translate Bing Translator LetsMT Recompose 3. sarakstu, kas minēts 1. punktā, un visus tā grozījumus ir publicēti Eiropas kopienu oficiālajā žurnālā. .. . ir publicēti Eiropas kopienu oficiālajā žurnālā. publicē oficiālajā vēstnesī Eiropas Kopienu publicē Eiropas Kopienu Oficiālajā Vēstnesī 3. punktā minēto sarakstu un visus grozījumus 1 3. sarakstu, kas minēts 1. punktā, un visus tā grozījumus 3. sarakstu, kas minētas punktā 1 un visi grozījumi tajos
  • 24.
    Experiments Syntactic analysis Berkeley Parser Sentencesare split into chunks from the top level subtrees of the syntax tree Selection of the best chunk The same as in the previous experiment (5-gram LM with KenLM using JRC-Acquis) Test data The same as in the previous experiment (1581 random sentences from JRC-Acquis) September 27, 2017 24
  • 25.
    Experiments System BLEU NIST MHyT SyMHyTMHyT SyMHyT Google Translate 18.09 8.37 Bing Translator 18.87 8.09 LetsMT! 30.28 9.45 Google + Bing 18.73 21.27 7.76 8.30 Google + LetsMT 24.50 26.24 9.60 9.09 LetsMT! + Bing 24.66 26.63 9.47 8.97 Google + Bing + LetsMT! 22.69 24.72 8.57 8.24 September 27, 2017 25
  • 26.
  • 27.
    Additional Experiments September 27,2017 27 Language Model Size (sentences) BLEU 5-gram JRC 1.4 million 24.72 12-gram JRC 1.4 million 24.70 12-gram DGT-TM 3.1 million 24.04 Experiments with different language models Experiments with random chunks Chunks BLEU SyMHyT chunks 24.72 5-grams 11.85 Random 1-4 grams 7.33 Random 1-6 grams 10.25 Random 6-max grams 20.94
  • 28.
    Human evaluation September 27,2017 28 System Fluency AVG Accuracy AVG SyMHyT selection BLEU Google 35.29% 34.93% 16.83% 18.09 Bing 23.53% 23.97% 17.94% 18.87 LetsMT 20.00% 21.92% 65.23% 30.28 SyMHyT 21.18% 19.18% - 24.72
  • 29.
    Advanced sentence fragments An advancedapproach to chunking – Traverse the syntax tree bottom up, from right to left – Add a word to the current chunk if • The current chunk is not too long (sentence word count / 4) • The word is non-alphabetic or only one symbol long • The word begins with a genitive phrase («of ») – Otherwise, initialize a new chunk with the word – In case when chunking results in too many chunks, repeat the process, allowing more (than sentence word count / 4) words in a chunk Changes in the MT API systems – LetsMT! API temporarily replaced with Hugo.lv API – Added Yandex API September 27, 2017 29
  • 30.
  • 31.
    Experiments September 27, 201731 Selection of the best translation: 6-gram and 12-gram LMs trained with – KenLM – JRC-Acquis corpus v. 3.0 – DGT-Translation Memory corpus – 3.1 million sentences – Sentences scored with the query program from KenLM Test corpora – 1581 random sentences from JRC-Acquis – ACCURAT balanced evaluation corpus
  • 32.
    Experiments September 27, 201732 Sentence chunks with SyMHyT Sentence chunks with ChunkMT • Recently • there • has been an increased interest in the automated discovery of equivalent expressions in different languages . • Recently there has been an increased interest • in the automated discovery of equivalent expressions • in different languages .
  • 33.
  • 34.
    Experiments September 27, 201734 System BLEU Equal Bing Google Hugo Yandex BLEU - - 17.43 17.73 17.14 16.04 MSMT Google + Bing 17.70 7.25% 43.85% 48.90% - - MSMT Google + Bing + LetsMT 17.63 3.55% 33.71% 30.76% 31.98% - SyMHyT Google + Bing 17.95 4.11% 19.46% 76.43% - - SyMHyT Google + Bing + LetsMT 17.30 3.88% 15.23% 19.48% 61.41% - ChunkMT Google + Bing 18.29 22.75% 39.10% 38.15% - - ChunkMT all four 19.21 7.36% 30.01% 19.47% 32.25% 10.91%
  • 35.
    Exhaustive search The maindifferences: • the manner of scoring chunks with the LM and selecting the best translation • utilisation of multi-threaded computing that allows to run the process on all available CPU cores in parallel • very slow September 27, 2017 35
  • 36.
    Corpora statistics September 27,2017 36 Chunks Combinations Count Percentage Legal General Legal General 1 4 210 16 13.28% 3.13% 2 16 178 78 11.26% 15.23% 3 64 262 131 16.57% 25.59% 4 256 273 127 17.27% 24.80% 5 1024 275 94 17.39% 18.36% 6 4096 201 47 12.71% 9.18% 7 16384 96 11 6.07% 2.15% 8 65536 49 6 3.10% 1.17% 9 262144 37 2 2.34% 0.39%
  • 37.
    Corpora statistics September 27,2017 37 Legal domain General domain
  • 38.
    Experiments September 27, 201738 System BLEU Legal General Full-search 23.61 14.40 Linguistic chunks 20.00 17.27 Bing 16.99 17.43 Google 16.19 17.72 Hugo 20.27 17.13 Yandex 19.75 16.03
  • 39.
    Experiment Results September 27,2017 39 System Sentence / Chunk Perplexity Full-search Šis lēmums stājas spēkā tā publicēšanas dienā oficiālajā vēstnesī . 16.57 ChunkMT šo lēmumu . stājas spēkā tās publicēšanas dienā , oficiālajā vēstnesī . 132.14 Other possible variants šo lēmumu lēmums stājas spēkā trešajā dienā pēc tās publicēšanas valsts oficiālajā vēstnesī. 54.31 Šis lēmums lēmums stājas spēkā trešajā dienā pēc tās publicēšanas valsts oficiālajā vēstnesī . 68.82 Šis lēmums stājas spēkā tās publicēšanas dienā Savienības Oficiālajā Vēstnesī . 21.79
  • 40.
    Experiment Results September 27,2017 40 System Chunk / Perplexity Bing Šis lēmums lēmums stājas spēkā trešajā dienā pēc tās publicēšanas Savienības Oficiālajā Vēstnesī. 70.73 33.21 678.29 Google šis lēmums stājas spēkā tā publicēšanas dienā oficiālajā vēstnesī. 568.43 64.58 6858.23 Hugo šo lēmumu . stājas spēkā tās publicēšanas dienā , valsts oficiālajā vēstnesī. 48.04 23.91 951.49 Yandex šo lēmumu stājas spēkā tās publicēšanas dienā oficiālajā vēstnesī . 760.09 61.66 164.97
  • 41.
    Neural network language models September27, 2017 41 • RWTHLM • CPU only • Feed-forward, recurrent (RNN) and long short-term memory (LSTM) NNs • MemN2N • CPU or GPU • End-to-end memory network (RNN with attention) • Char-RNN • CPU or GPU • RNNs, LSTMs and rated recurrent units (GRU) • Character level
  • 42.
    Best models September 27,2017 42 • RWTHLM • one feed-forward input layer with a 3-word history, followed by one linear layer of 200 neurons with sigmoid activation function • MemN2N • internal state dimension of 150, linear part of the state 75 and number of hops set to six • Char-RNN • 2 LSTM layers with 1024 neurons each and the dropout set to 0.5
  • 43.
    Experiment Environment September 27, 201743 Training • Baseline KenLM and RWTHLM modes • 8-core CPU with 16GB of RAM • MemN2N • GeForce Titan X (12GB, 3,072 CUDA cores) 12-core CPU and 64GB RAM • Char-RNN • Radeon HD 7950 (3GB, 1,792 cores) 8-core CPU and 16GB RAM Translation • All models • 4-core CPU with 16GB of RAM
  • 44.
    Experiment Results September 27,2017 44 System Perplexity Training Corpus Size Trained On Training Time BLEU KenLM 34.67 3.1M CPU 1 hour 19.23 RWTHLM 136.47 3.1M CPU 7 days 18.78 MemN2N 25.77 3.1M GPU 4 days 18.81 Char-RNN 24.46 1.5M GPU 2 days 19.53
  • 45.
    Char-RNN General Domain September 27,2017 45 12.00 12.50 13.00 13.50 14.00 14.50 15.00 15.50 16.00 16.50 17.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 0.11 0.20 0.32 0.41 0.50 0.61 0.70 0.79 0.88 1.00 1.09 1.20 1.29 1.40 1.47 1.56 1.67 1.74 1.77 BLEU Perplexity Epoch Perplexity BLEU-HY BLEU-BG Linear (BLEU-HY) Linear (BLEU-BG)
  • 46.
    Char-RNN Legal Domain September 27,2017 46 16.00 17.00 18.00 19.00 20.00 21.00 22.00 23.00 24.00 25.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 0.11 0.20 0.32 0.41 0.50 0.61 0.70 0.79 0.88 1.00 1.09 1.20 1.29 1.40 1.47 1.56 1.67 1.74 1.77 BLEU Perplexity Epoch Perplexity BLEU-BG BLEU-HY Linear (BLEU-BG) Linear (BLEU-HY)
  • 47.
  • 48.
    Combining neural machine translationoutput • Experimenting with NMT attention alignments • Simple system combination using neural network attention • System combination by estimating confidence from neural network attention September 27, 2017 48
  • 49.
    Experimenting with NMT attentionalignments Goals • Improve translation of multiword-expressions • Keep track of changes in attention alignments September 27, 2017 49
  • 50.
    Workflow September 27, 201750 Tag corpora with morphological taggers UDPipe LV Tagger Identify MWE candidates MWE Toolkit Align identified MWE candidates MPAligner Shuffle MWEs into training corpora; Train NMT systems Neural Monkey Identify changes
  • 51.
    Data Training – En →Lv • 4.5M parallel sentences for the baseline • 4.8M after adding MEWs/MWE sentences – En → Cs • 49M parallel sentences for the baseline • 17M after adding MEWs/MWE sentences Evaluation – En → Lv • 2003 sentences in total • 611 sentences with at least one MWE – En → Cs • 6000 sentences in total • 112 sentences with at least one MWE September 27, 2017 51 WMT17 News Translation Task
  • 52.
    Data En → Lv En→ Cs September 27, 2017 52 1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M 2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M
  • 53.
    NMT Systems Neural Monkey –Embedding size 350 – Encoder state size 350 – Decoder state size 350 – Max sentence length 50 – BPE merges 30000 September 27, 2017 53
  • 54.
    Experiments Two forms ofthe presenting MWEs to the NMT system – Adding only the parallel MWEs themselves (MWE phrases) each pair forming a new “sentence pair” in the parallel corpus – Adding full sentences that contain the identified MWEs (MWE sentences) September 27, 2017 54 Languages En → Cs En → Lv Dataset Dev MWE Dev MWE Baseline 13.71 10.25 11.29 9.32 +MWE phrases - - 11.94 10.31 +MWE sentences 13.99 10.44 - -
  • 55.
  • 56.
  • 57.
    Simple system combinationusing neural network attention September 27, 2017 57
  • 58.
    Simple system combinationusing neural network attention September 27, 2017 58
  • 59.
    Simple system combinationusing neural network attention Workflow • Translate the same sentence with two different NMT systems and one SMT system; save attention alignment data from the NMT systems • Choose output from the system that does not • Align most of its attention to a single token • Have only very strong one-to-one alignments • Otherwise - back off to the output of the SMT system September 27, 2017 59
  • 60.
    Experiments September 27, 201760 System En->Lv Lv->En Dataset Dev Test Dev Test LetsMT! 19.8 12.9 24.3 13.4 Neural Monkey 16.7 13.5 15.7 14.3 Nematus 16.9 13.6 15.0 13.8 NM+NT+LMT - 13.6 - 14.3 Data – WMT17 News translation Task
  • 61.
    System combination byestimating confidence from neural network attention 𝐶𝐷𝑃 = 1 𝐽 𝑗 log 1 + 𝑖 ∝𝑗𝑖 2 September 27, 2017 61 𝐴𝑃𝑜𝑢𝑡 = − 1 𝐼 𝑖 𝑗 ∝𝑗𝑖∙ 𝑙𝑜𝑔 ∝𝑗𝑖 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = 𝐶𝐷𝑃 + 𝐴𝑃𝑜𝑢𝑡 + 𝐴𝑃𝑖𝑛 𝐴𝑃𝑖𝑛 = − 1 𝐼 𝑗 𝑖 ∝𝑖𝑗∙ 𝑙𝑜𝑔 ∝𝑖𝑗
  • 62.
    System combination byestimating confidence from neural network attention September 27, 2017 62 Source Viņš bija labs cilvēks ar plašu sirdi. Reference He was a kind spirit with a big heart. Hypothesis He was a good man with a wide heart. CDP -0.099 APout -1.077 APin -0.847 Confidence -2.024
  • 63.
    System combination byestimating confidence from neural network attention September 27, 2017 63 Source Aizvadītajā diennaktī Latvijā reģistrēts 71 ceļu satiksmes negadījumos, kuros cietuši 16 cilvēki. Reference 71 traffic accidents in which 16 persons were injured have happened in Latvia during the last 24 hours. Hypothesis The first day of the EU’European Parliament is the first of the three years of the Eur opean Union . CDP -0.900 APout -2.809 APin -2.137 Confidence -5.846
  • 64.
    Experiments September 27, 201764 BLEU System En->De De->En En->Lv Lv->En Neural Monkey 18.89 26.07 13.74 11.09 Nematus 22.35 30.53 13.80 12.64 Hybrid 20.19 27.06 14.79 12.65 Human 23.86 34.26 15.12 13.24 Data – WMT17 News translation Task
  • 65.
    Human Evaluation September 27,2017 65 En->Lv Lv->En LM-based overlap with human 58% 56% Attention-based overlap with human 52% 60% LM-based overlap with Attention-based 34% 22% Language pair CDP APin APout Overall En->Lv 0.099 0.074 0.123 0.086 Lv->En -0.012 -0.153 -0.2 -0.153
  • 66.
  • 67.
    Practical implementations • Interactive multi-systemmachine translation • Visualizing neural machine translation attention and confidence September 27, 2017 67
  • 68.
    Interactive multi-system machine translation September27, 2017 68 Start page Translate with onlinesystems Inputtranslations to combine Input translated chunks Settings Translation results Inputsource sentence Inputsource sentence
  • 69.
    Interactive multi-system machine translation •Adding a user-friendly interface to ChunkMT – Draws a syntax tree with chunks highlighted – Designates which chunks where chosen from which system – Provides a confidence score for the choices • Allows using online APIs or user provided translations • Comes with resources for translating between English, French, German and Latvian • Can be used in a web browser September 27, 2017 69
  • 70.
  • 71.
  • 72.
    Visualizing NMT attention andconfidence Works with attention alignment data from • Nematus • Neural Monkey • AmuNMT • OpenNMT • Sockeye Visualise translations in • Linux Terminal or Windows PowerShell • Web browser • Line form or matrix form • Save as PNG • Sort and navigate dataset by confidence scores September 27, 2017 72
  • 73.
    Visualizing NMT attention andconfidence September 27, 2017 73
  • 74.
    Visualizing NMT attention andconfidence September 27, 2017 74
  • 75.
  • 76.
    Conclusions • Exploration ofa variety of methods for combining multiple MT systems • Mostly focused on translating from and to Latvian, but also on other morphologically complex languages like Czech and German • All results evaluated using automatic metrics; most of them also using manual human evaluation September 27, 2017 76
  • 77.
    Conclusions • Hybrid MTcombination via chunking outperformed individual systems in translating long sentences • Hybrid combination for NMT via attention alignments complies to the emerging technology of neural network systems and can distinguish low quality translations from high quality ones • The graphical tools serve for performing translations while inspecting their composition of parts from individual systems, as well as overviewing results of already generated translations to quickly locate better or worse results September 27, 2017 77
  • 78.
    Conclusions Since in mostcases of evaluating both, the chunking method and the attention-based method, the author observed improvements in automatic, as well as human evaluation, the proposed hypothesis that it is possible to achieve a higher quality MT than produced by each component system individually, by combining output from multiple different MT systems can be considered as proven. September 27, 2017 78
  • 79.
    Main results • Methodsfor hybrid machine translation combination via chunking; • Methods for hybrid neural machine translation combination via attention alignments; • Graphical tools for overviewing the processes. September 27, 2017 79
  • 80.
    Publications • 11 publications •3 indexed in Web of Science • 2 indexed in Scopus • 10 peer reviewed • Presented in • 8 conferences • 2 workshops September 27, 2017 80
  • 81.
    Publications • Rikters, M.,Fishel, M., (2017, September). Confidence Through Attention. In the proceedings of The 16th Machine Translation Summit. • Rikters, M., Bojar, O. (2017, September). Paying Attention to Multi-word Expressions in Neural Machine Translation. In the proceedings of The 16th Machine Translation Summit. • Rikters, M., Amrhein, C., Del, M., Fishel, M. (2017b, September). C-3MA: Tartu-Riga-Zurich Translation Systems for WMT17. In the proceedings of The 2nd Conference on Machine Translation. September 27, 2017 81
  • 82.
    Publications • Rikters, M.,Fishel, M., Bojar, O. (2017a, August). Visualizing Neural Machine Translation Attention and Confidence. In The Prague Bulletin for Mathematical Linguistics issue 109. • Rikters, M. (2016d, December). Neural Network Language Models for Candidate Scoring in Hybrid Multi-System Machine Translation. In CoLing 2016, 6th Workshop on Hybrid Approaches to Translation (HyTra 6). • Rikters, M. (2016c, October). Searching for the Best Translation Combination Across All Possible Variants. In The 7th Conference on Human Language Technologies - the Baltic Perspective (Baltic HLT 2016) (pp. 92-96). September 27, 2017 82
  • 83.
    Publications • Rikters, M.(2016b, September). Interactive multi-system machine translation with neural language models. In Frontiers in Artificial Intelligence and Applications. • Rikters, M. (2016a, July). K-Translate-Interactive Multi- System Machine Translation. In The 12th International Baltic Conference on Databases and Information Systems (pp. 304-318). Springer International Publishing. • Rikters, M., Skadiņa, I. (2016b, May). Syntax-based multi- system machine translation. In N. C. C. Chair) et al. (Eds.), In Proceedings of The 10th International Conference on Language Resources and Evaluation (LREC 2016). Paris, France: European Language Resources Association (ELRA). September 27, 2017 83
  • 84.
    Publications • Rikters, M.,Skadiņa, I. (2016a, April) Combining machine translated sentence chunks from multiple MT systems. In The 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2016). • Rikters, M. (2015, July). Multi-system machine translation using online APIs for English-Latvian. In ACL-IJCNLP 2015, 4th Workshop on Hybrid Approaches to Translation (HyTra 4). September 27, 2017 84