KantanFest: Andy Way

The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
Andy Way
ADAPT Centre @ Dublin City University
Separating the Hype from the Reality:
Neural Machine Translation

www.adaptcentre.ieThe Time for MT is
| 2

www.adaptcentre.ieThe hype begins …
| 3

www.adaptcentre.ie
| 4
The hype gathers pace …
Thanks to Sheila Castilho

www.adaptcentre.ie
| 5
And translators go (again) …
Thanks to Sheila Castilho

www.adaptcentre.ieLeads to Us vs. Them Mentality (again)
| 6

www.adaptcentre.ieTranslators are quite well-disposed towards technology
| 7
LeBlanc 2013, 2017
Koskinen & Ruokenon 2017
Thanks to Dorothy Kenny

www.adaptcentre.ieTranslators are quite well-disposed towards technology
| 8
“The public thinks that technology takes
care of translation. Good technology is all
about making good translators better”
Jost Zetzsche (@jeromobot)
24th June 2017

www.adaptcentre.ieSo, should translators be afraid of NMT?
| 9

| 10
Labradoodles, or Fried Chicken?

| 11

| 12
Chihuahuas, or Muffins?

www.adaptcentre.ie
Philipp Koehn, Omniscien Webinar 2017
MT has been overhyped for years …
| 13

www.adaptcentre.ieSo what can we learn from the past?
 Why did we start doing SMT (and Hybrid MT)?
 We wrote a (I thought) good paper on EBMT, submitted to
ACL, and were rejected by all three reviewers. Why?
 Because we hadn't compared our results with 'state-of-
the-art' SMT!
| 14

www.adaptcentre.iePhrase-Based SMT then came along in earnest
• “Let the data decide”!
| 15

• But what have we spent the
last 10 years doing?
 Smuggling in Syntax,
Semantics, and (lately)
Discourse features to
break through the glass
ceiling.
| 16

• But what have we spent the
last 10 years doing?
 Smuggling in Syntax,
Semantics, and (lately)
Discourse features to
break through the glass
ceiling.
| 17

www.adaptcentre.ieSMT & Linguistics
 SMT practitioners know now about the value of
linguistic information
 cf. Alex Fraser's keynote at EAMT-16:
 agreement phenomena (gender, person, number, case),
 verbal inflection,
 compounding,
 terminology,
 lexical/structural ambiguity,
 pronouns ...
| 18

www.adaptcentre.ieWhat’s happened since?
 Deep Learning came along and took off!
 “Let the data decide”!
 Recent (accepted) ACL 2016 paper on SMT:
“you haven't compared your results with
'state-of-the-art' NMT”!
| 19

www.adaptcentre.ieThis isn’t rocket science!
| 20

www.adaptcentre.ieWhat is the actual situation?
• Wins for NMT for numerous language pairs at IWSLT/WMT 2015 & WMT 2016
• Bentivogli et al. (2016 – arxiv; EMNLP)
– IWSLT 2015 English-German: NMT compared to 4 SMT systems
– Automatic Evaluation:
• NMT outperforms SMT system in any length bin, with statistically significant differences
– Human Evaluation:
• NMT makes at least 19% fewer morphology errors than SMT
• NMT makes at least 17% fewer lexical errors than SMT
• NMT translations require about 50% fewer shifts than SMT
• NMT reduces verb order errors by 70% with respect to best SMT system
• NMT reduces noun order errors by 47% with respect to best SMT system
• NMT gains also for prepositions (-18%), negation particles (-17%) and articles (-4%)
• NMT generates outputs that considerably lower the overall post-editing effort w.r.t best SMT
system (-26%)
| 21

www.adaptcentre.ieOther Use-Cases
• NMT for E-Commerce
• NMT for Patents
• NMT for MOOCs
[Castilho et al. 2017, EAMT]
• Five other human evaluations of NMT/SMT at
EAMT 2017 (inc. from ))
| 22

www.adaptcentre.ieNMT for E-Commerce
• Translate product listings
• Systems (Calixto et al. 2017—EACL):
• (1) a PBSMT baseline model built with the Moses SMT Toolkit
• (2) a text-only NMTt model
• (3) a multi-modal NMT model (NMTm)
• English into German
• Data set: 24k parallel product listings + images
• Validation/test data: 480/444 tuples
• 18 German native speakers
• Ranking
• Translations from the 3 systems + product image
• Adequacy (Likert scale 1- All of it to 4- None of it)
• Source + translation + product image
| 23

www.adaptcentre.ieNMT for E-Commerce
• AEM:
• PBSMT outperforms both NMT models (BLEU, METEOR and chrF3)
• NMTm performs as well as PBSMT (TER)
• Adequacy
• NMTm performs as well as PBSMT
• Ranking
• PBSMT: 56.3% preferred system
• NMTm: 24.8%
• NMTt: 18.8%
| 24

www.adaptcentre.ieNMT for Patents
| 25
• Compare performance of mature patent MT engines used in
production with new NMT system
• Systems
• PBSMT (a combination of elements of phrase-based, syntactic, and
rule-driven MT, along with automatic post-editing)
• NMT (baseline)
• English into Chinese
• Data set: ~1M sentence pairs chemical abstracts, ~350K chemical
titles, ~12M general patent, and ~2K glossaries.
• 2 reviewers
• Ranking
• Error analysis
• Punctuation, part of speech, omission, addition, wrong terminology,
literal translation, and word form.

www.adaptcentre.ieNMT for Patents
| 26
• AEM:
• SMT outperforms NMT for abstracts, NMT outperforms SMT for titles
• Ranking
• General: PBSMT 54% -- MT 39%
• Long sentences: PBSMT 58% -- NMT 33%
• Short sentences: PBSMT 84% -- NMT 8%
• Medium-length sentences: PBSMT 36% -- NMT 57%
• Error analysis
• SMT: sentence structure 35% (10% NMT)
• NMT: 37% omission (8% SMT)
• % segments with “no errors”: SMT 25% -- NMT 2%

www.adaptcentre.ieNMT for MOOCs
• Decide which system would provide better quality translations for the
project domain
• Systems
• PBMST (Moses)
• NMT (baseline)
• English into German, Greek, Portuguese and Russian
• Data set:
• OFD : ~24M (DE), ~31M (EL), ~32M (PT), ~22M (RU)
• In-domain : ~270K (DE), ~140K (EL), ~58K (PT), ~2M (RU)
• Ranking
• Post-editing
• Fluency and Adequacy (1-4 Likert scale)
• Error analysis: inflectional morphology, word order, omission, addition,
and mistranslation
| 27

• AEM:
• NMT outperforms SMT in terms of BLEU and METEOR
• More PE for SMT
• Fluency and Adequacy
• NMT is preferred across all languages for Fluency
• Adequacy results a bit less consistent
| 28

 Post-editing
 Technical effort improved for DE, but marginally for other languages
 Temporal effort marginally improved
 Ranking
 NMT is preferred across all languages (DE 80%, EL 56%, PT 61% and RU 63%)
| 29

www.adaptcentre.ieObservations
| 30

www.adaptcentre.ieObservations (from an old guy)
| 31

• MT is hard; it’s about as hard a problem as we’ve some up with.
• Just by adopting a new paradigm, the problems don’t become
any easier.
| 32

• MT is hard; it’s about as hard a problem as we’ve some up with.
• Just by adopting a new paradigm, the problems don’t become
any easier.
• (Some) newcomers to the field will soon find that MT is too
hard for them and will disappear …
• The same thing happened with SMT – people came into the
field, published an ACL paper with their favourite statistical
method and ran off to their next field.
• For them, MT was just another application, whereas some of us
have been doing this for half our lives and more!
| 33

www.adaptcentre.ieConcluding Remarks
• NMT results are really promising!
• But … human evaluations show that results are not yet so
clear-cut
• Especially where data is scarce, NMT hopelessly
underperforms compared to SMT
• Translation industry is eager for improved MT quality in
order to minimise costs
• The hype around NMT must be treated cautiously;
overselling a technology that is still in need of more
research may cause more negativity about MT
| 34

www.adaptcentre.ieFood for Thought?
• Imagine NMT really is better than SMT:
– for all domains
– for all language pairs
| 35

– for all domains
• Is the translation industry set up to provide this
technology now?
| 36

– for all domains
technology now?
• If not, what needs to happen? And by when? Who
can help?
| 37

– for all domains
technology now?
can help?
• Finally: training NMT engines typically takes weeks
rather than days for SMT.
| 38

– for all domains
technology now?
can help?
• Finally: training NMT engines typically takes weeks
rather than days for SMT.
– What’s the impact on the climate of all these
GPU servers running 24/7?
| 39

www.adaptcentre.ie
| 40| 40
Thanks for listening!

KantanFest: Andy Way

Recommended

Recommended

More Related Content

Similar to KantanFest: Andy Way

Similar to KantanFest: Andy Way (20)

More from kantanmt

More from kantanmt (20)

Recently uploaded

Recently uploaded (20)

KantanFest: Andy Way