SlideShare a Scribd company logo
KantanNeural™ from A to Z
1/3: To NMT or not to NMT?
Dimitar Shterionov
The Rise of MT
1954 1966 1970 1982 1993 2003 2005 2016 2020
Quality of MT over time
Relativequality
Time
31/07/2017 KantanFest, Dublin, Ireland 2
Breakthrough in NeuralMT
31/07/2017 KantanFest, Dublin, Ireland 3
Yet another MT paradigm?
31/07/2017 KantanFest, Dublin, Ireland 4
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 5
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 6
Is NMT better than PBSMT???
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 7
Can NMT better than PBSMT???
 Various empirical evaluations
(since 2015)
31/07/2017 KantanFest, Dublin, Ireland 8
…
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 9
 Experiment Setup
 Identical Training, Test and Tune Data
 NMT training limited to 4 days
 Evaluation:
 Automated Scores: F-Measure, TER, BLEU
 Ranking with KantanLQR™, A/B Testing
 Publications and Presentations
 EAMT 2017
 MT Summit 2017
 LocWorld34 NMT GALA Track
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 10
 A small parenthesis…
There are so many factors
 Learning algorithm and rate
 Number of epochs
 ANN properties
 Data – preprocessing, segmentation
you need the right data!
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 11
Training: Identical Corpora
Language Arc
Parallel
Sentences
TWC UWC Domain(s)
English->German 8,820,562 110,150,238 859,167 Legal/Medical
English->Chinese(Simplified) 6,522,064 84,426,931 956,864 Legal/Technical
English->Japanese 8,545,366 87,252,129 676,244 Legal/Technical
English->Italian 2,756,185 35,295,535 765,930 Medical
English->Spanish 3,681,332 44,917,538 952,089 Legal
31/07/2017 KantanFest, Dublin, Ireland 12
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
Training: Automated Scores
“In information theory, perplexity is a measurement of how well a
probability distribution or probability model predicts a sample. It may be
used to compare probability models. A low perplexity indicates the
probability distribution is good at predicting the sample.”
31/07/2017 KantanFest, Dublin, Ireland 13
Training: Automated Scores
0
10
20
30
40
50
60
70
80
90
English->German English->Chinese(S) English->Japanese English->Italian English->Spanish
SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 14
Training: Automated Scores
0
10
20
30
40
50
60
70
80
90
English->German English->Chinese(S) English->Japanese English->Italian English->Spanish
SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
Alternative translations
Source
All dossiers must be individually analysed by the ministry responsible for the
economy and scientific policy.
Reference
Jeder Antrag wird von den Dienststellen des zuständigen Ministers für
Wirtschaft und Wissenschaftspolitik individuell geprüft.
PBSMT
Alle Unterlagen müssen einzeln analysiert werden von den Dienststellen des
zuständigen Ministers für Wirtschaft und Wissenschaftspolitik.
NMT
Alle Unterlagen müssen von dem für die Volkswirtschaft und die
wissenschaftliche Politik zuständigen Ministerium einzeln analysiert werden.
58%
0%
Source En este punto muestro mi desacuerdo con el informe.
Reference On this point, I am not in agreement with the report before us.
PBSMT At this point, I am not in agreement with the report.
NMT In this point I disagree with the report.
72%
7%
Source Debemos apoyarles a todos para que alcancen este objetivo.
Reference We must give them all our support to reach that goal.
PBSMT We must give them all our support to reach that goal.
NMT We have to support everyone to achieve this goal.
100%
0%
BLEU
EN→DEES→ENES→EN
31/07/2017 KantanFest, Dublin, Ireland 15
31/07/2017 KantanFest, Dublin, Ireland 16
Ranking
37
21
13
24
10
21
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 17
Ranking
37
21
13
24
10
21
24
21
34
19
28
25.2
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 18
Ranking
37
21
13
24
10
21
24
21
34
19
28
25.2
39
58
53
56
62
53.6
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
BLEU underestimation of NMT
 Take the translations from the NMT engine
considered better than their PBSMT counterparts.
 How many of those are scored by BLEU lower than
their PBSMT counterparts?
 Do the same for the PBSMT translations.
31/07/2017 KantanFest, Dublin, Ireland 19
EN→ZH-CN EN→JP EN→DE EN→IT EN→ES Average
NMT 40% 59% 55% 34% 53% 48%
PBSMT 12% 0% 9% 9% 0% 6%
Take-away messages…
 NMT is a new efficient paradigm for MT
 NMT does not solve the problem of language
 NMT can be much better than PBSMT
 Evaluating NMT:
 BLEU, TER, F-Measure may underestimate NMT
when compared to PBSMT
 Using KantanLQR™ (A/B Testing) facilitates MT ranking
31/07/2017 KantanFest, Dublin, Ireland 20
Take-away messages…
 NMT is a new efficient paradigm for MT
 NMT does not solve the problem of language … but it is getting there
 NMT can be much better than PBSMT
 Evaluating NMT:
 BLEU, TER, F-Measure may underestimate NMT
when compared to PBSMT
 Using KantanLQR™ (A/B Testing) facilitates MT ranking
31/07/2017 KantanFest, Dublin, Ireland 21
To NMT or not to NMT?
Quality Evaluation
Thank you…
31/07/2017 KantanFest, Dublin, Ireland 22

More Related Content

Similar to Kantanfest: Dimitar Shterionov - Part 1

CLiC-it 2018 Presentation
CLiC-it 2018 PresentationCLiC-it 2018 Presentation
CLiC-it 2018 Presentation
Oronzo Antonelli
 
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
INVIZA® HEALTH
 
Bagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time SeriesBagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time Series
Tiago Mendes Dantas
 
Tablet tools and micro tasks
Tablet tools and micro tasksTablet tools and micro tasks
Tablet tools and micro tasks
mhilde
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Decision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy ProcessDecision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy Process
Vaibhav Gaikwad
 
chapter 17.pdf
chapter 17.pdfchapter 17.pdf
chapter 17.pdf
Bagian Pembangunan
 
Supply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final PresentationSupply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final Presentation
Mark Cigich
 
Steven Lugard
Steven LugardSteven Lugard
Steven Lugard
Investnet
 
Fall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation SummaryFall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation Summary
Ash Abel
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
multimediaeval
 
Machine Learning using biased data
Machine Learning using biased dataMachine Learning using biased data
Machine Learning using biased data
Arnaud de Myttenaere
 
Log11 uitwerking opdrachten
Log11 uitwerking opdrachtenLog11 uitwerking opdrachten
Log11 uitwerking opdrachten
Arthur van der Molen
 
EPFL workshop on sparsity
EPFL workshop on sparsityEPFL workshop on sparsity
EPFL workshop on sparsity
Juri Ranieri
 
P 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_MeijerP 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_Meijer
Marcel Meijer
 
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Juan Camilo Vasquez
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Jinho Choi
 
Instrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.pptInstrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.ppt
muhamadzulhelmibinmo
 
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Yusuke Oda
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure Sensor
Yole Developpement
 

Similar to Kantanfest: Dimitar Shterionov - Part 1 (20)

CLiC-it 2018 Presentation
CLiC-it 2018 PresentationCLiC-it 2018 Presentation
CLiC-it 2018 Presentation
 
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
 
Bagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time SeriesBagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time Series
 
Tablet tools and micro tasks
Tablet tools and micro tasksTablet tools and micro tasks
Tablet tools and micro tasks
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Decision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy ProcessDecision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy Process
 
chapter 17.pdf
chapter 17.pdfchapter 17.pdf
chapter 17.pdf
 
Supply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final PresentationSupply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final Presentation
 
Steven Lugard
Steven LugardSteven Lugard
Steven Lugard
 
Fall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation SummaryFall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation Summary
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
 
Machine Learning using biased data
Machine Learning using biased dataMachine Learning using biased data
Machine Learning using biased data
 
Log11 uitwerking opdrachten
Log11 uitwerking opdrachtenLog11 uitwerking opdrachten
Log11 uitwerking opdrachten
 
EPFL workshop on sparsity
EPFL workshop on sparsityEPFL workshop on sparsity
EPFL workshop on sparsity
 
P 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_MeijerP 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_Meijer
 
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Instrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.pptInstrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.ppt
 
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure Sensor
 

More from kantanmt

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
kantanmt
 
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2
kantanmt
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellas
kantanmt
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
kantanmt
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
kantanmt
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeural
kantanmt
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
kantanmt
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
kantanmt
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
kantanmt
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
kantanmt
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
kantanmt
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
kantanmt
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
kantanmt
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
kantanmt
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
kantanmt
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
kantanmt
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
kantanmt
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
kantanmt
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
kantanmt
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
kantanmt
 

More from kantanmt (20)

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
 
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellas
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeural
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
 

Recently uploaded

Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
Eticketing.co
 
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
Eticketing.co
 
The most promising young Players to watch during Euro 2024.docx
The most promising young Players to watch during Euro 2024.docxThe most promising young Players to watch during Euro 2024.docx
The most promising young Players to watch during Euro 2024.docx
Euro Cup 2024 Tickets
 
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
Eticketing.co
 
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
Eticketing.co
 
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docxLuciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
Euro Cup 2024 Tickets
 
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
Eticketing.co
 
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
apobqx
 
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docxCroatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
Euro Cup 2024 Tickets
 
Tennis rules and techniques with information
Tennis rules and techniques with informationTennis rules and techniques with information
Tennis rules and techniques with information
mohsintariq167876
 
Psaroudakis: Family and Football – The Psaroudakis Success Story
Psaroudakis: Family and Football – The Psaroudakis Success StoryPsaroudakis: Family and Football – The Psaroudakis Success Story
Psaroudakis: Family and Football – The Psaroudakis Success Story
Psaroudakis
 
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
Eticketing.co
 
Euro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage OutcomesEuro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage Outcomes
Select Distinct Limited
 
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdfJORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
Arturo Pacheco Alvarez
 
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
Eticketing.co
 
Sportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platformSportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platform
NathanielMDuncan
 
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
8z10jo1w
 
Paris 2024 History-making Matildas team selected for Olympic Games.pdf
Paris 2024 History-making Matildas team selected for Olympic Games.pdfParis 2024 History-making Matildas team selected for Olympic Games.pdf
Paris 2024 History-making Matildas team selected for Olympic Games.pdf
Eticketing.co
 
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
asabad1
 
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
Eticketing.co
 

Recently uploaded (20)

Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
Croatia vs Italy Modric's Last Dance Croatia's UEFA Euro 2024 Journey and Ita...
 
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
Poland vs Netherlands UEFA Euro 2024 Poland Battles Injuries Without Lewandow...
 
The most promising young Players to watch during Euro 2024.docx
The most promising young Players to watch during Euro 2024.docxThe most promising young Players to watch during Euro 2024.docx
The most promising young Players to watch during Euro 2024.docx
 
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
Euro Cup Group E Preview, Team Strategies, Key Players, and Tactical Insights...
 
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
Olympic 2024 Key Players and Teams to Watch in Men's and Women's Football at ...
 
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docxLuciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
Luciano Spalletti Leads Italy's Transition at UEFA Euro 2024.docx
 
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
Spain vs Italy Spain Route to The Euro Cup 2024 Final Who La Roja Will Face I...
 
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理一比一原版(Curtin毕业证)科廷大学毕业证如何办理
一比一原版(Curtin毕业证)科廷大学毕业证如何办理
 
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docxCroatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
Croatia's UEFA Euro 2024 Puzzle of Experience versus Youth.docx
 
Tennis rules and techniques with information
Tennis rules and techniques with informationTennis rules and techniques with information
Tennis rules and techniques with information
 
Psaroudakis: Family and Football – The Psaroudakis Success Story
Psaroudakis: Family and Football – The Psaroudakis Success StoryPsaroudakis: Family and Football – The Psaroudakis Success Story
Psaroudakis: Family and Football – The Psaroudakis Success Story
 
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...
 
Euro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage OutcomesEuro 2024 Predictions - Group Stage Outcomes
Euro 2024 Predictions - Group Stage Outcomes
 
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdfJORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
JORNADA 11 LIGA MURO 2024BASQUETBOL1.pdf
 
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
Belgium vs Romania Ultimate Guide to Euro Cup 2024 Tactics, Ticketing, and Qu...
 
Sportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platformSportr pitch deck for our saas based platform
Sportr pitch deck for our saas based platform
 
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
快速制作加拿大西蒙菲莎大学毕业证(sfu毕业证书)硕士学位证书原版一模一样
 
Paris 2024 History-making Matildas team selected for Olympic Games.pdf
Paris 2024 History-making Matildas team selected for Olympic Games.pdfParis 2024 History-making Matildas team selected for Olympic Games.pdf
Paris 2024 History-making Matildas team selected for Olympic Games.pdf
 
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
一比一原版(Columbia毕业证)哥伦比亚大学毕业证如何办理
 
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...
 

Kantanfest: Dimitar Shterionov - Part 1

  • 1. KantanNeural™ from A to Z 1/3: To NMT or not to NMT? Dimitar Shterionov
  • 2. The Rise of MT 1954 1966 1970 1982 1993 2003 2005 2016 2020 Quality of MT over time Relativequality Time 31/07/2017 KantanFest, Dublin, Ireland 2
  • 3. Breakthrough in NeuralMT 31/07/2017 KantanFest, Dublin, Ireland 3
  • 4. Yet another MT paradigm? 31/07/2017 KantanFest, Dublin, Ireland 4
  • 5. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 5
  • 6. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 6 Is NMT better than PBSMT???
  • 7. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 7 Can NMT better than PBSMT???
  • 8.  Various empirical evaluations (since 2015) 31/07/2017 KantanFest, Dublin, Ireland 8 … Scientific Rigour – NMT vs PBSMT
  • 9. 31/07/2017 KantanFest, Dublin, Ireland 9  Experiment Setup  Identical Training, Test and Tune Data  NMT training limited to 4 days  Evaluation:  Automated Scores: F-Measure, TER, BLEU  Ranking with KantanLQR™, A/B Testing  Publications and Presentations  EAMT 2017  MT Summit 2017  LocWorld34 NMT GALA Track Scientific Rigour – NMT vs PBSMT
  • 10. 31/07/2017 KantanFest, Dublin, Ireland 10  A small parenthesis… There are so many factors  Learning algorithm and rate  Number of epochs  ANN properties  Data – preprocessing, segmentation you need the right data! Scientific Rigour – NMT vs PBSMT
  • 11. 31/07/2017 KantanFest, Dublin, Ireland 11 Training: Identical Corpora Language Arc Parallel Sentences TWC UWC Domain(s) English->German 8,820,562 110,150,238 859,167 Legal/Medical English->Chinese(Simplified) 6,522,064 84,426,931 956,864 Legal/Technical English->Japanese 8,545,366 87,252,129 676,244 Legal/Technical English->Italian 2,756,185 35,295,535 765,930 Medical English->Spanish 3,681,332 44,917,538 952,089 Legal
  • 12. 31/07/2017 KantanFest, Dublin, Ireland 12 Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT Training: Automated Scores “In information theory, perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distribution is good at predicting the sample.”
  • 13. 31/07/2017 KantanFest, Dublin, Ireland 13 Training: Automated Scores 0 10 20 30 40 50 60 70 80 90 English->German English->Chinese(S) English->Japanese English->Italian English->Spanish SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT
  • 14. 31/07/2017 KantanFest, Dublin, Ireland 14 Training: Automated Scores 0 10 20 30 40 50 60 70 80 90 English->German English->Chinese(S) English->Japanese English->Italian English->Spanish SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT
  • 15. Alternative translations Source All dossiers must be individually analysed by the ministry responsible for the economy and scientific policy. Reference Jeder Antrag wird von den Dienststellen des zuständigen Ministers für Wirtschaft und Wissenschaftspolitik individuell geprüft. PBSMT Alle Unterlagen müssen einzeln analysiert werden von den Dienststellen des zuständigen Ministers für Wirtschaft und Wissenschaftspolitik. NMT Alle Unterlagen müssen von dem für die Volkswirtschaft und die wissenschaftliche Politik zuständigen Ministerium einzeln analysiert werden. 58% 0% Source En este punto muestro mi desacuerdo con el informe. Reference On this point, I am not in agreement with the report before us. PBSMT At this point, I am not in agreement with the report. NMT In this point I disagree with the report. 72% 7% Source Debemos apoyarles a todos para que alcancen este objetivo. Reference We must give them all our support to reach that goal. PBSMT We must give them all our support to reach that goal. NMT We have to support everyone to achieve this goal. 100% 0% BLEU EN→DEES→ENES→EN 31/07/2017 KantanFest, Dublin, Ireland 15
  • 16. 31/07/2017 KantanFest, Dublin, Ireland 16 Ranking 37 21 13 24 10 21 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 17. 31/07/2017 KantanFest, Dublin, Ireland 17 Ranking 37 21 13 24 10 21 24 21 34 19 28 25.2 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 18. 31/07/2017 KantanFest, Dublin, Ireland 18 Ranking 37 21 13 24 10 21 24 21 34 19 28 25.2 39 58 53 56 62 53.6 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 19. BLEU underestimation of NMT  Take the translations from the NMT engine considered better than their PBSMT counterparts.  How many of those are scored by BLEU lower than their PBSMT counterparts?  Do the same for the PBSMT translations. 31/07/2017 KantanFest, Dublin, Ireland 19 EN→ZH-CN EN→JP EN→DE EN→IT EN→ES Average NMT 40% 59% 55% 34% 53% 48% PBSMT 12% 0% 9% 9% 0% 6%
  • 20. Take-away messages…  NMT is a new efficient paradigm for MT  NMT does not solve the problem of language  NMT can be much better than PBSMT  Evaluating NMT:  BLEU, TER, F-Measure may underestimate NMT when compared to PBSMT  Using KantanLQR™ (A/B Testing) facilitates MT ranking 31/07/2017 KantanFest, Dublin, Ireland 20
  • 21. Take-away messages…  NMT is a new efficient paradigm for MT  NMT does not solve the problem of language … but it is getting there  NMT can be much better than PBSMT  Evaluating NMT:  BLEU, TER, F-Measure may underestimate NMT when compared to PBSMT  Using KantanLQR™ (A/B Testing) facilitates MT ranking 31/07/2017 KantanFest, Dublin, Ireland 21 To NMT or not to NMT?
  • 22. Quality Evaluation Thank you… 31/07/2017 KantanFest, Dublin, Ireland 22

Editor's Notes

  1. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  2. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  3. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  4. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  5. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  6. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  7. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  8. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  9. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  10. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  11. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  12. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  13. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  14. (give 30 seconds for people to check and ask which translation they prefer).
  15. (give 30 seconds for people to check and ask which translation they prefer).
  16. (give 30 seconds for people to check and ask which translation they prefer).
  17. (give 30 seconds for people to check and ask which translation they prefer).
  18. Next we aimed to investigate our hypothesis of BLEU underestimating NMT quality. In order to do so, we needed to find irregularities between human evaluation and BLEU scores. To do so, first, we took the set of translations, for each language pair and from the set that the reviewers evaluated, where NMT was marked by all three reviewers better. Next, from this set we counted the number of translations with BLEU score lower than their PBSMT counterparts. Third, we find the ration of the two counts. We did the same also for the PBSMT – get the set of better translations, count the ones with BLEU score lower than the NMT counterparts and calculate the ration between the two numbers. It is clear from our results that indeed, the BLEU is not that reliable for NMT. Furthermore, these results indicate that BLEU underestimates the quality, thus confirming our hypothesis. Now, can we actually trust BLEU??? There are several remarks that need to be noted. First, the numbers shown in our table for each language pair are similar – this means that the affect of the BLEU underestimation is the same among the NMT engines, that is – we can compare NMT engines based on BLEU and still get a sense of their quality differences; Second, we notice the same tendency in the F-Measure score, which is also a metric based on n-grams. That indicates that indeed the issues arise from the underlying principles of PBSMT and NMT (recall the 2D picture with the points linked to the John/Mary sentences). This can push the future research in quality estimation in a particular direction. And third, something not shown in a table or a graph. Remember that our engines are trained under a time restriction. Assume we let the training continue until the neural network reaches its full potential. That is, it will model optimally the training data. Given that the test data is very similar to the training data this would mean that the engine would model each test sentence also very well, even on a phrase level. And as such, the scores (BLEU, F-Measure and TER) would improve and get closer or even surpass the PBSMT scores. This statement is supported by other research where (e.g., google’s paper from November last year) shows very good scores but also each of their models is trained for almost two weeks.
  19. A translation production line nowadays typically combines an MT component with human post-editing. While the MT component is simply a means to get a raw translation of the original text, which in the next step is modified to meet certain translation quality standards, the choice of correct MT toolset impacts the efficiency of this pipeline.
  20. A translation production line nowadays typically combines an MT component with human post-editing. While the MT component is simply a means to get a raw translation of the original text, which in the next step is modified to meet certain translation quality standards, the choice of correct MT toolset impacts the efficiency of this pipeline.