State of the Machine Translation by Intento (stock engines, Jun 2019)

Konstantin Savenkov
Konstantin Savenkovco-Founder and CEO at Inten.to
STATE OF THE
MACHINE TRANSLATION
STOCK* MT MODELS
by Intento

June 2019
* commercially available pre-trained MT models
June 2019© Intento, Inc.
DISCLAIMER
2
The MT systems used in this report were accessed from May 10 to June
10, 2019. They may have changed many times since then.
—
This report demonstrates the performance of those systems exclusively
on the datasets used for this report (see slide 13) using proximity scores.
The final MT decision requires Human LQA and depends on the use-
case.
—
We run multiple evaluations for our clients for various language pairs and
domains, observing different rankings of the MT systems.
—
There’s no “best” MT system. Performance depends on how your data
is similar to what they used to train their models and on their algorithms.
June 2019© Intento, Inc.
About
We have been evaluating models for Machine Translation since
May 2017 (Custom NMT as well).
—
As we show in this report, the Machine Translation landscape is
complex, with models from 8 different vendors required to get
the best quality across popular language pairs and a 300x
difference in price. And it changes often!
—
To evaluate on your own dataset, reach us at hello@inten.to
—
To conveniently use many MT engines at once, check out our
Enterprise MT Hub (next slide).
3
June 2019© Intento, Inc.
Intento Enterprise MT Hub
One place to evaluate and manage MT
Universal API to
all MT engines
Single MT
dashboard
Connects to many
CAT, TMS and
CMS
Works with files
of any size
Get your
API key at
inten.to
4
Smart Routing
with retries
and failovers
MAY BE DEPLOYED
ON
PRIVATE CLOUD
June 2019© Intento, Inc.
Executive Summary
Overall MT Quality significantly improved for Finnish<>English,
German<>French, Romanian>English, Russian>English,
Chinese>English*
—
The best MT provider has changed for 19 language pairs since
January 2019. To get the best quality across 48 language pairs, one
needs 8 different engines (see slide 17).
—
Many engines increased their language coverage: Google,
Amazon, Kakao, Systran PNMT, SDL, PROMT, ModernMT, IBM (see
slide 23).
—
New pre-trained engines: Alibaba launched their MT internationally,
Tencent MT went from preview to production in China. More
providers added pre-trained engines: Tilde (EN-PL) and Cloud
Translation (EN-ZH Medical).
5
* Beyond fluctuations attributed to the datasets update.
June 2019© Intento, Inc.
Overview
1 TRANSLATION QUALITY
2 PRICING
3 LANGUAGE COVERAGE
4 HISTORICAL PROGRESS
5 CONCLUSIONS
48
Language Pairs
25
Machine Translation
Engines
6
June 2019© Intento, Inc.
Machine Translation Engines*
with Pre-Trained General-Purpose Models
* We have evaluated general purpose Cloud Machine Translation services with pre-trained translation models, provided via API. Some vendors also
provide web-based, on-premise or custom MT engines, which may differ on all aspects from what we’ve evaluated.
Alibaba Cloud
MT
Amazon
Translate
Baidu
Translate API
DeepL
API
eBay
Translation API
Google Cloud
Translation API
GTCom
YeeCloud MT
IBM Watson
Language Translator
Kakao Developers
Translation
Microsoft Translator
Text API v3
ModernMT
Enterprise API
Naver Cloud
Papago NMT
NiuTrans
Maverick Translation
PROMT
Cloud API
SAP
Translation Hub
SDL
BeGlobal
SDL (SMT)
Language Cloud
Sogou
Deepi MT
SYSTRAN PNMT
Enterprise Server
SYSTRAN REST
Translation API
Tencent Cloud
TMT API
Tilde
Machine Translation
Yandex
Translate API
Youdao Cloud
Translation API
7
(MT systems highlighted in gray were unavailable for quantitative evaluation for different reasons)
June 2019© Intento, Inc.
1Translation Quality
1.1 Evaluation Methodology
1.2 Available MT Quality
1.3 Top-Performing Engines
1.4 Best General-Purpose Engines
1.5 Optimal General-Purpose Engines
8
June 2019© Intento, Inc.
1.1 Evaluation methodology (I)
Translation quality is evaluated by computing LEPOR score
between reference translations and the MT output (Slide 11).
—
Currently, our goal is to evaluate the performance of translation
between the most popular languages (Slide 12).
—
We use public datasets from StatMT/WMT, CASMACAT News
Commentary and Tatoeba (Slide 13).
—
We have performed LEPOR metric convergence analysis to
identify the minimal viable number of segments in the dataset.
See Slide 14 for some details.
9
June 2019© Intento, Inc.
Evaluation methodology (II)
We consider MT services BEST for a language pair if
their hLEPOR scores are within top 0.5% hLEPOR
available for this pair.
—
We consider MT services TOP for a language pair if their
hLEPOR scores are within the top 5% hLEPOR available
for this pair.
—
We consider MT services OPTIMAL for a language pair if
they are cheapest amount the top 5% hLEPOR score
available for this pair.
10
June 2019© Intento, Inc.
LEPOR score
LEPOR: automatic machine translation evaluation metric
considering the enhanced Length Penalty, n-gram Position
difference Penalty and Recall
—
In our evaluation, we used hLEPORA v.3.1:
—
(best metric at the ACL-WMT 2013 contest)
https://www.slideshare.net/AaronHanLiFeng/lepor-an-augmented-machine-translation-evaluation-metric-thesis-ppt
https://github.com/aaronlifenghan/aaron-project-lepor
LIKE BLEU,
BUT BETTER
11
June 2019© Intento, Inc.
48
Language
Pairs
* https://w3techs.com/technologies/overview/content_language/all
Language groups by
web popularity*:
P1 - ≥ 2.0% websites
P2 - 0.5%-2% websites
P3 - 0.1-0.3% websites
P4 - <0.1% websites
—
We focus on the en-P1,
P1-en and P1-P1
(partially)
en ru ja de es fr pt it zh cs tr fi ro ko ar nl
en ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
ru ✓ ✓ ✓ ✓ ✓
ja ✓ ✓ ✓
de ✓ ✓ ✓ ✓ ✓
es ✓ ✓
fr ✓ ✓ ✓ ✓
pt ✓
it ✓ ✓ ✓
zh ✓ ✓ ✓
cs ✓
tr ✓
fi ✓
ro ✓
ko ✓
ar ✓
nl ✓
12
June 2019© Intento, Inc.
Datasets
WMT-2013 (translation task, news domain)
en-es, es-en
WMT-2015 (translation task, news domain)
fr-en, en-fr
WMT-2016 (translation task, news domain)
ro-en, en-ro
WMT-2018 (translation task, news domain)
tr-en, en-tr, cs-en
WMT-2019 (translation task, news domain)
zh-en, en-zh, en-cs, de-en, en-de, ru-en, en-ru, fi-en, en-fi, de-fr, fr-de
NewsCommentary-2011
en-ja, ja-en, en-pt, pt-en, en-it, it-en, ru-de, de-ru, ru-es, ru-fr, ru-pt, ja-fr, de-ja, es-zh, fr-ru, fr-es,
it-pt, zh-it, en-ar, ar-en, en-nl, nl-en, de-it, it-de, ja-zh, zh-ja
Tatoeba, JHE
en-ko, ko-en
13
June 2019© Intento, Inc.
We used 1600 - 2000 sentences per language pair. The metric stabilizes and adding
more from the same domain won’t change the outcome.
number of sentences
regularisedhLEPORscores
Aggregated across all language pairs Examples for individual language pairs
LEPOR Convergence
Confi-
dence

interval
Aggre-
gated
mean
14
June 2019© Intento, Inc.
en ru ja de es fr pt it zh cs tr fi ro ko ar nl
en 5 8 2 9 7 8 7 4 2 3 1 4 1 5 7
ru 3 5 5 5 3
ja 5 4 6
de 5 5 3 2 5
es 9 5
fr 9 5 6 8
pt 9
it 6 6 5
zh 3 5 4
cs 5
tr 4
fi 2
ro 4
ko 1
ar 5
nl 9
$
$
Available
MT
Quality Maximal
Available

hLEPOR score:
>80 %
70 %
60 %
50 %
40 %
<40 %
Minimal price
for this quality,
per 1M char*:
$$$$ ≥$100
$$$ $20-25
$$ $10-15
$ <$10
No. of 

top-performing

MT Providers**
* base pricing tier
** up to 5% worse than the leader,
SMT and NMT counted separately
Check Appendix A for more
detailed data.
$
$
$$
$
$
$$$$
$
$
$
$
$
$
$$
$$$$$$ $
$
$
$
$
$
$
$$$
$$
$
$
$$
$
$
$$
$ $$ $$
$
$
$
$
$$$
$
$
$ $$$
$
$
15
improved:
fi-en
fr-de
ro-en
zh-en
en-fi
de-fr
ru-en
changes:
25 pairs
Changed since
last report
June 2019© Intento, Inc.
optimal
Provides the lowest price
among the top 5% MT
engines for a language
pair
0
8
16
24
32
40
deepl
google
am
azon
m
odernm
t
yandex
systran-pnm
tm
sft-nm
t
ibm
-nm
t
tencent
baidu
sdl-beglobal
prom
t
gtcom
sdl-sm
t
across 48 language pairs
1.3 TOP Performing MT Providers
best
Provides the best MT
Quality for a language
pair
top 5%
Within 5% of the best
available MT Quality for a
language pair
16
numberoflanguagepairs
Intento, Inc.
June 2019
June 2019© Intento, Inc.
en ru ja de es fr pt it zh cs tr fi ro ko ar nl
en
ru
ja
de
es
fr
pt
it
zh
cs
tr
fi
ro
ko
ar
nl
1.4 Best
general-
purpose
MT
engines
MT Engines
deepl
google
amazon
yandex
systran-pnmt
modernmt
ibm
promt
microsoft
tencent
baidu
17
In several cases, there’s no
statistically significant difference
between the top engines.
Check Appendix A for more
detailed data.
changed since
Jan 2019:
19 pairs
June 2019© Intento, Inc.
en ru ja de es fr pt it zh cs tr fi ro ko ar nl
en
ru
ja
de
es
fr
pt
it
zh
cs
tr
fi
ro
ko
ar
nl
* Cheapest with a
performance within
5% of the best
available engine for
this language pair
Optimal*
general-
purpose
MT
engines
18
MT Engines
deepl
google
amazon
yandex
systran-pnmt
modernmt
ibm
promt
microsoft
tencent
baidu
changed since
Jan 2019:
12 pairs
June 2019© Intento, Inc.
Sample pair analysis: English-Chinese
LEPOR

score Providers
Price range

(per 1M characters)
71 % Tencent (preview)
69 % Baidu, GTCom, Google $8-20
66 %
Amazon, Systran
PNMT
$15-N/A
65 %
Microsoft, ModernMT,
Yandex
$6-$1,000
based on
WMT-19

dataset
BEST
QUALITY:
Tencent (preview)
TOP 5%: Tencent, Baidu, GTCom,
Google
BEST PRICE
IN TOP 5%:
Baidu
19
June 2019© Intento, Inc.
Sample pair analysis: Finnish-English
20
sentence difficulty
MTagreement
source
Oulun poliisi onnistui tiistaina löytämään Raahessa kadonneen
sienestäjän dronen avulla vain parissa minuutissa.
hLEPOR:
reference
On Tuesday, using a drone, Oulu police found a missing mushroom
picker in Raahe in only a few minutes. 1
best MT
On Tuesday, the Oulu police managed to find a mushroom picker
missing in Raahe with the help of a drone in just a few minutes. 0.77
other MT
engines
On Tuesday, the Oulu police managed to find a missing mushroom
spider in Raahe in just a few minutes. 0.75
On Tuesday, the Oulu police managed to find a lost mushroom maker
in Raahe in just a few minutes. 0.71
On Tuesday, the Oulu police managed to find a missing mushroom
inhibitor with a drone in just a few minutes. 0.70
The police of Oulu managed to find a missing mushroom pilot in
Raahe in just a couple of minutes. 0.57
The police managed to find oulun Raahessa disappeared on tuesday
dronen sienestäjän through on just one minute. 🤔
Oulu police managed on Tuesday to find the Raahe missing
mushroom pickers drone using just a couple of minutes. 🤦
On Tuesday, the Oulu police managed to find a funeral in the Bible
with the drone of the missing fungicides in just a few minutes. 🙈
To validate the MT
choice, look at the
sentences of median
difficulty and high
disagreement
across different
MT engines
based on WMT-19 dataset
June 2019© Intento, Inc.
2 Public pricing
USD per 1M symbols***
* volume estimation based on 4.79 symbols per word
** +20% for some language pairs
*** freemium volumes are not shown
21
June 2019© Intento, Inc.
3Language Coverage
3.1 Supported and Unique per Provider
3.2 Coverage by Language Popularity
22
June 2019© Intento, Inc.
1
100
10000
N
iutrans
G
oogle
Yandex
M
icrosoftv3
Sogou
Baidu
Am
azon
Kakao
Systran
Tencent
SDL
PRO
M
T
G
TC
om
SAP
DeepL
M
odernM
TIBM
W
atson
v3
N
aver
Youdao
Alibaba
eBay
Tilde
1
3
2
54
6
8
272
2 202
1
2
20
24
38
5256
72
9090
111121122
139
342
594
756
3 422
3 782
7 482
10 50613 340
Total
Unique
3.1 Supported and Unique Language Pairs*
Unique
language pairs
- supported
exclusively by
one provider
23
* where possible, we have checked via API if all language pairs advertised by the documentation are
supported and removed the pairs we were unable to locate in the API.
** as advertised (not validated via API)
** ** ** ** ** ** ** ****
June 2019© Intento, Inc.
Language popularity
Language groups by
web popularity*:
P1 - ≥ 2.0% websites
P2 - 0.5%-2% websites
P3 - 0.1-0.3% websites
P4 - <0.1% websites
* https://w3techs.com/technologies/overview/content_language/all
A total of
28730
pairs possible,
14136
are supported
across all providers
P1
en, ru, ja, de, es, fr,
pt, it, zh
P2
pl, fa, tr, nl, ko, cs, ar,
vi, el, sv, in, ro, hu
P3
da, sk, fi, th, bg, he, lt, uk, hr,
no/nb, sr, ca, sl, lv, et
P4
hi, az, bs, ms, is, mk, bn, eu, ka, sq, gl,
mn, kk, hy, se, uz, kr, ur, ta, nn, af, be,
si, my, br, ne, sw, km, fil, ml, pa, …
24
June 2019© Intento, Inc.
100% 100% 63%
38%
P1 P2 P3 P4
P1
P2
P3
P4
63%
100%
100%
100%
63%
100% 100%
100%
63%
63% 63%
99%
3.2 Language coverage
by popularity
49%
of possible
language pairs
25
June 2019© Intento, Inc.
Language coverage
by service provider
NiuTrans
Maverick
Translation
Google Cloud
Translation API
Yandex
Translate API
Microsoft
Translator Text
API v3
Sogou
Deepi MT
Baidu
Translate API
Amazon
Translate
Tencent Cloud
TMT API
(preview)
Youdao Cloud
Translation API
Systran PNMT SDL
BeGlobal
PROMT
Cloud API
SAP Translation
Hub
DeepL
API
IBM Watson
Language
Translator v3
ModernMT
API
Naver
Papago NMT
Alibaba
Translate
GTCom
YeeCloud MT
Kakao
MT
eBay
MT
(preview)
Tilde
MT
26
June 2019© Intento, Inc.
4 Historical Progress
4.1 Number of Cloud MT Vendors
4.2 MT Quality
4.3 Performance/Price Efficiency
27
June 2019© Intento, Inc.
4.1 Independent Cloud MT Vendors
with pre-trained models
Commercial
Alibaba, Amazon,
Baidu, CloudTranslate,
DeepL, Google,
GTCom, IBM,
Microsoft, ModernMT,
Naver, Niutrans,
PROMT, SAP, SDL,
Sogou, Systran, Tilde,
Tencent, Yandex,
Youdao
Preview / Limited
eBay, Kakao
0
5
10
15
20
25
Mar 18 Jul 18 Dec 18 Jun 19
Preview
Commercial
Intento, Inc. • June 2019
28
June 2019© Intento, Inc.
30 %
40 %
50 %
60 %
70 %
80 %
Mar 18 Jul 18 Dec 18 Jun 19
Best pair
Worst pair
8 9
4.2 Best available MT Quality
Stable growth of “almost perfect” pairs
Number of
language pairs
available at this level
of LEPOR quality
out of 48 pairs we
evaluate since
March 2018
15
16
8
16
17
7
Intento, Inc. • Jun 2019
15
17
7
11
15
16
6
29
2
7
June 2019© Intento, Inc.
4.3 Best available Performance/Price Efficiency
Grows as NMT gets cheaper
Efficiency =
(hLEPOR in %)² /
(USD per 1M
symbols)
—
Number of
language pairs
available at this level
of efficiency out of
48 pairs we
evaluated since
March 2018
2
12
14
17
Intento, Inc. • Jun 2019
30
3
3
11
15
16
3
4
13
18
12
1
1
9
16
14
7
1
0
200
400
600
800
1 000
1 200
Mar 18 Jul 18 Dec 18 Jun 19
Best pair
Worst pair
June 2019© Intento, Inc.
5 Conclusions
MT Landscape continues to evolve, both in terms of
quality and price.
—
Language coverage is increasing faster than ever.
—
Even for the general domain, having the best quality
across 48 language pairs requires 8 engines used
simultaneously (and those are different from half a year
ago).
—
Re-evaluate your MT choice often to stay competitive.
31
June 2019© Intento, Inc.
Intento Professional Services
MT Evaluation and Integration
Training and statistically significant evaluation of NMT
engines, which may bring the most cost and time reduction
on the post-editing stage (see the example here).
—
Identifying a subset of MT results for fast and affordable
manual inspection (~200x reduction of LQA efforts).
—
LQA and HTER also available via our LSP partners.
—
MT Integration - SDK and connectors to open platforms and
in-house software.
—
Reach us at hello@inten.to
32
June 2019© Intento, Inc.
Intento Enterprise MT Hub
One place to evaluate and manage MT
Universal API to
all MT engines
Single MT
dashboard
Connects to many
CAT, TMS and
CMS
Works with files
of any size
Get your
API key at
inten.to
33
Smart Routing
with retries
and failovers
MAY BE DEPLOYED
ON
PRIVATE CLOUD
June 2019© Intento, Inc.
Intento Web-Tools
Human-Friendly UI
working directly with the
Intento API
—
Quick way to try every MT
engine and translate large
files without API
integration.
—
Available in preview at no
added cost to Intento API
34
SIGN UP

at https://console.inten.to
June 2019© Intento, Inc.
Intento Plugins and Connectors
35
memoQ (private plugin)
—
SDL Trados (private plugin, also in SDL AppStore)
—
MateCat (private plugin)
—
YiCat (coming soon)
—
Also, many of the engines are available in
Smartcat.
—
Any Enterprise TMS via XLIFF connector.
—
Miss some connector? Reach us at
hello@inten.to!
XLIFF
by Intento (https://inten.to)

June 2019
Intento, Inc.
hello@inten.to
2150 Shattuck Ave
Berkeley CA 94704
36
STATE OF THE
MACHINE TRANSLATION
STOCK* MT MODELS
* commercially available pre-trained MT models
June 2019© Intento, Inc.
Appendix A Average hLEPOR ranking
across all 48 language pairs
WARNING: This chart looks
cool but requires a high level
of color sensitivity. Also, there
are lots of overlapping circles.
Please look at slides 18 and
19 for more digestible data.
37
AveragehLEPOR
Intento, Inc. • Jun 2019
1 of 37

Recommended

State of the Machine Translation by Intento (July 2018) by
State of the Machine Translation by Intento (July 2018)State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (July 2018)Konstantin Savenkov
29K views35 slides
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017) by
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
2.1K views37 slides
Machine translation by
Machine translationMachine translation
Machine translationmohamed hassan
7.8K views17 slides
Natural Language Processing by
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
779 views24 slides
NLP using transformers by
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
4.1K views67 slides
NLP by
NLPNLP
NLPguestff64339
23.8K views26 slides

More Related Content

What's hot

Machine Tanslation by
Machine TanslationMachine Tanslation
Machine TanslationMahsa Mohaghegh
11.6K views23 slides
Online Depression Detection.ppt by
Online Depression Detection.pptOnline Depression Detection.ppt
Online Depression Detection.pptyegnajayasimha21
623 views21 slides
Natural Language Processing seminar review by
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
4K views22 slides
NLP_KASHK:Evaluating Language Model by
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
1.8K views22 slides
Machine translation with statistical approach by
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approachvini89
1.4K views33 slides
Language models by
Language modelsLanguage models
Language modelsMaryam Khordad
12K views24 slides

What's hot(20)

Natural Language Processing seminar review by Jayneel Vora
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
Jayneel Vora4K views
Machine translation with statistical approach by vini89
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini891.4K views
Natural language processing by Abash shah
Natural language processingNatural language processing
Natural language processing
Abash shah1K views
Natural Language Processing in AI by Saurav Shrestha
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
Saurav Shrestha5.1K views
Natural lanaguage processing by gulshan kumar
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
gulshan kumar848 views
Natural language processing and transformer models by Ding Li
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
Ding Li638 views
NLP pipeline in machine translation by Marcis Pinnis
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
Marcis Pinnis1.7K views
Deep learning for NLP and Transformer by Arvind Devaraj
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj1.4K views
Introduction to Natural Language Processing by rohitnayak
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
rohitnayak17.7K views
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ... by Edureka!
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Edureka!1.6K views
Natural Language Processing by saurabhnarhe
Natural Language ProcessingNatural Language Processing
Natural Language Processing
saurabhnarhe1.2K views
The Role of Natural Language Processing in Information Retrieval by Tony Russell-Rose
The Role of Natural Language Processing in Information RetrievalThe Role of Natural Language Processing in Information Retrieval
The Role of Natural Language Processing in Information Retrieval
Tony Russell-Rose14.3K views

Similar to State of the Machine Translation by Intento (stock engines, Jun 2019)

State of the Machine Translation by Intento (stock engines, Jan 2019) by
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)Konstantin Savenkov
24K views37 slides
State of the Machine Translation by Intento (March 2018) by
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)Konstantin Savenkov
16.1K views34 slides
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA... by
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...Konstantin Savenkov
1.1K views41 slides
State of the Machine Translation by Intento (November 2017) by
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
25.5K views31 slides
Intento Enterprise MT Hub by
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT HubTatianaDmitrieva8
106 views44 slides
Intento Enterprise MT Hub by
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT HubTatianaDmitrieva8
88 views41 slides

Similar to State of the Machine Translation by Intento (stock engines, Jun 2019)(20)

State of the Machine Translation by Intento (stock engines, Jan 2019) by Konstantin Savenkov
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (March 2018) by Konstantin Savenkov
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)
Konstantin Savenkov16.1K views
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA... by Konstantin Savenkov
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
Konstantin Savenkov1.1K views
State of the Machine Translation by Intento (November 2017) by Konstantin Savenkov
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
Konstantin Savenkov25.5K views
State of the Domain-Adaptive Machine Translation by Intento (November 2018) by Konstantin Savenkov
State of the Domain-Adaptive Machine Translation by Intento (November 2018)State of the Domain-Adaptive Machine Translation by Intento (November 2018)
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
Intento Machine Translation Benchmark, July 2017 by Konstantin Savenkov
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017
Konstantin Savenkov9.2K views
Cloud Sentiment Analysis - Vendor Overview (April 2018) by Konstantin Savenkov
Cloud Sentiment Analysis - Vendor Overview (April 2018)Cloud Sentiment Analysis - Vendor Overview (April 2018)
Cloud Sentiment Analysis - Vendor Overview (April 2018)
Konstantin Savenkov1.4K views
Progress in Commercial Machine Translation Systems by Konstantin Savenkov
Progress in Commercial Machine Translation SystemsProgress in Commercial Machine Translation Systems
Progress in Commercial Machine Translation Systems
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma... by IRJET Journal
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET Journal16 views
MuleSoft + Augmented Reality & ChatGPT by MuleSoft Meetups
MuleSoft + Augmented Reality & ChatGPTMuleSoft + Augmented Reality & ChatGPT
MuleSoft + Augmented Reality & ChatGPT
MuleSoft Meetups237 views
Dodging AI biases in future-proof Machine Translation solutions by Konstantin Savenkov
Dodging AI biases in future-proof Machine Translation solutionsDodging AI biases in future-proof Machine Translation solutions
Dodging AI biases in future-proof Machine Translation solutions

More from Konstantin Savenkov

GPT and other Text Transformers: Black Swans and Stochastic Parrots by
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic ParrotsKonstantin Savenkov
459 views46 slides
Как выбрать и приручить машинный перевод / How to choose and tame the Machine... by
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Konstantin Savenkov
4.7K views27 slides
Intento Enterprise MT Hub by
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT HubKonstantin Savenkov
856 views21 slides
Improving the Demand Side of the AI Economy (API World 2018) by
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Konstantin Savenkov
478 views24 slides
Сравнительный анализ систем машинного перевода by
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаKonstantin Savenkov
641 views32 slides
NLU / Intent Detection Benchmark by Intento, August 2017 by
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017Konstantin Savenkov
43.1K views45 slides

More from Konstantin Savenkov(15)

GPT and other Text Transformers: Black Swans and Stochastic Parrots by Konstantin Savenkov
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic Parrots
Как выбрать и приручить машинный перевод / How to choose and tame the Machine... by Konstantin Savenkov
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Konstantin Savenkov4.7K views
Improving the Demand Side of the AI Economy (API World 2018) by Konstantin Savenkov
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)
Сравнительный анализ систем машинного перевода by Konstantin Savenkov
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного перевода
NLU / Intent Detection Benchmark by Intento, August 2017 by Konstantin Savenkov
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017
Konstantin Savenkov43.1K views
Управление бизнесом на основе данных by Konstantin Savenkov
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данных
Рекомендательные системы: роль и оценка эффективности by Konstantin Savenkov
Рекомендательные системы: роль и оценка эффективностиРекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективности
Konstantin Savenkov1.5K views
Driving Business Goals with Recommender Systems @ YAC/m 2015 by Konstantin Savenkov
Driving Business Goals with Recommender Systems @ YAC/m 2015Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015

Recently uploaded

CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T by
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TCloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TShapeBlue
56 views34 slides
Business Analyst Series 2023 - Week 3 Session 5 by
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5DianaGray10
369 views20 slides
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsShapeBlue
111 views13 slides
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
46 views13 slides
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
50 views69 slides
State of the Union - Rohit Yadav - Apache CloudStack by
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStackShapeBlue
145 views53 slides

Recently uploaded(20)

CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T by ShapeBlue
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TCloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
ShapeBlue56 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10369 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue111 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue46 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker50 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue145 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue57 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue54 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman40 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue91 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online by ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue102 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue74 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu141 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray1080 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue46 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson133 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue63 views

State of the Machine Translation by Intento (stock engines, Jun 2019)

  • 1. STATE OF THE MACHINE TRANSLATION STOCK* MT MODELS by Intento June 2019 * commercially available pre-trained MT models
  • 2. June 2019© Intento, Inc. DISCLAIMER 2 The MT systems used in this report were accessed from May 10 to June 10, 2019. They may have changed many times since then. — This report demonstrates the performance of those systems exclusively on the datasets used for this report (see slide 13) using proximity scores. The final MT decision requires Human LQA and depends on the use- case. — We run multiple evaluations for our clients for various language pairs and domains, observing different rankings of the MT systems. — There’s no “best” MT system. Performance depends on how your data is similar to what they used to train their models and on their algorithms.
  • 3. June 2019© Intento, Inc. About We have been evaluating models for Machine Translation since May 2017 (Custom NMT as well). — As we show in this report, the Machine Translation landscape is complex, with models from 8 different vendors required to get the best quality across popular language pairs and a 300x difference in price. And it changes often! — To evaluate on your own dataset, reach us at hello@inten.to — To conveniently use many MT engines at once, check out our Enterprise MT Hub (next slide). 3
  • 4. June 2019© Intento, Inc. Intento Enterprise MT Hub One place to evaluate and manage MT Universal API to all MT engines Single MT dashboard Connects to many CAT, TMS and CMS Works with files of any size Get your API key at inten.to 4 Smart Routing with retries and failovers MAY BE DEPLOYED ON PRIVATE CLOUD
  • 5. June 2019© Intento, Inc. Executive Summary Overall MT Quality significantly improved for Finnish<>English, German<>French, Romanian>English, Russian>English, Chinese>English* — The best MT provider has changed for 19 language pairs since January 2019. To get the best quality across 48 language pairs, one needs 8 different engines (see slide 17). — Many engines increased their language coverage: Google, Amazon, Kakao, Systran PNMT, SDL, PROMT, ModernMT, IBM (see slide 23). — New pre-trained engines: Alibaba launched their MT internationally, Tencent MT went from preview to production in China. More providers added pre-trained engines: Tilde (EN-PL) and Cloud Translation (EN-ZH Medical). 5 * Beyond fluctuations attributed to the datasets update.
  • 6. June 2019© Intento, Inc. Overview 1 TRANSLATION QUALITY 2 PRICING 3 LANGUAGE COVERAGE 4 HISTORICAL PROGRESS 5 CONCLUSIONS 48 Language Pairs 25 Machine Translation Engines 6
  • 7. June 2019© Intento, Inc. Machine Translation Engines* with Pre-Trained General-Purpose Models * We have evaluated general purpose Cloud Machine Translation services with pre-trained translation models, provided via API. Some vendors also provide web-based, on-premise or custom MT engines, which may differ on all aspects from what we’ve evaluated. Alibaba Cloud MT Amazon Translate Baidu Translate API DeepL API eBay Translation API Google Cloud Translation API GTCom YeeCloud MT IBM Watson Language Translator Kakao Developers Translation Microsoft Translator Text API v3 ModernMT Enterprise API Naver Cloud Papago NMT NiuTrans Maverick Translation PROMT Cloud API SAP Translation Hub SDL BeGlobal SDL (SMT) Language Cloud Sogou Deepi MT SYSTRAN PNMT Enterprise Server SYSTRAN REST Translation API Tencent Cloud TMT API Tilde Machine Translation Yandex Translate API Youdao Cloud Translation API 7 (MT systems highlighted in gray were unavailable for quantitative evaluation for different reasons)
  • 8. June 2019© Intento, Inc. 1Translation Quality 1.1 Evaluation Methodology 1.2 Available MT Quality 1.3 Top-Performing Engines 1.4 Best General-Purpose Engines 1.5 Optimal General-Purpose Engines 8
  • 9. June 2019© Intento, Inc. 1.1 Evaluation methodology (I) Translation quality is evaluated by computing LEPOR score between reference translations and the MT output (Slide 11). — Currently, our goal is to evaluate the performance of translation between the most popular languages (Slide 12). — We use public datasets from StatMT/WMT, CASMACAT News Commentary and Tatoeba (Slide 13). — We have performed LEPOR metric convergence analysis to identify the minimal viable number of segments in the dataset. See Slide 14 for some details. 9
  • 10. June 2019© Intento, Inc. Evaluation methodology (II) We consider MT services BEST for a language pair if their hLEPOR scores are within top 0.5% hLEPOR available for this pair. — We consider MT services TOP for a language pair if their hLEPOR scores are within the top 5% hLEPOR available for this pair. — We consider MT services OPTIMAL for a language pair if they are cheapest amount the top 5% hLEPOR score available for this pair. 10
  • 11. June 2019© Intento, Inc. LEPOR score LEPOR: automatic machine translation evaluation metric considering the enhanced Length Penalty, n-gram Position difference Penalty and Recall — In our evaluation, we used hLEPORA v.3.1: — (best metric at the ACL-WMT 2013 contest) https://www.slideshare.net/AaronHanLiFeng/lepor-an-augmented-machine-translation-evaluation-metric-thesis-ppt https://github.com/aaronlifenghan/aaron-project-lepor LIKE BLEU, BUT BETTER 11
  • 12. June 2019© Intento, Inc. 48 Language Pairs * https://w3techs.com/technologies/overview/content_language/all Language groups by web popularity*: P1 - ≥ 2.0% websites P2 - 0.5%-2% websites P3 - 0.1-0.3% websites P4 - <0.1% websites — We focus on the en-P1, P1-en and P1-P1 (partially) en ru ja de es fr pt it zh cs tr fi ro ko ar nl en ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ru ✓ ✓ ✓ ✓ ✓ ja ✓ ✓ ✓ de ✓ ✓ ✓ ✓ ✓ es ✓ ✓ fr ✓ ✓ ✓ ✓ pt ✓ it ✓ ✓ ✓ zh ✓ ✓ ✓ cs ✓ tr ✓ fi ✓ ro ✓ ko ✓ ar ✓ nl ✓ 12
  • 13. June 2019© Intento, Inc. Datasets WMT-2013 (translation task, news domain) en-es, es-en WMT-2015 (translation task, news domain) fr-en, en-fr WMT-2016 (translation task, news domain) ro-en, en-ro WMT-2018 (translation task, news domain) tr-en, en-tr, cs-en WMT-2019 (translation task, news domain) zh-en, en-zh, en-cs, de-en, en-de, ru-en, en-ru, fi-en, en-fi, de-fr, fr-de NewsCommentary-2011 en-ja, ja-en, en-pt, pt-en, en-it, it-en, ru-de, de-ru, ru-es, ru-fr, ru-pt, ja-fr, de-ja, es-zh, fr-ru, fr-es, it-pt, zh-it, en-ar, ar-en, en-nl, nl-en, de-it, it-de, ja-zh, zh-ja Tatoeba, JHE en-ko, ko-en 13
  • 14. June 2019© Intento, Inc. We used 1600 - 2000 sentences per language pair. The metric stabilizes and adding more from the same domain won’t change the outcome. number of sentences regularisedhLEPORscores Aggregated across all language pairs Examples for individual language pairs LEPOR Convergence Confi- dence interval Aggre- gated mean 14
  • 15. June 2019© Intento, Inc. en ru ja de es fr pt it zh cs tr fi ro ko ar nl en 5 8 2 9 7 8 7 4 2 3 1 4 1 5 7 ru 3 5 5 5 3 ja 5 4 6 de 5 5 3 2 5 es 9 5 fr 9 5 6 8 pt 9 it 6 6 5 zh 3 5 4 cs 5 tr 4 fi 2 ro 4 ko 1 ar 5 nl 9 $ $ Available MT Quality Maximal Available hLEPOR score: >80 % 70 % 60 % 50 % 40 % <40 % Minimal price for this quality, per 1M char*: $$$$ ≥$100 $$$ $20-25 $$ $10-15 $ <$10 No. of top-performing MT Providers** * base pricing tier ** up to 5% worse than the leader, SMT and NMT counted separately Check Appendix A for more detailed data. $ $ $$ $ $ $$$$ $ $ $ $ $ $ $$ $$$$$$ $ $ $ $ $ $ $ $$$ $$ $ $ $$ $ $ $$ $ $$ $$ $ $ $ $ $$$ $ $ $ $$$ $ $ 15 improved: fi-en fr-de ro-en zh-en en-fi de-fr ru-en changes: 25 pairs Changed since last report
  • 16. June 2019© Intento, Inc. optimal Provides the lowest price among the top 5% MT engines for a language pair 0 8 16 24 32 40 deepl google am azon m odernm t yandex systran-pnm tm sft-nm t ibm -nm t tencent baidu sdl-beglobal prom t gtcom sdl-sm t across 48 language pairs 1.3 TOP Performing MT Providers best Provides the best MT Quality for a language pair top 5% Within 5% of the best available MT Quality for a language pair 16 numberoflanguagepairs Intento, Inc. June 2019
  • 17. June 2019© Intento, Inc. en ru ja de es fr pt it zh cs tr fi ro ko ar nl en ru ja de es fr pt it zh cs tr fi ro ko ar nl 1.4 Best general- purpose MT engines MT Engines deepl google amazon yandex systran-pnmt modernmt ibm promt microsoft tencent baidu 17 In several cases, there’s no statistically significant difference between the top engines. Check Appendix A for more detailed data. changed since Jan 2019: 19 pairs
  • 18. June 2019© Intento, Inc. en ru ja de es fr pt it zh cs tr fi ro ko ar nl en ru ja de es fr pt it zh cs tr fi ro ko ar nl * Cheapest with a performance within 5% of the best available engine for this language pair Optimal* general- purpose MT engines 18 MT Engines deepl google amazon yandex systran-pnmt modernmt ibm promt microsoft tencent baidu changed since Jan 2019: 12 pairs
  • 19. June 2019© Intento, Inc. Sample pair analysis: English-Chinese LEPOR score Providers Price range (per 1M characters) 71 % Tencent (preview) 69 % Baidu, GTCom, Google $8-20 66 % Amazon, Systran PNMT $15-N/A 65 % Microsoft, ModernMT, Yandex $6-$1,000 based on WMT-19 dataset BEST QUALITY: Tencent (preview) TOP 5%: Tencent, Baidu, GTCom, Google BEST PRICE IN TOP 5%: Baidu 19
  • 20. June 2019© Intento, Inc. Sample pair analysis: Finnish-English 20 sentence difficulty MTagreement source Oulun poliisi onnistui tiistaina löytämään Raahessa kadonneen sienestäjän dronen avulla vain parissa minuutissa. hLEPOR: reference On Tuesday, using a drone, Oulu police found a missing mushroom picker in Raahe in only a few minutes. 1 best MT On Tuesday, the Oulu police managed to find a mushroom picker missing in Raahe with the help of a drone in just a few minutes. 0.77 other MT engines On Tuesday, the Oulu police managed to find a missing mushroom spider in Raahe in just a few minutes. 0.75 On Tuesday, the Oulu police managed to find a lost mushroom maker in Raahe in just a few minutes. 0.71 On Tuesday, the Oulu police managed to find a missing mushroom inhibitor with a drone in just a few minutes. 0.70 The police of Oulu managed to find a missing mushroom pilot in Raahe in just a couple of minutes. 0.57 The police managed to find oulun Raahessa disappeared on tuesday dronen sienestäjän through on just one minute. 🤔 Oulu police managed on Tuesday to find the Raahe missing mushroom pickers drone using just a couple of minutes. 🤦 On Tuesday, the Oulu police managed to find a funeral in the Bible with the drone of the missing fungicides in just a few minutes. 🙈 To validate the MT choice, look at the sentences of median difficulty and high disagreement across different MT engines based on WMT-19 dataset
  • 21. June 2019© Intento, Inc. 2 Public pricing USD per 1M symbols*** * volume estimation based on 4.79 symbols per word ** +20% for some language pairs *** freemium volumes are not shown 21
  • 22. June 2019© Intento, Inc. 3Language Coverage 3.1 Supported and Unique per Provider 3.2 Coverage by Language Popularity 22
  • 23. June 2019© Intento, Inc. 1 100 10000 N iutrans G oogle Yandex M icrosoftv3 Sogou Baidu Am azon Kakao Systran Tencent SDL PRO M T G TC om SAP DeepL M odernM TIBM W atson v3 N aver Youdao Alibaba eBay Tilde 1 3 2 54 6 8 272 2 202 1 2 20 24 38 5256 72 9090 111121122 139 342 594 756 3 422 3 782 7 482 10 50613 340 Total Unique 3.1 Supported and Unique Language Pairs* Unique language pairs - supported exclusively by one provider 23 * where possible, we have checked via API if all language pairs advertised by the documentation are supported and removed the pairs we were unable to locate in the API. ** as advertised (not validated via API) ** ** ** ** ** ** ** ****
  • 24. June 2019© Intento, Inc. Language popularity Language groups by web popularity*: P1 - ≥ 2.0% websites P2 - 0.5%-2% websites P3 - 0.1-0.3% websites P4 - <0.1% websites * https://w3techs.com/technologies/overview/content_language/all A total of 28730 pairs possible, 14136 are supported across all providers P1 en, ru, ja, de, es, fr, pt, it, zh P2 pl, fa, tr, nl, ko, cs, ar, vi, el, sv, in, ro, hu P3 da, sk, fi, th, bg, he, lt, uk, hr, no/nb, sr, ca, sl, lv, et P4 hi, az, bs, ms, is, mk, bn, eu, ka, sq, gl, mn, kk, hy, se, uz, kr, ur, ta, nn, af, be, si, my, br, ne, sw, km, fil, ml, pa, … 24
  • 25. June 2019© Intento, Inc. 100% 100% 63% 38% P1 P2 P3 P4 P1 P2 P3 P4 63% 100% 100% 100% 63% 100% 100% 100% 63% 63% 63% 99% 3.2 Language coverage by popularity 49% of possible language pairs 25
  • 26. June 2019© Intento, Inc. Language coverage by service provider NiuTrans Maverick Translation Google Cloud Translation API Yandex Translate API Microsoft Translator Text API v3 Sogou Deepi MT Baidu Translate API Amazon Translate Tencent Cloud TMT API (preview) Youdao Cloud Translation API Systran PNMT SDL BeGlobal PROMT Cloud API SAP Translation Hub DeepL API IBM Watson Language Translator v3 ModernMT API Naver Papago NMT Alibaba Translate GTCom YeeCloud MT Kakao MT eBay MT (preview) Tilde MT 26
  • 27. June 2019© Intento, Inc. 4 Historical Progress 4.1 Number of Cloud MT Vendors 4.2 MT Quality 4.3 Performance/Price Efficiency 27
  • 28. June 2019© Intento, Inc. 4.1 Independent Cloud MT Vendors with pre-trained models Commercial Alibaba, Amazon, Baidu, CloudTranslate, DeepL, Google, GTCom, IBM, Microsoft, ModernMT, Naver, Niutrans, PROMT, SAP, SDL, Sogou, Systran, Tilde, Tencent, Yandex, Youdao Preview / Limited eBay, Kakao 0 5 10 15 20 25 Mar 18 Jul 18 Dec 18 Jun 19 Preview Commercial Intento, Inc. • June 2019 28
  • 29. June 2019© Intento, Inc. 30 % 40 % 50 % 60 % 70 % 80 % Mar 18 Jul 18 Dec 18 Jun 19 Best pair Worst pair 8 9 4.2 Best available MT Quality Stable growth of “almost perfect” pairs Number of language pairs available at this level of LEPOR quality out of 48 pairs we evaluate since March 2018 15 16 8 16 17 7 Intento, Inc. • Jun 2019 15 17 7 11 15 16 6 29 2 7
  • 30. June 2019© Intento, Inc. 4.3 Best available Performance/Price Efficiency Grows as NMT gets cheaper Efficiency = (hLEPOR in %)² / (USD per 1M symbols) — Number of language pairs available at this level of efficiency out of 48 pairs we evaluated since March 2018 2 12 14 17 Intento, Inc. • Jun 2019 30 3 3 11 15 16 3 4 13 18 12 1 1 9 16 14 7 1 0 200 400 600 800 1 000 1 200 Mar 18 Jul 18 Dec 18 Jun 19 Best pair Worst pair
  • 31. June 2019© Intento, Inc. 5 Conclusions MT Landscape continues to evolve, both in terms of quality and price. — Language coverage is increasing faster than ever. — Even for the general domain, having the best quality across 48 language pairs requires 8 engines used simultaneously (and those are different from half a year ago). — Re-evaluate your MT choice often to stay competitive. 31
  • 32. June 2019© Intento, Inc. Intento Professional Services MT Evaluation and Integration Training and statistically significant evaluation of NMT engines, which may bring the most cost and time reduction on the post-editing stage (see the example here). — Identifying a subset of MT results for fast and affordable manual inspection (~200x reduction of LQA efforts). — LQA and HTER also available via our LSP partners. — MT Integration - SDK and connectors to open platforms and in-house software. — Reach us at hello@inten.to 32
  • 33. June 2019© Intento, Inc. Intento Enterprise MT Hub One place to evaluate and manage MT Universal API to all MT engines Single MT dashboard Connects to many CAT, TMS and CMS Works with files of any size Get your API key at inten.to 33 Smart Routing with retries and failovers MAY BE DEPLOYED ON PRIVATE CLOUD
  • 34. June 2019© Intento, Inc. Intento Web-Tools Human-Friendly UI working directly with the Intento API — Quick way to try every MT engine and translate large files without API integration. — Available in preview at no added cost to Intento API 34 SIGN UP at https://console.inten.to
  • 35. June 2019© Intento, Inc. Intento Plugins and Connectors 35 memoQ (private plugin) — SDL Trados (private plugin, also in SDL AppStore) — MateCat (private plugin) — YiCat (coming soon) — Also, many of the engines are available in Smartcat. — Any Enterprise TMS via XLIFF connector. — Miss some connector? Reach us at hello@inten.to! XLIFF
  • 36. by Intento (https://inten.to) June 2019 Intento, Inc. hello@inten.to 2150 Shattuck Ave Berkeley CA 94704 36 STATE OF THE MACHINE TRANSLATION STOCK* MT MODELS * commercially available pre-trained MT models
  • 37. June 2019© Intento, Inc. Appendix A Average hLEPOR ranking across all 48 language pairs WARNING: This chart looks cool but requires a high level of color sensitivity. Also, there are lots of overlapping circles. Please look at slides 18 and 19 for more digestible data. 37 AveragehLEPOR Intento, Inc. • Jun 2019