KantanMT Framework: Analysing MT Quality Using Professional Translators

No Hardware. No Software. No Hassle MT.
#EUATC2015

.
What we aim to cover today?
 About KantanMT.com
 Who are we and what we stand for?
 Who Measures MT Quality?
 Project Managers
 Engine Developers
 Translators
 KantanMT LQR Framework
 Objective Measurements of quality
that drive engine improvements
 Q&A
30 Minutes

What is KantanMT.com?
 Statistical MT System
 Cloud-based
 Highly scalable
 Inexpensive to operate
 Quick to deploy
 Our Vision
 To put Machine Translation
 Customization
 Improvement
 Deployment
 into your hands
Active KantanMT Engines
7,105
Training Words Uploaded
103,533,605,925
Member Words Translated
751,291,925
Fully Operational 12 Months

.
Technology Firsts
Q2 2013 Q3 2013 Q3 2014Q1 2013
Adoption: Uploaded 10b training
words and 200m words
translated. KantanAPI launched
www.kantanmt.com:
1st SMT Cloud Based
Platform (TotalRecall)
KantanAutoScale: Using
the power of the cloud to
maximise performance
Kantan BuildAnalytics:
Helping engineers build
better MT
Q1 2014
Kantan Analytics: 1st
Predictive Quality
Estimation Technology
Massive Adoption: 750m
translated and 100b training
words uploaded
Q1 2015

.
Who Measures MT Quality?
Determine scope
of project
Project Managers
Fluency,
Adequacy,
Consistency
Translators
Path to
Production
Release
Engine Developers

.
Project Managers
 Predictive Quality Estimation
 Determine Project Scope
 Schedule
 Resources required
 Cost
 Billables
Determine scope
of project
Project Managers

.
Project Managers
Determine scope
of project
Project Managers

.
Engine Developers
 Different Criteria
 Objective: Get Engine Production Ready as
quick as humanly possible
 Automated Measurement Systems used to
smooth this path
 BLUE
 TER
 F-Measure
 METEOR
Path to
Production
Release
Engine Developers

.
Engine Developers
Path to
Production
Release
Engine Developers

.
Translators
Fluency,
Adequacy,
Consistency
Translators

.
KantanMT LQR Framework
Path to
Production
Rlease
Engine Developers
 Framework Characteristics
 Based on defined Error Typology
 Designed to provide objective feedback to
Engine Developers
 Designed to both qualify and quantify
language quality feedback
 An integral step in the
development of MT systems
 Honestly, I can’t imagine how you could
develop an MT engine within this!
Fluency,
Adequacy,
Consistency
Translators

.
Style Wrong terminology
Wrong Spelling
Source not Translated/Omissions
Compliance with client specs
Literal translation
Text/Information added
Syntax & Grammar Capitalization
Wrong Word Form
Wrong Part of Speech
Punctuation
Sentence Structure
Technical Tags and Markup
Locale Adaptation
SpacingOverall Adequacy Score
Fluency Score
Overall Quality Score
 Agree Error Typology
Fluency,
Adequacy,
Consistency
Translators

.
 Agree Error Weightings
Fluency,
Adequacy,
Consistency
Translators

.
 Adequacy Score (Range 1 – 5)
 Full Meaning
 All meaning expressed in the source segment appears in the translated
segment
 Most Meaning
 Most of the source segment meaning is expressed in the translated segment
 Much Meaning
 Much of the source segment meaning is expressed in the translated segment
 Little Meaning
 Little of the source segment is expressed in the translated segment
 No Meaning
 None of the meaning expressed in the source segment is expressed in the
translated segment
5
1

.
 Fluency Score (Range 1 – 5)
 Native language fluency
 No grammar errors, excellent word selection and good syntax. No post-editing
required.
 Near native fluency
 Few terminology/grammar errors. No impact on overall understanding of the
meaning. Little post-editing required.
 Not very fluent
 About half of translation contains errors and requires post-editing.
 Little fluency
 Wrong word choice, poor grammar and syntax. A lot of post-editing required.
 No fluency
 Absolutely ungrammatical and doesn’t make any sense. Re-translate from
scratch .
5
1

.
 Data Analysis is automatic/immediate
Fluency,
Adequacy,
Consistency
Translators

.
 Scheduling LQR - essential
Fluency,
Adequacy,
Consistency
Translators

.
Conclusions
 Translator involvement is crucial
 Development efforts are more focused
 Errors with biggest impact are addressed quickly and
systematically
 Using a structured LQR Framework
 Ensures feedback is quantified and qualified
 Avoids objective undertones which can be
unproductive
 But the biggest benefit
 Higher engagement with professional
translators that makes a big impact to quality
of MT outputs

.
Tony O’Dowd
tonyod@kantanmt.com

KantanMT Framework: Analysing MT Quality Using Professional Translators

Recommended

Recommended

More Related Content

More from kantanmt

More from kantanmt (20)

Recently uploaded

Recently uploaded (20)

KantanMT Framework: Analysing MT Quality Using Professional Translators

Editor's Notes