Measuring the performance of your KantanMT engine is where the art of Machine Translation meets the science of Machine Translation. It gives you the ability to prove the impact of your MT efforts - but only if you have an ability to measure from its initial build through its multiple retrainings and production runs.
However, to finesse an MT engine, you need to get professional translators involved. Ultimately, the success and quality of the MT engine is down to their input for improvements. Tony O'Dowd, Founder and Chief Architect at KantanMT, will explain how they've successfully built some of the largest and most successful industry MT deployments using the inputs and engagements of Professional Translators. He'll demonstrate the KantanMT structured language quality review (LQR) framework that is used to collate translator feedback and make quality improvements on SMT engines - rapidly.
3. .
What we aim to cover today?
About KantanMT.com
Who are we and what we stand for?
Who Measures MT Quality?
Project Managers
Engine Developers
Translators
KantanMT LQR Framework
Objective Measurements of quality
that drive engine improvements
Q&A
30 Minutes
4. What is KantanMT.com?
Statistical MT System
Cloud-based
Highly scalable
Inexpensive to operate
Quick to deploy
Our Vision
To put Machine Translation
Customization
Improvement
Deployment
into your hands
Active KantanMT Engines
7,105
Training Words Uploaded
103,533,605,925
Member Words Translated
751,291,925
Fully Operational 12 Months
5. .
Technology Firsts
Q2 2013 Q3 2013 Q3 2014Q1 2013
Adoption: Uploaded 10b training
words and 200m words
translated. KantanAPI launched
www.kantanmt.com:
1st SMT Cloud Based
Platform (TotalRecall)
KantanAutoScale: Using
the power of the cloud to
maximise performance
Kantan BuildAnalytics:
Helping engineers build
better MT
Q1 2014
Kantan Analytics: 1st
Predictive Quality
Estimation Technology
Massive Adoption: 750m
translated and 100b training
words uploaded
Q1 2015
6. .
Who Measures MT Quality?
Determine scope
of project
Project Managers
Fluency,
Adequacy,
Consistency
Translators
Path to
Production
Release
Engine Developers
9. .
Engine Developers
Different Criteria
Objective: Get Engine Production Ready as
quick as humanly possible
Automated Measurement Systems used to
smooth this path
BLUE
TER
F-Measure
METEOR
Path to
Production
Release
Engine Developers
14. .
KantanMT LQR Framework
Path to
Production
Rlease
Engine Developers
Framework Characteristics
Based on defined Error Typology
Designed to provide objective feedback to
Engine Developers
Designed to both qualify and quantify
language quality feedback
An integral step in the
development of MT systems
Honestly, I can’t imagine how you could
develop an MT engine within this!
Fluency,
Adequacy,
Consistency
Translators
15. .
KantanMT LQR Framework
Style Wrong terminology
Wrong Spelling
Source not Translated/Omissions
Compliance with client specs
Literal translation
Text/Information added
Syntax & Grammar Capitalization
Wrong Word Form
Wrong Part of Speech
Punctuation
Sentence Structure
Technical Tags and Markup
Locale Adaptation
SpacingOverall Adequacy Score
Fluency Score
Overall Quality Score
Agree Error Typology
Fluency,
Adequacy,
Consistency
Translators
17. .
KantanMT LQR Framework
Adequacy Score (Range 1 – 5)
Full Meaning
All meaning expressed in the source segment appears in the translated
segment
Most Meaning
Most of the source segment meaning is expressed in the translated segment
Much Meaning
Much of the source segment meaning is expressed in the translated segment
Little Meaning
Little of the source segment is expressed in the translated segment
No Meaning
None of the meaning expressed in the source segment is expressed in the
translated segment
5
1
18. .
KantanMT LQR Framework
Fluency Score (Range 1 – 5)
Native language fluency
No grammar errors, excellent word selection and good syntax. No post-editing
required.
Near native fluency
Few terminology/grammar errors. No impact on overall understanding of the
meaning. Little post-editing required.
Not very fluent
About half of translation contains errors and requires post-editing.
Little fluency
Wrong word choice, poor grammar and syntax. A lot of post-editing required.
No fluency
Absolutely ungrammatical and doesn’t make any sense. Re-translate from
scratch .
5
1
23. .
Conclusions
Translator involvement is crucial
Development efforts are more focused
Errors with biggest impact are addressed quickly and
systematically
Using a structured LQR Framework
Ensures feedback is quantified and qualified
Avoids objective undertones which can be
unproductive
But the biggest benefit
Higher engagement with professional
translators that makes a big impact to quality
of MT outputs
No more expensive deployments
Monthly subscription plan
Customised subscription plan
No more complexity
KantanMT does all the heavy lifting
You focus on what you do best – grow and develop your business