Tony O’Dowd (KantanMT). KantanMT enables its community to generate meaningful business intelligence that helps them identify the scope of their customised machine translation projects. More importantly, it helps them schedule and scale those projects to achieve maximum translation productivity and a positive ROI.
Predictive Analysis in Machine Translation is Business Intelligence.
1.
2. Tony O’Dowd
Founder & Chief Architect
tonyod@kantanmt.com
Predictive Analysis in
Machine Translation
is Business Intelligence
3. What we aim to cover today?
What is KantanMT.com?
Types of Quality Estimation
Comparative Quality Estimations
Predictive Quality Estimations
Benefits to Industry
Product Scope Determination
Tiered Pricing Capabilities
Conclusions
4. What is KantanMT.com?
Statistical MT Platform
Cloud-based
Highly scalable
Inexpensive to operate
Fusion of TM & MT & rules
High speed, high quality
translations
Our Vision
To put Machine Translation
Customisation
Improvement
Deployment
into your hands
Active KantanMT Engines
7,783
Training Words Uploaded
143,078,042,293
Member Words Translated
4,259,399,846
www.kantanMT.com
5. Types of MT Quality Estimation
Comparative MT Quality Estimation
Uses a reference translation to calculate:-
Word recall & precision
Text Similarities
Word Order correlations
Linguistic similarities
6. F-Measure Score
Recall & Precision calculation
Closely linked to the relevancy of word selection
for MT systems
Types of MT Quality Estimation
KantanBuildAnalytics™
7. BLEU Score
Improvement upon F-Measure
Takes word-order into consideration
Linked to a sense of translation ‘fluency’
Types of MT Quality Estimation
KantanBuildAnalytics™
8. Types of MT Quality Estimation
TER Score
A method to help in predict the post-editing effort
TER is quick to use and correlates highly with actual post-
editing effort
KantanBuildAnalytics™
9. Types of MT Quality Estimation
Useful for
Engine Development
Baseline measurements
Determination of ‘possible’ engine
quality and relevancy
Reference set of comparative
translations required
Does not work on unseen translations
Of limited use in determining
PE effort
Resources
Costs
Kantan BuildAnalytics™
10. Kantan TotalRecall – Advanced TM
% of TM hits in this job
KantanMT – automated translations
% of automated translations for this job
Range of QE Scores
QE range defined to match existing fuzzy match ranges used by
L10N industry
Quality Estimation Scores
Segment level QE scores – akin to fuzzy match scores
Word Counts – Project Stats
Can be used to develop Project TimeLine and Tiered Pricing
Model for Post-Editing Projects
Placeholder & Tag Counts
Used by PM for complexity sur-charges
Types of MT Quality Estimation
KantanAnalytics™
11. Types of MT Quality Estimation
KantanAnalytics™
No Reference set reqd.
Predictive, not comparative
Benefits
Tiered Pricing Model
Prioritise PE activity
Schedule
Resources
Cost
Seamlessly integrated into all
CAT tools
KantanAnalytics™ - a predictive quality
estimation technology
No more expensive deployments
Monthly subscription plan
Customised subscription plan
No more complexity
KantanMT does all the heavy lifting
You focus on what you do best – grow and develop your business