SlideShare a Scribd company logo
1 of 18
Download to read offline
MT Evaluation
Seeing the Wood for the Trees
John Tinsley
CEO and Co-founder
TAUS QE Summit. Dublin. 28th May 2015
We need to marry data that we know from operations with data
we product during MT evaluations to create intelligence
Let’s look at how we can find that out and what it means…
Making the business case for MT
KNOWNS
•  Revenue from translation
•  Costs (internal, outsourced)
•  Variations of this information
across content and
languages
UNKNOWNS
•  MT performance
•  Cost of MT
•  Variations of this information
across content and
languages
Calculating potential ROI
Parameters	
  
Per	
  word	
  rate	
  (LSP)	
   Vendor	
  Rate	
   Produc3vity	
  Gain	
   Project	
  Word	
  Count	
   MT	
  Cost	
  
€0.10	
   €0.08	
   5,000,000	
  
MT	
  Weighted	
  Word	
  Count	
  
No	
  Machine	
  Transla3on	
   With	
  Machine	
  Transla3on	
  
LSP	
  Revenue	
   €500,000	
   LSP	
  Revenue	
   €500,000	
  
Vendor	
  Cost	
   €400,000	
   Vendor	
  Cost	
  
MT	
  Cost	
   0	
   MT	
  Cost	
  
Gross	
  Profit	
   €100,000	
   Gross	
  Profit	
  
Gross	
  Profit	
  Margin	
   20.0%	
   Gross	
  Profit	
  Margin	
  
Gross	
  Profit	
  
Increase	
  when	
  using	
  
MT	
   ???%	
  
**These numbers are for illustrative purposes only and not related to the case study
Problem
Large Chinese to English patent translation project. Challenging
content and language
Question
What if any efficiencies can machine translation add to the workflow of
RWS translators?
How we applied different types of MT evaluation and different stages
in the process, at various go/no stages, to help RWS to assess whether
MT is viable for this project
Client Case Study – RWS
- UK headquartered public company
- Founded 1958
- 9th largest LSP (CSA 2013 report)
- Leader in specialist IP translations
Lots of different ways to do evaluation**
–  automatic scores
•  BLEU, METEOR, GTM, TER
–  fluency, adequacy, comparative ranking
–  task-based evaluation
•  error analysis, post-edit productivity
Different metrics, different intelligence
–  what does each type of metric tell us?
–  which ones are usable at which stage of evaluation?
e.g. can we really use automatic scores to assess productivity?
e.g. does productivity delta really tell us how good the output is?
MT Evaluation – where do we start!?
Can we improve our baseline engines through customisation?
Step 1: Baseline and Customisation
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
BLEU TER
Iconic Baseline
Iconic Customised
What next?
How good is the output relative to the task, i.e. post-editing?
- fluency/adequacy not going to tell us
- let’s start with segment level TER
-  Huge improvement
-  Intuitively, scores
reflect well but don’t
really say anything
-  Let’s dig deeper
Translation Edit Rate: correlates well with practical evaluations
If we look deeper, what can we learn?
INTELLIGENCE
• Proportion of full matches (i.e. big savings)
• Proportion of close matches (i.e. faster that fuzzy matches)
• Proportion of poor matches
ACTIONABLE INFORMATION
• Type of sentence with high/low matches
• Weaknesses and gaps
• Segments to compare and analyse in translation memory
TERscore
Step 2: Segment-level automatic analysis

Distribution of segment-level TER scores
This represents a 24% potential
productivity gain**
segment length
With MT experience and previous MT integration, productivity
testing can be run in the production environment. In this case we
used, the Dynamic Quality Framework
Beware the variables**!
•  Translators: different experience, speed, perceptions of MT
–  24 translators: senior, staff, and interns
•  Test sets: not representative; particularly difficult
–  2 tests sets, comprising 5 documents, and cross-fold validation
•  Environment and task: inexperience and unfamiliarity
–  Training materials, videos, and “dummy” segments
Step 3: Productivity testing
Overall average
Findings and Learnings 
25% productivity gain
Experienced: 22%
Staff: 23%
Interns: 30%
Test set 1.1: 25%
Test set 1.2: 35%
Test set 2.1: 06%
Test set 2.2: 35%
Correlates with TER
Rollout with junior staff
for more immediate
impact on bottom line?
Don’t be over concerned
by outliers.
Use data to facilitate
source content profiling?
What it tells us
By Translator Profile
By Test Set
Look our for anomalies**
–  segments with long timings (above average ratio words/minute)
–  sentences that don’t change much from MT to post-edit*
–  segments with unusually short timings
In this case, the next step is production roll-out to validate these
in the actual translator workflow over an extended period.
Warnings, Tips, and Next Steps

Now would be the right time to do fluency/adequacy if you need to
verify that post-editing is producing, at least, similar quality output
Calculating the ROI - revisited
Parameters	
  
Per	
  word	
  rate	
  (LSP)	
   Vendor	
  Rate	
   Produc3vity	
  Gain	
   Project	
  Word	
  Count	
   MT	
  Cost	
  
€0.10	
   €0.08	
   5,000,000	
  
MT	
  Weighted	
  Word	
  Count	
  
No	
  Machine	
  Transla3on	
   With	
  Machine	
  Transla3on	
  
LSP	
  Revenue	
   €500,000	
   LSP	
  Revenue	
   €500,000	
  
Vendor	
  Cost	
   €400,000	
   Vendor	
  Cost	
  
MT	
  Cost	
   0	
   MT	
  Cost	
  
Gross	
  Profit	
   €100,000	
   Gross	
  Profit	
  
Gross	
  Profit	
  Margin	
   20.0%	
   Gross	
  Profit	
  Margin	
  
Gross	
  Profit	
  
Increase	
  when	
  using	
  
MT	
   ???%	
  
**These numbers are for illustrative purposes only and not related to the case study
Calculating the ROI – plugging in the numbers
Parameters	
  
Per	
  word	
  rate	
  (LSP)	
   Vendor	
  Rate	
   Produc3vity	
  Gain	
   Project	
  Word	
  Count	
   MT	
  Cost	
  
€0.10	
   €0.08	
   25%	
   5,000,000	
   €0.008	
  
MT	
  Weighted	
  Word	
  Count	
  
3,750,000	
  
No	
  Machine	
  Transla3on	
   With	
  Machine	
  Transla3on	
  
LSP	
  Revenue	
   €500,000	
   LSP	
  Revenue	
   €500,000	
  
Vendor	
  Cost	
   €400,000	
   Vendor	
  Cost	
   €300,000	
  
MT	
  Cost	
   0	
   MT	
  Cost	
   €40,000	
  
Gross	
  Profit	
   €100,000	
   Gross	
  Profit	
   €160,000	
  
Gross	
  Profit	
  Margin	
   20.0%	
   Gross	
  Profit	
  Margin	
   32%	
  
Gross	
  Profit	
  
Increase	
  when	
  using	
  
MT	
   60%	
  
**These numbers are for illustrative purposes only and not related to the case study
Identify the gaps in your data
3 take home messages

Understand the process to collect
the right information
Continuous assessment
Thank You!
john@iptranslator.com
@IconicTrans
Iconic Translation Machines
•  Machine Translation with Subject Matter Expertise
•  Headquartered here in Dublin
•  Strong tradition of MT research and development
underpinning the company and its technologies
This presentation
•  MT evaluation: what, how, when, why?
–  What ways can we evaluate MT?
–  How do we carry out the evaluation?
–  When in the process do we carry out certain types of evaluation?
–  Why do we do certain evaluations and what do they tell us?
By way of introduction…
Step 2: Segment-level automatic analysis

Productivity
threshold
Plot of TER scores by length
Step 2: Segment-level automatic analysis

Distribution of segment-level TER scores

More Related Content

Similar to MT Evaluation Reveals 25% Productivity Gain

Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyIconic Translation Machines
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
Digital Transformation: How to Model Human Behavior in Digitization
Digital Transformation: How to Model Human Behavior in DigitizationDigital Transformation: How to Model Human Behavior in Digitization
Digital Transformation: How to Model Human Behavior in DigitizationBizagi
 
NTEN Nonprofit Technology Leadership Series
NTEN Nonprofit Technology Leadership SeriesNTEN Nonprofit Technology Leadership Series
NTEN Nonprofit Technology Leadership SeriesBeth Kanter
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS - The Language Data Network
 
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013TAUS - The Language Data Network
 
Measuring Success in the Lean IT World
Measuring Success in the Lean IT WorldMeasuring Success in the Lean IT World
Measuring Success in the Lean IT WorldLean IT Association
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...TAUS - The Language Data Network
 
LookupPoint for Partners - Peter Reynolds
LookupPoint for Partners -  Peter ReynoldsLookupPoint for Partners -  Peter Reynolds
LookupPoint for Partners - Peter Reynoldspeterjreynolds
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)kantanmt
 
The lure of "the one metric that matters"
The lure of "the one metric that matters"The lure of "the one metric that matters"
The lure of "the one metric that matters"Split Software
 
Engineering mindset fort corporate management
Engineering mindset fort corporate managementEngineering mindset fort corporate management
Engineering mindset fort corporate managementXBOSoft
 
May 11th Slides: NTEN Leadership
May 11th Slides:  NTEN LeadershipMay 11th Slides:  NTEN Leadership
May 11th Slides: NTEN LeadershipBeth Kanter
 
Preparing for AI - Measurefest
Preparing for AI - MeasurefestPreparing for AI - Measurefest
Preparing for AI - MeasurefestGuido X Jansen
 
CompTIA P&L Management with Frank Coker
CompTIA P&L Management with Frank CokerCompTIA P&L Management with Frank Coker
CompTIA P&L Management with Frank CokerKris Fuehr
 
Unleashing the Enormous Power of Service Desk KPIs
Unleashing the Enormous Power of Service Desk KPIsUnleashing the Enormous Power of Service Desk KPIs
Unleashing the Enormous Power of Service Desk KPIsMetricNet
 
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...Perficient, Inc.
 
Step-Change Productivity - Analyst & Journalist Briefing 2014
Step-Change Productivity - Analyst & Journalist Briefing 2014Step-Change Productivity - Analyst & Journalist Briefing 2014
Step-Change Productivity - Analyst & Journalist Briefing 2014Tele2
 

Similar to MT Evaluation Reveals 25% Productivity Gain (20)

Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case Study
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
Digital Transformation: How to Model Human Behavior in Digitization
Digital Transformation: How to Model Human Behavior in DigitizationDigital Transformation: How to Model Human Behavior in Digitization
Digital Transformation: How to Model Human Behavior in Digitization
 
NTEN Nonprofit Technology Leadership Series
NTEN Nonprofit Technology Leadership SeriesNTEN Nonprofit Technology Leadership Series
NTEN Nonprofit Technology Leadership Series
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
ITIL Introduction
ITIL IntroductionITIL Introduction
ITIL Introduction
 
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013
TAUS MT SHOWCASE, Hunnect’s Use Case, Sándor Sojnóczky, Hunnect, 10 April 2013
 
Measuring Success in the Lean IT World
Measuring Success in the Lean IT WorldMeasuring Success in the Lean IT World
Measuring Success in the Lean IT World
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Sándor Sojnóczky, Hunne...
 
LookupPoint for Partners - Peter Reynolds
LookupPoint for Partners -  Peter ReynoldsLookupPoint for Partners -  Peter Reynolds
LookupPoint for Partners - Peter Reynolds
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
 
MT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, LingosailMT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, Lingosail
 
The lure of "the one metric that matters"
The lure of "the one metric that matters"The lure of "the one metric that matters"
The lure of "the one metric that matters"
 
Engineering mindset fort corporate management
Engineering mindset fort corporate managementEngineering mindset fort corporate management
Engineering mindset fort corporate management
 
May 11th Slides: NTEN Leadership
May 11th Slides:  NTEN LeadershipMay 11th Slides:  NTEN Leadership
May 11th Slides: NTEN Leadership
 
Preparing for AI - Measurefest
Preparing for AI - MeasurefestPreparing for AI - Measurefest
Preparing for AI - Measurefest
 
CompTIA P&L Management with Frank Coker
CompTIA P&L Management with Frank CokerCompTIA P&L Management with Frank Coker
CompTIA P&L Management with Frank Coker
 
Unleashing the Enormous Power of Service Desk KPIs
Unleashing the Enormous Power of Service Desk KPIsUnleashing the Enormous Power of Service Desk KPIs
Unleashing the Enormous Power of Service Desk KPIs
 
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...
Move from Business Intelligence to Advanced Analytics by Integrating IBM SPSS...
 
Step-Change Productivity - Analyst & Journalist Briefing 2014
Step-Change Productivity - Analyst & Journalist Briefing 2014Step-Change Productivity - Analyst & Journalist Briefing 2014
Step-Change Productivity - Analyst & Journalist Briefing 2014
 

More from TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

More from TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

MT Evaluation Reveals 25% Productivity Gain

  • 1. MT Evaluation Seeing the Wood for the Trees John Tinsley CEO and Co-founder TAUS QE Summit. Dublin. 28th May 2015
  • 2. We need to marry data that we know from operations with data we product during MT evaluations to create intelligence Let’s look at how we can find that out and what it means… Making the business case for MT KNOWNS •  Revenue from translation •  Costs (internal, outsourced) •  Variations of this information across content and languages UNKNOWNS •  MT performance •  Cost of MT •  Variations of this information across content and languages
  • 3. Calculating potential ROI Parameters   Per  word  rate  (LSP)   Vendor  Rate   Produc3vity  Gain   Project  Word  Count   MT  Cost   €0.10   €0.08   5,000,000   MT  Weighted  Word  Count   No  Machine  Transla3on   With  Machine  Transla3on   LSP  Revenue   €500,000   LSP  Revenue   €500,000   Vendor  Cost   €400,000   Vendor  Cost   MT  Cost   0   MT  Cost   Gross  Profit   €100,000   Gross  Profit   Gross  Profit  Margin   20.0%   Gross  Profit  Margin   Gross  Profit   Increase  when  using   MT   ???%   **These numbers are for illustrative purposes only and not related to the case study
  • 4. Problem Large Chinese to English patent translation project. Challenging content and language Question What if any efficiencies can machine translation add to the workflow of RWS translators? How we applied different types of MT evaluation and different stages in the process, at various go/no stages, to help RWS to assess whether MT is viable for this project Client Case Study – RWS - UK headquartered public company - Founded 1958 - 9th largest LSP (CSA 2013 report) - Leader in specialist IP translations
  • 5. Lots of different ways to do evaluation** –  automatic scores •  BLEU, METEOR, GTM, TER –  fluency, adequacy, comparative ranking –  task-based evaluation •  error analysis, post-edit productivity Different metrics, different intelligence –  what does each type of metric tell us? –  which ones are usable at which stage of evaluation? e.g. can we really use automatic scores to assess productivity? e.g. does productivity delta really tell us how good the output is? MT Evaluation – where do we start!?
  • 6. Can we improve our baseline engines through customisation? Step 1: Baseline and Customisation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 BLEU TER Iconic Baseline Iconic Customised What next? How good is the output relative to the task, i.e. post-editing? - fluency/adequacy not going to tell us - let’s start with segment level TER -  Huge improvement -  Intuitively, scores reflect well but don’t really say anything -  Let’s dig deeper
  • 7. Translation Edit Rate: correlates well with practical evaluations If we look deeper, what can we learn? INTELLIGENCE • Proportion of full matches (i.e. big savings) • Proportion of close matches (i.e. faster that fuzzy matches) • Proportion of poor matches ACTIONABLE INFORMATION • Type of sentence with high/low matches • Weaknesses and gaps • Segments to compare and analyse in translation memory
  • 8. TERscore Step 2: Segment-level automatic analysis Distribution of segment-level TER scores This represents a 24% potential productivity gain** segment length
  • 9. With MT experience and previous MT integration, productivity testing can be run in the production environment. In this case we used, the Dynamic Quality Framework Beware the variables**! •  Translators: different experience, speed, perceptions of MT –  24 translators: senior, staff, and interns •  Test sets: not representative; particularly difficult –  2 tests sets, comprising 5 documents, and cross-fold validation •  Environment and task: inexperience and unfamiliarity –  Training materials, videos, and “dummy” segments Step 3: Productivity testing
  • 10. Overall average Findings and Learnings 25% productivity gain Experienced: 22% Staff: 23% Interns: 30% Test set 1.1: 25% Test set 1.2: 35% Test set 2.1: 06% Test set 2.2: 35% Correlates with TER Rollout with junior staff for more immediate impact on bottom line? Don’t be over concerned by outliers. Use data to facilitate source content profiling? What it tells us By Translator Profile By Test Set
  • 11. Look our for anomalies** –  segments with long timings (above average ratio words/minute) –  sentences that don’t change much from MT to post-edit* –  segments with unusually short timings In this case, the next step is production roll-out to validate these in the actual translator workflow over an extended period. Warnings, Tips, and Next Steps Now would be the right time to do fluency/adequacy if you need to verify that post-editing is producing, at least, similar quality output
  • 12. Calculating the ROI - revisited Parameters   Per  word  rate  (LSP)   Vendor  Rate   Produc3vity  Gain   Project  Word  Count   MT  Cost   €0.10   €0.08   5,000,000   MT  Weighted  Word  Count   No  Machine  Transla3on   With  Machine  Transla3on   LSP  Revenue   €500,000   LSP  Revenue   €500,000   Vendor  Cost   €400,000   Vendor  Cost   MT  Cost   0   MT  Cost   Gross  Profit   €100,000   Gross  Profit   Gross  Profit  Margin   20.0%   Gross  Profit  Margin   Gross  Profit   Increase  when  using   MT   ???%   **These numbers are for illustrative purposes only and not related to the case study
  • 13. Calculating the ROI – plugging in the numbers Parameters   Per  word  rate  (LSP)   Vendor  Rate   Produc3vity  Gain   Project  Word  Count   MT  Cost   €0.10   €0.08   25%   5,000,000   €0.008   MT  Weighted  Word  Count   3,750,000   No  Machine  Transla3on   With  Machine  Transla3on   LSP  Revenue   €500,000   LSP  Revenue   €500,000   Vendor  Cost   €400,000   Vendor  Cost   €300,000   MT  Cost   0   MT  Cost   €40,000   Gross  Profit   €100,000   Gross  Profit   €160,000   Gross  Profit  Margin   20.0%   Gross  Profit  Margin   32%   Gross  Profit   Increase  when  using   MT   60%   **These numbers are for illustrative purposes only and not related to the case study
  • 14. Identify the gaps in your data 3 take home messages Understand the process to collect the right information Continuous assessment
  • 16. Iconic Translation Machines •  Machine Translation with Subject Matter Expertise •  Headquartered here in Dublin •  Strong tradition of MT research and development underpinning the company and its technologies This presentation •  MT evaluation: what, how, when, why? –  What ways can we evaluate MT? –  How do we carry out the evaluation? –  When in the process do we carry out certain types of evaluation? –  Why do we do certain evaluations and what do they tell us? By way of introduction…
  • 17. Step 2: Segment-level automatic analysis Productivity threshold Plot of TER scores by length
  • 18. Step 2: Segment-level automatic analysis Distribution of segment-level TER scores