SlideShare a Scribd company logo
1 of 21
Download to read offline
Improving Translator Productivity with MT
a patent translation case study
John Tinsley
CEO and Co-founder
PSLT @ MT Summit. Miami. 30th October 2015
We provide Machine Translation
solutions with Subject Matter Expertise
MT solutions and services provider, specializing in
providing customised solutions with subject
matter expertise for specific technical sectors,
such as Patents/IP, life sciences, and financial.
Pre-processing Post-processing
Input Output
Training Data
Data Engineering
How does that work?
Chinese pre-ordering
rules
Statistical
Post-editing
Input
Output
Training Data
Spanish med-device
entity recognizer
Multi-output
Combination
Korean pharma
tokenizer
Patent input
classifier
Client TM/terminology (optional)
Japanese script
normalisation
German
Compounding rules
Moses
RBMT
Moses
Moses
Domain Adaptation and Data Selection
•  MML with Vocabulary Saturation
Filtering (VSF)
•  Language and translation model
interpolation (linear/log linear)
•  Terminology extraction using IR
Hybrid is a misnomer
•  Statistical MT
•  Syntax-based methods
•  Grammar rules
•  Example-based templates
On-the-fly system combinationHierarchical models Translation Memory Integration
Syntactic pre/post-ordering Template-driven translation
Combining linguistics, statistics, and MT expertise
The Ensemble ArchitectureTM
The Challenge of Patents
L is an organic group selected from -CH2-
(OCH2CH2)n-, -CO-NR'-, with R'=H or
C1-C4 alkyl group; n=0-8; Y=F, CF3 …
maximum stress of 1.2 to 3.5 N/mm<2>
and a maximum elongation of 700 to
1,300% at 0[deg.] C.
Long Sentences
Technical constructions
Largest single document: 249,322 words
Longest Sentence: 1,417 words
The Challenge of Patents
  Very	
  long	
  sentences	
  as	
  standard	
  
  Gramma1cally	
  incomplete	
  using	
  
nominal	
  and	
  telegraphic	
  style	
  (!)	
  
  Passive	
  forms	
  are	
  frequent	
  
  Frequent	
  use	
  of	
  subordinate	
  clauses,	
  
par1ciples,	
  implicit	
  constructs	
  
  Inconsistent	
  and	
  incorrect	
  spelling	
  
  High	
  use	
  of	
  neologisms	
  	
  
  Instances	
  of	
  synonymy	
  and	
  polysemy	
  	
  
  Spurious	
  use	
  of	
  punctua1on	
  
Authoring guide
for “to be
translated” text
Patents break
almost all of the
rules!
IPTranslatorPatent Translation by Iconic Translation Machines
MT for Information Purposes
MT Application Areas
MT for Post-editing Productivity
•  Development focuses on improving key information translation
•  Terminology is important
•  Evaluation driven by “usability”
•  Development focuses on reducing edits required
•  Feedback loop is crucial
•  Evaluation through practical translation tasks
Lots of different ways to do evaluation
–  automatic scores
•  BLEU, METEOR, GTM, TER
–  fluency, adequacy, comparative ranking
–  task-based evaluation
•  error analysis, post-edit productivity
Different metrics, different intelligence
–  what does each type of metric tell us?
–  which ones are usable at which stage of evaluation?
e.g. can we really use automatic scores to assess productivity?
e.g. does productivity delta really tell us how good the output is?
MT Evaluation – where do we start!?
Problem
Large Chinese to English patent translation project. Challenging
content and language
Question
What if any efficiencies can machine translation add to the workflow of
RWS translators?
How we applied different types of MT evaluation and different stages
in the process, at various go/no stages, to help RWS to assess whether
MT is viable for this project
Client Case Study – RWS
- UK headquartered public company
- Founded 1958
- 9th largest LSP (CSA 2013 report)
- Leader in specialist IP translations
Can we improve our baseline engines through customisation?
Step 1: Baseline and Customisation
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
BLEU TER
Iconic Baseline
Iconic Customised
What next?
How good is the output relative to the task, i.e. post-editing?
- fluency/adequacy not going to tell us
- let’s start with segment level TER
-  Huge improvement
-  Intuitively, scores
reflect well but don’t
really say anything
-  Let’s dig deeper
Translation Edit Rate: correlates well with practical evaluations
If we look deeper, what can we learn?
INTELLIGENCE
• Proportion of full matches (i.e. big savings)
• Proportion of close matches (i.e. faster that fuzzy matches)
• Proportion of poor matches
ACTIONABLE INFORMATION
• Type of sentence with high/low matches
• Weaknesses and gaps
• Segments to compare and analyse in translation memory
TERscore
Step 2: Segment-level automatic analysis

Distribution of segment-level TER scores
This represents a 24% potential
productivity gain
segment length
With MT experience and previous MT integration, productivity
testing can be run in the production environment. In this case we
used, the TAUS Dynamic Quality Framework
Step 3: Productivity testing

Productivity Test
Productivity Test
With MT experience and previous MT integration, productivity
testing can be run in the production environment. In this case we
used, the TAUS Dynamic Quality Framework
Beware the variables!
•  Translators: different experience, speed, perceptions of MT
–  24 translators: senior, staff, and interns
•  Test sets: not representative; particularly difficult
–  2 tests sets, comprising 5 documents, and cross-fold validation
•  Environment and task: inexperience and unfamiliarity
–  Training materials, videos, and “dummy” segments
Step 3: Productivity testing
Overall average
Findings and Learnings 
25% productivity gain
Experienced: 22%
Staff: 23%
Interns: 30%
Test set 1.1: 25%
Test set 1.2: 35%
Test set 2.1: 06%
Test set 2.2: 35%
Correlates with TER
Rollout with junior staff
for more immediate
impact on bottom line?
Don’t be over concerned
by outliers.
Use data to facilitate
source content profiling?
What it tells us
By Translator Profile
By Test Set
Look our for anomalies
–  segments with long timings (above average ratio words/minute)
–  sentences that don’t change much from MT to post-edit
–  segments with unusually short timings
In this case, the next step is production roll-out to validate these
in the actual translator workflow over an extended period.
Warnings, Tips, and Next Steps

Now would be the right time to do fluency/adequacy if you need to
verify that post-editing is producing, at least, similar quality output
“The biggest room in the world is the
room for improvement”
Thank You!
john@iconictranslation.com
@IconicTrans

More Related Content

What's hot

6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)
6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)
6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)RIILP
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)RIILP
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedSDL
 
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)TAUS - The Language Data Network
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for TranslationRIILP
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translationRIILP
 
Translation assessment
Translation assessmentTranslation assessment
Translation assessmentapril aulia
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...RIILP
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT IntroductionRIILP
 
Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
 
18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) TerminologyRIILP
 
KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskaskantanmt
 
virtual Touch keypad 12-key Chinese text input benchmark test report
virtual Touch keypad 12-key Chinese text input benchmark test reportvirtual Touch keypad 12-key Chinese text input benchmark test report
virtual Touch keypad 12-key Chinese text input benchmark test reportJohn Chen, Jun
 
Succeeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in EnterpriseSucceeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in Enterprisedsyme
 
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...Association for Computational Linguistics
 

What's hot (20)

6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)
6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)
6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
 
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)
Seeing the Wood for the Trees - John Tinsley (Iconic Translation Machines)
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
 
Lucia Specia - Estimativa de qualidade em TA
Lucia Specia - Estimativa de qualidade em TALucia Specia - Estimativa de qualidade em TA
Lucia Specia - Estimativa de qualidade em TA
 
TAUS MT Post-Editing Guidelines
TAUS MT Post-Editing GuidelinesTAUS MT Post-Editing Guidelines
TAUS MT Post-Editing Guidelines
 
Translation assessment
Translation assessmentTranslation assessment
Translation assessment
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)
 
18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology
 
KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
 
virtual Touch keypad 12-key Chinese text input benchmark test report
virtual Touch keypad 12-key Chinese text input benchmark test reportvirtual Touch keypad 12-key Chinese text input benchmark test report
virtual Touch keypad 12-key Chinese text input benchmark test report
 
TOIN - TAUS Tokyo Forum 2015
TOIN - TAUS Tokyo Forum 2015TOIN - TAUS Tokyo Forum 2015
TOIN - TAUS Tokyo Forum 2015
 
Computer
ComputerComputer
Computer
 
Succeeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in EnterpriseSucceeding with Functional-first Programming in Enterprise
Succeeding with Functional-first Programming in Enterprise
 
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
 

Viewers also liked

Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...Iconic Translation Machines
 
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonDr. Haxel Consult
 
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...Lighthouse IP Group
 
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...TAUS - The Language Data Network
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Yusuke Oda
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword UnitsKanji Takahashi
 
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...Dr. Haxel Consult
 
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...TAUS - The Language Data Network
 
Predictive Analysis in Machine Translation is Business Intelligence.
Predictive Analysis in Machine Translation is Business Intelligence.Predictive Analysis in Machine Translation is Business Intelligence.
Predictive Analysis in Machine Translation is Business Intelligence.TAUS - The Language Data Network
 
شهاده خبره محمد جلال
شهاده خبره محمد جلالشهاده خبره محمد جلال
شهاده خبره محمد جلالMahmoud Aly
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...TAUS - The Language Data Network
 

Viewers also liked (16)

Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...
 
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
 
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...
Patlib presentation lighthouse_ip_group_will_hawkins_2012_is_on_demand_patent...
 
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
Beyond the Hype of Neural Machine Translation, Diego Bartolome (tauyou) and G...
 
Innovative Business and Pricing Models: for MT
Innovative Business and Pricing Models: for MTInnovative Business and Pricing Models: for MT
Innovative Business and Pricing Models: for MT
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units
 
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
ICIC 2014 High volume, High Quality Patent Translation across Multiple Domain...
 
Plantilla hecha bien 2
Plantilla hecha bien 2Plantilla hecha bien 2
Plantilla hecha bien 2
 
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...
Quality estimation: the Holy Grail in the MT scene (Gábor Bessenyei, CEO of M...
 
Predictive Analysis in Machine Translation is Business Intelligence.
Predictive Analysis in Machine Translation is Business Intelligence.Predictive Analysis in Machine Translation is Business Intelligence.
Predictive Analysis in Machine Translation is Business Intelligence.
 
شهاده خبره محمد جلال
شهاده خبره محمد جلالشهاده خبره محمد جلال
شهاده خبره محمد جلال
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...
 
Nietzsche
NietzscheNietzsche
Nietzsche
 
Notas ingles
Notas inglesNotas ingles
Notas ingles
 

Similar to Improving Translator Productivity with MT: A Patent Translation Case Study

Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_saRobert Martin
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS - The Language Data Network
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinarkantanmt
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS - The Language Data Network
 
Webinar: How to get localization and testing for medical devices done right
Webinar: How to get localization and testing for medical devices done right Webinar: How to get localization and testing for medical devices done right
Webinar: How to get localization and testing for medical devices done right Qualitest
 
From the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT ResearchFrom the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT ResearchIconic Translation Machines
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Loctimize GmbH
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16kantanmt
 
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingThe Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingPerfecto by Perforce
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Sajan
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationPoulomi Choudhury
 
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]Vipul Gupta
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolometauyou
 
Neotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree NalwadadNeotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree NalwadadNeotys_Partner
 
Software Testing Services
Software Testing ServicesSoftware Testing Services
Software Testing ServicesScienceSoft
 

Similar to Improving Translator Productivity with MT: A Patent Translation Case Study (20)

Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
 
Webinar: How to get localization and testing for medical devices done right
Webinar: How to get localization and testing for medical devices done right Webinar: How to get localization and testing for medical devices done right
Webinar: How to get localization and testing for medical devices done right
 
From the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT ResearchFrom the Lab to the Market: Commercialising MT Research
From the Lab to the Market: Commercialising MT Research
 
MT Evaluation: Seeing the Wood for the Trees
MT Evaluation: Seeing the Wood for the TreesMT Evaluation: Seeing the Wood for the Trees
MT Evaluation: Seeing the Wood for the Trees
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingThe Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing
 
TAUS Evaluating Post-Editor Performance Guidelines
TAUS Evaluating Post-Editor Performance GuidelinesTAUS Evaluating Post-Editor Performance Guidelines
TAUS Evaluating Post-Editor Performance Guidelines
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]
Q Labs Webinar on Testcase Prioritization [Feb 20, 2009]
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
 
Neotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree NalwadadNeotys PAC 2018 - Gayatree Nalwadad
Neotys PAC 2018 - Gayatree Nalwadad
 
Software Testing Services
Software Testing ServicesSoftware Testing Services
Software Testing Services
 

More from Iconic Translation Machines

The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...Iconic Translation Machines
 
Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...Iconic Translation Machines
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Iconic Translation Machines
 
"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of Patents"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of PatentsIconic Translation Machines
 
Beyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter ExpertiseBeyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter ExpertiseIconic Translation Machines
 

More from Iconic Translation Machines (6)

The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...The growing role of translation technology in e-discovery, litigation, digita...
The growing role of translation technology in e-discovery, litigation, digita...
 
Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...Making the Old New Again - Modern Technical Provides Access to Historical Che...
Making the Old New Again - Modern Technical Provides Access to Historical Che...
 
Machine Translation: The Neural Frontier
Machine Translation: The Neural FrontierMachine Translation: The Neural Frontier
Machine Translation: The Neural Frontier
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of Patents"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of Patents
 
Beyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter ExpertiseBeyond Data: Delivering Machine Translation with Subject Matter Expertise
Beyond Data: Delivering Machine Translation with Subject Matter Expertise
 

Recently uploaded

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 

Recently uploaded (20)

chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 

Improving Translator Productivity with MT: A Patent Translation Case Study

  • 1. Improving Translator Productivity with MT a patent translation case study John Tinsley CEO and Co-founder PSLT @ MT Summit. Miami. 30th October 2015
  • 2. We provide Machine Translation solutions with Subject Matter Expertise MT solutions and services provider, specializing in providing customised solutions with subject matter expertise for specific technical sectors, such as Patents/IP, life sciences, and financial.
  • 3.
  • 4. Pre-processing Post-processing Input Output Training Data Data Engineering How does that work?
  • 5. Chinese pre-ordering rules Statistical Post-editing Input Output Training Data Spanish med-device entity recognizer Multi-output Combination Korean pharma tokenizer Patent input classifier Client TM/terminology (optional) Japanese script normalisation German Compounding rules Moses RBMT Moses Moses Domain Adaptation and Data Selection •  MML with Vocabulary Saturation Filtering (VSF) •  Language and translation model interpolation (linear/log linear) •  Terminology extraction using IR Hybrid is a misnomer •  Statistical MT •  Syntax-based methods •  Grammar rules •  Example-based templates On-the-fly system combinationHierarchical models Translation Memory Integration Syntactic pre/post-ordering Template-driven translation Combining linguistics, statistics, and MT expertise The Ensemble ArchitectureTM
  • 6. The Challenge of Patents L is an organic group selected from -CH2- (OCH2CH2)n-, -CO-NR'-, with R'=H or C1-C4 alkyl group; n=0-8; Y=F, CF3 … maximum stress of 1.2 to 3.5 N/mm<2> and a maximum elongation of 700 to 1,300% at 0[deg.] C. Long Sentences Technical constructions Largest single document: 249,322 words Longest Sentence: 1,417 words
  • 7. The Challenge of Patents   Very  long  sentences  as  standard     Gramma1cally  incomplete  using   nominal  and  telegraphic  style  (!)     Passive  forms  are  frequent     Frequent  use  of  subordinate  clauses,   par1ciples,  implicit  constructs     Inconsistent  and  incorrect  spelling     High  use  of  neologisms       Instances  of  synonymy  and  polysemy       Spurious  use  of  punctua1on   Authoring guide for “to be translated” text Patents break almost all of the rules!
  • 8. IPTranslatorPatent Translation by Iconic Translation Machines
  • 9. MT for Information Purposes MT Application Areas MT for Post-editing Productivity •  Development focuses on improving key information translation •  Terminology is important •  Evaluation driven by “usability” •  Development focuses on reducing edits required •  Feedback loop is crucial •  Evaluation through practical translation tasks
  • 10. Lots of different ways to do evaluation –  automatic scores •  BLEU, METEOR, GTM, TER –  fluency, adequacy, comparative ranking –  task-based evaluation •  error analysis, post-edit productivity Different metrics, different intelligence –  what does each type of metric tell us? –  which ones are usable at which stage of evaluation? e.g. can we really use automatic scores to assess productivity? e.g. does productivity delta really tell us how good the output is? MT Evaluation – where do we start!?
  • 11. Problem Large Chinese to English patent translation project. Challenging content and language Question What if any efficiencies can machine translation add to the workflow of RWS translators? How we applied different types of MT evaluation and different stages in the process, at various go/no stages, to help RWS to assess whether MT is viable for this project Client Case Study – RWS - UK headquartered public company - Founded 1958 - 9th largest LSP (CSA 2013 report) - Leader in specialist IP translations
  • 12. Can we improve our baseline engines through customisation? Step 1: Baseline and Customisation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 BLEU TER Iconic Baseline Iconic Customised What next? How good is the output relative to the task, i.e. post-editing? - fluency/adequacy not going to tell us - let’s start with segment level TER -  Huge improvement -  Intuitively, scores reflect well but don’t really say anything -  Let’s dig deeper
  • 13. Translation Edit Rate: correlates well with practical evaluations If we look deeper, what can we learn? INTELLIGENCE • Proportion of full matches (i.e. big savings) • Proportion of close matches (i.e. faster that fuzzy matches) • Proportion of poor matches ACTIONABLE INFORMATION • Type of sentence with high/low matches • Weaknesses and gaps • Segments to compare and analyse in translation memory
  • 14. TERscore Step 2: Segment-level automatic analysis Distribution of segment-level TER scores This represents a 24% potential productivity gain segment length
  • 15. With MT experience and previous MT integration, productivity testing can be run in the production environment. In this case we used, the TAUS Dynamic Quality Framework Step 3: Productivity testing Productivity Test
  • 17. With MT experience and previous MT integration, productivity testing can be run in the production environment. In this case we used, the TAUS Dynamic Quality Framework Beware the variables! •  Translators: different experience, speed, perceptions of MT –  24 translators: senior, staff, and interns •  Test sets: not representative; particularly difficult –  2 tests sets, comprising 5 documents, and cross-fold validation •  Environment and task: inexperience and unfamiliarity –  Training materials, videos, and “dummy” segments Step 3: Productivity testing
  • 18. Overall average Findings and Learnings 25% productivity gain Experienced: 22% Staff: 23% Interns: 30% Test set 1.1: 25% Test set 1.2: 35% Test set 2.1: 06% Test set 2.2: 35% Correlates with TER Rollout with junior staff for more immediate impact on bottom line? Don’t be over concerned by outliers. Use data to facilitate source content profiling? What it tells us By Translator Profile By Test Set
  • 19. Look our for anomalies –  segments with long timings (above average ratio words/minute) –  sentences that don’t change much from MT to post-edit –  segments with unusually short timings In this case, the next step is production roll-out to validate these in the actual translator workflow over an extended period. Warnings, Tips, and Next Steps Now would be the right time to do fluency/adequacy if you need to verify that post-editing is producing, at least, similar quality output
  • 20. “The biggest room in the world is the room for improvement”

Editor's Notes

  1. The idea here is that, as with any translation job, you want someone with expertise in the area. Same goes for MT. I’ll talk about how this affects how we TRAIN AND DEVELOP engines and also how we EVALUATE engines, and I’ll wrap up with a relatively big case study on how we helped one particular LSP bring on-board MT to improve translation post-editing productivity. We are the MT partner of choice for some of the world’s largest translation companies, information providers, and government and enterprise organisations. For Translation Companies: We help translation companies to translate more content, more accurately for faster project turnaround, resulting in significant cost savings and increased revenue. For Enterprise Clients: We help enterprises to translate more content in less time, resulting in faster products to market and enhanced global reach. For Information Providers: We help information providers to translate knowledge, literature and documentary information faster and more accurately, resulting in broader knowledge offerings and faster time to market.
  2. Just to give you some background on the origins of the company… Pluto project was an EU FP7 project than begun in 2010 with the goal of adapting existing MT technology for the area of patent translation (high demand – europe lots of languages) Went very well and led to the development of a service called IPTranslator.com was aimed at facilitating patent searchers through multiplingual MT that was easily integratable across patent search tools like Espacenet, Patentscope, Patbase, Questel and others… After the project, given the technology in place, a company called Iconic was formed which expanded the techinqies and technolog developed in IPTranslator to be adapted to other sectors and for other markets such as informaiton providers, enterprise companies, and LSPs. While IPTranslator.com still exists today (albeit in a modified format) the web service itself is not a core business but rather building on it to provide highlight customer adapted systems is the main focus of Iconic. Though in doing this, a large portion of our business is in the area of patent translation.
  3. Ok, so just as an introduction to the topic of DOMAIN ADAPTATION from our perspective, how do Iconic translation machines work? Existing vendors or MT providers use the follow process – if a client wants a machine translation system for a certain domain, say IT, they provider the vendor with training data and this gets churned through the various generic processes for each language required. The idea is that by pumping in data in the IT domain that an IT machine translation system comes out at the end. It’s true to a certain extent but the reality is that the quality often doesn’t cut the mustard. The problem with the data engineering approach is that you need A LOT of data and many clients simply don’t have it. As a consequence, there’s a complete reliance on the data. If the MT output doesn’t meet the end-user requirements, the only solution is to say “we need more training data” This shortcoming really comes to the fore when we’re dealing with complex languages and content types, like patents…
  4. ON-THE-FLY MODEL SELECTION. CLASSIFIER BASED. USER SELECTION (INTERFACE/REQUEST) QUALITY vs CONVENIENCE in Commission Scenario… NOT GOING TO GIVE AWAY TOO MANY TRADE SECRETS Customised domain-specific MT Grew from patent translations, expanding into technology that can be applied across technical areas Mixture of statistical, rule-based, syntactic. Ensemble architecture Domain adaptation and data selection MML with Vocabulary Saturation Filtering (VSF) language/translation model interpolation (linear/log linear) IR based term extraction (ask Hala) Hybrid, what is hybrid? Misnomer SMT + rules. Rule based + APE? Where does syntax fit in? Our hybrid architecture uses what’s most appropriate for a particlar language, domain AND style combination Specificially on the fly system combination Hierarchical models Templates driven, TM integrated Syntactic pre/post-ordering
  5. And, given the current environment, a little look as to why this is particularly required for patents…
  6. Sometimes it’s hard to tell whether the translation is bad or that’s simply how the original patent was written!
  7. Using a certain set of configurations in the ensemble gives us IPTranslator, which is our suite of MT engines that have been specifically adapted for patents. These serve as “ready to go” tool or as a basis for customisation… So let’s look at how IPTranslator, and other types of MT are being applied in the industry…
  8. There are a number of different ways in which MT is use in general. Most commonly for our solutions, we’re looking a 2 main use cases: MT for info and MT for post-editing. Examples of the things we offer in MT FOR INFO include development to specific evaluation criteria and how we want to fit it into a particular end-user scenario – e.g. EDISCOVERY, MT FOR SEARCH, WEB STORE INTEGRATION
  9. And, heading into our case studies, let’s look at the hardest part – evaluation… Different metrics tell us different things, but, perhaps more appropriately is what the metrics don’t tell us There are lots of them out there, you need to know which ones to use and when. We’ve obviously got a lot of experience in this area given our background,
  10. I’ll talk about how we collected this information through MT evaluations via a case study with RWS. What I’ll focus on his WHAT MT evaluation we carried out and what STAGES to give us the information we needed to know
  11. First step is can we improve our engines through customisation. These automatic scores tell us CONCLUSIVELY. Yes. But the don’t really tell us anything about QUALITY, or SUITABILITY for the TASK We need to dig deeper on a segment level and for this, we use TER. WHY?
  12. TER has correlated well with practical evaluations for us. It gives us practical information which we can correlate with the bottom line It also gives us practicable (actionable) information which we can use to improve MT and do further analysis **If you do this over a variety of test documents like we did with RWS, where we used 10, you’ll get a sense of what the MT can bring**
  13. For example, here we see FOR EACH SEGMENT, the TER range and how long the segments are within those ranges. This allows us to do some calculations, which I won’t detail now, can discuss in the breakout session, but it resulted in a 24% gain
  14. Experience is crucial here. Lot’s of variables and things to look out for, like TRANSLATORS, TEST SETS, and the ENVIRONMENT as I’m sure people here can attest to. I won’t go into detail but here’s a high level look at what we did to try to find out different information.
  15. We know the European landscape, the stakeholders, and the requirements Our machine translation expertise is second to none in the commercial landscape, and we’re helping to drive machine translation adoption, and in so doing, taking concepts from the lab to the market We specialise in collaboration – commercial, public sector, government, and research institution (so we’re well attuned to adapting to shifting priorities) Iconic was borne out of Europe and we’d be only to happy to give back in whatever way possible (for the right price)
  16. Experience is crucial here. Lot’s of variables and things to look out for, like TRANSLATORS, TEST SETS, and the ENVIRONMENT as I’m sure people here can attest to. I won’t go into detail but here’s a high level look at what we did to try to find out different information.
  17. TECHNICAL DETAILS SPECIFIC TO POST_EDIT ANALYSES THAT WE LEARNED In terms of analysing information, there are a number of things to look out for to make sure we’re getting more accurate results. Save to say now would be the right time to look at quality evaluation and make sure post-editing is not affecting things
  18. MT is now! “Domain” adaptation is more than just similar documents – it involve taking into account style and variations across languages Patents are hard – plenty of room for improvement