Improving Translator Productivity with MT: A Patent Translation Case Study

Improving Translator Productivity with MT
a patent translation case study
John Tinsley
CEO and Co-founder
PSLT @ MT Summit. Miami. 30th October 2015

We provide Machine Translation
solutions with Subject Matter Expertise
MT solutions and services provider, specializing in
providing customised solutions with subject
matter expertise for specific technical sectors,
such as Patents/IP, life sciences, and financial.

Pre-processing Post-processing
Input Output
Training Data
Data Engineering
How does that work?

Chinese pre-ordering
rules
Statistical
Post-editing
Input
Output
Training Data
Spanish med-device
entity recognizer
Multi-output
Combination
Korean pharma
tokenizer
Patent input
classifier
Client TM/terminology (optional)
Japanese script
normalisation
German
Compounding rules
Moses
RBMT
Moses
Moses
Domain Adaptation and Data Selection
•  MML with Vocabulary Saturation
Filtering (VSF)
•  Language and translation model
interpolation (linear/log linear)
•  Terminology extraction using IR
Hybrid is a misnomer
•  Statistical MT
•  Syntax-based methods
•  Grammar rules
•  Example-based templates
On-the-fly system combinationHierarchical models Translation Memory Integration
Syntactic pre/post-ordering Template-driven translation
Combining linguistics, statistics, and MT expertise
The Ensemble ArchitectureTM

The Challenge of Patents
L is an organic group selected from -CH2-
(OCH2CH2)n-, -CO-NR'-, with R'=H or
C1-C4 alkyl group; n=0-8; Y=F, CF3 …
maximum stress of 1.2 to 3.5 N/mm<2>
and a maximum elongation of 700 to
1,300% at 0[deg.] C.
Long Sentences
Technical constructions
Largest single document: 249,322 words
Longest Sentence: 1,417 words

The Challenge of Patents
  Very
long
sentences
as
standard

  Gramma1cally
incomplete
using

nominal
and
telegraphic
style
(!)

  Passive
forms
are
frequent

  Frequent
use
of
subordinate
clauses,

par1ciples,
implicit
constructs

  Inconsistent
and
incorrect
spelling

  High
use
of
neologisms

  Instances
of
synonymy
and
polysemy

  Spurious
use
of
punctua1on

Authoring guide
for “to be
translated” text
Patents break
almost all of the
rules!

IPTranslatorPatent Translation by Iconic Translation Machines

MT for Information Purposes
MT Application Areas
MT for Post-editing Productivity
•  Development focuses on improving key information translation
•  Terminology is important
•  Evaluation driven by “usability”
•  Development focuses on reducing edits required
•  Feedback loop is crucial
•  Evaluation through practical translation tasks

Lots of different ways to do evaluation
–  automatic scores
•  BLEU, METEOR, GTM, TER
–  fluency, adequacy, comparative ranking
–  task-based evaluation
•  error analysis, post-edit productivity
Different metrics, different intelligence
–  what does each type of metric tell us?
–  which ones are usable at which stage of evaluation?
e.g. can we really use automatic scores to assess productivity?
e.g. does productivity delta really tell us how good the output is?
MT Evaluation – where do we start!?

Problem
Large Chinese to English patent translation project. Challenging
content and language
Question
What if any efficiencies can machine translation add to the workflow of
RWS translators?
How we applied different types of MT evaluation and different stages
in the process, at various go/no stages, to help RWS to assess whether
MT is viable for this project
Client Case Study – RWS
- UK headquartered public company
- Founded 1958
- 9th largest LSP (CSA 2013 report)
- Leader in specialist IP translations

Can we improve our baseline engines through customisation?
Step 1: Baseline and Customisation
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
BLEU TER
Iconic Baseline
Iconic Customised
What next?
How good is the output relative to the task, i.e. post-editing?
- fluency/adequacy not going to tell us
- let’s start with segment level TER
-  Huge improvement
-  Intuitively, scores
reflect well but don’t
really say anything
-  Let’s dig deeper

Translation Edit Rate: correlates well with practical evaluations
If we look deeper, what can we learn?
INTELLIGENCE
• Proportion of full matches (i.e. big savings)
• Proportion of close matches (i.e. faster that fuzzy matches)
• Proportion of poor matches
ACTIONABLE INFORMATION
• Type of sentence with high/low matches
• Weaknesses and gaps
• Segments to compare and analyse in translation memory

TERscore
Step 2: Segment-level automatic analysis

Distribution of segment-level TER scores
This represents a 24% potential
productivity gain
segment length

With MT experience and previous MT integration, productivity
testing can be run in the production environment. In this case we
used, the TAUS Dynamic Quality Framework
Step 3: Productivity testing

Productivity Test

With MT experience and previous MT integration, productivity
testing can be run in the production environment. In this case we
used, the TAUS Dynamic Quality Framework
Beware the variables!
•  Translators: different experience, speed, perceptions of MT
–  24 translators: senior, staff, and interns
•  Test sets: not representative; particularly difficult
–  2 tests sets, comprising 5 documents, and cross-fold validation
•  Environment and task: inexperience and unfamiliarity
–  Training materials, videos, and “dummy” segments
Step 3: Productivity testing

Overall average
Findings and Learnings
25% productivity gain
Experienced: 22%
Staff: 23%
Interns: 30%
Test set 1.1: 25%
Test set 1.2: 35%
Test set 2.1: 06%
Test set 2.2: 35%
Correlates with TER
Rollout with junior staff
for more immediate
impact on bottom line?
Don’t be over concerned
by outliers.
Use data to facilitate
source content profiling?
What it tells us
By Translator Profile
By Test Set

Look our for anomalies
–  segments with long timings (above average ratio words/minute)
–  sentences that don’t change much from MT to post-edit
–  segments with unusually short timings
In this case, the next step is production roll-out to validate these
in the actual translator workflow over an extended period.
Warnings, Tips, and Next Steps

Now would be the right time to do fluency/adequacy if you need to
verify that post-editing is producing, at least, similar quality output

“The biggest room in the world is the
room for improvement”

Thank You!
john@iconictranslation.com
@IconicTrans

Improving Translator Productivity with MT: A Patent Translation Case Study

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Improving Translator Productivity with MT: A Patent Translation Case Study

Similar to Improving Translator Productivity with MT: A Patent Translation Case Study (20)

More from Iconic Translation Machines

More from Iconic Translation Machines (6)

Recently uploaded

Recently uploaded (20)

Improving Translator Productivity with MT: A Patent Translation Case Study

Editor's Notes