Sharing efforts to get the most from MT+PE

February 23, 2018
Athens, Greece #eliatogether
SharingEfforts
ToGettheMostfromMT+PE
Luigi Muzii
sQuid

#eliatogether
Introduction
Sharing Efforts to Get the Most from MT+PE 2© 2018 Luigi Muzii
Γεια σας
Με λένε

#eliatogether
In this
industry since
1982

#eliatogether
Working with
MT since
1991

#eliatogether
Working in
telecom until
2002
© 2018 Luigi Muzii 5Sharing Efforts to Get the Most from MT+PE

#eliatogether
Freelancing
since 2002

#eliatogether
University
teacher until
2011
© 2018 Luigi Muzii Sharing Efforts to Get the Most from MT+PE 7

#eliatogether
Business
consultant
since 2012
© 2018 Luigi Muzii
8Sharing Efforts to Get the Most from MT+PE

#eliatogether
Outline
• Clearing the field
• Laying foundations
• Defining requirements
• Arranging the platform
• Running projects

#eliatogether
Clearing the field
Target groups

#eliatogether
Practical
advice

#eliatogether
• To apply whether you are a
freelancer, a project manager, or a
translation buyer
Guiding
principles

#eliatogether
Three
scenarios
• In the making (freelancers)
• Downstream (customers)
• On constraints (LSPs)

#eliatogether
Laying foundations
Devising strategies

#eliatogether
Method?

#eliatogether
Already in the past?SMT

#eliatogether
Still SMT, but…
a whole different kettle of fishNMT?

#eliatogether
Not exactly
child’s play
• Tools
• Data
• Knowledge

#eliatogether
• MT will proliferate
• Good translators will still be lacking
Foresight

#eliatogether
Everyone’s needed, no one’s
indispensableJoin forces

#eliatogether
3 tips for
getting
started
• Recap goals and expectations
• Check MT readiness
• Plan for assistance

#eliatogether
Defining requirements
Simple and straightforward

#eliatogether
Where available data is larger and
quality is higherScope

#eliatogether
• Reduce labor
• Boost productivity
• Keep consistency
Goals

#eliatogether
• Familiarize with technology
• Strengthen your expertise
• Tackle security issues
• Scrub your data
• Plan for support
• Revise your pricing model
Separate the
wheat from
the chaff

#eliatogether
Building a platform
Selection, set-up, training, testing

#eliatogether
Givens
• Not all engines are created equal
• Raw output can vary across
systems—and language pairs
• Errors may not follow a consistent
pattern
• Engine performances also vary

#eliatogether
Set-up
• Data
 Maintenance
• Customized engine
 +100,000 segments
• Tool settings
 Sub-segment recall
 Fuzzy match repair

#eliatogether
Engine
• Total cost of ownership
• Integration
• Expertise
• Security

#eliatogether
Best practices
Running projects

#eliatogether
Dos
• Know your data
• Master quality metrics
• Devise a post-editing fee scheme

#eliatogether
Don’ts
• Mess with data
• DIY/Rely on vendors
• Expect miracles

#eliatogether
In any case,
remember:
Tell the customer you are using MT
So you won’t get sued

#eliatogether
The fuel Output is only as good as the data
used

#eliatogether
Good
(effective)
data
• Few reliable sources
• Single domain
• Current data
• Same encoding
• No empty segments
• No errors
• Terminologically consistent
segments
• Same style
• Same-length segments

#eliatogether
The output Accept that output is unpredictable

#eliatogether
• Fast
• Unchallenging
• Flowing
Post-editing:
expectations

#eliatogether
• EditTime
 The time required to get a raw MT output
to the desired standard
• Post-editing effort
 Percentage of edits to be applied to raw
MT output to attain the desired standard
Post-editing:
measures

#eliatogether
Can only be computed downstreamEdit time

#eliatogether
• Probabilistic forecasts
 Based on automatic metrics
• Depending on
 Post‐editing level
 Volume
 Turn‐around time
Post‐editing
effort

#eliatogether
Post-editing
levels
• Gisting
 Volatile content
 Automatic scripts to fix mechanical/recurring
errors
• Light
 Continuous delivery
 Fixing capitalization and punctuation, replacing
unknown words, removing redundant words,
ignoring stylistic issues
• Full
 Publishing and engine training
 Fixing meaning distortion, fixing grammar and
syntax, translating untranslated terms (possibly
new terms), adjusting fluency

#eliatogether
Dos
• Test before operating
• Ask for MT samples for negotiation
• Negotiate throughput rates
• Ask for glossary (with DNT words)
• Ask for for instructions
• Be open to feedback

#eliatogether
Don’ts
• Use MT to sustain price competition
• Process poor MT outputs
• Treat post-editing as fuzzy matches

#eliatogether
Post-editing
instructions
• Tool selection
• Environment setup
• General references
• Conventions
• Project details
• Pricing model
• Operating instructions

#eliatogether
Pricing and
compensation
• Upstream
 Clear-cut predictive scheme
 No fuzzy match scheme
 Fuzzy match over 85% are inherently correct while
MT segments may contain errors and inaccuracies
• Downstream
 Measurement of actual work

#eliatogether
Negotiation
grid
• Generals
 Engine
 Generic or trained
 Quality
 Raw output
 Expectations
 Formats and formatting
• Compensation
 Per-word rate
 Productivity rate
 Hourly rate
 Time tracking

#eliatogether
• A considerably low pay rate
unrelated to language pair and MT
output quality
• MT output quality is lower than a
generic free online service
When to say
NO

#eliatogether
Automatic
processing
• Pre-processing
 Empty, untranslated, duplicated segments
 Normalization
 Punctuation, diacritics, extra spaces, noise
 Numbers, dates, weights, measures
 Terminology
 Spellcheck
• Post-processing
 Encoding
 Normalization
 Terminology
 Spellcheck

#eliatogether
Ευχαριστίες
Don’t forget your download card

Sharing efforts to get the most from MT+PE

More Related Content

Similar to Sharing efforts to get the most from MT+PE

More from Luigi Muzii

Recently uploaded

Sharing efforts to get the most from MT+PE

Editor's Notes