SlideShare a Scribd company logo
Assumptions, Expectations
and Outliers in Post-Editing
Lena Marg, Laura Casanellas
Language Tools Team
@ EAMT Summit Dubrovnik, Croatia, June 2014
Background on MT Programs @
MT programs vary with regard to:
Scope
Locales
Maturity
System Setup & Ownership
MT Solution used
Key Objective of using MT
Final Quality Requirements
Source Content
MT Quality Evaluation @
1. Automatic Scores
 Provided by the MT system (typically BLEU)
 Provided by our internal scoring tool, weScore (range of metrics)
2. Human Evaluation
 Adequacy, scores 1-5
 Fluency, scores 1-5
3. Productivity Tests
 Post-Editing versus Human Translation in iOmegaT, validated through
final Quality Assessments
The Database
Objective:
Establish correlations between these 3 evaluation approaches to
- draw conclusions on predicting productivity gains in advance
- see how & when to use the different metrics best
Contents:
- Content Type
- Language Pair (English into XX)
- MT engine provider & owner (i.e. who owns training & maintenance)
- Metrics (BLEU & PE Distance, Adequacy & Fluency, Productivity deltas)
- MT error analysis
- Final QA scores
- Level of experience of resource doing productivity test
Data from 2013
thedatabaseData Used
27 locales in total, with
varying amounts of
available data
5 different MT systems (SMT /
Hybrid
Assumptionsresults
General assumptions around best performing languages
and content types were confirmed
Assumptionsresults, II
Interesting results around correlation between productivity
gained when translating and post-editing :
Not all the resources improve equally (or at all) when changing
activities from translation to post-editing.
correlationresults
Summary
Pearson's r Variables Strength of Correlation Tests (N) Locales Statistical
Significance (p
value <)
0.82 Adequacy & Fluency Very strong positive relationship 182 22 0.0001
0.77 Adequacy & P Delta Very strong positive relationship 23 9 0.0001
0.71 Fluency & P Delta Very strong positive relationship 23 9 0.00015
0.55 Cognitive Effort Rank & PE Distance Strong positive relationship 16 10 0.027
0.41 Fluency & BLEU Strong positive relationship 146 22 0.0001
0.26 Adequacy & BLEU Weak positive relationship 146 22 0.0015
0.24 BLEU & P Delta Weak positive relationship 106 26 0.012
0.13 Numbers of Errors & PE Distance No or negligible relationship 16 10 ns
-0.30 Predominant Error & BLEU Moderate negative relationship 63 13 0.017
-0.32 Cognitive Effort Rank & PE Delta Moderate negative relationship 20 10 ns
-0.41 Numbers of Errors & BLEU Strong negative relationship 63 20 0.00085
-0.41 Adequacy & PE Distance Strong negative relationship 38 13 0.011
-0.42 PE Distance & P Delta Strong negative relationship 72 27 0.00024
-0.70 Fluency & PE Distance Very strong negative relationship 38 13 0.0001
-0.81 BLEU & PE Distance Very strong negative relationship 75 27 0.0001
takeaways
The strongest correlations were found between:
 Adequacy & Fluency
 BLEU and PE Distance
 Adequacy & Productivity Delta
 Fluency & Productivity Delta
 Fluency & PE Distance
 The Human Evaluations come out as stronger indicators for
potential post-editing productivity gains than Automatic
metrics.
CORRELATIONS
Looking at subsets
Adequacy and Fluency versus BLEU
-1.00
-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
da_DK de_DE es_ES es_LA fr_CA fr_FR it_IT ja_JP ko_KR pt_BR ru_RU zh_CN
Pearson'sr
Adequacy, Fluency & BLEU Correlation – Select Locales
Adequacy & BLEU Fluency & BLEU
Adequacy, Fluency and BLEU correlation for locales with 4 or more test sets*
Although the tests sets here are too small to be statistically relevant,
the correlations seem to vary significantly between locales.
Would this be maintained with more data and what are the reasons
for the differences?
Looking at subsets, IIAdequacy and Fluency versus PE Distance
Fluency and PE distance across all locales have a cumulative Pearson’s r of -0.70, a very strong
negative relationship
Adequacy and PE distance across all locales have a cumulative Pearson’s r of -0.41, a strong
negative relationship
-1.00
-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
de_DE es_ES/LA fr_FR/CA it_IT pt_BR
Adequacy, Fluency and PE Distance Correlation
Adequacy & PE Distance Fluency & PE Distance
Looking at a few select locales with the highest numbers of tests, it looks
more varied again.
Outliersresults
Based on some of the data shown previously, and from the
point of view of consistent results versus outliers:
• For Human Evaluations of raw MT output, the inter-
annotator agreement was consistent in terms of scores
(same test set and language)
• Metrics based on human effort (productivity delta) are less
consistent and might include significant variations between
individual (same test set and language)
furtherquestions
Based on the premise that there are significant variations
between different post-editors…
… and with the aim of learning from individual behaviors and
predicting future productivity gains, we ask ourselves two
questions:
• What circumstances or variables most reliably facilitate
good-quality, highly productive post editing?
• Do conditions and parameters outside the post-editor’s
control facilitate or hamper his or her success?
survey
Q1: What is your primary target language?
Q2: What is your background?
Q3: How many years experience ?
Q4: How is your work environment?
Q5: Which of the following CAT tools have you worked with?
Q6: What is your level of proficiency on the CAT tool(s) you use?
Q7: What is your translation methodology?
Q8: How do you primarily enter text?
Q9: What are your quality assurance and automation processes?
Q10: What do you consider most important in your assignments?
5 languages (DE, FR, JP, PTBR, HU)
38 linguists (belonging to 14 different teams)
Probably less surprising…
 Except for 1 respondent, all respondents have more experience with
translation than with post-editing
 The overall correlation between translation experience and post-editing
experience is “strong”
However, looking at correlations by locale
German: very strong
French: weak
Japanese: weak
PTBR: strong
Hungarian: weak
 This suggests that for German and Brazilian Portuguese only, the overall
experience as professional translator (whether junior or senior) gives us
insights into how much post-editing experience to expect. For the other
3 locales, profiles are more varied.
Q3: How many years experience do you
have?
The choice of CAT tool is to some extend dependent on the client
requirement, but what the data shows is that all locales & respondents are
using a broad range of CAT tools for their work.
On average, respondents use / are familiar with 6-8 different CAT tools.
There is a slight trend that junior translators use / are familiar with more CAT
tools than senior translators.
All respondents claim to be proficient and / or expert in their most
frequently used CAT tool.
6 out of 8 Hungarian respondents call themselves “Experts”
3 out of 8 Germans
4 out of 9 French
1 out of 7 PTBR
None of the Japanese respondents (despite on average most translation
experience)
Q5: Which of the following CAT tools have
you worked with? Please select all that apply
 Of the 5 locales, the French respondents stand out as a very
homogenous group with
- Rarely making use of any pre-processing steps
- Never using free MT tools
- Never using internal MT tools
 The Japanese, Brazilian and Hungarian respondents are more
likely to perform pre-processing steps
 Japanese translators appear to copy to Word more than any
other locale
 Hungarian translators were the only group with almost half of
the respondents never doing draft translations first, but working
segment by segment
Q7: Please evaluate the following statements
on translation methodology
Looking at respondents who Always / Frequently perform any of
the 5 proposed actions,
- There was no clear trend with regard to years of translation
experience
- There was no clear trend with regard to background
- There was no clear trend between resource working in an office
/ at home etc.
With regard to text input methods,
 French and German translators seem to make more use of CAT
tool shortcuts.
 Japanese requires the use of Input Method Editors.
… Q7: Please evaluate the following statements
on translation methodology
• Romance languages are the best performers on MT.
• User Assistance is the most suitable content (apart from UGC).
• Translators do not improve homogenously when moving to post-
editing (some of them do not improve at all).
• It is more difficult to foresee post-editing effort than to asses the
quality of raw MT. The human effort is still the most variable
aspect.
• In some locales (Germany, Brazil) “senior translators” accept
post-editing as much as junior translators might do.
• Our French linguists seem to use less automation in their
processes.
Final Conclusions
White Papers: Two white papers elaborating on the
approach and results of the Analysis of the Database will be
published in the near future.
www.welocalize.com
More research: We continue adding data to our Database;
we have also included the survey on our hand-off material
when doing productivity tests with the aim of gaining more
insights into the post-editors background.
nextprojects
THANK YOU!
laura.casanellas@welocalize.com
lena.marg@welocalize.com

More Related Content

What's hot

Collective sensing
Collective sensingCollective sensing
Collective sensing
mahdikianirad1
 
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
Peinan ZHANG
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
Isabelle Augenstein
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutions
Sajeed Mahaboob
 
Word embedding
Word embedding Word embedding
Word embedding
ShivaniChoudhary74
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
JaeHo Jang
 
Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech
Assessing Virtual Assistant Capabilities with Italian Dysarthric SpeechAssessing Virtual Assistant Capabilities with Italian Dysarthric Speech
Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech
Luigi De Russis
 
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & AutocorrectionIRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET Journal
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
butest
 

What's hot (10)

Collective sensing
Collective sensingCollective sensing
Collective sensing
 
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
 
Evaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutionsEvaluation of hindi english mt systems, challenges and solutions
Evaluation of hindi english mt systems, challenges and solutions
 
Word embedding
Word embedding Word embedding
Word embedding
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
 
Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech
Assessing Virtual Assistant Capabilities with Italian Dysarthric SpeechAssessing Virtual Assistant Capabilities with Italian Dysarthric Speech
Assessing Virtual Assistant Capabilities with Italian Dysarthric Speech
 
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & AutocorrectionIRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & Autocorrection
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 

Viewers also liked

MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
Welocalize
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Welocalize
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Welocalize
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology
Welocalize
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014
Welocalize
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Welocalize
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
Welocalize
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
Welocalize
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
Welocalize
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
Welocalize
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysis
Welocalize
 
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
TAUS - The Language Data Network
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Welocalize
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
Welocalize
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to Production
Welocalize
 

Viewers also liked (15)

MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysis
 
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
Stephane Domisse (John Deere) at the Industry Leaders Forum 2015
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to Production
 

Similar to Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in Post-Editing

Auditing_COAs_TranslationSilviaZaragoza
Auditing_COAs_TranslationSilviaZaragozaAuditing_COAs_TranslationSilviaZaragoza
Auditing_COAs_TranslationSilviaZaragoza
Neuropsychological Research Organization s.l
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
Robert Martin
 
MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education
Jakub Absolon
 
Communication and Technology © 2016 South University .docx
Communication and Technology © 2016 South University .docxCommunication and Technology © 2016 South University .docx
Communication and Technology © 2016 South University .docx
durantheseldine
 
5. bleu
5. bleu5. bleu
Relating language examinations to the common European reference levels of lan...
Relating language examinations to the common European reference levels of lan...Relating language examinations to the common European reference levels of lan...
Relating language examinations to the common European reference levels of lan...
Nelly Zafeiriades
 
Assessing the quality of doctor consultations using ML
Assessing the quality of doctor consultations using MLAssessing the quality of doctor consultations using ML
Assessing the quality of doctor consultations using ML
GDG Cloud Bengaluru
 
The Wisdom of the Few @SIGIR09
The Wisdom of the Few @SIGIR09The Wisdom of the Few @SIGIR09
The Wisdom of the Few @SIGIR09
Xavier Amatriain
 
Pedersen acl2011-business-meeting
Pedersen acl2011-business-meetingPedersen acl2011-business-meeting
Pedersen acl2011-business-meeting
University of Minnesota, Duluth
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
Iconic Translation Machines
 
CHI'07: Biases in Human Estimation of Interruptibility
CHI'07: Biases in Human Estimation of InterruptibilityCHI'07: Biases in Human Estimation of Interruptibility
CHI'07: Biases in Human Estimation of Interruptibility
cpt.positive
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
Lifeng (Aaron) Han
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
Lifeng (Aaron) Han
 
Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...
Lifeng (Aaron) Han
 
Implementation of New Systems in Healthcare Service Providers Ppt.pdf
Implementation of New Systems in Healthcare Service Providers Ppt.pdfImplementation of New Systems in Healthcare Service Providers Ppt.pdf
Implementation of New Systems in Healthcare Service Providers Ppt.pdf
studywriters
 
actfl-opi-reliability-2012
actfl-opi-reliability-2012actfl-opi-reliability-2012
actfl-opi-reliability-2012
Hyder Abadin
 
oGIP - Quality Analisys
oGIP - Quality AnalisysoGIP - Quality Analisys
oGIP - Quality Analisys
aiesecincolombia
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
SDL
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation Issues
Raya Wasser
 

Similar to Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in Post-Editing (20)

Auditing_COAs_TranslationSilviaZaragoza
Auditing_COAs_TranslationSilviaZaragozaAuditing_COAs_TranslationSilviaZaragoza
Auditing_COAs_TranslationSilviaZaragoza
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education
 
Communication and Technology © 2016 South University .docx
Communication and Technology © 2016 South University .docxCommunication and Technology © 2016 South University .docx
Communication and Technology © 2016 South University .docx
 
5. bleu
5. bleu5. bleu
5. bleu
 
Relating language examinations to the common European reference levels of lan...
Relating language examinations to the common European reference levels of lan...Relating language examinations to the common European reference levels of lan...
Relating language examinations to the common European reference levels of lan...
 
Assessing the quality of doctor consultations using ML
Assessing the quality of doctor consultations using MLAssessing the quality of doctor consultations using ML
Assessing the quality of doctor consultations using ML
 
The Wisdom of the Few @SIGIR09
The Wisdom of the Few @SIGIR09The Wisdom of the Few @SIGIR09
The Wisdom of the Few @SIGIR09
 
Pedersen acl2011-business-meeting
Pedersen acl2011-business-meetingPedersen acl2011-business-meeting
Pedersen acl2011-business-meeting
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
CHI'07: Biases in Human Estimation of Interruptibility
CHI'07: Biases in Human Estimation of InterruptibilityCHI'07: Biases in Human Estimation of Interruptibility
CHI'07: Biases in Human Estimation of Interruptibility
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...Unsupervised Quality Estimation Model for English to German Translation and I...
Unsupervised Quality Estimation Model for English to German Translation and I...
 
Implementation of New Systems in Healthcare Service Providers Ppt.pdf
Implementation of New Systems in Healthcare Service Providers Ppt.pdfImplementation of New Systems in Healthcare Service Providers Ppt.pdf
Implementation of New Systems in Healthcare Service Providers Ppt.pdf
 
actfl-opi-reliability-2012
actfl-opi-reliability-2012actfl-opi-reliability-2012
actfl-opi-reliability-2012
 
oGIP - Quality Analisys
oGIP - Quality AnalisysoGIP - Quality Analisys
oGIP - Quality Analisys
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation Issues
 

More from Welocalize

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?
Welocalize
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Welocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
Welocalize
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Welocalize
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Welocalize
 

More from Welocalize (6)

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
 

Recently uploaded

Business storytelling: key ingredients to a story
Business storytelling: key ingredients to a storyBusiness storytelling: key ingredients to a story
Business storytelling: key ingredients to a story
Alexandra Fulford
 
-- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month ---- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month --
NZSG
 
Structural Design Process: Step-by-Step Guide for Buildings
Structural Design Process: Step-by-Step Guide for BuildingsStructural Design Process: Step-by-Step Guide for Buildings
Structural Design Process: Step-by-Step Guide for Buildings
Chandresh Chudasama
 
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta MatkaDpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
my Pandit
 
2022 Vintage Roman Numerals Men Rings
2022 Vintage Roman  Numerals  Men  Rings2022 Vintage Roman  Numerals  Men  Rings
2022 Vintage Roman Numerals Men Rings
aragme
 
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
taqyea
 
3 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 20243 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 2024
SEOSMMEARTH
 
Creative Web Design Company in Singapore
Creative Web Design Company in SingaporeCreative Web Design Company in Singapore
Creative Web Design Company in Singapore
techboxsqauremedia
 
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
Lacey Max
 
The Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb PlatformThe Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb Platform
SabaaSudozai
 
How MJ Global Leads the Packaging Industry.pdf
How MJ Global Leads the Packaging Industry.pdfHow MJ Global Leads the Packaging Industry.pdf
How MJ Global Leads the Packaging Industry.pdf
MJ Global
 
Authentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto RicoAuthentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto Rico
Corey Perlman, Social Media Speaker and Consultant
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
 
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
APCO
 
Easily Verify Compliance and Security with Binance KYC
Easily Verify Compliance and Security with Binance KYCEasily Verify Compliance and Security with Binance KYC
Easily Verify Compliance and Security with Binance KYC
Any kyc Account
 
Best practices for project execution and delivery
Best practices for project execution and deliveryBest practices for project execution and delivery
Best practices for project execution and delivery
CLIVE MINCHIN
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
JeremyPeirce1
 
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
Neil Horowitz
 
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
Stephen Cashman
 

Recently uploaded (20)

Business storytelling: key ingredients to a story
Business storytelling: key ingredients to a storyBusiness storytelling: key ingredients to a story
Business storytelling: key ingredients to a story
 
-- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month ---- June 2024 is National Volunteer Month --
-- June 2024 is National Volunteer Month --
 
Structural Design Process: Step-by-Step Guide for Buildings
Structural Design Process: Step-by-Step Guide for BuildingsStructural Design Process: Step-by-Step Guide for Buildings
Structural Design Process: Step-by-Step Guide for Buildings
 
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta MatkaDpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
Dpboss Matka Guessing Satta Matta Matka Kalyan Chart Satta Matka
 
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
Taurus Zodiac Sign: Unveiling the Traits, Dates, and Horoscope Insights of th...
 
2022 Vintage Roman Numerals Men Rings
2022 Vintage Roman  Numerals  Men  Rings2022 Vintage Roman  Numerals  Men  Rings
2022 Vintage Roman Numerals Men Rings
 
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
 
3 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 20243 Simple Steps To Buy Verified Payoneer Account In 2024
3 Simple Steps To Buy Verified Payoneer Account In 2024
 
Creative Web Design Company in Singapore
Creative Web Design Company in SingaporeCreative Web Design Company in Singapore
Creative Web Design Company in Singapore
 
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
How are Lilac French Bulldogs Beauty Charming the World and Capturing Hearts....
 
The Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb PlatformThe Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb Platform
 
How MJ Global Leads the Packaging Industry.pdf
How MJ Global Leads the Packaging Industry.pdfHow MJ Global Leads the Packaging Industry.pdf
How MJ Global Leads the Packaging Industry.pdf
 
Authentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto RicoAuthentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto Rico
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka Result Satta Matka Guessing Satta Fix jodi Kalyan Fin...
 
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
The APCO Geopolitical Radar - Q3 2024 The Global Operating Environment for Bu...
 
Easily Verify Compliance and Security with Binance KYC
Easily Verify Compliance and Security with Binance KYCEasily Verify Compliance and Security with Binance KYC
Easily Verify Compliance and Security with Binance KYC
 
Best practices for project execution and delivery
Best practices for project execution and deliveryBest practices for project execution and delivery
Best practices for project execution and delivery
 
Top mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptxTop mailing list providers in the USA.pptx
Top mailing list providers in the USA.pptx
 
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
Brian Fitzsimmons on the Business Strategy and Content Flywheel of Barstool S...
 
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
The Heart of Leadership_ How Emotional Intelligence Drives Business Success B...
 

Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in Post-Editing

  • 1. Assumptions, Expectations and Outliers in Post-Editing Lena Marg, Laura Casanellas Language Tools Team @ EAMT Summit Dubrovnik, Croatia, June 2014
  • 2. Background on MT Programs @ MT programs vary with regard to: Scope Locales Maturity System Setup & Ownership MT Solution used Key Objective of using MT Final Quality Requirements Source Content
  • 3. MT Quality Evaluation @ 1. Automatic Scores  Provided by the MT system (typically BLEU)  Provided by our internal scoring tool, weScore (range of metrics) 2. Human Evaluation  Adequacy, scores 1-5  Fluency, scores 1-5 3. Productivity Tests  Post-Editing versus Human Translation in iOmegaT, validated through final Quality Assessments
  • 4. The Database Objective: Establish correlations between these 3 evaluation approaches to - draw conclusions on predicting productivity gains in advance - see how & when to use the different metrics best Contents: - Content Type - Language Pair (English into XX) - MT engine provider & owner (i.e. who owns training & maintenance) - Metrics (BLEU & PE Distance, Adequacy & Fluency, Productivity deltas) - MT error analysis - Final QA scores - Level of experience of resource doing productivity test Data from 2013
  • 5. thedatabaseData Used 27 locales in total, with varying amounts of available data 5 different MT systems (SMT / Hybrid
  • 6. Assumptionsresults General assumptions around best performing languages and content types were confirmed
  • 7. Assumptionsresults, II Interesting results around correlation between productivity gained when translating and post-editing : Not all the resources improve equally (or at all) when changing activities from translation to post-editing.
  • 8. correlationresults Summary Pearson's r Variables Strength of Correlation Tests (N) Locales Statistical Significance (p value <) 0.82 Adequacy & Fluency Very strong positive relationship 182 22 0.0001 0.77 Adequacy & P Delta Very strong positive relationship 23 9 0.0001 0.71 Fluency & P Delta Very strong positive relationship 23 9 0.00015 0.55 Cognitive Effort Rank & PE Distance Strong positive relationship 16 10 0.027 0.41 Fluency & BLEU Strong positive relationship 146 22 0.0001 0.26 Adequacy & BLEU Weak positive relationship 146 22 0.0015 0.24 BLEU & P Delta Weak positive relationship 106 26 0.012 0.13 Numbers of Errors & PE Distance No or negligible relationship 16 10 ns -0.30 Predominant Error & BLEU Moderate negative relationship 63 13 0.017 -0.32 Cognitive Effort Rank & PE Delta Moderate negative relationship 20 10 ns -0.41 Numbers of Errors & BLEU Strong negative relationship 63 20 0.00085 -0.41 Adequacy & PE Distance Strong negative relationship 38 13 0.011 -0.42 PE Distance & P Delta Strong negative relationship 72 27 0.00024 -0.70 Fluency & PE Distance Very strong negative relationship 38 13 0.0001 -0.81 BLEU & PE Distance Very strong negative relationship 75 27 0.0001
  • 9. takeaways The strongest correlations were found between:  Adequacy & Fluency  BLEU and PE Distance  Adequacy & Productivity Delta  Fluency & Productivity Delta  Fluency & PE Distance  The Human Evaluations come out as stronger indicators for potential post-editing productivity gains than Automatic metrics. CORRELATIONS
  • 10. Looking at subsets Adequacy and Fluency versus BLEU -1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00 da_DK de_DE es_ES es_LA fr_CA fr_FR it_IT ja_JP ko_KR pt_BR ru_RU zh_CN Pearson'sr Adequacy, Fluency & BLEU Correlation – Select Locales Adequacy & BLEU Fluency & BLEU Adequacy, Fluency and BLEU correlation for locales with 4 or more test sets* Although the tests sets here are too small to be statistically relevant, the correlations seem to vary significantly between locales. Would this be maintained with more data and what are the reasons for the differences?
  • 11. Looking at subsets, IIAdequacy and Fluency versus PE Distance Fluency and PE distance across all locales have a cumulative Pearson’s r of -0.70, a very strong negative relationship Adequacy and PE distance across all locales have a cumulative Pearson’s r of -0.41, a strong negative relationship -1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00 de_DE es_ES/LA fr_FR/CA it_IT pt_BR Adequacy, Fluency and PE Distance Correlation Adequacy & PE Distance Fluency & PE Distance Looking at a few select locales with the highest numbers of tests, it looks more varied again.
  • 12. Outliersresults Based on some of the data shown previously, and from the point of view of consistent results versus outliers: • For Human Evaluations of raw MT output, the inter- annotator agreement was consistent in terms of scores (same test set and language) • Metrics based on human effort (productivity delta) are less consistent and might include significant variations between individual (same test set and language)
  • 13. furtherquestions Based on the premise that there are significant variations between different post-editors… … and with the aim of learning from individual behaviors and predicting future productivity gains, we ask ourselves two questions: • What circumstances or variables most reliably facilitate good-quality, highly productive post editing? • Do conditions and parameters outside the post-editor’s control facilitate or hamper his or her success?
  • 14. survey Q1: What is your primary target language? Q2: What is your background? Q3: How many years experience ? Q4: How is your work environment? Q5: Which of the following CAT tools have you worked with? Q6: What is your level of proficiency on the CAT tool(s) you use? Q7: What is your translation methodology? Q8: How do you primarily enter text? Q9: What are your quality assurance and automation processes? Q10: What do you consider most important in your assignments? 5 languages (DE, FR, JP, PTBR, HU) 38 linguists (belonging to 14 different teams)
  • 15. Probably less surprising…  Except for 1 respondent, all respondents have more experience with translation than with post-editing  The overall correlation between translation experience and post-editing experience is “strong” However, looking at correlations by locale German: very strong French: weak Japanese: weak PTBR: strong Hungarian: weak  This suggests that for German and Brazilian Portuguese only, the overall experience as professional translator (whether junior or senior) gives us insights into how much post-editing experience to expect. For the other 3 locales, profiles are more varied. Q3: How many years experience do you have?
  • 16. The choice of CAT tool is to some extend dependent on the client requirement, but what the data shows is that all locales & respondents are using a broad range of CAT tools for their work. On average, respondents use / are familiar with 6-8 different CAT tools. There is a slight trend that junior translators use / are familiar with more CAT tools than senior translators. All respondents claim to be proficient and / or expert in their most frequently used CAT tool. 6 out of 8 Hungarian respondents call themselves “Experts” 3 out of 8 Germans 4 out of 9 French 1 out of 7 PTBR None of the Japanese respondents (despite on average most translation experience) Q5: Which of the following CAT tools have you worked with? Please select all that apply
  • 17.  Of the 5 locales, the French respondents stand out as a very homogenous group with - Rarely making use of any pre-processing steps - Never using free MT tools - Never using internal MT tools  The Japanese, Brazilian and Hungarian respondents are more likely to perform pre-processing steps  Japanese translators appear to copy to Word more than any other locale  Hungarian translators were the only group with almost half of the respondents never doing draft translations first, but working segment by segment Q7: Please evaluate the following statements on translation methodology
  • 18. Looking at respondents who Always / Frequently perform any of the 5 proposed actions, - There was no clear trend with regard to years of translation experience - There was no clear trend with regard to background - There was no clear trend between resource working in an office / at home etc. With regard to text input methods,  French and German translators seem to make more use of CAT tool shortcuts.  Japanese requires the use of Input Method Editors. … Q7: Please evaluate the following statements on translation methodology
  • 19. • Romance languages are the best performers on MT. • User Assistance is the most suitable content (apart from UGC). • Translators do not improve homogenously when moving to post- editing (some of them do not improve at all). • It is more difficult to foresee post-editing effort than to asses the quality of raw MT. The human effort is still the most variable aspect. • In some locales (Germany, Brazil) “senior translators” accept post-editing as much as junior translators might do. • Our French linguists seem to use less automation in their processes. Final Conclusions
  • 20. White Papers: Two white papers elaborating on the approach and results of the Analysis of the Database will be published in the near future. www.welocalize.com More research: We continue adding data to our Database; we have also included the survey on our hand-off material when doing productivity tests with the aim of gaining more insights into the post-editors background. nextprojects