SlideShare a Scribd company logo
1 of 39
Enterprise MT Content Drift: 
Challenges, Impacts and Advanced 
Solutions 
Alon Lavie & Olga Beregovaya 
AMTA 2014 
October 25, 2014
Outline 
 Welocalize MT Program for eDell: workflow, processes, challenges 
 Safaba EMTGlobal MT 
 Enterprise Content Drift: the evidence 
 Identifying Content Drift: Indicators and their correlation 
 Overnight Retraining: the approach 
 Overnight Retraining: the Pilot Study and Results 
 EMTGlobal v4.0: Advanced Rapid Adaptation 
 Welocalize: Expected Impacts 
 Summary and Conclusions
Welocalize Approach to MT Program Deployment
Content Sent Through the TMT Process 
 eDell content types handled through the MT PE Process vary 
between different – mainly Marketing – content categories: 
 Partner Marketing 
 Global Support 
 Channel Support 
 Consumer Marketing Communication 
 Corporate HR 
 Customer Proposals 
 Product Launches 
 Global eDell Web content 
 The daily/weekly/monthly volumes per content type vary depending 
on the Dell business priorities
Translation with MT Post-Editing for Dell 
 Translation Setup: 
 Source document is pre-translated by translation memory matches augmented 
by Safaba MT 
 Translation Memory “fuzzy match” threshold typically 75-85% 
 Pre-translations are presented to human translator as starting point for editing; 
translators can use or ignore the suggested pre-translations 
 Currently 28 languages go through the TM/MT workflow 
 Post-Editing Productivity Assessment: 
 Contrastive translation projects that measure and compare translation team 
productivity with MT post-editing versus translation using just translation 
memories 
 Productivity measured by contrasting translated words per hour under both 
conditions: MT-PE throughput / HT throughput
MT Post-Editing Productivity Assessment 
 Evaluated by Welocalize in the context of the Dell MT Program 
300.00% 
250.00% 
200.00% 
150.00% 
100.00% 
50.00% 
0.00% 
90.00 BLEU 
80.00 
70.00 
60.00 
50.00 
40.00 
30.00 
20.00 
10.00 
0.00 
PE Distance 
Productivity Delta
Productivity Gains through Retraining 
LOCALE ID Initial engine Retrained Engine 
CH_TI -11.75% 4.7% 
CS-CZ 37.53% - 
DA-DK 88.67% - 
DE_DE 20.24% 31.2% 
EL_EL 18.36% 51.3% 
ES_ES 28.5% 
ES_LX 2.31% 99.3% 
FI-FI 102.80% 
FR_FR 21.73% 46.4% 
HE-IL 25.43% - 
HU-HU 32.53% - 
IT-IT 84.89% - 
JA-JP 20.62% - 
KO-KR 13.83% - 
NL-NL 27.85% - 
NO-NO 59.71% - 
PL-PL 33.83% - 
PT_BR 23.77% 31.3% 
PT_PT 30.24% 27.7% 
RU_RU 22.88% 36.6%
A Solid Engine Translates into Solid Gains 
100% 
80% 
60% 
40% 
20% 
0% 
-20% 
Productivity Delta and Fluency 
1 2 3 4 5 
Productivity Delta 
Human Evaluation Fluency Score (1-5) 
100% 
80% 
60% 
40% 
20% 
0% 
-20% 
Productivity Delta and Adequacy 
1 2 3 4 5 
Productivity Delta 
Human Evaluation Adequacy Score (1-5) 
1.00 
0.80 
0.60 
0.40 
0.20 
0.00 
-0.20 
-0.40 
-0.60 
-0.80 
-1.00 
Adequacy, Fluency and PE Distance Correlation 
de_DE es_ES/LA fr_FR/CA it_IT pt_BR 
Adequacy & PE Distance Fluency & PE Distance
1.5 Years later -Welocalize Post-Editing Adoption
Dell - “High-traffic” MT Program 
Q4/Q3/Q2/Q1 Volumes 
Quarterly MT throughput 
volumes allow Welocalize 
and Safaba to accumulate 
post-edits sufficient for far 
more frequent re-trainings 
than scheduled maintenance 
engine updates
Final Post-Edited Output Quality 
MT quality results are consistently above target – engine degradation will force 
translators to compensate with additional effort 
Continuously above target, monitoring trend 
Result 
Target 
100.00% 
99.90% 
99.80% 
99.70% 
99.60% 
99.50% 
99.40% 
99.30% 
Week 40Week 41Week 42Week 43Week 44Week 45Week 46Week 47Week 48Week 49Week 50Week 51Week 52Week 53
Safaba EMTGlobal 
Client-Specific MT Adaptation 
 The majority of the MT systems Safaba develops are specifically 
developed and optimized for specific client content types 
 Data Scenario: 
 Some amount of client-specific data: translation memories, terminology 
glossaries and monolingual data resources 
 Additional domain-specific and general background data resources: other 
client-specific content types, TAUS data, other general parallel and 
monolingual background data
Safaba EMTGlobal 
Client-Specific MT Adaptation 
 Safaba Suite of Adaptation Approaches: 
 Data selection, filtering and prioritization methods 
 Data mixture and interpolation methods 
 Model mixture and interpolation methods 
 Client-specific Automated Post-Editing (Language Optimization Engine) 
 Styling and Formatting post-processing modules 
 Terminology and DNT runtime overrides
Enterprise Content Drift 
 Client-specific Enterprise MT systems often degrade in performance over time 
for two main reasons: 
1. Client content, even in controlled-domains, gradually changes over time: 
new products, new terminology, new content developers 
2. The typical integrated setup of MT and translation memories: TMs are 
updated more frequently, so over time, only “harder” source segments are 
sent for translation to MT 
 Current Full MT system retraining is resource and time consuming: 
 MT systems are relatively static – they are fully retrained only periodically (typically 
only a couple of times per year) 
 The Result: MT accuracy for new projects declines over time  post-editing 
productivity also declines over time 
 We see strong evidence of “content drift” over time with many of our clients, 
especially in post-editing setups
Evidence from Safaba EMTGlobal Systems for Dell MT Program: 
 BLEU scores before and after retraining on held out “recent” 
incremental data 
70 
60 
50 
40 
30 
20 
10 
0 
2013 
2014 
Enterprise Content Drift
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Evidence from a typical client-specific MT system: 
 EMTGlobal English-to-German Dell MT System: 
 February 2013 System: 565K client + 964K background segments 
 March 2014 System: 594K client + 6,795K background segments 
 Two test sets: 
 “Original” test set from February 2013 system build (1,200 segments) 
 “Incremental” test set extracted from incremental data (500 segments) 
 System Test Scores and Statistics: 
Lang System Gloss 
Inconsist. 
Orig. 
BLEU 
Orig. 
MET 
Orig. 
TER 
Orig. 
LEN 
Orig. 
OOVs 
Incr. 
BLEU 
Incr. 
MET 
Incr. 
TER 
Incr. 
LEN 
Incr. 
OOVs 
DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 
DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
Enterprise Content Drift 
Analysis of Content Drift Over Time: 
 Three EMTGlobal MT systems for Dell: 
 English to Chinese, Spanish and German 
 Systems trained and deployed in February 2013 
 Test sets: 
 “Original” test set from February 2013 system build (1,200 
segments) 
 “Incremental” test set extracted from 2014 incremental data (500 
segments) 
 Data sets extracted from live Dell production projects in August- 
2013, December-2013 and March-2014 along with their post-edited 
references
Enterprise Content Drift 
Analysis of Content Drift Over Time: 
 BLEU Scores 
70 
60 
50 
40 
30 
20 
10 
0 
Chinese Spanish German 
Feb 
Aug 
Dec 
Mar 
Inc-2013 
Inc-2014
Identifying Enterprise Content Drift 
Content Drift Indicators 
 Goal: Establish real-time quantifiable measures that are indicative of Enterprise 
Content Drift 
 Immediate: Available immediately at MT production time, prior to any post-editing of 
the MT output 
 Predictive: Strongly correlate with expected MT evaluation score and post-editing effort 
 Similar to real-time MT Quality Estimation scores, but specific to capturing content drift 
 Three Measures: 
 Core Out-of-Vocabulary (OOV) Type and Token fractions: 
 Fraction of source types (tokens) that were out-of-vocabulary in the core MT system (OOVs) 
 Source-side Unigram Coverage: 
 Fraction of source type (token) unigrams that were observed in the MT system training data 
 Source-side Trigram Coverage: 
 Fraction of source type (token) trigrams that were observed in the MT system training data
Identifying Enterprise Content Drift 
Content Drift Indicators 
Performance of Content Drift Indicators on Dell EMTGlobal Systems: 
 OOVs (Fraction of Tokens) 
6.00 
5.00 
4.00 
3.00 
2.00 
1.00 
0.00 
Chinese Spanish German 
Feb 
Aug 
Dec 
Mar 
Inc-2013 
Inc-2014
Identifying Enterprise Content Drift 
Content Drift Indicators 
Performance of Content Drift Indicators on Dell EMTGlobal Systems: 
 Source Trigram Coverage 
80.00 
70.00 
60.00 
50.00 
40.00 
30.00 
20.00 
10.00 
0.00 
Chinese Spanish German 
Feb 
Aug 
Dec 
Mar 
Inc-2013 
Inc-2014
“Overnight” Incremental Adaptation 
 Objective: Counter “content drift” and help maintain and accelerate post-editing 
productivity with fast and frequent incremental adaptation retraining 
 Setting: New additional post-edited client data is deposited and made available for 
adaptation in small incremental batches 
 Challenge: Full offline system retraining is slow and computationally intensive and 
can take several days 
 Safaba Solution: implement fast “light-weight” adaptations that can be executed, 
tested and deployed into production within hours (“overnight”) 
 Suffix-array variant of Moses supports rapid updating of indexed training data 
 Safaba Language Optimization Engine (automated post-editing module) supports rapid retraining 
 KenLM supports rapid rebuilding of language models 
 Currently in pilot testing with Welocalize and Dell
Safaba Overnight Retraining 
The Approach: 
Goal: Rapid MT System Adaptation using Incremental Data 
 Current Approach: Language Optimization Engine (LOE) Incremental Retraining 
 Safaba EMTGlobal MT systems include a core MT engine and a target-side Language Optimization 
Engine 
 Retraining the LOE component is fast – typically within a few hours 
 Not equivalent to full MT system retraining, but effective in closing the gap 
 New Approach: EMTGlobal v4.0 Advanced Adaptation Technology: 
 Supports significantly improved client-specific adaptation within the core MT engine 
 Supports rapid incremental retraining of core MT engines 
 Much closer to full MT system retraining at similar time frame as LOE retraining 
 Will be available in late Q4 of 2014
Safaba Overnight Retraining 
The Approach: 
 Full Solution: Overnight Retraining 
 Incremental data from post-edited MT projects is delivered to Safaba 
 Incremental system retraining is launched automatically, completed within hours 
 Newly-adapted version of the MT system is automatically tested and QAed for 
quality 
 Newly-adapted version of the MT system is deployed into production
Safaba Overnight Retraining 
The Pilot Project 
 Pilot project with Welocalize to assess impact of Overnight Retraining on Safaba 
EMTGlobal Dell MT systems, using samples of real post-edited translation project data 
 Setup: 
 Languages: English to Chinese, Spanish and German 
 Baseline Systems: 2014 retrained Dell EMTGlobal 3.0 MT systems 
 Incremental Data: Three batches of incremental data from live translation projects 
 Methodology: 
 Three versions of the MT systems: 
 Baseline 
 Baseline + Retrained on Data Set #1 
 Baseline + Retrained on Data Set #1 & #2 
 MT Evaluation: 
 Translate Data Set #3 (unseen) with the three versions of the MT system 
 Assess impact on translation performance using automated MT evaluation metrics 
 Additional analysis using Safaba “Content Drift Indicators”
Safaba Overnight Retraining 
Data 
Original number of segments Number of segments post-filtering 
Set 1 Set 2 Set 3 Set 1 Set 2 Set 3 
ENUS-ESXL 1108 4553 704 926 2411 528 
ENUS-ZHCN 3191 2181 1328 1143 1084 714 
ENUS-DEDE 3043 1220 2270 2325 977 1466
Pilot Results: Automated Metric Scores 
 English-to-Chinese: 
 Incremental Adaptation of Language Optimization Engine (LOE) 
 Incrementally retraining on Data Sets #1 & #2 results in gain of +3.0 BLEU points on Data Set 
#3 
70 
65 
60 
55 
50 
45 
40 
35 
30 
BLEU METEOR TER 
2013 System 
2014 Baseline 
Baseline+DS1 
Baseline+DS1&2 
Safaba Overnight Retraining
Safaba Overnight Retraining 
Pilot Results: Content Drift Indicator Statistics 
 English-to-Chinese: 
 Incremental Adaptation of Language Optimization Engine (LOE) 
 Adding Data Sets #1 & #2 reduces Data Set #3 OOVs by 0.3%, improves unigram coverage 
by 0.36% and improves trigram coverage by 14.22% 
7.00% 
6.00% 
5.00% 
4.00% 
3.00% 
2.00% 
1.00% 
0.00% 
OOV Tokens 
1 
0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
Unigrams 
Covered 
Trigrams 
Covered 
2013 System 
2014 Baseline 
Baseline+DS1 
Baseline+DS1&2
Preliminary Results: Advanced Adaptation with EMTGlobal v4.0 
 English-to-Chinese: 
 Incremental Adaptation with EMTGlobal v4.0 
 Incrementally retraining on Data Sets #1 & #2 results in gain of +6.8 BLEU points on Data 
Set #3 
75 
70 
65 
60 
55 
50 
45 
40 
35 
30 
BLEU METEOR TER 
2014 Baseline 
Baseline+DS1&2 
Safaba Overnight Retraining
Safaba Overnight Retraining 
Summary of Pilot Results 
 Excellent results for English-to-Chinese! 
 Spanish and German results show no gain or loss in MT accuracy as a result of LOE 
incremental retraining with the available data sets 
 Performance on Data Set #3 remains completely flat with both retrainings according 
to all automated metrics 
 Data analysis with Content Drift Indicators reveals that Data Sets #1 & #2 for these 
two language pairs did not contain novel translations sufficient for improving MT 
performance on Data Set #3 
 No significant reduction in Data Set #3 OOVs 
 No significant improvement in coverage of source-side n-grams
Overnight Retraining Pilot Evaluation Setup 
Translators were asked to compare each engine iteration using the same source strings 
Result 
Target 
Day 1: read the MT output first. Then read the source text (ST). Then score the segment for Adequacy and Fluency 
Adequacy 
On a 4-point scale, rate how much of the meaning is rendered in the translation: 
4 Everything 
3 Most 
2 Little 
1 None 
Fluency 
Rate on a 4- point scale the extent to which the translation is well-formed grammatically, contains correct spellings, adheres to common 
use of terms, titles and names, is intuitively acceptable and can be sensibly interpreted by a native speaker: 
4 Flawless 
3 Good 
2 Disfluent 
1 Incomprehensible 
*Based on TAUS Adequacy/Fluency Guidelines 
Comparing the iterations: Compare the NEW MT output to that of the previous week and indicate with a X in the correpsonding column 
whether it is better / worse / equal. 
If it is better or worse, indicate in the error categories & comment column what has improved or regressed.
Interim Chinese Pilot Results - Welocalize 
• The human evaluation results we have an our disposal are work in progress - based on 
evaluating a small subset of translated data and just one iteration of “overnight 
retraining” 
• The improvements observed by automated metrics are not yet reflected in the human 
assessment 
• Human evaluation results consistent between Baseline+ DS 1 and DS2 - no degradation is 
introduced but from translator perspective no significant change in quality is captured, 
possibly requires a larger evaluation set or a different approach to evaluation string 
selection 
Translator feedback: improvement in fluency but no improvement in capturing the 
meaning of whole sentence; punctuation has improved, but the translation stil needs 
improvement; part of the sentence is now more fluent
Welocalize “Wins” from “Overnight Retraining” 
 We need to be more granular than “Quality” and look at 
“Relevance” (coverage and fluency will increase based on Safaba 
findings) 
 Our expected benefits from this approach – needs to be in-synch 
with sufficient daily volumes 
 No need to wait for scheduled retrainings 
 Two things will happen – the translator gets more used to post-editing, 
and the MT engines catch up with the changes in the 
source content in the “live” mode 
 Benefit for the client – once the actual ongoing engine relevance 
statistics have been captured, we’ll be able to predict higher 
throughputs and offer better discounts
Summary and Conclusions 
 Enterprise Content Drift is a natural and frequent phenomenon in large-scale 
commercial MT implementation projects 
 Enterprise MT systems need to constantly adapt or else are likely to 
significantly degrade in translation accuracy and value over time 
 Safaba’s Content Drift Indicators can identify and quantify content drift 
and can be effectively used to predict the impact of incremental MT 
system retraining 
 Are being incorporated into Safaba’s new EMTGlobal MT Monitoring Portal 
 Safaba’s “Overnight Retraining” incremental adaptation is effective in 
combating content drift and maintaining/improving MT system 
performance over time and maintaining translator productivity levels 
 Safaba’s upcoming EMTGlobal v4.0 will dramatically enhance these 
capabilities!

More Related Content

Similar to Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA 2014 Welocalize and Safaba

Informatica dvo training
Informatica dvo training  Informatica dvo training
Informatica dvo training keerthi124
 
Tuning OEM Templates
Tuning OEM Templates Tuning OEM Templates
Tuning OEM Templates Datavail
 
04 test controlling and tracking
04   test controlling and tracking04   test controlling and tracking
04 test controlling and trackingClemens Reijnen
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicDavid Solivan
 
CV_SyedShoeb_2015
CV_SyedShoeb_2015CV_SyedShoeb_2015
CV_SyedShoeb_2015Syed Shoeb
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya
 
Naveen_Resume1
Naveen_Resume1Naveen_Resume1
Naveen_Resume1Naveen K K
 
Evolution of Test Automation
Evolution of Test AutomationEvolution of Test Automation
Evolution of Test AutomationDharmik Rajput
 
Optimizing Alert Monitoring with Oracle Enterprise Manager
Optimizing Alert Monitoring with Oracle Enterprise ManagerOptimizing Alert Monitoring with Oracle Enterprise Manager
Optimizing Alert Monitoring with Oracle Enterprise ManagerDatavail
 
3E Cloud CARE Upgrade Day - Presentation
3E Cloud CARE Upgrade Day - Presentation3E Cloud CARE Upgrade Day - Presentation
3E Cloud CARE Upgrade Day - PresentationJoeCarton4
 
3E Cloud CARE Upgrade Day Presentation
3E Cloud CARE Upgrade Day Presentation 3E Cloud CARE Upgrade Day Presentation
3E Cloud CARE Upgrade Day Presentation JoeCarton4
 
From Zero to Hero in 205 Days!
From Zero to Hero in 205 Days!From Zero to Hero in 205 Days!
From Zero to Hero in 205 Days!Josiah Renaudin
 
UX in ALM Series - UX Project Worflow using TFS 2008
UX in ALM Series - UX Project Worflow using TFS 2008UX in ALM Series - UX Project Worflow using TFS 2008
UX in ALM Series - UX Project Worflow using TFS 2008Christian Thilmany
 
Software Project Management - NESDEV
Software Project Management - NESDEVSoftware Project Management - NESDEV
Software Project Management - NESDEVKrit Kamtuo
 
Track g semiconductor test program - testinsight
Track g  semiconductor test program - testinsightTrack g  semiconductor test program - testinsight
Track g semiconductor test program - testinsightchiportal
 
Framework For Automation Testing Practice Sharing
Framework For Automation Testing Practice SharingFramework For Automation Testing Practice Sharing
Framework For Automation Testing Practice SharingKMS Technology
 
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...The Importance of Performance Testing Theory and Practice - QueBIT Consulting...
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...QueBIT Consulting
 
Mercury Testdirector8.0 using Slides
Mercury Testdirector8.0 using SlidesMercury Testdirector8.0 using Slides
Mercury Testdirector8.0 using Slidestelab
 
Automating The Process For Building Reliable Software
Automating The Process For Building Reliable SoftwareAutomating The Process For Building Reliable Software
Automating The Process For Building Reliable Softwareguest8861ff
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday SeasonIosif Itkin
 

Similar to Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA 2014 Welocalize and Safaba (20)

Informatica dvo training
Informatica dvo training  Informatica dvo training
Informatica dvo training
 
Tuning OEM Templates
Tuning OEM Templates Tuning OEM Templates
Tuning OEM Templates
 
04 test controlling and tracking
04   test controlling and tracking04   test controlling and tracking
04 test controlling and tracking
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
 
CV_SyedShoeb_2015
CV_SyedShoeb_2015CV_SyedShoeb_2015
CV_SyedShoeb_2015
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
 
Naveen_Resume1
Naveen_Resume1Naveen_Resume1
Naveen_Resume1
 
Evolution of Test Automation
Evolution of Test AutomationEvolution of Test Automation
Evolution of Test Automation
 
Optimizing Alert Monitoring with Oracle Enterprise Manager
Optimizing Alert Monitoring with Oracle Enterprise ManagerOptimizing Alert Monitoring with Oracle Enterprise Manager
Optimizing Alert Monitoring with Oracle Enterprise Manager
 
3E Cloud CARE Upgrade Day - Presentation
3E Cloud CARE Upgrade Day - Presentation3E Cloud CARE Upgrade Day - Presentation
3E Cloud CARE Upgrade Day - Presentation
 
3E Cloud CARE Upgrade Day Presentation
3E Cloud CARE Upgrade Day Presentation 3E Cloud CARE Upgrade Day Presentation
3E Cloud CARE Upgrade Day Presentation
 
From Zero to Hero in 205 Days!
From Zero to Hero in 205 Days!From Zero to Hero in 205 Days!
From Zero to Hero in 205 Days!
 
UX in ALM Series - UX Project Worflow using TFS 2008
UX in ALM Series - UX Project Worflow using TFS 2008UX in ALM Series - UX Project Worflow using TFS 2008
UX in ALM Series - UX Project Worflow using TFS 2008
 
Software Project Management - NESDEV
Software Project Management - NESDEVSoftware Project Management - NESDEV
Software Project Management - NESDEV
 
Track g semiconductor test program - testinsight
Track g  semiconductor test program - testinsightTrack g  semiconductor test program - testinsight
Track g semiconductor test program - testinsight
 
Framework For Automation Testing Practice Sharing
Framework For Automation Testing Practice SharingFramework For Automation Testing Practice Sharing
Framework For Automation Testing Practice Sharing
 
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...The Importance of Performance Testing Theory and Practice - QueBIT Consulting...
The Importance of Performance Testing Theory and Practice - QueBIT Consulting...
 
Mercury Testdirector8.0 using Slides
Mercury Testdirector8.0 using SlidesMercury Testdirector8.0 using Slides
Mercury Testdirector8.0 using Slides
 
Automating The Process For Building Reliable Software
Automating The Process For Building Reliable SoftwareAutomating The Process For Building Reliable Software
Automating The Process For Building Reliable Software
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
 

More from Welocalize

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Welocalize
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionWelocalize
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015Welocalize
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Welocalize
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeWelocalize
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...Welocalize
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaWelocalize
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology Welocalize
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Welocalize
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013Welocalize
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingWelocalize
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 

More from Welocalize (16)

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to Production
 
EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 

Recently uploaded

Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxSocio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxtrishalcan8
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
DEPED Work From Home WORKWEEK-PLAN.docx
DEPED Work From Home  WORKWEEK-PLAN.docxDEPED Work From Home  WORKWEEK-PLAN.docx
DEPED Work From Home WORKWEEK-PLAN.docxRodelinaLaud
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni
 

Recently uploaded (20)

Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptxSocio-economic-Impact-of-business-consumers-suppliers-and.pptx
Socio-economic-Impact-of-business-consumers-suppliers-and.pptx
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
DEPED Work From Home WORKWEEK-PLAN.docx
DEPED Work From Home  WORKWEEK-PLAN.docxDEPED Work From Home  WORKWEEK-PLAN.docx
DEPED Work From Home WORKWEEK-PLAN.docx
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.
 

Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA 2014 Welocalize and Safaba

  • 1. Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions Alon Lavie & Olga Beregovaya AMTA 2014 October 25, 2014
  • 2. Outline  Welocalize MT Program for eDell: workflow, processes, challenges  Safaba EMTGlobal MT  Enterprise Content Drift: the evidence  Identifying Content Drift: Indicators and their correlation  Overnight Retraining: the approach  Overnight Retraining: the Pilot Study and Results  EMTGlobal v4.0: Advanced Rapid Adaptation  Welocalize: Expected Impacts  Summary and Conclusions
  • 3. Welocalize Approach to MT Program Deployment
  • 4. Content Sent Through the TMT Process  eDell content types handled through the MT PE Process vary between different – mainly Marketing – content categories:  Partner Marketing  Global Support  Channel Support  Consumer Marketing Communication  Corporate HR  Customer Proposals  Product Launches  Global eDell Web content  The daily/weekly/monthly volumes per content type vary depending on the Dell business priorities
  • 5. Translation with MT Post-Editing for Dell  Translation Setup:  Source document is pre-translated by translation memory matches augmented by Safaba MT  Translation Memory “fuzzy match” threshold typically 75-85%  Pre-translations are presented to human translator as starting point for editing; translators can use or ignore the suggested pre-translations  Currently 28 languages go through the TM/MT workflow  Post-Editing Productivity Assessment:  Contrastive translation projects that measure and compare translation team productivity with MT post-editing versus translation using just translation memories  Productivity measured by contrasting translated words per hour under both conditions: MT-PE throughput / HT throughput
  • 6. MT Post-Editing Productivity Assessment  Evaluated by Welocalize in the context of the Dell MT Program 300.00% 250.00% 200.00% 150.00% 100.00% 50.00% 0.00% 90.00 BLEU 80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 PE Distance Productivity Delta
  • 7. Productivity Gains through Retraining LOCALE ID Initial engine Retrained Engine CH_TI -11.75% 4.7% CS-CZ 37.53% - DA-DK 88.67% - DE_DE 20.24% 31.2% EL_EL 18.36% 51.3% ES_ES 28.5% ES_LX 2.31% 99.3% FI-FI 102.80% FR_FR 21.73% 46.4% HE-IL 25.43% - HU-HU 32.53% - IT-IT 84.89% - JA-JP 20.62% - KO-KR 13.83% - NL-NL 27.85% - NO-NO 59.71% - PL-PL 33.83% - PT_BR 23.77% 31.3% PT_PT 30.24% 27.7% RU_RU 22.88% 36.6%
  • 8. A Solid Engine Translates into Solid Gains 100% 80% 60% 40% 20% 0% -20% Productivity Delta and Fluency 1 2 3 4 5 Productivity Delta Human Evaluation Fluency Score (1-5) 100% 80% 60% 40% 20% 0% -20% Productivity Delta and Adequacy 1 2 3 4 5 Productivity Delta Human Evaluation Adequacy Score (1-5) 1.00 0.80 0.60 0.40 0.20 0.00 -0.20 -0.40 -0.60 -0.80 -1.00 Adequacy, Fluency and PE Distance Correlation de_DE es_ES/LA fr_FR/CA it_IT pt_BR Adequacy & PE Distance Fluency & PE Distance
  • 9. 1.5 Years later -Welocalize Post-Editing Adoption
  • 10. Dell - “High-traffic” MT Program Q4/Q3/Q2/Q1 Volumes Quarterly MT throughput volumes allow Welocalize and Safaba to accumulate post-edits sufficient for far more frequent re-trainings than scheduled maintenance engine updates
  • 11. Final Post-Edited Output Quality MT quality results are consistently above target – engine degradation will force translators to compensate with additional effort Continuously above target, monitoring trend Result Target 100.00% 99.90% 99.80% 99.70% 99.60% 99.50% 99.40% 99.30% Week 40Week 41Week 42Week 43Week 44Week 45Week 46Week 47Week 48Week 49Week 50Week 51Week 52Week 53
  • 12. Safaba EMTGlobal Client-Specific MT Adaptation  The majority of the MT systems Safaba develops are specifically developed and optimized for specific client content types  Data Scenario:  Some amount of client-specific data: translation memories, terminology glossaries and monolingual data resources  Additional domain-specific and general background data resources: other client-specific content types, TAUS data, other general parallel and monolingual background data
  • 13. Safaba EMTGlobal Client-Specific MT Adaptation  Safaba Suite of Adaptation Approaches:  Data selection, filtering and prioritization methods  Data mixture and interpolation methods  Model mixture and interpolation methods  Client-specific Automated Post-Editing (Language Optimization Engine)  Styling and Formatting post-processing modules  Terminology and DNT runtime overrides
  • 14. Enterprise Content Drift  Client-specific Enterprise MT systems often degrade in performance over time for two main reasons: 1. Client content, even in controlled-domains, gradually changes over time: new products, new terminology, new content developers 2. The typical integrated setup of MT and translation memories: TMs are updated more frequently, so over time, only “harder” source segments are sent for translation to MT  Current Full MT system retraining is resource and time consuming:  MT systems are relatively static – they are fully retrained only periodically (typically only a couple of times per year)  The Result: MT accuracy for new projects declines over time  post-editing productivity also declines over time  We see strong evidence of “content drift” over time with many of our clients, especially in post-editing setups
  • 15. Evidence from Safaba EMTGlobal Systems for Dell MT Program:  BLEU scores before and after retraining on held out “recent” incremental data 70 60 50 40 30 20 10 0 2013 2014 Enterprise Content Drift
  • 16. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 17. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 18. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 19. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 20. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 21. Enterprise Content Drift Evidence from a typical client-specific MT system:  EMTGlobal English-to-German Dell MT System:  February 2013 System: 565K client + 964K background segments  March 2014 System: 594K client + 6,795K background segments  Two test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from incremental data (500 segments)  System Test Scores and Statistics: Lang System Gloss Inconsist. Orig. BLEU Orig. MET Orig. TER Orig. LEN Orig. OOVs Incr. BLEU Incr. MET Incr. TER Incr. LEN Incr. OOVs DE Feb. 2013 55.7 % 51.0 63.4 38.2 101.2 63 41.7 56.6 45.0 101.2 107 DE March 2014 24.8 % 52.9 64.2 36.9 100.5 33 60.5 69.9 30.3 99.9 31
  • 22. Enterprise Content Drift Analysis of Content Drift Over Time:  Three EMTGlobal MT systems for Dell:  English to Chinese, Spanish and German  Systems trained and deployed in February 2013  Test sets:  “Original” test set from February 2013 system build (1,200 segments)  “Incremental” test set extracted from 2014 incremental data (500 segments)  Data sets extracted from live Dell production projects in August- 2013, December-2013 and March-2014 along with their post-edited references
  • 23. Enterprise Content Drift Analysis of Content Drift Over Time:  BLEU Scores 70 60 50 40 30 20 10 0 Chinese Spanish German Feb Aug Dec Mar Inc-2013 Inc-2014
  • 24. Identifying Enterprise Content Drift Content Drift Indicators  Goal: Establish real-time quantifiable measures that are indicative of Enterprise Content Drift  Immediate: Available immediately at MT production time, prior to any post-editing of the MT output  Predictive: Strongly correlate with expected MT evaluation score and post-editing effort  Similar to real-time MT Quality Estimation scores, but specific to capturing content drift  Three Measures:  Core Out-of-Vocabulary (OOV) Type and Token fractions:  Fraction of source types (tokens) that were out-of-vocabulary in the core MT system (OOVs)  Source-side Unigram Coverage:  Fraction of source type (token) unigrams that were observed in the MT system training data  Source-side Trigram Coverage:  Fraction of source type (token) trigrams that were observed in the MT system training data
  • 25. Identifying Enterprise Content Drift Content Drift Indicators Performance of Content Drift Indicators on Dell EMTGlobal Systems:  OOVs (Fraction of Tokens) 6.00 5.00 4.00 3.00 2.00 1.00 0.00 Chinese Spanish German Feb Aug Dec Mar Inc-2013 Inc-2014
  • 26. Identifying Enterprise Content Drift Content Drift Indicators Performance of Content Drift Indicators on Dell EMTGlobal Systems:  Source Trigram Coverage 80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Chinese Spanish German Feb Aug Dec Mar Inc-2013 Inc-2014
  • 27. “Overnight” Incremental Adaptation  Objective: Counter “content drift” and help maintain and accelerate post-editing productivity with fast and frequent incremental adaptation retraining  Setting: New additional post-edited client data is deposited and made available for adaptation in small incremental batches  Challenge: Full offline system retraining is slow and computationally intensive and can take several days  Safaba Solution: implement fast “light-weight” adaptations that can be executed, tested and deployed into production within hours (“overnight”)  Suffix-array variant of Moses supports rapid updating of indexed training data  Safaba Language Optimization Engine (automated post-editing module) supports rapid retraining  KenLM supports rapid rebuilding of language models  Currently in pilot testing with Welocalize and Dell
  • 28. Safaba Overnight Retraining The Approach: Goal: Rapid MT System Adaptation using Incremental Data  Current Approach: Language Optimization Engine (LOE) Incremental Retraining  Safaba EMTGlobal MT systems include a core MT engine and a target-side Language Optimization Engine  Retraining the LOE component is fast – typically within a few hours  Not equivalent to full MT system retraining, but effective in closing the gap  New Approach: EMTGlobal v4.0 Advanced Adaptation Technology:  Supports significantly improved client-specific adaptation within the core MT engine  Supports rapid incremental retraining of core MT engines  Much closer to full MT system retraining at similar time frame as LOE retraining  Will be available in late Q4 of 2014
  • 29. Safaba Overnight Retraining The Approach:  Full Solution: Overnight Retraining  Incremental data from post-edited MT projects is delivered to Safaba  Incremental system retraining is launched automatically, completed within hours  Newly-adapted version of the MT system is automatically tested and QAed for quality  Newly-adapted version of the MT system is deployed into production
  • 30. Safaba Overnight Retraining The Pilot Project  Pilot project with Welocalize to assess impact of Overnight Retraining on Safaba EMTGlobal Dell MT systems, using samples of real post-edited translation project data  Setup:  Languages: English to Chinese, Spanish and German  Baseline Systems: 2014 retrained Dell EMTGlobal 3.0 MT systems  Incremental Data: Three batches of incremental data from live translation projects  Methodology:  Three versions of the MT systems:  Baseline  Baseline + Retrained on Data Set #1  Baseline + Retrained on Data Set #1 & #2  MT Evaluation:  Translate Data Set #3 (unseen) with the three versions of the MT system  Assess impact on translation performance using automated MT evaluation metrics  Additional analysis using Safaba “Content Drift Indicators”
  • 31. Safaba Overnight Retraining Data Original number of segments Number of segments post-filtering Set 1 Set 2 Set 3 Set 1 Set 2 Set 3 ENUS-ESXL 1108 4553 704 926 2411 528 ENUS-ZHCN 3191 2181 1328 1143 1084 714 ENUS-DEDE 3043 1220 2270 2325 977 1466
  • 32. Pilot Results: Automated Metric Scores  English-to-Chinese:  Incremental Adaptation of Language Optimization Engine (LOE)  Incrementally retraining on Data Sets #1 & #2 results in gain of +3.0 BLEU points on Data Set #3 70 65 60 55 50 45 40 35 30 BLEU METEOR TER 2013 System 2014 Baseline Baseline+DS1 Baseline+DS1&2 Safaba Overnight Retraining
  • 33. Safaba Overnight Retraining Pilot Results: Content Drift Indicator Statistics  English-to-Chinese:  Incremental Adaptation of Language Optimization Engine (LOE)  Adding Data Sets #1 & #2 reduces Data Set #3 OOVs by 0.3%, improves unigram coverage by 0.36% and improves trigram coverage by 14.22% 7.00% 6.00% 5.00% 4.00% 3.00% 2.00% 1.00% 0.00% OOV Tokens 1 0.9 0.8 0.7 0.6 0.5 0.4 Unigrams Covered Trigrams Covered 2013 System 2014 Baseline Baseline+DS1 Baseline+DS1&2
  • 34. Preliminary Results: Advanced Adaptation with EMTGlobal v4.0  English-to-Chinese:  Incremental Adaptation with EMTGlobal v4.0  Incrementally retraining on Data Sets #1 & #2 results in gain of +6.8 BLEU points on Data Set #3 75 70 65 60 55 50 45 40 35 30 BLEU METEOR TER 2014 Baseline Baseline+DS1&2 Safaba Overnight Retraining
  • 35. Safaba Overnight Retraining Summary of Pilot Results  Excellent results for English-to-Chinese!  Spanish and German results show no gain or loss in MT accuracy as a result of LOE incremental retraining with the available data sets  Performance on Data Set #3 remains completely flat with both retrainings according to all automated metrics  Data analysis with Content Drift Indicators reveals that Data Sets #1 & #2 for these two language pairs did not contain novel translations sufficient for improving MT performance on Data Set #3  No significant reduction in Data Set #3 OOVs  No significant improvement in coverage of source-side n-grams
  • 36. Overnight Retraining Pilot Evaluation Setup Translators were asked to compare each engine iteration using the same source strings Result Target Day 1: read the MT output first. Then read the source text (ST). Then score the segment for Adequacy and Fluency Adequacy On a 4-point scale, rate how much of the meaning is rendered in the translation: 4 Everything 3 Most 2 Little 1 None Fluency Rate on a 4- point scale the extent to which the translation is well-formed grammatically, contains correct spellings, adheres to common use of terms, titles and names, is intuitively acceptable and can be sensibly interpreted by a native speaker: 4 Flawless 3 Good 2 Disfluent 1 Incomprehensible *Based on TAUS Adequacy/Fluency Guidelines Comparing the iterations: Compare the NEW MT output to that of the previous week and indicate with a X in the correpsonding column whether it is better / worse / equal. If it is better or worse, indicate in the error categories & comment column what has improved or regressed.
  • 37. Interim Chinese Pilot Results - Welocalize • The human evaluation results we have an our disposal are work in progress - based on evaluating a small subset of translated data and just one iteration of “overnight retraining” • The improvements observed by automated metrics are not yet reflected in the human assessment • Human evaluation results consistent between Baseline+ DS 1 and DS2 - no degradation is introduced but from translator perspective no significant change in quality is captured, possibly requires a larger evaluation set or a different approach to evaluation string selection Translator feedback: improvement in fluency but no improvement in capturing the meaning of whole sentence; punctuation has improved, but the translation stil needs improvement; part of the sentence is now more fluent
  • 38. Welocalize “Wins” from “Overnight Retraining”  We need to be more granular than “Quality” and look at “Relevance” (coverage and fluency will increase based on Safaba findings)  Our expected benefits from this approach – needs to be in-synch with sufficient daily volumes  No need to wait for scheduled retrainings  Two things will happen – the translator gets more used to post-editing, and the MT engines catch up with the changes in the source content in the “live” mode  Benefit for the client – once the actual ongoing engine relevance statistics have been captured, we’ll be able to predict higher throughputs and offer better discounts
  • 39. Summary and Conclusions  Enterprise Content Drift is a natural and frequent phenomenon in large-scale commercial MT implementation projects  Enterprise MT systems need to constantly adapt or else are likely to significantly degrade in translation accuracy and value over time  Safaba’s Content Drift Indicators can identify and quantify content drift and can be effectively used to predict the impact of incremental MT system retraining  Are being incorporated into Safaba’s new EMTGlobal MT Monitoring Portal  Safaba’s “Overnight Retraining” incremental adaptation is effective in combating content drift and maintaining/improving MT system performance over time and maintaining translator productivity levels  Safaba’s upcoming EMTGlobal v4.0 will dramatically enhance these capabilities!

Editor's Notes

  1. Beyond the quality standards
  2. Engine retraining and additional productivity gains
  3. A solid engine translats into solid gains
  4. In translators’ words – 1,5 years into the program
  5. Meeting high quality standards while reducing turnaround times requires high-quality up-to-date MT output
  6. Overnight retraining evaluation methodology
  7. Chinese engine evaluation results (next 2 slides)
  8. Chinese engine evaluation results (next 2 slides)