Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)

TAUS Quality Evaluation Summit - 28 May 2015, Dublin

Machine translation (MT) has been the hot-topic of the localisation industry for some time now. Buyers of localisation know they should be maximising the use of MT in their workflows, but it can be difficult to decide how, when and where to use it. Building an MT infrastructure and deploying it as part of your workflow comes at a cost – and it can be very difficult to calculate ROI. Quite often, investing in MT can come with a leap of faith. This informative presentation, presented by Tom Shaw from Capita Translation and Interpreting (Capita TI), will talk about how the TAUS DQF tools can be used to evaluate the outputs of MT from two different systems. The results of the Productivity and Quality tests can be used to help benchmark MT outputs, and can be used to help decide which engine to go with (if any), and what the associated ROI can be.

  • Login to see the comments

  • Be the first to like this

MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)

  2. 2. Language Service Provider Machine translation SmartMATE External
  3. 3. The Challenge - When to invest in MT? $ ? • Customising an MT engine is an investment • How can it be used / suitability to my needs? • How much quicker? • What cost savings to expect? • Impact on Quality? • Translate more? • Automated Metrics (e.g. BLEU) not enough
  4. 4. Capita TI – Evaluating MT Capita Benchmarking MT Engine Types DQF Project Type MT Engine ranking Capita Industry MT Capita Customer MT External MT Comparison Quick / Rank Industry Engine roll-out Client MT - ROI Capita Industry MT Capita Industry MT Capita Client MT Productivity HT vs PEMT PEMT Gist MT Engine Tuning Capita Industry MT Capita Client MT Capita Client MT Quality Evaluation Adequacy Fluency Typology Errors
  5. 5. Client MT – Calculating ROI Objectives • Compare productivity of Post-Editing Industry vs Client MT Engine • Calculate ROI of replacing an Industry engine with a Client specific engine No MT Industry MT Client MT Required Quality Human Effort Capita Investment Client Investment ? ?
  6. 6. Productivity Evaluation – Process Overview Capita Industry MT (Approx. 18m words / 250 terms) • MT produced by SmartMATE • Export in TMX format OUTPUT 1 Client Specific MT (Approx. 4.5m words / 125 terms) • MT produced by SmartMATE • Export in TMX format OUTPUT 2 1. Identify 3 suitable Post Editors 2. Generate 2 productivity evaluation projects (Output 1 and Output 2) 3. Brief Linguists on use of DQF Evaluation environment 4. Send both outputs to Post Editors through DQF Expected quality: Human Translation SOURCE CONTENT • Customer: Client X • Subject: Engineering- OMM • Size: 50 segments (1010 words) • Language Pair: EN - ES
  7. 7. Productivity Evaluation - Reporting MEASUREMENTS • Edit/translate time (average number of words processed in given timespan) • Edit/translate distance (average % of character changes applied to MT output) Both OUTPUTS should generate reports that will allow us to reach conclusions Expected result: Customised MT output (MT2) should have higher performance
  8. 8. Results - Snapshot Category Unit Output 1 (Generic MT) Output 2 (Customised MT) % Increase (Approx.) Edit Time Average WPH 474 685 44% Edit Distance Average % similarity 56 64 14% Category Unit Output 1 (Generic MT) Output 2 (Customised MT) Acceptable Fluency # of segments (Flawless or Good) 39 45 Acceptable Adequacy # of segments (Everything or Most) 38 44 Productivity Quality
  9. 9. What have we achieved? Free MT Capita Domain Engine Capita Customised Engine Domain Langu age Pair Productivity Rating Corpora Size (m words) BLEU Score PE Rate (WPH) Productivity Rating Corpora Size (m words) BLEU Score PE Rate (WPH) Productivity Rating Engineering EN-ES 66 18 60 474 68 4.5 65 685 72 Engineering EN-FR 64 16 55 410 64 3 65 590 61 IT EN-IT 74 24 75 570 72 5 75 702 76 Etc. Data for Benchmarking MT……….. Historical data helps calculate ROI
  10. 10. What’s next? Predicting MT Quality……….. Calculate actual effort on live jobs
  11. 11. Questions?