Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Translation: Latest Innovations and their Impact on Commercial Translation


Published on

Our SDL Language presentation from the Customer Success Summit Montreal 2015 on the latest innovations in language translation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Machine Translation: Latest Innovations and their Impact on Commercial Translation

  1. 1. SDL Proprietary and Confidential Machine Translation: Latest Innovations and their Impact on Commercial Translation SDL Customer Success Summit Montreal Rodrigo Fuentes Corradi, MT Business Consultant June, 2015
  2. 2. 2 Agenda ○ Evolution of MT ○ Common MT Use-Cases ○ Engine Training ○ Introducing SDL XMT ○ How to Deploy MT ○ MT and the Post-Editor
  3. 3. Evolution of MT
  4. 4. 4 1950s 2002 2010 2011 2015 SDL acquires RBMT engine…establishes MT group dedicated to improving quality for enterprise applications First SDL Post- Editing projects using SMT go into production Post-Editing booms: 4-fold increase SDL launches PE Certification Program War-time cryptography requirements, with subsequent experiments & investment in automated translation SDL launches XMT next- generation MT platform 2014 Brief history of Machine Translation SDL acquires Language Weaver / BeGlobal Statistical Machine Translation (SMT)
  5. 5. 5 Overview: The SDL MT Team Who we are First to commercialize Statistical Machine Translation o 50+ Professionals o Over 10 Nationalities o Across 5 Time Zones o 8 Locations o Ex-translators o Computational Linguists o Project Managers Widespread team of language lovers: o Data Specialists o Post- Editors o Architects …all gathered from the four corners of SDL! What we do Drive MT Adoption: Educate, promote and support MT usage in existing SDL accounts & new opportunities o Design o Create o Test o Implement o Monitor Custom Engine Builds: …custom Statistical Machine Translation engines Linguistic Projects: Semantic annotation projects for US Government bodies & academic institutes How we do it o Los Angeles, CA o Cambridge, UK Two Research Labs: o 100s of Scientific Publications o Over 50 Patents Approved or Filed We’re Evangelists…about Machine Translation, using automation to accelerate productivity
  6. 6. Common MT Use-Cases
  7. 7. 7 Communication Channels Consumer Preferences Increased Global Competition Export Market Growth
  8. 8. 8 Right translation method, right price, right time Quality Volume Human Translation Machine Translation Blogs User Forums Reviews Chat Email Support FAQ Websites Wikis Knowledge Base Alerts/ Notifications Help User Guides Documentation Post-Edit Newsletters Advertising Content Legal
  9. 9. 9 Description: ○ Direct access to machine translation from SDL Trados Studio Benefits: ○ Improve the efficiency of translators by providing results of machine translation to them for segments that do not match entries in translation memory Translator productivity
  10. 10. 10 Description: ○ Real-time translation of web-based chat conversations Benefits: o Reduces cost of staffing the support/sales operations as they do not need multi-lingual agents o Customer acquisition rates and satisfaction are much higher if you engage the customer in chat. Live chat translation
  11. 11. 11 Description: ○ Translation of user-generated content in web-based community forums Benefits: o Enable interactions between customers who speak different languages o Leverage community expertise across languages instead of only within the language of community experts Community forum translation
  12. 12. 12 Description: ○ Translation of knowledge base content for local language customers of technical solutions Benefits: o Reduces customer support costs and activity level by allowing remote language customers to directly access solutions o Increases customer satisfaction by providing solutions in their native language Knowledgebase content translation
  13. 13. 13 Case study: MT for online customer reviews Requirements: o Share customer reviews with international audiences o Automate the translation of customer reviews into 13 languages Results: o Reduced bounce rate from 70% to 25% o Increased user dwell times and page views o Economically translate 2 billion words/month
  14. 14. 14 Case study: MT for instant MS Office translation [a large global retail client] Requirements: o Improve communication among geographically scattered company employees o Fast, low-cost translation of MS Outlook emails & MS Office business documents Results: o BeGlobal Machine Translation integrated via API with MS Office apps o Any employee can instantly translate emails or attachments with a simple double-click
  15. 15. 15 Engine training: Making MT smarter Customized engines Domain verticals Baselines
  16. 16. 16 Baselines Baselines Data mined from reliable sources available in the public domain, covering various subjects Core generic MT engines for each language pair Work well for general & varied content Can be used as backup for verticals & customized engines Contain hundreds of millions of words of bilingual data 100Ms+
  17. 17. 17 Domain verticals Domain verticals Trained statistical engines exclusive for a domain Data selected from sources within a domain or industry MT output more likely to follow technical terminology Solution used when client-specific data is not available or not enough for a customization
  18. 18. 18 Customized engines Customized engines Optimize the MT output for specific client projects Training based on client- specific bilingual data More data usually has a positive effect on the MT output Quality & consistency of data is as important as quantity Adherence to client-specific terminology & style
  19. 19. 19 How SDL trains an MT engine Training Data Prep & Engine Customization Prep of Testing Material Evaluate MT Output Machine Translation Post-Edit Quality Assessment & Translation Delivery Update Translation Memory Source Content Apply Translation Memory Content Evaluation MT Customization Production QA Refine Training or Deploy for Production Integrate MT on Translation Process SDL MT Server Translation Memory
  20. 20. 20 SDL MT Group developers are constantly researching ways to improve Generic, Vertical, and Customized MT Engines SDL Research Scientists are continuously improving the Statistical Machine Translation algorithms (e.g. Language Models, Translation Models, Reordering Models, Syntax, Transliteration, Rule-Based Components, etc…) SDL Data Engineers are continuously mining large amounts of good data used by the statistical algorithms Continuous improvement
  21. 21. 21 Introducing SDL XMT… A NEW, modular & flexible technology that will power the “next generation” of SDL MT Syntax-based Machine Translation Phrase-based Machine Translation Word-based Machine Translation 2002 2003 2008 2015 XMT XMT
  22. 22. 22 Legacy MT Legacy MT (Monolithic Phrase-based) Foreign Language Your Language
  23. 23. 23 …… Neural Networks Compound Splitting Phrase- Based Finite State Automata String to Tree Rule- Based Tree to String Pre- Ordering Trans- literation Hidden Markov Model Hyper Graphs Modular & Flexible “State-of-the-Art” Machine Learning Better Translation Quality Rapid Research Transition SDL XMT: Next generation technology, higher quality XMT Foreign Language Your Language M O D U L A R C O M P O N E N T S
  24. 24. 24 Language Learning in XMT Continuous improvement by learning from Post-Editing. ○ The machine learns how to translate from source to target during the training process ○ The machine does not learn during the translation process Machine Translation Machine Translation + Language Learning ○ The machine learns how to translate from source to target during the training process ○ The machine learns & improves seamlessly, continuously, and in real-time from user feedback during the translation process ○ See it in action: SDL XMT XMT
  25. 25. How to Deploy MT Post-Edit
  26. 26. 26 Post-Editing experience in Montreal Quality delivered & owned by SDL, therefore commitment to quality remains our number #1 priority ! o Costs reductions up to 40% vs. conventional translation o File Formats received, TXT, XL, and XML o Unique client-specific process developed with collaboration of engineering & IT Teams from SDL & customers SDL Canada Post-Edited Post-Editing Large Retail Customers e-Commerce Sites Post-Edited (Forecasted) 2013 2014 2015 25M 10M 15M Words Words Words 40%
  27. 27. 27 Quality in MT Building blocks are there as a lot of content is pulled from the engines Allows the linguist to focus on refining the output Custom engines pull in client terminology & style Fewer resources equals greater consistency Trained linguists well-versed in handling MT output & certified
  28. 28. 28 Post-Editing quality requirements When post-editing to publishable quality, the following basic principles still apply: o The same references must be used for as for conventional translation (project- specific guidelines, TMs, glossaries, termbases, etc.) o Grammar, spelling and punctuation must be correct o Appropriate style & correct terminology must be used consistently o The translation must read well and be suitable for its intended purpose Customer User Guide
  29. 29. 29 Features to watch out for in SMT output… Incorrect Formatting Additional or Missing words Words Not Localized or Wrong Flavor Gender, Number, Agreement or Verb Inflection Issues Articles & Prepositions Syntax & Word Order Issues Wrong Punctuation Inconsistent or Non-compliant Terminology Mistranslations !
  30. 30. 30 Post-Editing Machine Translation certification ○ The demand for MT solutions is growing quickly & Post- Editing is becoming a mainstream skill for translators ○ In response, SDL have created Post-Editing Certification – released in June 2014 ○ 85% of in-house staff completed the Certification in 2014 ○ 2,500+ freelancers signed up for the course ○ The Certification covers the theory behind Machine Translation as well as practical approaches to Post-Editing ○ Our Certification is for anyone impacted by Post-Editing – certified translators can offer an extended skill set JUNE 2014 85% 2,500+
  31. 31. 31 SDL iMT: Key steps in the process ○ Evaluate content and translation assets ○ Train MT engines for your content or use existing solution ○ Configure the trained MT engines with SDL’s translation environment (TMS, WS, Studio) ○ Post-edit the MT output to full publishable quality ○ SDL infrastructure to support these steps Evaluate Train MT Configure Post-Edit SDL Infrastructure
  32. 32. Copyright © 2008-2015 SDL plc. All rights reserved. All company names, brand names, trademarks, service marks, images and logos are the property of their respective owners. This presentation and its content are SDL confidential unless otherwise specified, and may not be copied, used or distributed except as authorised by SDL. Global Customer Experience Management