New Breakthroughs in Machine Transation Technology


Published on

Tony O’Dowd takes us through some of the most innovative technologies offered on the platform which are helping a growing community of KantanMT users to develop and self-manage custom Machine Translation engines in the cloud.
Maxim Khalilov then illustrates bmmt’s journey with Machine Translation on KantanMT. He discusses what they have achieved so far in terms of MT engine development and showcases the value that his team is bringing to their growing international client base through the use of Machine Translation.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

New Breakthroughs in Machine Transation Technology

  1. 1. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar
  2. 2. KantanMT.Com NO HARDWARE. NO SOFTWARE. NO HASSLE MT Tony O’Dowd Founder & Chief Architect New Breakthroughs in Machine Translation Technology
  3. 3. What we aim to cover today? What is Challenges of the L10N Industry  Making the right Project Management decisions  Going beyond the baseline of MT quality Conclusions 15 minutes
  4. 4. What is Statistical MT System  Cloud-based =  Highly scalable  Inexpensive to operate  Quick to deploy Our Vision  To put Machine Translation:  Customization  Improvement  Deployment  …into your hands Active KantanMT Engines 6,191 Training Words Uploaded 28,243,234,615 Member WordsTranslated 427,526,741 Fully Operational 15 months
  5. 5. Initial Steps of any project are:  Determine Scope  How long will it take?  How much will it cost?  What is my margin?  Determine resources  How many Translators will I need? Introducing KantanAnalytics™  …think Fuzzy-Match report and you’ve got it in one! Challenge #1 How can Project Managers ‘manage’ Post- Editing Projects?
  6. 6. KantanAnalytics™ Kantan TotalRecall – Advanced TM % of TM hits in this job KantanMT – automated translations % of automated translations for this job Range of QE Scores QE range defined to match existing fuzzy match ranges used by L10N industry Quality Estimation Scores Segment level QE scores – akin to fuzzy match scores Word Counts – Project Stats Can be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects Placeholder & Tag Counts Used by PM for complexity sur-charges KantanAnalytics embeds QE scores into  TRADOS Studio  MemoQ  XLIFF
  7. 7. KantanAnalytics™ Helping PMs make the right business decisions!
  8. 8. KantanAnalytics™ - Helping PMs make the right decisions
  9. 9. Challenge #2: Going beyond the baseline and developing production ready MT! Easy to build 1st baseline engine  Aggregate Training Data – TM, Mono, Stock, Terminology  Use Cloud-based platform, like Real Challenge:  How do these platforms go beyond the baseline engine and achieve higher levels of production quality Introducing Kantan BuildAnalytics  Data analytics and visualisation providing insights into the customisation of SMT engines.
  10. 10. Kantan BuildAnalytics™ Rapidly develop production ready engines  Summary Report  Training Rejects Reports  F-Measure Analysis  BLEU Analysis  TER Analysis  GAP Analysis  Timeline Report  Deep Tuning
  11. 11. Kantan BuildAnalytics™ F-Measure Score Measures word recall & precision of KantanMT engines Distributions Provides distribution of F-Measure scores across all reference translations Kantan Insight™ Holistic analysis of score and advice on how to improve this for KantanMT engines Detailed Analysis Segment level F-Measure analysis to help SMT Developers improve training material
  12. 12. Kantan BuildAnalytics™ Detailed Reports for: F-Measure, BLEU and TER
  13. 13. Kantan BuildAnalytics™ Gap Analysis – quickest way of improving fluency
  14. 14. Kantan BuildAnalytics™ Training Rejects Report – Improve training data rapidly
  15. 15. Kantan BuildAnalytics™ Timeline – Tracks history of KantanMT engines
  16. 16. Kantan BuildAnalytics™ - Rapid MT Customisation
  17. 17. bmmt GmbH and KantanMT: The Real-World Use of Machine Translation Maxim Khalilov Technical Lead bmmt GmbH KantanMT webinar April 10, 2014
  18. 18. MT in industry: context and rationale The combination of these two technologies, well-established TM and cutting-edge MT, plus post-editing allows the creation of a high-quality translation that reads just as well as a “classically” produced translation.
  19. 19. MT in industry: what about cost? The cost structure changes when machine translation is integrated into the translation pipeline. When machine translation is adopted, the data preparation and quality assurance (editing) costs rise whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is reduced dramatically as illustrated.
  20. 20. MT case study  Customer: big German machine manufacturer  Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.  Settings: the files were processed through Trados Studio 2011.  Implementation: KantanMT  Description: Roughly 7,000 words came from TM as high matches. The remainder went through MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the same level of quality as in an all-human translation.  Training material: Our customer had not worked in this language combination before, so there was no TM to go on. But we knew that the English authors based their work on material that the customer had previously translated from German into English. Thus we reversed the language direction of the TM and trained a customer-specific engine with this TM.  Results: As a result, 44,000 words were post-edited to a final quality level that the customer was very happy with.  Cost savings > 30%.
  21. 21. MT: benefits of KantanMT solution  Fully automated system training  One-click system customization  Automatic data pre-processing  Fully automated translation  Automatic pre- and post-processing  Quality assessment  KantanWatch  Gap Analysis  Reject Report  No worry about maintenance and infrastructure
  22. 22. MT: benefits of KantanMT solution  Transparent file format conversion  Training material conversion: TM conversion, monolingual material  Documents to translate: TMS format into MTable format  SDLXliff  Smooth terminology integration  Consistent terminology  Tag handling and mark-up transfer Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8 SWord 9</g> Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g id="16481">Number</g>
  23. 23. bmmt GmbH  Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology solutions  Three operations centers in Germany: Munich, Berlin and Stuttgart  bmmt GmbH heavily relies on KantanMT services from 2013  Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT  Types of documents: workshop texts, product catalogues & other highly repetitive information documents  Primary source language: German  Integration: SDL Trados, SDL WorldServer and others  Find more:
  24. 24. Berlin Alt-Moabit 92 10559 Berlin Phone: +49 30-3117505-15 Fax: +49 30-3117505-20 Munich Bernhard-Wicki-Straße 5 80636 Munich Phone: +49 89 2000037-17 Fax: +49 89 2000037-11 Stuttgart Ruppmannstraße 33b 70565 Stuttgart Phone: +49 711 16646-66 Fax: +49 711 16646-50 bmmt GmbH Thank you
  25. 25. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar Tony O’Dowd, Maxim Khalilov, Speakers