Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013


Published on

This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. 

MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme. 

For the latest updates go to
or follow us on Twitter - #MosesCore

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013

  1. 1. TAUS  MACHINE  TRANSLATION  SHOWCASE   Microsoft Translator 09:30 – 09:50 Thursday, 10 October 2013 Chris Wendt Microsoft
  2. 2. Microsoft Translator Chris Wendt TAUS MT Showcase October 10, 2013 - Santa Clara, California
  3. 3. Why MT? The purpose The Crude §  Extent of localization §  Data Mining & Business Intelligence §  Globalized NLP §  Triage for human translation Research §  Machine Learning §  Statistical Linguistics §  Same-language translation The Good §  Breaking down language barriers §  Text, Speech, Images & Video §  Language Preservation NOT: §  Spend less money §  Take the job of human translators §  Perform miracles
  4. 4. Microsoft Translator – Quick Facts §  Linguistically informed statistical MT system §  41 languages – from any language to any other language §  Runs in Microsoft Datacenter §  Simple web service API: SOAP, REST, AJAX, OData, web site widget §  2 million characters/month free §  Available in the Enterprise Agreement, as a monthly subscription §  For extreme confidentiality situations available on-premise §  Highly customizable: –  Collaborative Translations – Involve community, coworkers and customers –  Hub: Custom engine training via an easy-to use UI §  Web Scale –  Powers translations in Bing, Microsoft Office, Microsoft SharePoint, Internet Explorer, Yammer –  Powers translations in Facebook, Twitter, eBay, and many other government and enterprise sites 4
  5. 5. Microsoft Translator at a Glance World-class Statistical Machine Translation Built on over a decade of work at Microsoft Research Big Data Powered Trained with billions of “parallel” sentences (Bing index & licensed) General Purpose System Powers Bing Translator, supports 40+ languages, any-to-any Unprecedented Customization Capability Hub train before translation + CTF edit after translation Powerful Cloud API Rich, secure API enabling integrations, 99.9% availability
  6. 6. Enabling Translation in Many Products Fully integrated across the stack, Translator extends the value of Microsoft platform and your solutions built on the Microsoft platform for our customers including consumer facing applications such as Bing Translator, Bing Toolbar, Bing Dictionary, and Windows Phone App. A few of our customers and partners…. +80,000 more.
  7. 7. Powerful Tools and Customization Our machine learning & big-data based translation technology brings the power of instant translations to break down language barriers for users, developers, webmasters, translators and businesses. Robust, industry leading tools such as the HUB and CTF allow for unprecedented customization of the translation experience. Powerful API Instant translation and language services in web, desktop and mobile applications. Highly scalable and robust cloud-based, machine-translation service from Microsoft. Supports SOAP, REST, AJAX, OData, and the Translator web site translation widget. Extensibility for development on SharePoint, Office , Windows Phone, and more….. Widget Hub CTF Instant translations of web pages without the need to write any code. Custom translation portal to build, train, and deploy customized automatic language translation systems. Override, modify or vote for the translated output to best fit the content. Use the AJAX API to roll-your-own widget. Combine your data with Bing big data to tune the translation output to best fit your content. Provide the end-user alternative translations. Free with any level of Translator subscription (including the free tier). Import the edits back into Hub for further training. Use the integrated “Collaborative Translations” (CTF) functionality to tap into your community.
  8. 8. Integrates with your TM tool Top translation tools support Microsoft Translator 8
  9. 9. Give these a try! (Demo) Bing Translator Lync Conversation Translator Translator Widget for Webpages Word Web App Contextual Thesaurus
  10. 10. Price Competitively priced §  Monthly subscription §  Free for up to 2 million characters per month §  Base price: $10 per million characters §  Discounted for higher volumes §  Paid by credit card or via Microsoft Enterprise agreement 10
  11. 11. Extent of localization Methods of applying MT Post-Editing Raw publishing § Goal: Human translation quality § Increase human translator’s productivity § In practice: 0% to 25% productivity increase § Goals: –  Varies by content, style and language – Good enough for the purpose – Speed – Cost § Publish the output of the MT system directly to end user § Best with bilingual UI § Good results with technical audiences 11
  12. 12. Extent of localization Methods of applying MT Post-Editing Post-Publish Post-Editing Raw publishing “P3” § Goal: Human translation you are human § Know what § Goals: quality – Good translating, and why enough for the purpose – Speed § Increase human § Make use of community translator’s productivity experts – Cost –  Domain –  25% § In practice: 0% to Enthusiasts § Publish the output of the MT system directly to end –  Employees productivity increase –  Professional translators user –  Varies by content, style and language § Best of both worlds with bilingual UI § Best –  Fast § Good results with technical –  Better than raw audiences –  Always current 12
  13. 13. The Triangle You can have only two. Not anymore! Price P3 Quality Speed P3: Post-Publishing Post-Edit 13
  14. 14. The cost/quality curve Optimize for the knee User satisfaction Highly visible marketing content Low pageview supporting content Good enough for the intended purpose $ No cost No translation Low cost MT+TM+ Community High cost Fully qualified HT Very high cost Expert reviewed translation/ transcreation 14
  15. 15. Collaboration: MT + Your community Translation Request Your  community Response Your  Web  Site Microsoft   Translator   Collaborative  TM Match  f irst Microsoft  T ranslator  API Your  App Translate  if  no  match Collaborative TM entries: §  Rating 1 to 4: unapproved §  Rating 5 to10: Approved §  Rating -10 to -1: Rejected 1 to many is possible What makes this possible – fully integrated 100% matching TM Enormous  l anguage   knowledge
  16. 16. Making it easier for the approver – Pending edits highlight
  17. 17. Making it easier for the approver – Managing authorized users
  18. 18. Making it easier for the approver – Bulk approvals
  19. 19. What is Important? In this order § Quality § Access § Coverage
  20. 20. Measuring Quality: Human Evaluations Knowledge powered by people §  Absolute §  3 to 5 independent human evaluators are asked to rank translation quality for 250 sentences on a scale of 1 to 4 –  Comparing to human translated sentence –  No source language knowledge required 4   Ideal   3   Acceptable   2   Possibly Acceptable   1   Unacceptable   Grammatically correct, all information included   Not perfect, but definitely comprehensible, and with accurate transfer of all important information   May be interpretable given context/time, some information transferred accurately   Absolutely not comprehensible and/or little or not information transferred accurately   Also: Relative evals, against a competitor, or a previous version of ourselves 23
  21. 21. Measuring Quality: BLEU* Cheap and effective – but be aware of the limits § A fully automated MT evaluation metric – Modified N-gram precision, comparing a test sentence to reference sentences § Standard in the MT community – Immediate, simple to administer – Correlates with human judgments § Automatic and cheap: runs daily and for every change § Not suitable for cross-engine or crosslanguage evaluations * BLEU: BiLingual Evaluation Understudy Result are always relative to the test set. 24
  22. 22. Measuring Quality In Context Real-world data § Instrumentation to observe user’s behavior § A/B testing § Polling In-Context gives you the most useful results 25
  23. 23. Knowledge Base (since 2003) 26  
  24. 24. 28
  25. 25. Knowledge base feedback 29  
  26. 26. Knowledge Base Resolve Rate Human Translation Machine Translation Source: Martine Smets, Microsoft Customer Support Microsoft is using a customized version of Microsoft Translator 30  
  27. 27. Statistical MT - The Simple View User Input Text, web pages, Chat etc Government data Microsoft manuals Dictionaries Phrasebooks Publisher data Collect and store parallel and target language data Train statistical models Translation Engine Translation Engine Distributed Runtime Web mined data High-Performance Computing Cluster Translation APIs and UX Translated Output 31
  28. 28. Collaboration: MT + Your community Your  community Your  Web  Site Microsoft   Translator   Collaborative  TM Microsoft  T ranslator  API Your  App Enormous  l anguage   knowledge Remember the collaborative TM? There is more.
  29. 29. Collaboration: You, your community, and Microsoft You, your community and Microsoft working together to create the optimal MT system for your terminology and style Your  community Your  Web  Site Microsoft   Translator   Collaborative  TM Your  App Your  TMs Microsoft  Translator  API Microsoft  Translator  Hub Your  previously  translated   documents Your  custom  MT  system Enormous  language   knowledge Your  collaborators
  30. 30. 34  
  31. 31. Just visit to do it yourself 39  
  32. 32. Office 2013 Beta Send-a-smile program § 107 languages § 234M words translated § $22B revenue, > 60% outside U.S. § > 100,000 Send-a-smiles received § > 500 bugs fixed Example of Business Intelligence use
  33. 33. Contacts Web site Licensing & Pricing Questions General & Customer Questions
  34. 34. 43