Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TAUS MT Showcase, Sovee Smart Engine 2.0, A Leap Beyond Base Moses Technology, Scott Gaskill, Sovee

This month marks the advent of a new generation in Machine Translation. With the release of Sovee Smart Engine 2.0, it is now possible to process virtually unlimited simultaneous transactions without the limitations originally inherent to the base Moses technology. Sovee's latest development delivers an unprecedented 500 language engines, which will expand to thousands of languages in the next few years. This workshop will demonstrate the automated language tuning and training capabilities of Sovee Smart Engine 2.0. It will highlight the deep cascading framework that delivers the highest level of accuracy ever imagined for machine translation, and a new combined process for SMT and post-editing.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.

MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.
For the latest updates go to
or follow us on Twitter - #MosesCore

  • Login to see the comments

TAUS MT Showcase, Sovee Smart Engine 2.0, A Leap Beyond Base Moses Technology, Scott Gaskill, Sovee

  1. 1. Wednesday,  4  June   Sovee  Smart  Engine  2.0:   A  Leap  Beyond  Base  Moses  Technology   Sco$  Gaskill,  Sovee   TAUS  Machine  TranslaDon  Showcase  2014   Dublin  (Ireland)   The  research  within  the  project  MosesCore  leading  to  these  results  has  received  funding  from  the  European  Union  7th  Framework  Programme,  grant  agreement  no  288487  
  2. 2. Presented by: Scott Gaskill Christopher Klapp June 4, 2014      MT  Showcase  
  3. 3. 3   I  skate  to  where  the  puck  is  going  to  be,  not   where  it  has  been.       Wayne  Gretzsky,  Hockey  Star  
  4. 4. 4   Where  is  the  world  going?   CNNTech,  “Google  boss:  EnDre  world  will  be  online  by  2020,”  April  2013  hXp://­‐schmidt-­‐internet     Kenya    stat  from  ITU,  2-­‐13.  Photo  used  by  permission  of  Deseret  News.   2016  the  world  will  have   internet  connecDvity     By  the  end  of  this  decade   everyone  in  the  world  will   be  on  the  Web,  with   Mobile  access  growing  as   the  preferred  interface     In  Kenya,  99%  of  Internet   connecDons  are  mobile    
  5. 5. 5   We  are  entering  the  Convergence  era:  translaDon   will  be  a  uDlity  embedded  in  every  app,  device  and   screen.  Businesses  will  prosper  by  finding  new   customers  in  new  markets….       Consumers  will  become  world-­‐wise,     communicaDng  as  if  language  barriers  never   existed.       Jaap  van  der  Meer,  Director  of  TAUS,  2013  
  6. 6. 6   Transla9on  Memory  –  Is  More  Be?er?   If  we  simply  add  an  addiDonal  1,000  TM  lines  to  a  database  of   40-­‐60  billion,  will  we  see  beXer  translaDons?       Knowing  how  to  use  the  data  is  key  
  7. 7. 7   Challenges     Technology,  approach  &  process   Progress  in  first  60  years   Progress  Needed  by  2016   Engines  for  <  150  Languages   Engines  for  >  6000  languages   <  3%  of  the  world’s  content   translated   All  content  translated   Cloud-­‐based  speed  providing  more   servers  for  translaDon   92  billion  Servers   StaDsDcal  translaDon  introduced,   but  “fuzzy  logic”  does  not  deliver   quality  businesses  need   Quality  improvement  to  standards   required  to  meet  world  commerce   demand  
  8. 8. 8   4  (  n(n-­‐1)   2   )   Generic  SMT   92  million     9.2  billion  –   based  on  100   businesses   92  billion   Based  on   1000   customers   Not  valued  as   pracDcal  –   infinite   servers   required   MT  Assets  (cascades)   Technology  Challenge   6800 languages Generic  SMT   Domain   Generic  SMT   Domain   Customer   Generic  SMT   Domain   Customer   Project   Minimum  Server  Requirements  
  9. 9. 9   Accuracy  Challenge   Relevant  Segments   General    Corpus     Adequacy   Accuracy   General  MT  (30-­‐40%)    TM  (40-­‐60%)    Post  EdiDng  (up  to  100%)  
  10. 10. Preparing  new  project  /  import   TM  /  CAT   Leverage  Exact  Fuzzy  Match   Post  Edit   Review     Deliver  to  customer         Gather  past  TM   Package  and  send  TM  to  SMT   provider   Clean,  tokenize,  data  (prepare  data)   Train  –Tune-­‐Test  (3Ts)   Repeat  unDl  viewed  as  acceptable   (repeat  with  customer  data  each   Dme)   10            Post  Edi9ng                        Learning  Engine        SMT  Workflow   Segments  are  not  just  a   string  of  text  –  they  are  a   living  learning  en99es   Process Real-time Automation and Integration Sovee     Smart  Engine  2.0  
  11. 11. 11   Smart  Engine  Advantages   Language   from   Scratch   Seamless   integra9on  to   Post  Edi9ng   workflow   Training  /   Learning   Efficiency  Gains  (what  we  have  seen)      Post  ediDng  –  50%+  improvement        TM  /MT  management  and  training  –  100%  improvement     Update  MT  on  the  fly     Watch  it  learn  before   your  eyes   Never  leave  the  post   ediDng  environment  
  12. 12. 12   Learned  Transla9ons   !"#"$%&'()"*+"&',( -"&".%#((/0.12,(( 34"52%67( 3662.%67( -"&".%#(89( :0+%;&( (<.*%&;=%>0&( /2,'0+".( ?.0@"6'( 3,,"'( 9%*,( Cascading  Assets   Sovee  Smart   Engine  MT   Learned  Segments   Segment  output  
  13. 13. 1   2   Asset  Synchrony  (CAT  Tools)   Post  edi9ng  interface   Smart  Engine   13   Asset  Push  (Past  TM)   Real-­‐9me  progressive   transla9on  cycle  (Sovee  MT,   save  /push  post  edits)   1 2
  14. 14. 14   Demo  
  15. 15. 15   Seamless  Integra9on   “Convergence  Era”   Apps   Websites   eCommerce   elearning   Videos   Podcasts   Sorware   Live  chat   Text  Messages   email        
  16. 16. Japan   Sovee  Smart  Engine   TranslaDon   USA   Yukiko  (Japan):   ホールインワンを決めたよ!     Robert  (USA):   I  just  scored  a  hole-­‐in-­‐one!   Original:   ホールインワンを決めたよ!            Japan   SNAG  
  17. 17. 17   Jack  Nicklaus  Learning  Leagues   Languages:  Spanish  and  Japanese     In  Process:  10  more  languages   Video  and  Training  Materials  for  Golf  Instruc9on  
  18. 18. 18   R.E.  Michel  
  19. 19. 19   Ques9ons?