TAUS MT SHOWCASE, Moses and Other Open Resources, Maxim Khalilov, TAUS Labs, 12 June 2013


Published on

This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.

MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme.

For the latest updates, follow us on Twitter - #MosesCore

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

TAUS MT SHOWCASE, Moses and Other Open Resources, Maxim Khalilov, TAUS Labs, 12 June 2013

  1. 1. TAUS  MACHINE  TRANSLATION  SHOWCASE  Moses and Other Open Resources09:40 – 10:00Wednesday, 12 June 2013Maxim KhalilovTAUS Labs
  2. 2. Tools  Data  Education  and  trainings  Support  MT  open  resources  
  3. 3. Tools  Data  Education  and  trainings  Support  MT  open  resources  
  4. 4. MT  open  resources:  tools  o  Open source MT toolkits:o  Moses (University of Edinburgh, UK + others)o  Joshua (JHU, USA)o  Cdec (CMU, USA)o  NiuTrans (Northeastern University, China)o  Apertium (University of Alicante, Spain)o  etc..o  Free MT support tools:o  Word alignment (GIZA++, MGIZA++, Berkeley Aligner, etc.)o  Language-dependent tools (tokenizers, segmentors, parsers,..)o  MT evaluation tools (BLEU, TER, METEOR, etc.)o  Many more…  
  5. 5. MT  open  resources:  tools  TAUS  Tracker:  h5p://www.taustracker.com/    
  6. 6. Tools  Data  Education  and  trainings  Support  MT  open  resources  
  7. 7. MT  open  resources:  data  Name Description Domain Aligned data(average)LanguagesEuroparl EuropeanParliamentProceedingsLegal/“GeneralDomain”1.8 millionsentences11 EuropeanlanguagesJRC-Acquis EU laws Legal 270 000paragraphs22 EuropeanlanguagesHansards CanadianParliamentProceedingsLegal/“GeneralDomain”1.3 millionsentencesNorth AmericanEnglish, FrenchUN Resolutions of thegeneral assemblyLegal 3 million words English, French,Spanish, Russian,Chinese,Arabic}  Governmental resources:
  8. 8. MT  open  resources:  data  Name Description Domain LanguagesOPUS Free corporacollected by JörgTiedemannIT, movie subtitles,medicalEuropean, non-European for ITLDC Linguistic DataConsortium (US)News English, Chinese,Arabic, …ELRA European LanguageResourcesAssociationEuropean}  Academic resources:
  9. 9. MT  open  resources:  data  }  Industrial resources:Name Description Domain LanguagesTAUS Data* TAUS DataRepositorySeveral with slantto ITAll major languagesTMs TranslationMemories• your own• from yourcustomer• from your supplierProject-specific(great for v2.0 orlater)* Open for the participants of the TAUS Developing Talent project.
  10. 10. MT  open  resources:  data  ü  2,200  language pairs  ü  17 industry categoriesü  more than 54 billion words
  11. 11. Tools  Data  Education  and  trainings  Support  MT  open  resources  
  12. 12. MT  open  resources:  educa>on  and  trainings  }  TAUS MT and Moses tutorial}  Online courses (Coursera.org, Stanford NLP course)}  TAUS Developing Talent project}  Machine Translation Marathons}  Other online resources (JHU MT class, UPC practicaltutorial, UEdin MT class)
  13. 13. MT  open  resources:  TAUS  MT  and  Moses  Tutorial      o  https://tauslabs.com/open-source-mt/mosescore/50-moses-tutorial-guesto  Online tutorialo  Narrated presentationso  Step-by-step screen castso  Technical audienceo  Learn about statistical MT and its practical application onthe example of Moses  
  14. 14. Moses-­‐specific  Presenta=on/  Demo  Principles  of  Machine  Transla>on No   Presenta>on  Training  Data  Data  Types  and  Sources   No   Presenta>on  Data  Conversion  and  Corpus  Prepara>on   No   Demo  Data  Cleaning  and  Tokeniza>on   No   Presenta>on  Data  Cleaning  and  Tokeniza>on  Demo   No   Demo  Training  Moses  MT  Systems  Moses  Introduc>on   Yes   Presenta>on  Training  a  Moses  MT  System   Yes   Demo  Bulk  Transla>on  and  MT  System  Op>miza>on   Yes   Demo  MT  open  resources:  TAUS  MT  and  Moses  Tutorial  
  15. 15. Moses-­‐specific  Presenta=on/  Demo  Evalua>ng  MT  Systems  Automa>c  Metrics   No   Presenta>on  Human  Evalua>on   No   Presenta>on  Integra>on  Document  Transla>on  and  Integra>on  Scenarios   Yes   Presenta>on  Document  Transla>on  and  Web  API  Demo   Yes   Demo  o  More  to  come  o  Demos  o  In-­‐depth  Info  o  Commercial  Vendor  Presenta>ons      MT  open  resources:  TAUS  MT  and  Moses  Tutorial      
  16. 16. MT  open  resources:  TAUS  MT  and  Moses  Tutorial      
  17. 17. MT  open  resources:  TAUS  MT  and  Moses  Tutorial      
  18. 18. MT  open  resources:  TAUS  MT  and  Moses  Tutorial      
  19. 19. MT  open  resources:  TAUS  MT  and  Moses  Tutorial      
  20. 20. MT  open  resources:  TAUS  Developing  Talent  
  21. 21. MT  open  resources:  TAUS  Developing  Talent  
  22. 22. MT  open  resources:  TAUS  Developing  Talent  
  23. 23. Tools  Data  Education  and  trainings  Support  MT  open  resources  
  24. 24. MT  open  resources:  Support  o  Moses support listo  http://mailman.mit.edu/mailman/listinfo/moses-supporto  EAMT MT listo  http://www.eamt.org/mt-list.phpo  Corpora listo  http://www.hit.uib.no/corpora/