Moses on Amazon EC2


Published on

This presentation describes the advantages of running the open source Moses Statistical Machine Translation System on Amazon EC2

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Moses on Amazon EC2

  1. 1. Achim RuoppDigital Silk Road<br />Moses On amazon EC2<br />
  2. 2. Moses Statistical Machine Translation<br />+<br />Digital Silk Road<br />
  3. 3. Who this Presentation is for<br />Moses users looking to save infrastructure costs<br />Small/medium size companies <br />Just starting to use SMT<br />Without the infrastructure investment<br />Large users of SMT <br />Looking to scale their infrastructure to meet changing needs <br />Digital Silk Road<br />
  4. 4. Amazon Web Services<br />Infrastructure web services platform in the cloud <br />Pay only for what you use<br />Inherently scalable<br />Secure & Private<br />Digital Silk Road<br />
  5. 5. AWS Management Console<br />Digital Silk Road<br />
  6. 6. Ingredients for SuccessfulStatistical Machine Translation<br />Digital Silk Road<br />
  7. 7. Parallel Corpora/Text<br />Criteria<br />Sentence-aligned<br />The more, the better <br />The cleaner, the better<br />Sources<br />Freely available (Europarl, Hansards, Opus)<br />Linguistic Data Consortium<br />TAUS Data Association<br />Your own Translation Memories (cleaned up)<br />Tailor-made from the web via<br />Digital Silk Road<br />
  8. 8. Moses Statistical Machine Translation System<br />Academic tool, but stable & well supported<br />Open source under LGPL license<br />Required additional tools<br />GIZA++ aligner<br />IRSTLM language model toolkit (open source alternative to SRILM)<br />Miscellaneous other tools (e.g. NIST BLEU tool)<br />No setup needed!<br />Part of free Amazon EC2 machine image<br />Digital Silk Road<br />
  9. 9. Moses on Amazon EC2<br />Training on a large EC2 instance<br />Multi-core with 7.5GB of memory<br />Necessary for fast training with large corpora<br />Cost: $0.34/hour from Nov 1st<br />Provision as many as needed in minutes<br />Try variations of training data/parameters at the same time<br />Decoding/Translating<br />On a large or small EC2 instance<br />On you own hardware<br />Linux/Unix/Mac/(Windows)<br />Digital Silk Road<br />
  10. 10. Moses +<br />The Web<br />SourceText<br /><br />Parallel Corpus<br />Search Index<br />Search UI<br />S3<br />Trained MT System<br />S3<br />EC2<br />Corpora<br />Moses<br />Digital Silk Road<br />
  11. 11. Translating KDE Documentation from English to German<br />Tuning and test data from K Desktop Environment documentation<br />Digital Silk Road<br />
  12. 12. Training Infrastructure Cost Comparison<br />Excludes costs for corpus gathering, corpus cleaning, operation and decoding<br />Digital Silk Road<br />
  13. 13. Using Moses on Amazon Web Services for Machine Translation<br />Pros<br />No need for hardware setup and maintenance<br />No need for software setup and maintenance<br />No need for in-depth Unix expertise<br />Pay only variable costs<br />Less expensive than owned hardware in most use cases<br />Evaluate many training data combinations/parameter settings fast<br />Cons<br />Need for Amazon Web Services expertise<br />Unsuitable for language pairs with few resources<br />True for all statistical MT<br />Digital Silk Road<br />
  14. 14. More Information<br /><br /><br />Digital Silk Road Services for Moses on EC2<br />Support<br />Training<br />Custom MT Engines with your Data<br />Integration Consulting<br />Digital Silk Road<br />