TAUS USER CONFERENCE 2010, Getting started with Moses: An Adobe case study

689 views

Published on

Jeff Rueppel, Translation Technologies Engineer, Adobe

Jeff will present the process for getting up and running with Moses, detailing the requirements in terms of man-hours, skill set, computing and data resources. He will then outline the results of a recent comparative exercise between 3 commercial offerings and Adobe’s homegrown Moses engines.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
689
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

TAUS USER CONFERENCE 2010, Getting started with Moses: An Adobe case study

  1. 1. TAUS USER CONFERENCE 2010 LANGUAGE BUSINESS INNOVATION 4 – 6 OCTOBER / PORTLAND (OR), USA TUESDAY 5 OCTOBER / 15.15 GETTING STARTED WITH MOSES: AN ADOBE CASE STUDY Jeff Rueppel, Adobe
  2. 2. Getting Into Moses – An Adventure Story The big unanswered questions: What can Moses do? What can we do with Moses? And.. How much effort and how much time will it take to get Moses to do what we want?
  3. 3. Getting Into Moses – An Adventure Story Human Resources - Machine Resources Time Line: 1-2 Weeks Installation/ Technical Fluency 3-4 Weeks TMX Conversion / Corpus Cleaning 3-4 Weeks Tools And Process Refinement
  4. 4. 3 Phases To The Moses Story Phase 1: Tame Your Corpus • Large Corpus Size • 24 Million Words • 36 Languages • Quantity Vs. Quality • Large File Handling • Cleaning • Tooling
  5. 5. Phase 1: Tame Your Corpus – Moses Corpus Tool TMX -> Moses Corpus Tool Moses Functionality • Tokenizing • Casing (Lower,Upper) • Cleaning Long Segments Adobe Functionality • Placeholder Handling • URL Handling • Number Cleaning • Duplicate Line Cleaning • Clean Weird Aligned Pairs
  6. 6. 3 Phases To The Moses Story Phase 1: Tame Your Corpus Phase 2: Train Your Engines
  7. 7. Phase 2 – Train Your Engines Human Resources Hardware Resources 6 Million Words, 4GHz Machine, 4GBs 2 Hours Training & 12 Hours Tuning =
  8. 8. Phase 3: Deploy Your Engines  World Server Integration  Web Service Integration  Engine Hosting  Training Server Harness  Evaluation Harness
  9. 9. 3 Phases To The Moses Story Phase 1: Tame Your Corpus Phase 2: Train Your Engines Phase 3: Deploy Your Engines

×