Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MT Experiences At Sybase  Kerstin Bier  Manuel Herranz PangeaMT
MT at Pangeanic From Trial to Service 2007/08 . 2009/10 2011/12 <ul><li>DIY SMT   </li></ul><ul><li>Empower Users </li></u...
MT and PE at Sybase:  From Trial to Production 2009/2010 Trial project with PangeaMT (EN-DE) Engine: 2.5  million words, n...
Expectations <ul><ul><li>MT output gets better over time </li></ul></ul><ul><ul><li>Continuous PE productivity increase </...
Main Challenges
Results: From „human“ perspective
Results: The MT perspective Project 1 Project 2 Retraining  METEOR score range 100-70 50- 69 40 - 49 30 - 39 0 – 29
Examples: MT output and PE effort  Minimal PE effort Small PE effort Medium PE effort
Lessons Learned <ul><li>Small in-domain MT engine = excellent starting point  </li></ul><ul><ul><li>For future projects:  ...
2015 2014 2013 2011 2010 2009 2012 2018 2017 2016 User empowerment YEAR 2016 000's of customized MT systems Predictions Pa...
2015 2014 2013 2011 2010 2009 2012 2018 2017 2016 User empowerment YEAR 2016 000's of customized MT systems Predictions Pa...
Thank you! Kerstin Bier Sybase, An SAP Company Manuel Herranz PangeaMT
Upcoming SlideShare
Loading in …5
×

kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sybase

4,744 views

Published on

Co-presentation by Kerstin Bier and Manuel Herranz in Localization World Barcelona 2011 on the achievement and progress made by a customized PangeaMT engine at Sybase. Initial machine translation implementation, machine translation customization for Sybase, use of client's data for training and productivity results.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sybase

  1. 1. MT Experiences At Sybase Kerstin Bier Manuel Herranz PangeaMT
  2. 2. MT at Pangeanic From Trial to Service 2007/08 . 2009/10 2011/12 <ul><li>DIY SMT </li></ul><ul><li>Empower Users </li></ul><ul><li>Glossary </li></ul><ul><li>Automated re-training </li></ul><ul><li>Transfer architecture and know-how to users </li></ul><ul><li>Compatibility with commercial formats (ttx, sdlxliff, itd) </li></ul>2007 and before <ul><li>RB tests with commercial software </li></ul><ul><li>Insufficiently good output </li></ul><ul><li>Only internal production </li></ul><ul><li>EU Post-Editing Award </li></ul><ul><li>V1: Small data sets (2-5M words), </li></ul><ul><li>automotive & electronics </li></ul><ul><li>(ES), then Fr/It/De in other fields </li></ul><ul><li>Division born </li></ul><ul><li>00's of engine trials and language combinations </li></ul><ul><li>Open-Source to commercial </li></ul><ul><li>TMX / XLIFF workflows </li></ul>
  3. 3. MT and PE at Sybase: From Trial to Production 2009/2010 Trial project with PangeaMT (EN-DE) Engine: 2.5 million words, narrow domain (one product) Results:Surprisingly good (BLEU: 49, PE productivity > 70 % ) 2010 Project 1: MT and Post-Editing (EN-DE) Engine: 5 million words Major new release, lots of new content: 400.000 „new“ words post-edited 2010/2011 Project 2: Retraining, MT + PE Retraining with post-edited and cleaned-up TMs Small product update: 80.000 words „no matches“ MT + PE
  4. 4. Expectations <ul><ul><li>MT output gets better over time </li></ul></ul><ul><ul><li>Continuous PE productivity increase </li></ul></ul><ul><ul><li>Turnaround times shorter </li></ul></ul><ul><ul><li>Cost savings go up over time </li></ul></ul>Initial system – Expected output (% of MT words) Retraining 1, Retraining 2, …
  5. 5. Main Challenges
  6. 6. Results: From „human“ perspective
  7. 7. Results: The MT perspective Project 1 Project 2 Retraining METEOR score range 100-70 50- 69 40 - 49 30 - 39 0 – 29
  8. 8. Examples: MT output and PE effort Minimal PE effort Small PE effort Medium PE effort
  9. 9. Lessons Learned <ul><li>Small in-domain MT engine = excellent starting point </li></ul><ul><ul><li>For future projects: Faster turnaround, lower costs </li></ul></ul><ul><ul><li>Other product lines </li></ul></ul><ul><ul><li>Experiences help with other languages </li></ul></ul><ul><li>MT output better than expected </li></ul><ul><ul><li>Often better than translators said </li></ul></ul><ul><ul><li>Improved after retraining </li></ul></ul><ul><li>We think that improving translator acceptance will improve productivity </li></ul><ul><ul><li>Idea: Filtering out poor translations (confidence scores) </li></ul></ul><ul><ul><li>Retraining, retraining, retraining </li></ul></ul>
  10. 10. 2015 2014 2013 2011 2010 2009 2012 2018 2017 2016 User empowerment YEAR 2016 000's of customized MT systems Predictions PangeaMT Tech. not the realm of a few providers
  11. 11. 2015 2014 2013 2011 2010 2009 2012 2018 2017 2016 User empowerment YEAR 2016 000's of customized MT systems Predictions PangeaMT Tech. not the realm of a few providers
  12. 12. Thank you! Kerstin Bier Sybase, An SAP Company Manuel Herranz PangeaMT

×