Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2012 MosesCore GALA Monaco: Friendly Machine Translation


Published on

Presentation by tauyou language technology at the annual GALA conference, in the event organized by TAUS for the MosesCore project.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

2012 MosesCore GALA Monaco: Friendly Machine Translation

  1. 1. © 2012 #1friendly machine translationDiego Bartolomé, CEO
  2. 2. © 2012 #2outlinebefore starting with machine translationwhat happens when you go livehow to minimize the riskspractical hints + some numbers
  3. 3. © 2012 #3is machine translation for us?<LSP> <tauyou>translation memories open-source corporaprevious documents documentation alignmentwebsites of clients public informationlanguage-specific rules programming of rulesTAUS data terminology extraction<some issues>minimum amount of dataneed for data classificationlanguage pairs
  4. 4. © 2012 #4for sure it is!<data cleaning + selection>translation tables and language modelsdata and parameters for tuningtest measures<engines creation>several + pruning afterwards<engine validation>by professional translators<continuous improvement>new files, new corpora, new rules, etc.
  5. 5. © 2012 #5the production process (I)statistical MT decodingconvertfile formatsegmenttextNLPtaskstokenizerewritesourcelowercase
  6. 6. © 2012 #6the production process (II)statistical MT decodingtranslatedfilereformat detokenizerewritetargetuppercaseevaluate
  7. 7. © 2012 #7risk minimization<tauyou>quality metrics computation<LSP>time and cost analysis<LSP> + <tauyou>track the evolution over time
  8. 8. © 2012 #8practical hintsbigger clientslanguageswith highest translation volumeswith similar structurewith specific terminology/needsMT-friendly translatorsstart moving
  9. 9. © 2012 #9some numbersmore than 1,500 million words per monthin latin languages ES, FR, PT, CA, GA, IT, ROEN as source or target is the starES, FR, DE, PT, IT, DA, SV, ZH, AR, JP...LSPs are translating +3 million words per monthinvestment pays off if you translate+50,000 words per month
  10. 10. © 2012 #10Thanks!// Diego Bartolomé, PhD<address> C/ Les Planes 39 – 08201 Sabadell – Spain<phone> +34 93 711 29 96<cell> +34 670 331 225<email><www>