2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL


Published on

Presentation by CPSL and tauyou at the tekom annual conference. It provides the case of a successful implementation of machine translation in a mid-size Language Service Providers.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

2011 Tekom Wiesbaden: Implementation of a machine translation engine at CPSL

  1. 1. Speaker:Speaker: BelBeléénn GarcGarcííaa--Ochoa (CPSL)Ochoa (CPSL)CoCo--speaker: Diegospeaker: Diego BartolomBartoloméé ((tauyoutauyou <language technology>)<language technology>)Implementation of a MachineImplementation of a MachineTranslation Engine at CPSLTranslation Engine at CPSL
  2. 2. TheThe speakerspeakerLocalization Director at CPSLCPSL is a Multilingual ServiceProvider since 1963Headquarters in Barcelona-SpainOther Offices in:Madrid-SpainGermanyUKCPSL staff includes over 50 peopleBelén García-Ochoa
  3. 3. TheThe coco--speakerspeakerCEO tauyou <language technology>tauyou provides languagetechnologies for the localizationindustry since 2006Main clients: medium-sized LSPsHeadquarters in BarcelonaDiego Bartolomé
  4. 4. CPSL and Machine TranslationPost-editing services provided to a softwarecompany for a huge projectLots of translated words in a tight timeframe
  5. 5. MainMain difficultiesdifficulties foundfoundLotsLots ofof clientsclientsDifferentDifferent subjectsubject mattersmattersDifferentDifferent languagelanguage combinationscombinations
  6. 6. WorkaroundWorkaroundLotsLots ofof clientsclients::AA listlist ofof thethe mostmost appropiateappropiate clientsclients forforusingusing thethe engineengine waswas createdcreatedBasedBased onon thisthis listlist,, wewe establishedestablished thetheDifferentDifferent subjectsubject mattersmattersAndAnd thetheDifferentDifferent languagelanguage combinationscombinations
  7. 7. HumanHuman postpost--editingediting vs.vs.humanhuman translationtranslationTheThe standardstandard wordswords thatthat aa translatortranslatorcan docan do perper dayday isis 2,5002,500..TheThe standardstandard wordswords thatthat aa reviewerreviewer ofofhumanhuman translationtranslation can docan do perper dayday isis12,000.12,000.AnAn averageaverage ofof thethe wordswords thatthat can becan bepostpost--editededited perper dayday isis 8,000.8,000.
  8. 8. Dedicated hybrid machine translationDedicated hybrid machine translationengine that is continuously customizedengine that is continuously customizedCorpusCorpus--based with rules for prebased with rules for pre-- andandpostpost--processingprocessingData confidentiality is guaranteedData confidentiality is guaranteedTranslation speedTranslation speedThe tauyou solutionThe tauyou solution
  9. 9. Any type of documentAny type of documentGlossary priorizationGlossary priorizationFast domain creation/updateFast domain creation/updateFully customizableFully customizableQuality metrics computationQuality metrics computationTerminology extractionTerminology extractionMain characteristicsMain characteristics
  10. 10. gather ingather in--domain datadomain datatrain the translation solutiontrain the translation solutionenrich solution with related textenrich solution with related textterminology priorizationterminology priorizationupdate the translation solutionupdate the translation solutionadd rules to enhance qualityadd rules to enhance qualityweekly updatesweekly updatesOptimum domain creationOptimum domain creation
  11. 11. Optimize translation quality for a clientOptimize translation quality for a clientgather client datagather client datatrain the translation solutiontrain the translation solutionadd rules to enhance qualityadd rules to enhance qualitycontinuous improvementcontinuous improvementCPSL workflow 1CPSL workflow 1
  12. 12. General purpose translatorGeneral purpose translatorgather clients datagather clients dataadd generic texts to provide a good sampleadd generic texts to provide a good sampletrain the translation solutiontrain the translation solutionadd rules to enhance qualityadd rules to enhance qualityperiodical improvementperiodical improvementCPSL workflow 2CPSL workflow 2
  13. 13. Data creation and enhancementData creation and enhancementuser defineduser definedunaligned translated documentsunaligned translated documentsgeneric translationsgeneric translationsoptimum corpus/memories creationoptimum corpus/memories creationrulerule--based extension/filteringbased extension/filteringOther use casesOther use cases
  14. 14. tauyou interfacetauyou interfaceTabs can be customizedTabs can be customized
  15. 15. Detailed analysis of translated documentsDetailed analysis of translated documentsSeveral customized parameters, including wordSeveral customized parameters, including worderror rate, number of word edits, tag differences, etcerror rate, number of word edits, tag differences, etcUseful in machine translation but also in normalUseful in machine translation but also in normalquality processquality processQuality metricsQuality metrics
  16. 16. Unilingual and bilingual terminology listsUnilingual and bilingual terminology listsCustomized according to position in the sentence,Customized according to position in the sentence,word type, number of words, etcword type, number of words, etcFeed the MT engine or tool for human translatorFeed the MT engine or tool for human translatorTerminology extractionTerminology extraction
  17. 17. Increase usage of translation memoriesIncrease usage of translation memoriesAutomatic domain classificationAutomatic domain classificationSource text enhancementSource text enhancementspelling, grammar, structure, terminology ...spelling, grammar, structure, terminology ...Special words detectionSpecial words detectionNew domains/language pairs creationNew domains/language pairs creationThe futureThe future
  18. 18. QuestionsQuestions??bgarciabgarcia--ochoa@cpsl.comochoa@cpsl.comwww.cpsl.comwww.cpsl.comdbc@tauyou.comdbc@tauyou.comwww.tauyou.comwww.tauyou.com