© 2011 #1language technologyforoptimum localizationDiego Bartolomé, CEO
© 2011 #2optimum workflowgather in-domain datatrain the translation solutionenrich solution with related textterminology p...
© 2011 #3data issues 1<large volume of heterogeneus data>training with all the datasemantic classification for domain sele...
© 2011 #4data issues 2<scarce data>add dictionaries into corporacomplementary segments from memoriesbalance client data wi...
© 2011 #5data issues 3<dirty data>remove multiple translationseliminate text in other languagescorrect spellingselect sent...
© 2011 #6data issues 4<data creation and enhancement>final client definedunaligned translated documentsgeneric translation...
© 2011 #7linguistic issues 1<untranslated words>dictionary creation<grammatical errors>post-processing rules<blind quality...
© 2011 #8linguistic issues 2<source text cleaning>spelling and grammarsentence simplificationterminology homogenization<sp...
© 2011 #9use case<recurrent small volumes>frequent translationsclients from different domains<workflow>gather as much data...
© 2011 #10<tauyou_text> snapshot<fully customizable>look and feel + functionality
© 2011 #11Thanks!// Diego Bartolomé, PhD<address> C/ Les Planes 39 – 08201 Sabadell – Spain<phone> +34 93 711 29 96<cell> ...
Upcoming SlideShare
Loading in …5
×

2011 TAUS Executive Forum Barcelona: Language Technology for optimum localization

192 views

Published on

Presentation by Diego Bartolome, tauyou CEO at the TAUS Executive Forum that was held in Barcelona in 2011. It provides insights into how to deal with specific data and linguistic issues that arise when you are creating machine translation solutions for Language Service Providers.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
192
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2011 TAUS Executive Forum Barcelona: Language Technology for optimum localization

  1. 1. © 2011 #1language technologyforoptimum localizationDiego Bartolomé, CEO
  2. 2. © 2011 #2optimum workflowgather in-domain datatrain the translation solutionenrich solution with related textterminology priorizationupdate the translation solutionadd rules to enhance qualityweekly updates
  3. 3. © 2011 #3data issues 1<large volume of heterogeneus data>training with all the datasemantic classification for domain selectionfine tuning for each clientglossary priorizationcontinuous machine learning
  4. 4. © 2011 #4data issues 2<scarce data>add dictionaries into corporacomplementary segments from memoriesbalance client data with generic textsin-domain adaptation of generic systemincrease the number of sentences with rules
  5. 5. © 2011 #5data issues 3<dirty data>remove multiple translationseliminate text in other languagescorrect spellingselect sentences with correct grammarautomatic alignment with client terminologyfilter out other undesired segments
  6. 6. © 2011 #6data issues 4<data creation and enhancement>final client definedunaligned translated documentsgeneric translationsoptimum corpus/memories creationrule-based extension/filtering
  7. 7. © 2011 #7linguistic issues 1<untranslated words>dictionary creation<grammatical errors>post-processing rules<blind quality filtering>do not translate sentences below threshold
  8. 8. © 2011 #8linguistic issues 2<source text cleaning>spelling and grammarsentence simplificationterminology homogenization<special words detection>people, places, organizationsalphanumeric codes
  9. 9. © 2011 #9use case<recurrent small volumes>frequent translationsclients from different domains<workflow>gather as much data as possiblereceive a new file for translationcreate an ad hoc domain for that filetrain the translation solution + basic rules<output>optimum adaptation for a file in around 4 hours
  10. 10. © 2011 #10<tauyou_text> snapshot<fully customizable>look and feel + functionality
  11. 11. © 2011 #11Thanks!// Diego Bartolomé, PhD<address> C/ Les Planes 39 – 08201 Sabadell – Spain<phone> +34 93 711 29 96<cell> +34 670 331 225<email> dbc@tauyou.com<www> tauyou.com

×