Simplification and Explicitation Universals

2,422 views

Published on

Simplification and Explicitation Universals

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,422
On SlideShare
0
From Embeds
0
Number of Embeds
82
Actions
Shares
0
Downloads
38
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Simplification and Explicitation Universals

  1. 1. Translation Studies Simplification and Explicitation Universals Claudiu Mih˘il˘ a a Faculty of Computer Science ”Alexandru Ioan Cuza” University of Ia¸i s 21 April 2010
  2. 2. Outline Introduction Motivation Translation studies Simplification Definiton Simplification pros Simplification cons Explicitation Definiton Explicitation pros Explicitation cons Conclusions 2 of 13
  3. 3. Motivation • The questions ◦ Is there a difference between original and translated language? ◦ If so, is it automatically detectable? ◦ And if so, does it improve NLP quality? 3 of 13
  4. 4. Motivation • The questions ◦ Is there a difference between original and translated language? ◦ If so, is it automatically detectable? ◦ And if so, does it improve NLP quality? • The answers ◦ Yes! ◦ Yes: up to 97.62% for simplification ◦ Yes: • Human translator (self-)assessment • Statistical machine translation • Multilingual plagiarism detection 3 of 13
  5. 5. Translation studies • Specific lexico-grammatical and syntactic characteristics 4 of 13
  6. 6. Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process 4 of 13
  7. 7. Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process • Translation laws - Toury (1983) ◦ Standardisation, Interference 4 of 13
  8. 8. Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process • Translation laws - Toury (1983) ◦ Standardisation, Interference • Translation universals - Baker (1993) ◦ Simplification, Explicitation, Convergence, Normalisation 4 of 13
  9. 9. Simplification • Tendency to produce simpler and easier-to-follow texts 5 of 13
  10. 10. Simplification • Tendency to produce simpler and easier-to-follow texts • Laviosa (2002) ◦ Study on small corpus ◦ Features for simplification ◦ Insufficient evidence 5 of 13
  11. 11. Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% 6 of 13
  12. 12. Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% • Corpas (2008a) ◦ English-into-Spanish and Spanish medical and technical texts ◦ Validated for lexical richness ◦ Contradicted for complex sentences, sentence length, ambiguity, information load, depth of syntactic trees 6 of 13
  13. 13. Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% • Corpas (2008a) ◦ English-into-Spanish and Spanish medical and technical texts ◦ Validated for lexical richness ◦ Contradicted for complex sentences, sentence length, ambiguity, information load, depth of syntactic trees • Corpas (2008b) ◦ Validated for lexical richness and density, number of discourse markers, complex sentences, sentence length ◦ More visible for technical domain 6 of 13
  14. 14. Simplification pros • Ilisei (2010) ◦ 21 language-independent features ◦ Supervised machine learning - 8 classifiers ◦ Accuracy of 97.62% ◦ Most salient features - InfoGain, ChiSquare • Lexical richness • Sentence length • Proportions of pronouns, conjunctions, grammatical and lexical words 7 of 13
  15. 15. Simplification cons • Jantunen (2001) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ typical lexical combinations in most cases 8 of 13
  16. 16. Simplification cons • Jantunen (2001) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ typical lexical combinations in most cases • Jantunen (2004) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ untypical lexical combinations in translations ◦ similar colligations in originals and translations 8 of 13
  17. 17. Explicitation • Introducing overt information into the translation that is implicit in the source language 9 of 13
  18. 18. Explicitation • Introducing overt information into the translation that is implicit in the source language • Classification - Pym (2005) ◦ Obligatory explicitation • Forced by language specificity or grammar ◦ Voluntary explicitation • Optional information to avoid misinterpretations 9 of 13
  19. 19. Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know 10 of 13
  20. 20. Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know • Olohan (2000) ◦ BNC vs. TEC ◦ say / tell + that / zero connective 10 of 13
  21. 21. Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know • Olohan (2000) ◦ BNC vs. TEC ◦ say / tell + that / zero connective • Olohan (2001) ◦ BNC vs. TEC ◦ promise + that / zero connective 10 of 13
  22. 22. Explicitation cons • Cheong (2006) ◦ Explicitation vs. implicitation ◦ English-into-Korean translations ◦ The phenomena appear equally ◦ The direction of translation influences their behaviour 11 of 13
  23. 23. Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed 12 of 13
  24. 24. Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed • Explicitation ◦ Occuring often to avoid misinterpretations ◦ Implicitation needs to be considered as well 12 of 13
  25. 25. Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed • Explicitation ◦ Occuring often to avoid misinterpretations ◦ Implicitation needs to be considered as well • Usefulness ◦ SMT ◦ Multilingual plagiarism detection ◦ (Self-)assessment of translators’s work 12 of 13
  26. 26. Thank you! • Questions? 13 of 13

×