Simplification and Explicitation Universals
Upcoming SlideShare
Loading in...5
×
 

Simplification and Explicitation Universals

on

  • 2,410 views

Simplification and Explicitation Universals

Simplification and Explicitation Universals

Statistics

Views

Total Views
2,410
Views on SlideShare
2,340
Embed Views
70

Actions

Likes
0
Downloads
35
Comments
0

3 Embeds 70

http://masterlc.wordpress.com 67
https://www.linkedin.com 2
https://masterlc.wordpress.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Simplification and Explicitation Universals Simplification and Explicitation Universals Presentation Transcript

  • Translation Studies Simplification and Explicitation Universals Claudiu Mih˘il˘ a a Faculty of Computer Science ”Alexandru Ioan Cuza” University of Ia¸i s 21 April 2010
  • Outline Introduction Motivation Translation studies Simplification Definiton Simplification pros Simplification cons Explicitation Definiton Explicitation pros Explicitation cons Conclusions 2 of 13
  • Motivation • The questions ◦ Is there a difference between original and translated language? ◦ If so, is it automatically detectable? ◦ And if so, does it improve NLP quality? 3 of 13
  • Motivation • The questions ◦ Is there a difference between original and translated language? ◦ If so, is it automatically detectable? ◦ And if so, does it improve NLP quality? • The answers ◦ Yes! ◦ Yes: up to 97.62% for simplification ◦ Yes: • Human translator (self-)assessment • Statistical machine translation • Multilingual plagiarism detection 3 of 13
  • Translation studies • Specific lexico-grammatical and syntactic characteristics 4 of 13
  • Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process 4 of 13
  • Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process • Translation laws - Toury (1983) ◦ Standardisation, Interference 4 of 13
  • Translation studies • Specific lexico-grammatical and syntactic characteristics • Translationese - Gellerstam (1986) ◦ ”Fingerprints” left behind by the translation process • Translation laws - Toury (1983) ◦ Standardisation, Interference • Translation universals - Baker (1993) ◦ Simplification, Explicitation, Convergence, Normalisation 4 of 13
  • Simplification • Tendency to produce simpler and easier-to-follow texts 5 of 13
  • Simplification • Tendency to produce simpler and easier-to-follow texts • Laviosa (2002) ◦ Study on small corpus ◦ Features for simplification ◦ Insufficient evidence 5 of 13
  • Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% 6 of 13
  • Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% • Corpas (2008a) ◦ English-into-Spanish and Spanish medical and technical texts ◦ Validated for lexical richness ◦ Contradicted for complex sentences, sentence length, ambiguity, information load, depth of syntactic trees 6 of 13
  • Simplification pros • Baroni (2006) ◦ Detect originals and translations in an Italian corpus ◦ Uni-, bi-, tri-grams, word forms, lemmas, and POS tags ◦ Supervised learning system ◦ Accuracy up to 87% • Corpas (2008a) ◦ English-into-Spanish and Spanish medical and technical texts ◦ Validated for lexical richness ◦ Contradicted for complex sentences, sentence length, ambiguity, information load, depth of syntactic trees • Corpas (2008b) ◦ Validated for lexical richness and density, number of discourse markers, complex sentences, sentence length ◦ More visible for technical domain 6 of 13
  • Simplification pros • Ilisei (2010) ◦ 21 language-independent features ◦ Supervised machine learning - 8 classifiers ◦ Accuracy of 97.62% ◦ Most salient features - InfoGain, ChiSquare • Lexical richness • Sentence length • Proportions of pronouns, conjunctions, grammatical and lexical words 7 of 13
  • Simplification cons • Jantunen (2001) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ typical lexical combinations in most cases 8 of 13
  • Simplification cons • Jantunen (2001) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ typical lexical combinations in most cases • Jantunen (2004) ◦ Boosters in Finnish translations - hyvin, kovin, oikein ◦ untypical lexical combinations in translations ◦ similar colligations in originals and translations 8 of 13
  • Explicitation • Introducing overt information into the translation that is implicit in the source language 9 of 13
  • Explicitation • Introducing overt information into the translation that is implicit in the source language • Classification - Pym (2005) ◦ Obligatory explicitation • Forced by language specificity or grammar ◦ Voluntary explicitation • Optional information to avoid misinterpretations 9 of 13
  • Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know 10 of 13
  • Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know • Olohan (2000) ◦ BNC vs. TEC ◦ say / tell + that / zero connective 10 of 13
  • Explicitation pros • Burnett (1999) ◦ BNC vs. TEC ◦ suggest, admit, claim, think, believe, hope, know • Olohan (2000) ◦ BNC vs. TEC ◦ say / tell + that / zero connective • Olohan (2001) ◦ BNC vs. TEC ◦ promise + that / zero connective 10 of 13
  • Explicitation cons • Cheong (2006) ◦ Explicitation vs. implicitation ◦ English-into-Korean translations ◦ The phenomena appear equally ◦ The direction of translation influences their behaviour 11 of 13
  • Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed 12 of 13
  • Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed • Explicitation ◦ Occuring often to avoid misinterpretations ◦ Implicitation needs to be considered as well 12 of 13
  • Conclusions • Simplification ◦ Many studies supporting it ◦ Many studies contradicting it ◦ Not yet clearly confirmed • Explicitation ◦ Occuring often to avoid misinterpretations ◦ Implicitation needs to be considered as well • Usefulness ◦ SMT ◦ Multilingual plagiarism detection ◦ (Self-)assessment of translators’s work 12 of 13
  • Thank you! • Questions? 13 of 13