The Translation Game

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    The Translation Game - Presentation Transcript

    1. Introduction The actors The scenarios 5 counterobjections Conclusions The translation game Machine translation evaluation without prejudice Federico Gobbo federico.gobbo@uninsubria.it University of Insubria, Varese, Italy CC Some rights reserved. ECAP09, UAB, Barcelona, July 2009 1/22
    2. Introduction The actors The scenarios 5 counterobjections Conclusions Why machine translation evaluation is interesting? Machine Translation (MT): a coherent chain of grammatical sentences written in a given natural language (NL) rendered by a machine from a source language (Ls ) into a reliable text written in a target language (Lt ). 2/22
    3. Introduction The actors The scenarios 5 counterobjections Conclusions Why machine translation evaluation is interesting? Machine Translation (MT): a coherent chain of grammatical sentences written in a given natural language (NL) rendered by a machine from a source language (Ls ) into a reliable text written in a target language (Lt ). Remarks: only asynchronous written texts (otherwise, interpretation). 2/22
    4. Introduction The actors The scenarios 5 counterobjections Conclusions Why machine translation evaluation is interesting? Machine Translation (MT): a coherent chain of grammatical sentences written in a given natural language (NL) rendered by a machine from a source language (Ls ) into a reliable text written in a target language (Lt ). Remarks: only asynchronous written texts (otherwise, interpretation). without any human aid (otherwise, Computer-Aided Translation). 2/22
    5. Introduction The actors The scenarios 5 counterobjections Conclusions Why machine translation evaluation is interesting? Machine Translation (MT): a coherent chain of grammatical sentences written in a given natural language (NL) rendered by a machine from a source language (Ls ) into a reliable text written in a target language (Lt ). Remarks: only asynchronous written texts (otherwise, interpretation). without any human aid (otherwise, Computer-Aided Translation). MT can be seen as a corpus-based test of the appropriateness of models we have of the language faculty, if the evaluation is performed in the appropriate setting. 2/22
    6. Introduction The actors The scenarios 5 counterobjections Conclusions The problem of MT evaluation In the development process of an MT engine, automatic evaluators are used: they implement algorithms that measure the distance of the target text (Tt ) from a gold standard reference corpus, trained through machine learning techniques. Nonetheless, it is well known among specialists that automated evaluation is not enough, especially in the production phase of the life cycle of the MT software, where informants are used. 3/22
    7. Introduction The actors The scenarios 5 counterobjections Conclusions Why using a Gedankenexperiment? I argue that the setting where the MT evaluation is performed is not free from psychological and epistemological prejudice by informants, a-priori invalidating their judgement. Therefore, I compare two MT evaluation settings as a Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle a (Chinese Room, 1980). 4/22
    8. Introduction The actors The scenarios 5 counterobjections Conclusions Why using a Gedankenexperiment? I argue that the setting where the MT evaluation is performed is not free from psychological and epistemological prejudice by informants, a-priori invalidating their judgement. Therefore, I compare two MT evaluation settings as a Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle a (Chinese Room, 1980). The first setting is called the default scenario. 4/22
    9. Introduction The actors The scenarios 5 counterobjections Conclusions Why using a Gedankenexperiment? I argue that the setting where the MT evaluation is performed is not free from psychological and epistemological prejudice by informants, a-priori invalidating their judgement. Therefore, I compare two MT evaluation settings as a Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle a (Chinese Room, 1980). The first setting is called the default scenario. The second setting is called the neutral scenario. 4/22
    10. Introduction The actors The scenarios 5 counterobjections Conclusions A. & B.: the sender and the receiver Alice is a native speaker of Spanish (Ls ) and she wants to translate a newspaper article in Tamil (Lt ). Bob is a native speaker of Tamil (Lt ) and he wants to read Alice’s newspaper article in his own language. 5/22
    11. Introduction The actors The scenarios 5 counterobjections Conclusions C. & D.: MT designing & evaluation Charles is the designer of the Spanish-Tamil MT system. He is a software engineer specialized in MT, not a translator (nor a linguist). Dave is a bilingual Spanish-Tamil and a professional translator, i.e., he is skilled to evaluate translationese – the set of linguistic indicators of a text being a translation. 6/22
    12. Introduction The actors The scenarios 5 counterobjections Conclusions The default scenario: the sender and the receiver Alice is expected to write the original text in Spanish (Ts ), while Bob will read the MT in Tamil (Tt ) 7/22
    13. Introduction The actors The scenarios 5 counterobjections Conclusions The default scenario: the evaluation process To evaluate the reliability of the translation, Charles wants Dave to read the original text in Spanish (Ts ) and the MT in Tamil (Tt ). Charles asks to Dave the following question: Is this translation reliable? 8/22
    14. Introduction The actors The scenarios 5 counterobjections Conclusions How to avoid the psychological fallacy If Dave does not know that the translation is a machine translation, he is free from the psychological bias towards MT: “MT is stupid, only professional translators can be real translator”, etc. 9/22
    15. Introduction The actors The scenarios 5 counterobjections Conclusions The default sceario cannot avoid the epistemic fallacy In fact, Dave is evaluating: the text in Spanish (Ts ) by a human agent; the text in Tamil (Ts ) by an artificial agent. In other words, the epistemic fallacy is still valid: in the default scenario MT can only mimick human translation, therefore it cannot be truly evaluated per se. 10/22
    16. Introduction The actors The scenarios 5 counterobjections Conclusions The neutral scenario: the Entry Language Let’s suppose that Charles asks Alice not to write directly in Spanish (Ls ), but in a special controlled language, i.e., a Quasi-Natural Language (QNL, Lyons 2006): 11/22
    17. Introduction The actors The scenarios 5 counterobjections Conclusions The neutral scenario: the Entry Language Let’s suppose that Charles asks Alice not to write directly in Spanish (Ls ), but in a special controlled language, i.e., a Quasi-Natural Language (QNL, Lyons 2006): it is double articulated (i.e., phonemes vs. morphemes) its semantics is not domain-specific; it is highly regular in morphology (low homophony degree); POS-tagging is easy (low allomorphy degree). 11/22
    18. Introduction The actors The scenarios 5 counterobjections Conclusions The neutral scenario: the Entry Language Let’s suppose that Charles asks Alice not to write directly in Spanish (Ls ), but in a special controlled language, i.e., a Quasi-Natural Language (QNL, Lyons 2006): it is double articulated (i.e., phonemes vs. morphemes) its semantics is not domain-specific; it is highly regular in morphology (low homophony degree); POS-tagging is easy (low allomorphy degree). Let us call this QNL the “Entry Language” (Le ). 11/22
    19. Introduction The actors The scenarios 5 counterobjections Conclusions What is a QNL? For example, QNL has a subclass Quasi-English, one of whose memebers is like English in all respects except that it is inflectionally regular, all plurals of nouns being formed with the -s suffix (childs, sheeps, gooses, etc.), all past-tense forms of verbs with -ed (goed, runned, beed, etc.), and so on. This is a language part of which children construct of themselves (and then in part decostruct – if I may so express it at) at a certain stage in the normal (natural3 ) process of acquiring English. It is also the language into which English would presumably have developed under particular environmental conditions which maximized the effect of what is traditionally referred to as analogy. Lyons, Natural language and universal grammar, 2006:69–70. 12/22
    20. Introduction The actors The scenarios 5 counterobjections Conclusions The MT interface for writing in the neutral scenario When Alice writes the text in the Entry Language (Te ), the MT systems generates automatically the translation in Spanish (Ts ) and in Tamil (Tt ) and she can adjust her writing to obtain a better result controlling the Ts output. In practice, two cognitive processes are performed: A writing process in the Entry Language input; An evaluation process, through the reading of the Spanish output. 13/22
    21. Introduction The actors The scenarios 5 counterobjections Conclusions The receiver isn’t aware of the new scenario While Alice is expected to write differently, Bob will still read the MT in Tamil (Tt ) as in the default scenario. The text in the Entry Language is only part of the writing interface for Alice: only Charles is aware of it, unlike Bob and Dave. 14/22
    22. Introduction The actors The scenarios 5 counterobjections Conclusions The neutral sceario does avoid the epistemic fallacy For Dave, apparently nothing changed, but in reality he is evaluating: the text in Spanish (Ts ) by an artificial agent; the text in Tamil (Ts ) by an artificial agent. I argue that in this way, the epistemic fallacy is avoided. 15/22
    23. Introduction The actors The scenarios 5 counterobjections Conclusions 1. The chinese room argument “Whatever linguistic model you put into the machine, we cannot consider it really cognition, as the meaning of the linguistic model is only in the brain of Charles, the system designer”. Counterobjection: 16/22
    24. Introduction The actors The scenarios 5 counterobjections Conclusions 1. The chinese room argument “Whatever linguistic model you put into the machine, we cannot consider it really cognition, as the meaning of the linguistic model is only in the brain of Charles, the system designer”. Counterobjection: based on the Chinese Room argument; the system is made explicit by the computer program; once formulated, a theorem should also be proved; analogously, the MT engine is like a proof (not the same knowledge!); software is a “cognitive mediator” (Magnani, 2007). 16/22
    25. Introduction The actors The scenarios 5 counterobjections Conclusions 2. The engineer’s reaction “Machine translation is not a testbed of any linguistic theory or anything else. What we need is something practical, i.e., commercially valuable as useful in some domains where we want fast translation of large amount of data.” Counterobjection: 17/22
    26. Introduction The actors The scenarios 5 counterobjections Conclusions 2. The engineer’s reaction “Machine translation is not a testbed of any linguistic theory or anything else. What we need is something practical, i.e., commercially valuable as useful in some domains where we want fast translation of large amount of data.” Counterobjection: seldom said openly; having no linguistic theory is a linguistic choice; computational brute force is not enough for achieving good results; machine learning techniques should be poised in annotated corpora. 17/22
    27. Introduction The actors The scenarios 5 counterobjections Conclusions 3. The desperantist’s argument “A QNL is an artificial language, like Esperanto, Interlingua, Lojban, or Klingon. Artificial language are living dead languages, unfit for your purposes!” Counterobjection: 18/22
    28. Introduction The actors The scenarios 5 counterobjections Conclusions 3. The desperantist’s argument “A QNL is an artificial language, like Esperanto, Interlingua, Lojban, or Klingon. Artificial language are living dead languages, unfit for your purposes!” Counterobjection: at least Esperanto proved to work and evoluate as any NL; non-naturalness is not unnaturalness (such as for C, Java or Prolog); true: is there anyone out there willing to test concretely this Gedankenexperiment? 18/22
    29. Introduction The actors The scenarios 5 counterobjections Conclusions 3. The desperantist’s argument “A QNL is an artificial language, like Esperanto, Interlingua, Lojban, or Klingon. Artificial language are living dead languages, unfit for your purposes!” Counterobjection: at least Esperanto proved to work and evoluate as any NL; non-naturalness is not unnaturalness (such as for C, Java or Prolog); true: is there anyone out there willing to test concretely this Gedankenexperiment? Probably not. 18/22
    30. Introduction The actors The scenarios 5 counterobjections Conclusions 4. The typologist’s argument “Whatever QNL you choose, it will be typologically determined, according to the native tongue of Alice – e.g., you will use a Quasi-Natural Spanish. Your scenario may work with English, French or Spanish, but not with non-European languages, such as Chinese, Arabic, or Tamil.” Counterobjection: 19/22
    31. Introduction The actors The scenarios 5 counterobjections Conclusions 4. The typologist’s argument “Whatever QNL you choose, it will be typologically determined, according to the native tongue of Alice – e.g., you will use a Quasi-Natural Spanish. Your scenario may work with English, French or Spanish, but not with non-European languages, such as Chinese, Arabic, or Tamil.” Counterobjection: to be verified: this is a very serious variable in implementing the MT engine; but please, give us a chance! 19/22
    32. Introduction The actors The scenarios 5 counterobjections Conclusions 5. The human interface argument “Your assumption is too strong. You force Alice not to use her mother tongue, i.e., Spanish, and you ask her to learn Charles’ system too. Furthermore, as your approach implies a strong supervision, I think that it will be easier, faster and cheapier to translate source and target language by professionals instead of using your system.” Counterobjection: 20/22
    33. Introduction The actors The scenarios 5 counterobjections Conclusions 5. The human interface argument “Your assumption is too strong. You force Alice not to use her mother tongue, i.e., Spanish, and you ask her to learn Charles’ system too. Furthermore, as your approach implies a strong supervision, I think that it will be easier, faster and cheapier to translate source and target language by professionals instead of using your system.” Counterobjection: a pragmatic argument, moved from economics; the comparison between Te and Ts should compensate Alice’s additional effort; only a monolingual parser is needed (for Le ); translation memories can be stored so that the MT system becomes more and more precise according to its use. 20/22
    34. Introduction The actors The scenarios 5 counterobjections Conclusions Paraphrasing Turing... Only a machine can really appreciate a machine translation. 21/22
    35. Introduction The actors The scenarios 5 counterobjections Conclusions Paraphrasing Turing... Only a machine can really appreciate a machine translation. Perhaps. 21/22
    36. Introduction The actors The scenarios 5 counterobjections Conclusions Thanks. Any questions? Download these slides here: http://www.slideshare.net/goberiko/ C CC BY: $ Federico Gobbo 2009. Pubblicato in Italia. Attribuzione – Non commerciale – Condividi allo stesso modo 2.5 22/22

    + Federico GobboFederico Gobbo, 5 months ago

    custom

    386 views, 0 favs, 1 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 386
      • 381 on SlideShare
      • 5 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds
    • 5 views on http://federicogobbo.wordpress.com

    more

    All embeds
    • 5 views on http://federicogobbo.wordpress.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Tags