Introduction The actors The scenarios 5 counterobjections Conclusions
The translation game
Machine translation evaluation without prejudice
Federico Gobbo
federico.gobbo@uninsubria.it
University of Insubria, Varese, Italy
CC Some rights reserved.
ECAP09, UAB, Barcelona, July 2009
1/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why machine translation evaluation is interesting?
Machine Translation (MT): a coherent chain of grammatical
sentences written in a given natural language (NL) rendered by a
machine from a source language (Ls ) into a reliable text written in
a target language (Lt ).
2/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why machine translation evaluation is interesting?
Machine Translation (MT): a coherent chain of grammatical
sentences written in a given natural language (NL) rendered by a
machine from a source language (Ls ) into a reliable text written in
a target language (Lt ).
Remarks:
only asynchronous written texts (otherwise, interpretation).
2/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why machine translation evaluation is interesting?
Machine Translation (MT): a coherent chain of grammatical
sentences written in a given natural language (NL) rendered by a
machine from a source language (Ls ) into a reliable text written in
a target language (Lt ).
Remarks:
only asynchronous written texts (otherwise, interpretation).
without any human aid (otherwise, Computer-Aided
Translation).
2/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why machine translation evaluation is interesting?
Machine Translation (MT): a coherent chain of grammatical
sentences written in a given natural language (NL) rendered by a
machine from a source language (Ls ) into a reliable text written in
a target language (Lt ).
Remarks:
only asynchronous written texts (otherwise, interpretation).
without any human aid (otherwise, Computer-Aided
Translation).
MT can be seen as a corpus-based test of the appropriateness of
models we have of the language faculty, if the evaluation is
performed in the appropriate setting.
2/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The problem of MT evaluation
In the development process of an MT engine, automatic evaluators
are used: they implement algorithms that measure the distance of
the target text (Tt ) from a gold standard reference corpus, trained
through machine learning techniques.
Nonetheless, it is well known among specialists that automated
evaluation is not enough, especially in the production phase of the
life cycle of the MT software, where informants are used.
3/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why using a Gedankenexperiment?
I argue that the setting where the MT evaluation is performed is
not free from psychological and epistemological prejudice by
informants, a-priori invalidating their judgement.
Therefore, I compare two MT evaluation settings as a
Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle
a
(Chinese Room, 1980).
4/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why using a Gedankenexperiment?
I argue that the setting where the MT evaluation is performed is
not free from psychological and epistemological prejudice by
informants, a-priori invalidating their judgement.
Therefore, I compare two MT evaluation settings as a
Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle
a
(Chinese Room, 1980).
The first setting is called the default scenario.
4/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Why using a Gedankenexperiment?
I argue that the setting where the MT evaluation is performed is
not free from psychological and epistemological prejudice by
informants, a-priori invalidating their judgement.
Therefore, I compare two MT evaluation settings as a
Gedankenexperiment, ` la Turing (Imitation Game, 1950) or Searle
a
(Chinese Room, 1980).
The first setting is called the default scenario.
The second setting is called the neutral scenario.
4/22
Introduction The actors The scenarios 5 counterobjections Conclusions
A. & B.: the sender and the receiver
Alice is a native speaker of Spanish (Ls ) and she wants
to translate a newspaper article in Tamil (Lt ).
Bob is a native speaker of Tamil (Lt ) and he wants to
read Alice’s newspaper article in his own language.
5/22
Introduction The actors The scenarios 5 counterobjections Conclusions
C. & D.: MT designing & evaluation
Charles is the designer of the Spanish-Tamil MT
system. He is a software engineer specialized in MT, not a
translator (nor a linguist).
Dave is a bilingual Spanish-Tamil and a professional
translator, i.e., he is skilled to evaluate translationese – the set of
linguistic indicators of a text being a translation. 6/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The default scenario: the sender and the receiver
Alice is expected to write the original text in Spanish (Ts ), while
Bob will read the MT in Tamil (Tt )
7/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The default scenario: the evaluation process
To evaluate the reliability of the translation, Charles wants Dave to
read the original text in Spanish (Ts ) and the MT in Tamil (Tt ).
Charles asks to Dave the following question:
Is this translation reliable?
8/22
Introduction The actors The scenarios 5 counterobjections Conclusions
How to avoid the psychological fallacy
If Dave does not know that the translation is a machine
translation, he is free from the psychological bias towards MT:
“MT is stupid, only professional translators can be real
translator”, etc.
9/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The default sceario cannot avoid the epistemic fallacy
In fact, Dave is evaluating:
the text in Spanish (Ts ) by a human agent;
the text in Tamil (Ts ) by an artificial agent.
In other words, the epistemic fallacy is still valid: in the default
scenario MT can only mimick human translation, therefore it
cannot be truly evaluated per se.
10/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The neutral scenario: the Entry Language
Let’s suppose that Charles asks Alice not to write directly in
Spanish (Ls ), but in a special controlled language, i.e., a
Quasi-Natural Language (QNL, Lyons 2006):
11/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The neutral scenario: the Entry Language
Let’s suppose that Charles asks Alice not to write directly in
Spanish (Ls ), but in a special controlled language, i.e., a
Quasi-Natural Language (QNL, Lyons 2006):
it is double articulated (i.e., phonemes vs. morphemes)
its semantics is not domain-specific;
it is highly regular in morphology (low homophony degree);
POS-tagging is easy (low allomorphy degree).
11/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The neutral scenario: the Entry Language
Let’s suppose that Charles asks Alice not to write directly in
Spanish (Ls ), but in a special controlled language, i.e., a
Quasi-Natural Language (QNL, Lyons 2006):
it is double articulated (i.e., phonemes vs. morphemes)
its semantics is not domain-specific;
it is highly regular in morphology (low homophony degree);
POS-tagging is easy (low allomorphy degree).
Let us call this QNL the “Entry Language” (Le ).
11/22
Introduction The actors The scenarios 5 counterobjections Conclusions
What is a QNL?
For example, QNL has a subclass Quasi-English, one of whose
memebers is like English in all respects except that it is
inflectionally regular, all plurals of nouns being formed with the
-s suffix (childs, sheeps, gooses, etc.), all past-tense forms of
verbs with -ed (goed, runned, beed, etc.), and so on. This is a
language part of which children construct of themselves (and
then in part decostruct – if I may so express it at) at a certain
stage in the normal (natural3 ) process of acquiring English. It
is also the language into which English would presumably have
developed under particular environmental conditions which
maximized the effect of what is traditionally referred to as
analogy.
Lyons, Natural language and universal grammar, 2006:69–70.
12/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The MT interface for writing in the neutral scenario
When Alice writes the text in the Entry Language (Te ), the MT
systems generates automatically the translation in Spanish (Ts )
and in Tamil (Tt ) and she can adjust her writing to obtain a better
result controlling the Ts output. In practice, two cognitive
processes are performed:
A writing process in the Entry Language input;
An evaluation process, through the reading of the Spanish
output.
13/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The receiver isn’t aware of the new scenario
While Alice is expected to write differently, Bob will still read the
MT in Tamil (Tt ) as in the default scenario.
The text in the Entry Language is only part of the writing interface
for Alice: only Charles is aware of it, unlike Bob and Dave.
14/22
Introduction The actors The scenarios 5 counterobjections Conclusions
The neutral sceario does avoid the epistemic fallacy
For Dave, apparently nothing changed, but in reality he
is evaluating:
the text in Spanish (Ts ) by an artificial agent;
the text in Tamil (Ts ) by an artificial agent.
I argue that in this way, the epistemic fallacy is avoided.
15/22
Introduction The actors The scenarios 5 counterobjections Conclusions
1. The chinese room argument
“Whatever linguistic model you put into the machine, we
cannot consider it really cognition, as the meaning of the
linguistic model is only in the brain of Charles, the
system designer”.
Counterobjection:
16/22
Introduction The actors The scenarios 5 counterobjections Conclusions
1. The chinese room argument
“Whatever linguistic model you put into the machine, we
cannot consider it really cognition, as the meaning of the
linguistic model is only in the brain of Charles, the
system designer”.
Counterobjection:
based on the Chinese Room argument;
the system is made explicit by the computer program;
once formulated, a theorem should also be proved;
analogously, the MT engine is like a proof (not the same
knowledge!);
software is a “cognitive mediator” (Magnani, 2007).
16/22
Introduction The actors The scenarios 5 counterobjections Conclusions
2. The engineer’s reaction
“Machine translation is not a testbed of any linguistic
theory or anything else. What we need is something
practical, i.e., commercially valuable as useful in some
domains where we want fast translation of large amount
of data.”
Counterobjection:
17/22
Introduction The actors The scenarios 5 counterobjections Conclusions
2. The engineer’s reaction
“Machine translation is not a testbed of any linguistic
theory or anything else. What we need is something
practical, i.e., commercially valuable as useful in some
domains where we want fast translation of large amount
of data.”
Counterobjection:
seldom said openly;
having no linguistic theory is a linguistic choice;
computational brute force is not enough for achieving good
results;
machine learning techniques should be poised in annotated
corpora.
17/22
Introduction The actors The scenarios 5 counterobjections Conclusions
3. The desperantist’s argument
“A QNL is an artificial language, like Esperanto,
Interlingua, Lojban, or Klingon. Artificial language are
living dead languages, unfit for your purposes!”
Counterobjection:
18/22
Introduction The actors The scenarios 5 counterobjections Conclusions
3. The desperantist’s argument
“A QNL is an artificial language, like Esperanto,
Interlingua, Lojban, or Klingon. Artificial language are
living dead languages, unfit for your purposes!”
Counterobjection:
at least Esperanto proved to work and evoluate as any NL;
non-naturalness is not unnaturalness (such as for C, Java or
Prolog);
true: is there anyone out there willing to test concretely this
Gedankenexperiment?
18/22
Introduction The actors The scenarios 5 counterobjections Conclusions
3. The desperantist’s argument
“A QNL is an artificial language, like Esperanto,
Interlingua, Lojban, or Klingon. Artificial language are
living dead languages, unfit for your purposes!”
Counterobjection:
at least Esperanto proved to work and evoluate as any NL;
non-naturalness is not unnaturalness (such as for C, Java or
Prolog);
true: is there anyone out there willing to test concretely this
Gedankenexperiment?
Probably not.
18/22
Introduction The actors The scenarios 5 counterobjections Conclusions
4. The typologist’s argument
“Whatever QNL you choose, it will be typologically
determined, according to the native tongue of Alice –
e.g., you will use a Quasi-Natural Spanish. Your scenario
may work with English, French or Spanish, but not with
non-European languages, such as Chinese, Arabic, or
Tamil.”
Counterobjection:
19/22
Introduction The actors The scenarios 5 counterobjections Conclusions
4. The typologist’s argument
“Whatever QNL you choose, it will be typologically
determined, according to the native tongue of Alice –
e.g., you will use a Quasi-Natural Spanish. Your scenario
may work with English, French or Spanish, but not with
non-European languages, such as Chinese, Arabic, or
Tamil.”
Counterobjection:
to be verified: this is a very serious variable in implementing
the MT engine; but please, give us a chance!
19/22
Introduction The actors The scenarios 5 counterobjections Conclusions
5. The human interface argument
“Your assumption is too strong. You force Alice not to
use her mother tongue, i.e., Spanish, and you ask her to
learn Charles’ system too. Furthermore, as your approach
implies a strong supervision, I think that it will be easier,
faster and cheapier to translate source and target
language by professionals instead of using your system.”
Counterobjection:
20/22
Introduction The actors The scenarios 5 counterobjections Conclusions
5. The human interface argument
“Your assumption is too strong. You force Alice not to
use her mother tongue, i.e., Spanish, and you ask her to
learn Charles’ system too. Furthermore, as your approach
implies a strong supervision, I think that it will be easier,
faster and cheapier to translate source and target
language by professionals instead of using your system.”
Counterobjection:
a pragmatic argument, moved from economics;
the comparison between Te and Ts should compensate Alice’s
additional effort;
only a monolingual parser is needed (for Le );
translation memories can be stored so that the MT system
becomes more and more precise according to its use.
20/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Paraphrasing Turing...
Only a machine can really appreciate a
machine translation.
21/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Paraphrasing Turing...
Only a machine can really appreciate a
machine translation.
Perhaps.
21/22
Introduction The actors The scenarios 5 counterobjections Conclusions
Thanks. Any questions?
Download these slides here:
http://www.slideshare.net/goberiko/
C
CC BY: $ Federico Gobbo 2009. Pubblicato in Italia.
Attribuzione – Non commerciale – Condividi allo stesso modo 2.5
22/22
0 comments
Post a comment