How to Know Best Machine Translation System in Advance before Translating a Sentence?

How to Know Best Machine
Translation System in Advance
before Translating a
Sentence?
Bibekananda Kundu and Sanjay Kumar Choudhury
Centre for Development of Advanced
Computing
December 19, 2014
{ bibekananda.kundu,sanjay.choudhury }@cdac.in

2/19
Contents
* Research Problem
* Methodology and Contributions
* Feature Set for Selecting Best MT System
* Experiments
* Results and Discussion
* Conclusions

3/19
* Research Problem
....................
Was my camera repaired already?
.
MT1
.
MT2
.
MT3
...
আমার ক ােমরা িক ইিতমে মরামত করা হে িছল ?
.
আমার ক ােমরা িক ইিতমে মরামত করা িছেলা ?
.
িক আমার ক ােমরা ইতঃ েব মরামত করা হেয়িছল ?
.
আমার ক ােমরা ইিতমে মরামত করা ?
.
আমার ক ােমরা ইিতমে িছল মরামত ?
How to identify a MT system from a set of multiple MT
systems in advance, capable of producing most appropriate
translation for a source sentence without having any idea
about working principle of these MT systems.

8/19
* Phrase-structure features : represent structural
complexity of a sentence.
* Dependency based features: represent how words in
a sentence depend on each other even for long distances.
* Probabilistic features: represent complexity in
term of out-of-vocabulary (OOV), likelihood of a
sentence, likelihood of a dependency relation, mapping
capability of a source word to multiple target words or
vice versa.

9/19
* Phrase-structure features :
.
Number of Unique POS Tags (NUPT)
.
POS Tag Density (PTD)
.
Maximum and Mean Depth
.
Number of Internal nodes
.
Maximum and Mean Number of Child Nodes for
each Node
....S1...
..SQ.....
......
..?
.
....
..VP.....
..ADVP...
..RB...
..already
.
..
..VBN...
..repaired
.
....
..NP.....
..NN...
..camera
.
..
..PRP$...
..my
.
..
..AUX...
..Was

10
/19
* Probabilistic features :
.
Joint Probability of Input Sentence (JPIS): We
have approximated JPIS using trigram sequences.
P(S = w1w2w3 · · · wn) = P(w1)
× P(w2|w1)
× P(w3|w1w2)
× · · ·
× P(wn|wn−2wn−1)

11/19
* Probabilistic features :
Joint Probability Using N-gram Dependency (JPUND):
Dependency based language model is reported in
(Shen et al. 2008). JPUND for the dependency tree is
calculated as :
JPUND = PT (repaired)
× PL(camera | repairedhead )
× PL(my | camerahead )
× PL(was | my, camerahead )
× PR (already | repairedhead )
× PR (? | already, repairedhead )
..
Was
.
my
.
camera
.
repaired
.
already
.
nsubj
.
pos
.
advmod
.
cop
.
?
.
punct
.
ROOT
Figure : A dependency tree.

12/19
* Dependency based features :
.
Number of Dependency Link (NDL)
.
Maximum Dependency Distance (MDD)
.
Maximum amongst the Number of Dependent of a
Word (MNDW)
..
Was
.
my
.
camera
.
repaired
.
already
.
nsubj
.
pos
.
advmod
.
cop
.
?
.
punct
.
ROOT
Figure : A dependency tree.

13/19
* Experiemnts
* Questions to Answer. Can features extracted from source sentences predict
the quality of a MT system?
. Which machine learning algorithm is most
appropriate for this classification task?
. How selection of different types of features influences
the performances of classifiers?

14/19
* Experiemnts
* English-Bangla MT Systems
. AnglaMT: http://tdil-dc.in
. GoogleMT: https://translate.google.co.in
* Data Preparation
. 20K Basic Travel Expression Corpus (BTEC)
. 50K ILCI corpus: http://www.tdil-dc.in/
* Tools used in this experiments
. WEKA: http://www.cs.waikato.ac.nz/ ml/weka
. Charniak parser: http://cs.brown.edu/ ec/
. Malt parser: http://www.maltparser.org/
. Moses toolkit: http://www.statmt.org/moses/

15/19
* Results and Discussion

16/19
* Conclusions
. A machine learning approach for selecting a MT system
producing most appropriate translation before translating the
input sentence.
. Our approach uses phrase-structure, probabilistic and
dependency features.
. Features used in this paper can also be applied on similar
NLP tasks where measuring confidence of the system is
required.
. Experiment shows IB1 classifier provides best performance
when compare to other classifiers.

How to Know Best Machine Translation System in Advance before Translating a Sentence?

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Similar to How to Know Best Machine Translation System in Advance before Translating a Sentence?

Similar to How to Know Best Machine Translation System in Advance before Translating a Sentence? (20)

Recently uploaded

Recently uploaded (20)

How to Know Best Machine Translation System in Advance before Translating a Sentence?