Post-editese: Translation Characteristics Exacerbated by Machine Translation

Post-editese: an Exacerbated Translationese
Antonio Toral
MT Summit, Dublin, 23rd August 2019

Table of contents
1. Intro and Motivation
2. Datasets
3. Experiments
4. Conclusions and Future
1

Abbreviations
• Translation types: HT (human from scratch), PE (post-editing), MT
(machine translation)
• Languages: ISO-2 codes, e.g. DE, EN, ES, ZH ...
• MT types: RBMT (rule-based), SMT (statistical), NMT (neural)
2

The Reader
Does PE aﬀect the reading experience?
3

The Reader
Does PE aﬀect the reading experience?
→ Are PE translations = HT?
3

PE vs HT. Theory and Practice
In theory, PE=HT
1. Translator is primed by MT output while post-editing (Green et al., 2013)
2. PE should contain the footprint of MT
3. HT should be preferred over PE
4

In practice, quality of PE
• comparable to that of HT. E.g., Garcia (2010)
• or even better. E.g. Plitt and Masselot (2010)
5

In practice, quality of PE
• comparable to that of HT. E.g., Garcia (2010)
• or even better. E.g. Plitt and Masselot (2010)
But... Quality is typically measured as number of errors (Koponen, 2016)
5

PE vs HT. Beyond Number of Errors
Characteristics of PE vs HT
• Czulo and Nitzke (2016) Terminology in PE closer to MT than HT
• Daems et al. (2017) Discrimination between PE and HT not possible
• Farrell (2018) lexical variability in PE<HT
6

PE vs HT. Translationese
Research has proven the existence of translationese: HT=original text
• Normalisation
• Simpliﬁcation
• Interference
• Explicitation
7

PE vs HT. Translationese
Research has proven the existence of translationese: HT=original text
• Normalisation
• Simpliﬁcation
• Interference
• Explicitation
This paper: quantitative analysis of PE vs HT in terms of translationese principles
7

PE vs HT. Datasets
Dataset MT systems Direction PE type Domain # Sent. pairs
Tarax¨u 2 SMT, 2 RBMT en→de Light News 272
(2011) de→en 240
es→de 101
IWSLT 4 NMT, 4 SMT en→de Light Subtitles 600
(2016) 2 NMT, 3 SMT en→fr
MS 1 NMT zh→en Full News 1,000
(2018)
8

PE vs HT. Datasets
Dataset MT systems Direction PE type Domain # Sent. pairs
Tarax¨u 2 SMT, 2 RBMT en→de Light News 272
(2011) de→en 240
es→de 101
IWSLT 4 NMT, 4 SMT en→de Light Subtitles 600
(2016) 2 NMT, 3 SMT en→fr
MS 1 NMT zh→en Full News 1,000
(2018)
Competence missmatch. PE=prof, HT=anyone
8

Lexical Variety
type-token ratio =
number of types
number of tokens
(1)
9

Lexical Variety
type-token ratio =
number of types
number of tokens
(1)
→ Simpliﬁcation principle
9

Lexical Variety Results (Microsoft)
zhen
0.1760
0.1780
0.1800
0.1820
0.1840
0.1860
0.1880
ht
nmt1
nmt2
penmt
type-tokenratio
10

Lexical Variety Results (Microsoft)
zhen
0.1760
0.1780
0.1800
0.1820
0.1840
0.1860
0.1880
ht
nmt1
nmt2
penmt
type-tokenratio
HT > PE > MT
10

Lexical Variety Results (all)
Translation Dataset and translation direction
type Tarax¨u IWSLT MS
de→en en→de es→de en→de en→fr zh→en
HT 0.26 0.27 0.31 0.20 0.16 0.14
PE -2.05% -1.81% -1.27% -3.86% -1.17% -4.76%
MT -2.94% -3.62% -5.91% -10.93% -6.04% -6.96%
11

HT 0.26 0.27 0.31 0.20 0.16 0.14
PE -2.05% -1.81% -1.27% -3.86% -1.17% -4.76%
MT -2.94% -3.62% -5.91% -10.93% -6.04% -6.96%
PE-NMT -4.21% -1.88% -4.76%
PE-SMT -1.59% -1.31% -1.03% -3.50% -0.70%
PE-RBMT -2.79% -2.04% -3.05%
11

HT 0.26 0.27 0.31 0.20 0.16 0.14
PE -2.05% -1.81% -1.27% -3.86% -1.17% -4.76%
MT -2.94% -3.62% -5.91% -10.93% -6.04% -6.96%
PE-NMT -4.21% -1.88% -4.76%
PE-SMT -1.59% -1.31% -1.03% -3.50% -0.70%
PE-RBMT -2.79% -2.04% -3.05%
NMT -12.22% -8.18% -7.33%
SMT -2.36% -2.36% -6.42% -9.63% -4.61%
RBMT -3.08% -4.26% -7.78%
11

Lexical Density
lexical density =
number of content words
number of total words
(2)
Content words: adverbs, adjectives, nouns and verbs (UDPipe)
12

Lexical Density
lexical density =
number of content words
number of total words
(2)
Content words: adverbs, adjectives, nouns and verbs (UDPipe)
→ Simpliﬁcation principle
12

Lexical Density Results (Taraxu)
ende deen esde
0.4800
0.4900
0.5000
0.5100
0.5200
0.5300
0.5400
0.5500
0.5600
HT
MT
PE
LexicalDensity
13

Lexical Density Results (Taraxu)
ende deen esde
0.4800
0.4900
0.5000
0.5100
0.5200
0.5300
0.5400
0.5500
0.5600
HT
MT
PE
LexicalDensity
HT > PE MT
13

Lexical Density Results (all)
Type Tarax¨u IWSLT MS
HT 0.55 0.53 0.53 0.48 0.46 0.59
PE -1.00% -2.48% -4.31% -3.46% -1.24% -0.46%
MT -0.81% -0.69% -4.53% -5.14% -0.94% -2.37%
14

HT 0.55 0.53 0.53 0.48 0.46 0.59
PE -1.00% -2.48% -4.31% -3.46% -1.24% -0.46%
MT -0.81% -0.69% -4.53% -5.14% -0.94% -2.37%
PE-NMT -3.88% -1.47% -0.46%
PE-SMT -0.54% -2.87% -4.78% -3.04% -1.09%
PE-RBMT -1.46% -2.09% -3.84%
14

HT 0.55 0.53 0.53 0.48 0.46 0.59
PE -1.00% -2.48% -4.31% -3.46% -1.24% -0.46%
MT -0.81% -0.69% -4.53% -5.14% -0.94% -2.37%
PE-NMT -3.88% -1.47% -0.46%
PE-SMT -0.54% -2.87% -4.78% -3.04% -1.09%
PE-RBMT -1.46% -2.09% -3.84%
NMT -6.31% -3.14% -2.37%
SMT -0.80% 0.14% -3.45% -3.98% 0.53%
RBMT -0.83% -1.51% -5.61%
14

Length Ratio
length ratio =
|lengthST − lengthTT |
lengthST
(3)
15

Length Ratio
length ratio =
lengthST
(3)
Hypothesis: compared to HT, PE translations are closer in length to source text
15

Length Ratio
length ratio =
lengthST
(3)
Hypothesis: compared to HT, PE translations are closer in length to source text
→ Normalisation principle
15

Length Ratio Results (Taraxu)
ende deen esde dees
0.000
0.050
0.100
0.150
0.200
0.250
ht
pesmt1
pesmt2
perbmt1
perbmt2
lengthratio
16

Length Ratio Results (Taraxu)
ende deen esde dees
0.000
0.050
0.100
0.150
0.200
0.250
ht
pesmt1
pesmt2
perbmt1
perbmt2
lengthratio
HT > PESMT ≥ PERBMT
16

Length Ratio Results (all)
Dataset Direction
Length ratio
HT PE MT
Tarax¨u
de→en 0.16 -38.5% -36.9%
en→de 0.22 -33.4% -38.5%
es→de 0.17 -25.2% -21.0%
IWSLT
en→de 0.17 -3.4% -18.8%
en→fr 0.18 6.7% -10.9%
MS zh→en 1.41 -9.9% -9.1%
17

Length Ratio Results (all)
Dataset Direction
Length ratio
HT PE MT
Tarax¨u
de→en 0.16 -38.5% -36.9%
en→de 0.22 -33.4% -38.5%
es→de 0.17 -25.2% -21.0%
IWSLT
en→de 0.17 -3.4% -18.8%
en→fr 0.18 6.7% -10.9%
MS zh→en 1.41 -9.9% -9.1%
Competence missmatch. PE=prof, HT=anyone
17

Perplexity on PoS Sequences
Process:
1. PoS tag monolingual corpora (Universal Dependencies tag set) for source
and target languages
2. Build language models on PoS tagged data
3. PoS tag each translation (MT, PE and HT) and calculate:
18

Process:
PP diﬀ = PP(translation, LMsource) − PP(translation, LMtarget) (4)
18

Process:
Hypothesis: PP diﬀPE < PP diﬀHT
18

Process:
Hypothesis: PP diﬀPE < PP diﬀHT
→ Interference principle
18

Perplexity Results (Taraxu)
ende deen
0.00
1.00
2.00
3.00
4.00
5.00
6.00
ht
smt1
pesmt1
smt2
pesmt2
rbmt1
perbmt1
rbmt2
perbmt2
Perplexitydifference
19

Perplexity Results (all)
HT 5.12 5.09 9.41 5.01 2.47 17.23
PE -13.84% -11.29% -8.58% -6.26% -2.03% -3.26%
MT -33.65% -32.25% -20.71% -18.66% -11.07% -3.1%
20

HT 5.12 5.09 9.41 5.01 2.47 17.23
PE -13.84% -11.29% -8.58% -6.26% -2.03% -3.26%
MT -33.65% -32.25% -20.71% -18.66% -11.07% -3.1%
PE-NMT -3.41% -1.40% -3.26%
PE-SMT -11.72% -13.37% -10.48% -9.10% -2.46%
PE-RBMT -15.95% -9.20% -6.68%
20

HT 5.12 5.09 9.41 5.01 2.47 17.23
PE -13.84% -11.29% -8.58% -6.26% -2.03% -3.26%
MT -33.65% -32.25% -20.71% -18.66% -11.07% -3.1%
PE-NMT -3.41% -1.40% -3.26%
PE-SMT -11.72% -13.37% -10.48% -9.10% -2.46%
PE-RBMT -15.95% -9.20% -6.68%
NMT -5.89% -2.58% -3.10%
SMT -30.07% -41.71% -26.30% -31.43% -7.95%
RBMT -37.24% -22.80% -15.13%
20

Conclusions
PE=HT. PEs:
• Are simpler (lexical variety and density)
• Are more normalised (length ratio)
• Have more interference from the source language (PoS sequences)
21

Conclusions
PE=HT. PEs:
• Are simpler (lexical variety and density)
• Are more normalised (length ratio)
• Have more interference from the source language (PoS sequences)
MT paradigms
• (PE)SMT better than (PE)NMT in lexical variety and density
• (PE)NMT has less interference than (PE)SMT
21

Discussion
1. Does PE contribute to the impoverishment of the target language?
22

Discussion
2. In this study HT better than PE. But number of errors HT≥PE
22

Discussion
• PE may be better suited than HT for some domains, e.g. technical
22

Discussion
3. It’s not the fault of the post-editing process per se... but of MT
22

Discussion
3. It’s not the fault of the post-editing process per se... but of MT
• PEs should be better if MT is. E.g. interference in PE-NMT<PE-SMT
because interference in NMT<SMT
22

Future
• Eﬀect of PE guidelines, translator’s expertise, etc.
• Measures with deeper linguistic information
• Automatic discrimination between PE and HT
• More data (industry?)
Data and code available: https://bit.ly/2zeKf0b
23

Thanks: L. Bentivogli, S. Castilho, J. Daems, M. Farrell, L. Macken, L. Marg
and M. Popovi´c
23

Thanks: L. Bentivogli, S. Castilho, J. Daems, M. Farrell, L. Macken, L. Marg
and M. Popovi´c
Go raibh maith agaibh!
Ceisteanna?
Antonio Toral
@ atoral
23

References i
References
L. Bowker and J. Buitrago Ciro. Investigating the usefulness of machine
translation for newcomers at the public library. Translation and Interpreting
Studies, 10(2):165–186, 2015. ISSN 1932-2798. doi: 10.1075/tis.10.2.01bow.
URL http://www.jbe-platform.com/content/journals/10.1075/tis.
10.2.01bow.

References ii
O. Czulo and J. Nitzke. Patterns of terminological variation in post-editing and
of cognate use in machine translation in contrast to human translation. In
Proceedings of the 19th Annual Conference of the European Association for
Machine Translation, EAMT 2017, Riga, Latvia, May 30 - June 1, 2016, pages
106–114. European Association for Machine Translation, 2016. URL
https://aclanthology.info/papers/W16-3401/w16-3401.
J. Daems, O. De Clercq, and L. Macken. Translationese and post-editese : how
comparable is comparable quality? LINGUISTICA ANTVERPIENSIA NEW
SERIES-THEMES IN TRANSLATION STUDIES, 16:89–103, 2017. ISSN
0304-2294. URL https://lans-tts.uantwerpen.be/index.php/
LANS-TTS/article/view/434/409.

References iii
M. Farrell. Machine Translation Markers in Post-Edited Machine Translation
Output. In Proceedings of the 40th Conference Translating and the Computer,
pages 50–59, 2018.
R. Fiederer and S. O’Brien. Quality and machine translation: A realistic
objective. The Journal of Specialised Translation, 11:52–74, 2009.
I. Garcia. Is machine translation ready yet? Target. International Journal of
Translation Studies, 22(1):7–21, 2010.
S. Green, J. Heer, and C. D. Manning. The eﬃcacy of human post-editing for
language translation. Chi 2013, pages 439–448, 2013. doi:
10.1145/2470654.2470718. URL
http://vis.stanford.edu/papers/post-editing.

References iv
M. Koponen. Is machine translation post-editing worth the eﬀort? A survey of
research into post-editing and eﬀort. Journal of Specialised Translation, 25
(25):131–148, 2016. ISSN 0169-2607. URL
https://sites.google.com/site/wptp2015/.
M. Plitt and F. Masselot. A productivity test of statistical machine translation
post-editing in a typical localisation context. The Prague bulletin of
mathematical linguistics, 93:7–16, 2010.

Post-editese: Translation Characteristics Exacerbated by Machine Translation

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Similar to Post-editese: Translation Characteristics Exacerbated by Machine Translation

Similar to Post-editese: Translation Characteristics Exacerbated by Machine Translation (20)

Recently uploaded

Recently uploaded (20)

Post-editese: Translation Characteristics Exacerbated by Machine Translation