SlideShare a Scribd company logo
1 of 69
Download to read offline
A Polynomial-Time Dynamic Programming
Algorithm for Phrase-Based Decoding with a
Fixed Distortion Limit
Yin-Wen Chang 1
(Joint work with Michael Collins 1,2)
1Google, New York
2Columbia University
July 31, 2017
Introduction
Background:
Phrase-based decoding without further constraints is NP-hard
Proof: reduction from the travelling salesman problem
(TSP)[Knight(1999)]
Hard distortion limit is commonly imposed in PBMT systems
Question:
Is phrase-based decoding with a fixed distortion limit NP-hard
or not?
Introduction
A related problem: bandwidth-limited TSP
1 2 . . . i
. . .
j
|i − j| ≤ d
This work: a new decoding algorithm
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Run time: O(nd!lhd+1
) n: source sentence length
d: distortion limit
Overview of the proposed decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
π1 ←
π1 =
π2 =
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
(3, 4, our concern)
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
π1 ← π1 · (1, 2, this must)
π1 =
π2 =
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
(3, 4, our concern)
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
π2 ← π2 · (3, 4, our concern)
π1 =
π2 =
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
(3, 4, our concern)
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
π1 ← π1 · (5, 5, also)
π1 =
π2 =
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
(3, 4, our concern)
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
π1 ← π1 · (6, 6, be) · π2
π1 =
π2 =
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
(3, 4, our concern)
Process the source word from left-to-right
Maintain multiple “tapes” in the target side
Outline
Introduction of the phrase-based decoding problem
Target-side left-to-right: the usual decoding algorithm
Source-side left-to-right: the proposed algorithm
Time complexity of the proposed algorithm
Conclusion and future work
Phrase-based decoding problem
das muss unsere sorge gleichermaßen sein
this must our concern also be
Derivation: complete translation with phrase mappings
Sub-derivation: partial translation
Phrase-based decoding problem
das muss unsere sorge gleichermaßen sein
this must our concern also be
Segment the German sentence into non-overlapping phrases
Derivation: complete translation with phrase mappings
Sub-derivation: partial translation
Phrase-based decoding problem
das muss unsere sorge gleichermaßen sein
this must our concern also bethis must our concern also be
Segment the German sentence into non-overlapping phrases
Find an English translation for each German phrase
Derivation: complete translation with phrase mappings
Sub-derivation: partial translation
Phrase-based decoding problem
das muss unsere sorge gleichermaßen sein
this must our concern also bethis must also be our concern
Segment the German sentence into non-overlapping phrases
Find an English translation for each German phrase
Reorder the English phrases to get a better English sentence
Derivation: complete translation with phrase mappings
Sub-derivation: partial translation
Phrase-based decoding problem
das muss unsere sorge gleichermaßen sein
this must our concern also bethis must also be our concern
Segment the German sentence into non-overlapping phrases
Find an English translation for each German phrase
Reorder the English phrases to get a better English sentence
Derivation: complete translation with phrase mappings
Sub-derivation: partial translation
Score a derivation
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must also be our concern
Score a derivation
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must also be our concern
Phrase translation score: score(das muss, this must) + · · ·
Score a derivation
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must also be our concern
Phrase translation score: score(das muss, this must) + · · ·
Language model score:
score(<s> this must also be our concern </s>)
=score(this|<s>) + score(must|this) + · · ·
Score a derivation
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must also be our concern
Phrase translation score: score(das muss, this must) + · · ·
Language model score:
score(<s> this must also be our concern </s>)
=score(this|<s>) + score(must|this) + · · ·
Reordering score: η · |2 + 1 − 5|
Fixed distortion limit: distortion distance ≤ d
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must also be our concern
Distortion distance: |2 + 1 − 5| = 2
Target-side left-to-right: the usual decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must
Target-side left-to-right: the usual decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must
Target-side left-to-right: the usual decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also
Target-side left-to-right: the usual decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be
Target-side left-to-right: the usual decoding algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern
Target-side left-to-right: dynamic programming algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must
Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
DP state:
Target-side left-to-right: dynamic programming algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
DP state:DP state: (must, 2, 110000)
Target-side left-to-right: dynamic programming algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
DP state:DP state: (also, 5, 110010)
Target-side left-to-right: dynamic programming algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
DP state:DP state: (be, 6, 110011)
Target-side left-to-right: dynamic programming algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
DP state:DP state: (concern, 4, 111111)
Source-side left-to-right: the proposed algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must
Source-side left-to-right: the proposed algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must
Source-side left-to-right: the proposed algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must our concern
Source-side left-to-right: the proposed algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also our concern
Source-side left-to-right: the proposed algorithm
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this must
Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 =
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
DP state: j = 2,
σ1 = 1, this, 2, must
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 =
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
DP state: j = 2,
σ1 = 1, this, 2, must
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must our concern
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 =
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also our concern
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 =
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
DP state: j = 5,
σ1 = 1, this, 5, also ,
σ2 = 3, our, 4, concern
Source-side left-to-right: dynamic programming state
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern
Sub-derivation:Sub-derivation:
(1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 =
π2 =(3, 4, our concern)
DP state: j = 4,
σ1 = 1, this, 2, must ,
σ2 = 3, our, 4, concern
DP state: j = 6,
σ1 = 1, this, 4, concern
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
s, t: source word indices
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
s, t: source word indices
ws, wt: translated target words
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs here
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source wordsTranslated source words
Next phrase starts at
s, t can only occurs here
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source wordsTranslated source words
Next phrase starts atNext phrase starts at
s, t can only occurs here
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source wordsTranslated source words
Next phrase starts atNext phrase starts at
s, t can only occurs heres, t can only occurs here
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
X
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
s(σ1) = 1
s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r}
t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r}
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
s(σ1) = 1
s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r}
t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r}
r is bounded by d + 1.
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
s(σ1) = 1
s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r}
t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r}
r is bounded by d + 1.
Number of states: O(n · g(d) · hd+1)
The number of DP states (fixed distortion limit d)
State: j, {σ1, σ2 . . . σr } r: number of “tapes”
j ∈ {1, . . . , n} n: source sentence length → O(n)
σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also)
σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
1 2 3 4 . . . j − d . . . j j + 1 . . .
X
Translated source words
Next phrase starts at
s, t can only occurs heres, t can only occurs here
s(σ1) = 1
s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r}
t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r}
r is bounded by d + 1.
Number of states: O(n · g(d) · hd+1)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern also
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern also
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern also
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern also
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern alsoalso
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must alsoalso be our concern also
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern alsoalso
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern alsoalso
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)
π2 = (3, 4, our concern)(5, 5, also)π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern alsoalso
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Extend a sub-derivation by four operations
Current sub-derivation: j, π1, π2, . . . πr
Consider a new phrase starting at source position j + 1 → O(l)
New segment πr+1 = p
Append πi = πi , p
Prepend πi = p, πi
Concatenate πi = πi , p, πi → O(r2) = O(d2)
1 2 3 4 5 6
das muss unsere sorge gleichermaßen sein
this mustthis must also be our concern alsoalsoalso
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (5, 5, also)(3, 4, our concern)
π3 = (5, 5, also)
Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern)
π2 = (3, 4, our concern)(5, 5, also)
π3 = (5, 5, also)
Bound on running time O(nd!lhd+1
)
# DP states: O(n · g(d) · hd+1)
# transition: O(d2 · l)
n: source sentence length
d: distortion limit
l: bound on the number of phrases starting at any position
h: bound on the maximum number of target translations for
any source word
Summary
Problem: Phrase-based decoding with a fixed distortion limit
A new decoding algorithm with O(nd!lhd+1) time
Operate from left to right on the source side
Maintain multiple “tapes” on the target side
Follow-up paper in EMNLP discussing experimental results
To appear in EMNLP 2017:
“Source-side left-to-right or target-side left-to-right?
An empirical comparison of two phrase-based decoding algorithms”
Beam search with a trigram language model
Constraints on the number of “tapes”
Achieve similar efficiency and accuracy as Moses
Future work
Finite state transducer (FST) formulation
j = 4
σ1 = 1, this, 2, must
σ2 = 3, our, 4, concern
j = 5
σ1 = 1, this, 5, also
σ2 = 3, our, 4, concern
1 · (5, 5, also), 2
Neural machine translation
An NMT system using this kind of approach?
Replace the attention model by absolving source words strictly
left-to-right?

More Related Content

Similar to Yin-Wen Chang - 2017 - A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programmingOye Tu
 
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...Simplilearn
 
Algebra factoring
Algebra factoringAlgebra factoring
Algebra factoringTrabahoLang
 
Slides sentiment 2013 10-3
Slides sentiment 2013 10-3Slides sentiment 2013 10-3
Slides sentiment 2013 10-3Joachim De Beule
 
constructing_generic_algorithms__ben_deane__cppcon_2020.pdf
constructing_generic_algorithms__ben_deane__cppcon_2020.pdfconstructing_generic_algorithms__ben_deane__cppcon_2020.pdf
constructing_generic_algorithms__ben_deane__cppcon_2020.pdfSayanSamanta39
 
Lecture 1
Lecture 1Lecture 1
Lecture 1butest
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimizationOlivier Teytaud
 
Erlang Concurrency
Erlang ConcurrencyErlang Concurrency
Erlang ConcurrencyBarry Ezell
 
Information and network security 34 primality
Information and network security 34 primalityInformation and network security 34 primality
Information and network security 34 primalityVaibhav Khanna
 
Causal Inference in R
Causal Inference in RCausal Inference in R
Causal Inference in RAna Daglis
 
Optimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesOptimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesEmmanuel Hadoux
 
Analysis and design of algorithms part 4
Analysis and design of algorithms part 4Analysis and design of algorithms part 4
Analysis and design of algorithms part 4Deepak John
 

Similar to Yin-Wen Chang - 2017 - A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit (20)

Dynamic programing
Dynamic programingDynamic programing
Dynamic programing
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programming
 
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
 
PyOhio Recursion Slides
PyOhio Recursion SlidesPyOhio Recursion Slides
PyOhio Recursion Slides
 
Algebra factoring
Algebra factoringAlgebra factoring
Algebra factoring
 
Slides sentiment 2013 10-3
Slides sentiment 2013 10-3Slides sentiment 2013 10-3
Slides sentiment 2013 10-3
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
constructing_generic_algorithms__ben_deane__cppcon_2020.pdf
constructing_generic_algorithms__ben_deane__cppcon_2020.pdfconstructing_generic_algorithms__ben_deane__cppcon_2020.pdf
constructing_generic_algorithms__ben_deane__cppcon_2020.pdf
 
defense
defensedefense
defense
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
 
Python Homework Sample
Python Homework SamplePython Homework Sample
Python Homework Sample
 
Erlang Concurrency
Erlang ConcurrencyErlang Concurrency
Erlang Concurrency
 
Information and network security 34 primality
Information and network security 34 primalityInformation and network security 34 primality
Information and network security 34 primality
 
Causal Inference in R
Causal Inference in RCausal Inference in R
Causal Inference in R
 
RSA
RSARSA
RSA
 
Optimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesOptimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processes
 
Analysis and design of algorithms part 4
Analysis and design of algorithms part 4Analysis and design of algorithms part 4
Analysis and design of algorithms part 4
 

More from Association for Computational Linguistics

Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsAssociation for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 

Recently uploaded (20)

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 

Yin-Wen Chang - 2017 - A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

  • 1. A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit Yin-Wen Chang 1 (Joint work with Michael Collins 1,2) 1Google, New York 2Columbia University July 31, 2017
  • 2. Introduction Background: Phrase-based decoding without further constraints is NP-hard Proof: reduction from the travelling salesman problem (TSP)[Knight(1999)] Hard distortion limit is commonly imposed in PBMT systems Question: Is phrase-based decoding with a fixed distortion limit NP-hard or not?
  • 3. Introduction A related problem: bandwidth-limited TSP 1 2 . . . i . . . j |i − j| ≤ d This work: a new decoding algorithm Process the source word from left-to-right Maintain multiple “tapes” in the target side Run time: O(nd!lhd+1 ) n: source sentence length d: distortion limit
  • 4. Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π1 ← π1 = π2 = (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) (3, 4, our concern) Process the source word from left-to-right Maintain multiple “tapes” in the target side
  • 5. Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π1 ← π1 · (1, 2, this must) π1 = π2 = (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) (3, 4, our concern) Process the source word from left-to-right Maintain multiple “tapes” in the target side
  • 6. Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π2 ← π2 · (3, 4, our concern) π1 = π2 = (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) (3, 4, our concern) Process the source word from left-to-right Maintain multiple “tapes” in the target side
  • 7. Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π1 ← π1 · (5, 5, also) π1 = π2 = (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) (3, 4, our concern) Process the source word from left-to-right Maintain multiple “tapes” in the target side
  • 8. Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π1 ← π1 · (6, 6, be) · π2 π1 = π2 = (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) (3, 4, our concern) Process the source word from left-to-right Maintain multiple “tapes” in the target side
  • 9. Outline Introduction of the phrase-based decoding problem Target-side left-to-right: the usual decoding algorithm Source-side left-to-right: the proposed algorithm Time complexity of the proposed algorithm Conclusion and future work
  • 10. Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also be Derivation: complete translation with phrase mappings Sub-derivation: partial translation
  • 11. Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also be Segment the German sentence into non-overlapping phrases Derivation: complete translation with phrase mappings Sub-derivation: partial translation
  • 12. Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also bethis must our concern also be Segment the German sentence into non-overlapping phrases Find an English translation for each German phrase Derivation: complete translation with phrase mappings Sub-derivation: partial translation
  • 13. Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also bethis must also be our concern Segment the German sentence into non-overlapping phrases Find an English translation for each German phrase Reorder the English phrases to get a better English sentence Derivation: complete translation with phrase mappings Sub-derivation: partial translation
  • 14. Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also bethis must also be our concern Segment the German sentence into non-overlapping phrases Find an English translation for each German phrase Reorder the English phrases to get a better English sentence Derivation: complete translation with phrase mappings Sub-derivation: partial translation
  • 15. Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern
  • 16. Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern Phrase translation score: score(das muss, this must) + · · ·
  • 17. Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern Phrase translation score: score(das muss, this must) + · · · Language model score: score(<s> this must also be our concern </s>) =score(this|<s>) + score(must|this) + · · ·
  • 18. Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern Phrase translation score: score(das muss, this must) + · · · Language model score: score(<s> this must also be our concern </s>) =score(this|<s>) + score(must|this) + · · · Reordering score: η · |2 + 1 − 5|
  • 19. Fixed distortion limit: distortion distance ≤ d 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern Distortion distance: |2 + 1 − 5| = 2
  • 20. Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must
  • 21. Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must
  • 22. Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also
  • 23. Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be
  • 24. Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern
  • 25. Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) DP state:
  • 26. Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) DP state:DP state: (must, 2, 110000)
  • 27. Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) DP state:DP state: (also, 5, 110010)
  • 28. Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) DP state:DP state: (be, 6, 110011)
  • 29. Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) DP state:DP state: (concern, 4, 111111)
  • 30. Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must
  • 31. Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must
  • 32. Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must our concern
  • 33. Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also our concern
  • 34. Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern
  • 35. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern) π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern
  • 36. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 = π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern DP state: j = 2, σ1 = 1, this, 2, must
  • 37. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 = π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern DP state: j = 2, σ1 = 1, this, 2, must
  • 38. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must our concern Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 = π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern
  • 39. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also our concern Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 = π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern DP state: j = 5, σ1 = 1, this, 5, also , σ2 = 3, our, 4, concern
  • 40. Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern Sub-derivation:Sub-derivation: (1, 2, this must)(5, 5, also)(6, 6, be)(3, 4, our concern)π1 = π2 =(3, 4, our concern) DP state: j = 4, σ1 = 1, this, 2, must , σ2 = 3, our, 4, concern DP state: j = 6, σ1 = 1, this, 4, concern
  • 41. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n)
  • 42. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n)
  • 43. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1)
  • 44. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) s, t: source word indices
  • 45. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) s, t: source word indices ws, wt: translated target words
  • 46. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs here Number of states: O(n · g(d) · hd+1)
  • 47. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source wordsTranslated source words Next phrase starts at s, t can only occurs here Number of states: O(n · g(d) · hd+1)
  • 48. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source wordsTranslated source words Next phrase starts atNext phrase starts at s, t can only occurs here Number of states: O(n · g(d) · hd+1)
  • 49. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source wordsTranslated source words Next phrase starts atNext phrase starts at s, t can only occurs heres, t can only occurs here Number of states: O(n · g(d) · hd+1)
  • 50. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here Number of states: O(n · g(d) · hd+1)
  • 51. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here X Number of states: O(n · g(d) · hd+1)
  • 52. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here s(σ1) = 1 s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r} t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r} Number of states: O(n · g(d) · hd+1)
  • 53. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here s(σ1) = 1 s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r} t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r} r is bounded by d + 1. Number of states: O(n · g(d) · hd+1)
  • 54. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here s(σ1) = 1 s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r} t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r} r is bounded by d + 1. Number of states: O(n · g(d) · hd+1)
  • 55. The number of DP states (fixed distortion limit d) State: j, {σ1, σ2 . . . σr } r: number of “tapes” j ∈ {1, . . . , n} n: source sentence length → O(n) σ = (s, ws, t, wt) Ex: σ = (1, this, 5, also) σ = (s(σ), ws(σ), t(σ), wt(σ)) → O(g(d) · hd+1) 1 2 3 4 . . . j − d . . . j j + 1 . . . X Translated source words Next phrase starts at s, t can only occurs heres, t can only occurs here s(σ1) = 1 s(σi ) ∈ {j − d + 2 . . . j} ∀i ∈ {2 . . . r} t(σi ) ∈ {j − d . . . j} ∀i ∈ {1 . . . r} r is bounded by d + 1. Number of states: O(n · g(d) · hd+1)
  • 56. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern also Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 57. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern also Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 58. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern also Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 59. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern also Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 60. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern alsoalso Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 61. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must alsoalso be our concern also Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 62. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern alsoalso Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 63. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern alsoalso Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also) π2 = (3, 4, our concern)(5, 5, also)π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also)
  • 64. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern alsoalso Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 65. Extend a sub-derivation by four operations Current sub-derivation: j, π1, π2, . . . πr Consider a new phrase starting at source position j + 1 → O(l) New segment πr+1 = p Append πi = πi , p Prepend πi = p, πi Concatenate πi = πi , p, πi → O(r2) = O(d2) 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this mustthis must also be our concern alsoalsoalso Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (5, 5, also)(3, 4, our concern) π3 = (5, 5, also) Sub-derivation: π1 = (1, 2, this must)(5, 5, also)(5, 5, also)(3, 4, our concern) π2 = (3, 4, our concern)(5, 5, also) π3 = (5, 5, also)
  • 66. Bound on running time O(nd!lhd+1 ) # DP states: O(n · g(d) · hd+1) # transition: O(d2 · l) n: source sentence length d: distortion limit l: bound on the number of phrases starting at any position h: bound on the maximum number of target translations for any source word
  • 67. Summary Problem: Phrase-based decoding with a fixed distortion limit A new decoding algorithm with O(nd!lhd+1) time Operate from left to right on the source side Maintain multiple “tapes” on the target side
  • 68. Follow-up paper in EMNLP discussing experimental results To appear in EMNLP 2017: “Source-side left-to-right or target-side left-to-right? An empirical comparison of two phrase-based decoding algorithms” Beam search with a trigram language model Constraints on the number of “tapes” Achieve similar efficiency and accuracy as Moses
  • 69. Future work Finite state transducer (FST) formulation j = 4 σ1 = 1, this, 2, must σ2 = 3, our, 4, concern j = 5 σ1 = 1, this, 5, also σ2 = 3, our, 4, concern 1 · (5, 5, also), 2 Neural machine translation An NMT system using this kind of approach? Replace the attention model by absolving source words strictly left-to-right?