SlideShare a Scribd company logo
Hybrid Solutions for
Translation: Going hybrid
Qun Liu (DCU)
Dr. Manuel Herranz (Pangeanic)
12 November 2013, Birmingham, UK
PART A
Qun Liu (DCU)
qliu@computing.dcu.ie
Outline
 Why Hybrid MT?
 An overview of Hybrid MT
 Typical Hybrid MT Approaches
 Conclusion

Winter School 2013, Birmingham
MT Approaches
 RBMT: Rule-based Machine Translation
 EBMT: Example-based Machine Translation
 TM: Translation Memory
 SMT: Statistical Machine Translation

Winter School 2013, Birmingham
RBMT: Vauquois’ Triangle
Interlingua

Analysis

Generation
Semantic Transfer

Syntactic Transfer

Source Language

Direct

Target Language
Winter School 2013, Birmingham
RBMT: Rules for Components
Morphological Analysis

Source Semantic Rules
Bilingual Lexicon

Syntactic Transfer

Syntactic Mapping Rules

Semantic Transfer

Semantic Mapping Rules

Semantic Generation
Generation

Source Grammar

Lexical Transfer
Transfer

Syntactic Analysis (Parsing)
Semantic Analysis

Analysis

Source Morphological Rules

Target Semantic Rules

Syntactic Generation

Target Grammar

Morphological Generation

Target Morphological Rules
Winter School 2013, Birmingham
RBMT: an Example

Winter School 2013, Birmingham
RBMT: an Example

Winter School 2013, Birmingham
RBMT: an Example

Winter School 2013, Birmingham
RBMT: an Example

Winter School 2013, Birmingham
RBMT: an Example

Winter School 2013, Birmingham
RBMT
 RBMT makes use of human encoded
linguistic rules for translation
 Development of a RBMT system is
very expensive because it needs
plenty of human labour and takes a
long time (years)
Winter School 2013, Birmingham
RBMT
 RBMT systems can reach good translation
quality after years of development in the
given domain.
 Well developed RBMT systems tend to
better capture large size sentence
structures but perform worse on small size
expressions compared with SMT systems.
Winter School 2013, Birmingham
EBMT
 An EBMT system translate sentences by
analog of existing translation examples
 EBMT does not need deep analysis of
source text and may generate high quality
translation when similar examples are
found
Winter School 2013, Birmingham
EBMT

Winter School 2013, Birmingham
EBMT
 Quality of EBMT increases while we
get more examples.
 A problem of EBMT is the coverage of
the examples, especially for long
sentences.

Winter School 2013, Birmingham
TM
 Translation Memory directly output
existing target sentence when a very
similar source sentence is found in the
memory, or it outputs nothing.

Winter School 2013, Birmingham
SMT
 SMT builds statistical models to predict the
probability of a target sentence being the
translation of a given source sentence.
 To translate a given source sentence is just
to search for a target sentence with the
highest translation probability.
Winter School 2013, Birmingham
SMT
 A large number of translation pairs (parallel
corpus) is needed to estimate the model
parameters.
 To predict the translation, sentence pairs are
broken into smaller translation equivalence,
either in word level, or in phrase level or
syntax rule level.
Winter School 2013, Birmingham
Word-based SMT

Winter School 2013, Birmingham
Word-based SMT
Source

Target

Probability

Bushi (布什)

Bush

0.7

President

0.2

US

0.1

and

0.6

with

0.4

hold

0.7

had

0.3

hold

0.01

...

...

yu (与)
juxing (举行)
le (了)

Winter School 2013, Birmingham
Phrase-based SMT

Winter School 2013, Birmingham
Phrase-based SMT
Source

Target

Probability

Bushi (布什)

Bush

0.5

president Bush

0.3

the US president

0.2

Bush and

0.8

the president and

0.2

and Shalong

0.6

with Shalong

0.4

hold a meeting

0.7

had a meeting

0.3

Bushi yu (布什与)
yu Shalong (与沙龙)
juxing le huiang (举行了会谈)

Winter School 2013, Birmingham
Hierarchical Phrased-based SMT

Winter School 2013, Birmingham
Hierarchical Phrased-based SMT
Source

Target

Probability

juxing le huiang (举行了会谈)

hold a meeting

0.6

had a meeting

0.3

X a meeting

0.8

X a talk

0.2

hold a X

0.5

had a X

0.5

Bushi yu Shalong (布什与沙龙)

Bush and Sharon

0.8

Bushi X (布什X)

Bush X

0.7

X yu Y (X与Y)

X and Y

0.9

X huitang (X会谈)
juxing le X (举行了X)

Winter School 2013, Birmingham
Syntax-based SMT

Winter School 2013, Birmingham
Syntax-based SMT
Source

Target

Probability

VPB(VS(juxing) AS(le) NPB(huiang))

hold a meeting

0.6

(举行了会谈)

have a meeting

0.3

have a talk

0.1

hold a x1

0.5

have a x1

0.5

VPB(VS(juxing) AS(le) x1:NPB)
(举行了x1)

VP(PP(P(yu) x1:NPB) x2:VPB) (与 x1 x2) x2 with x1

0.9

IP(x1:NPB VP(x2:PP x3:VPB))

0.7

x1 x3 x2

Winter School 2013, Birmingham
SMT
 SMT is cheap
 SMT systems can be developed in a
short time
 SMT needs a large number of parallel
corpus

Winter School 2013, Birmingham
SMT
 SMT gets good quality translations if we
have plenty of in-domain data
 SMT quality drops dramatically for out-ofdomain data
 SMT results is fluent in short phrases but
not good at large size sentence structures
(esp. for distant languages)
Winter School 2013, Birmingham
Why Hybrid MT?
 Each MT approach has its pros and cons.
 We want to take advantage of different MT
approaches
 We do not want to waste our investments
on existing MT systems
Winter School 2013, Birmingham
Outline
 Why Hybrid MT?
 An overview of Hybrid MT
 Typical Hybrid MT Approaches
 Conclusion

Winter School 2013, Birmingham
An overview of Hybrid MT
 Selective MT: loose coupling
 Pipelined MT: medium coupling
 Mixture MT: close coupling

Winter School 2013, Birmingham
Selective MT
 Given translations generated by
different approaches, Selective MT
tries to select a best one, or select
best parts from different translations
and combine them to a new one.
Winter School 2013, Birmingham
Selective MT
MT1

MT2

Select
Target

Source

MT3
Target
Winter School 2013, Birmingham
Selective MT
MT1

MT2

Select
Target

Source

MT3
Target
Winter School 2013, Birmingham
Selective MT
 Typical Selective MT:
 System Recommendation
 System Combination
 Sentence-level combination
 word-level combination

Winter School 2013, Birmingham
Pipelined MT
 Pipelined MT adopts one approach as
the main approach and use another
approach for monolingual preprocessing or post-processing.

Winter School 2013, Birmingham
Pipelined MT

Pre-Processing

Main Approach

Post-Processing
Winter School 2013, Birmingham
Pipelined MT
 Typical Pipelined MT:
 Statistical Post-Editing for RBMT
 Rule-based Pre-reordering for SMT

Winter School 2013, Birmingham
Mixture MT
 Mixture MT adopts one approach as
the main approach but utilizes one or
more different approaches in some
components.

Winter School 2013, Birmingham
Mixture MT

Winter School 2013, Birmingham
Mixture MT
 Typical Mixture MT:
 Statistical Parsing in RBMT
 Rule-based Named Entity Translation
in SMT
 Human-Encoded Rules in SMT
 SMT Decoding with TM Phrases
Winter School 2013, Birmingham
Outline
 Why Hybrid MT?
 An overview of Hybrid MT
 Typical Hybrid MT Approaches
 Conclusion

Winter School 2013, Birmingham
Typical Hybrid MT Approaches
 Selective MT
 System Recommendation

System Combination
 Pipelined MT
 Mixture MT
Winter School 2013, Birmingham
System Recommendation
 Yifan He, Yanjun Ma, Josef van Genabith and Andy
Way, Bridging SMT and TM with System
Recommendation, Proceedings of the 48th Annual
Meeting of the Association for Computational
Linguistics (ACL2010), pages 622–630, Uppsala,
Sweden, 11-16 July 2010.
Winter School 2013, Birmingham
System Recommendation
 Intuition:
 In some cases when we have enough big
translation memory, the trained SMT system is
comparable with TM output in translation quality.
Here comes the problem of selection.
 System recommendation recommends SMT
outputs to a TM user when it predicts that SMT
outputs are more suitable for post-editing than
the hits provided by the TM
Winter School 2013, Birmingham
System Recommendation
TM
System
Recommendation

SMT

Parallel Corpus
Winter School 2013, Birmingham
System Recommendation
 A SVM binary classifier is adopted
 The classifier is trained on humanannotated data
 A confidence score is given for the
recommendation

Winter School 2013, Birmingham
System Recommendation
 SMT System Features: features used in the SMT system
 TM Feature: Fuzzy Match Cost
 System Independent Features:
 Source-Side Language Model Score and Perplexity
 Target-Side Language Model Perplexity
 The Pseudo-Source Fuzzy Match Score
 The IBM Model 1 Score.
Winter School 2013, Birmingham
System Recommendation
 Evaluation Metrics:

Where A is the set of recommended MT
outputs, and B is the set of MT outputs that
have lower TER than TM hits.
Winter School 2013, Birmingham
System Recommendation

Winter School 2013, Birmingham
System Recommendation

Winter School 2013, Birmingham
Typical Hybrid MT Approaches
 Selective MT
System Recommendation
 System Combination

 Pipelined MT
 Mixture MT
Winter School 2013, Birmingham
System Combination
 Rosti, A. V. I., Ayan, N. F., Xiang, B., Matsoukas,
S., Schwartz, R. M., & Dorr, B. J. (2007, April).
Combining Outputs from Multiple Machine
Translation Systems. In HLT-NAACL (pp. 228-235).

Winter School 2013, Birmingham
System Combination
 Rosti, A. V. I., Matsoukas, S., & Schwartz, R. (2007,
June). Improved word-level system combination for
machine translation. In ANNUAL MEETINGASSOCIATION FOR COMPUTATIONAL
LINGUISTICS (Vol. 45, No. 1, p. 312).

Winter School 2013, Birmingham
System Combination
 He, X., Yang, M., Gao, J., Nguyen, P., & Moore, R.
2008. Indirect-HMM-based hypothesis alignment for
combining outputs from machine translation systems.
In Proceedings of the Conference on Empirical
Methods in Natural Language Processing (pp. 98-107).
Association for Computational Linguistics.

Winter School 2013, Birmingham
System Combination
 Feng, Y., Liu, Y., Mi, H., Liu, Q., & Lü, Y. 2009. Latticebased system combination for statistical machine
translation. In Proceedings of the 2009 Conference on
Empirical Methods in Natural Language Processing:
Volume 3-Volume 3 (pp. 1105-1113). Association for
Computational Linguistics.

Winter School 2013, Birmingham
Sentence-Level
System Combination
 Kumar, S., & Byrne, W. J. (2004, May).
Minimum Bayes-Risk Decoding for
Statistical Machine Translation. In
HLT-NAACL (pp. 169-176).

Winter School 2013, Birmingham
Sentence-Level
System Combination
 Consider we have several MT systems
 For a given source text F, each MT system
output a n-best target text
 If possible, MT system gives each target
text a probability P(E|F), or we may
consider the n-best target text with equal
probabilities.
Winter School 2013, Birmingham
Sentence-Level
System Combination
 Minimum Bayes-Risk (MBR):

Winter School 2013, Birmingham
Word-Level
System Combination
 Select a translation candidate as a skeleton
(backbone) with Minimal Bayes Risk
 Construct a confusion network by aligning
all the words in other translation candidates
to the words in the skeleton
 Select the best path from the confusion
network and generate a new translation
Winter School 2013, Birmingham
Translation Candidate
Skeleton

Winter School 2013, Birmingham
Word Alignment
against the Skeleton

Skeleton

Winter School 2013, Birmingham
Confusion Network

Final output:
Please show me on the map.
Winter School 2013, Birmingham
Word-Level
System Combination
 System combination is proved to be very
effective
 In NIST Open MT Evaluation ChineseEnglish task, MSR-NRC-SRI ranked no.1
by using system combination technologies
 In later NIST evaluations, different tracks
are defined participants using or not using
system combination technologies.
Winter School 2013, Birmingham
Typical Hybrid MT Approaches
 Selective MT
 Pipelined MT
Statistical Post-Editing for RBMT
Rule-based Pre-reordering for SMT
 Mixture MT
Winter School 2013, Birmingham
Statistical Post-Editing for RBMT

 Dugast, L., Senellart, J., & Koehn, P. (2007, June).
Statistical post-editing on SYSTRAN's rule-based
translation system. In Proceedings of the Second
Workshop on Statistical Machine Translation (pp.
220-223). Association for Computational
Linguistics.

Winter School 2013, Birmingham
Statistical Post-Editing for RBMT

 Simard, M., Ueffing, N., Isabelle, P., & Kuhn, R.
(2007). Rule-based Translation With Statistical
Phrase-based Post-editing. Second Workshop on
Statistical Machine Translation. Prague, Czech
Republic. June 23, 2007. pp. 203–206.

Winter School 2013, Birmingham
Statistical Post-Editing
 When we have:
 A very good RBMT system
 Large number of parallel corpus which can be
used for SMT training

 Both RBMT and SMT have advantages and
disadvantages
 Can we make benefits from both methods?
Winter School 2013, Birmingham
Statistical Post-Editing
A Statistical Post-Editing (SPE) system is a
monolingual SMT system which takes the result of a
RBMT system as input and generate a improved
target output.
Source
Text

RBMT

RBMT
Result

SPE

SPE
Result

Winter School 2013, Birmingham
Statistical Post Edit: Training

Source
Target

RBMT

RBMT
Target

SPE
Training

SPE

Target

Winter School 2013, Birmingham
Statistical Post Edit: Training
 RBMT usually generates a better word
order while SMT can make better
lexical selection.
 RBMT+SPE outperforms the original
RBMT and SMT systems.

Winter School 2013, Birmingham
Typical Hybrid MT Approaches
 Selective MT
 Pipelined MT
Statistical Post-Editing for RBMT
Rule-based Pre-reordering for SMT
 Mixture MT
Winter School 2013, Birmingham
Rule-based Pre-reordering for SMT
 Elia Yuste, Manuel Herranz, Alexandra Helle and
Hirokazu Suzuki, Go Hybrid: Pangeanic's and Toshiba's
First Steps Towards ENJP MT Hybridization, AAMT
Journal, No.50, December 2011 (Part B for this tutorial)

Winter School 2013, Birmingham
Rule-based Pre-reordering for SMT
 Xia, F., & McCord, M. (2004, August). Improving a
statistical MT system with automatically learned rewrite
patterns. In Proceedings of the 20th international
conference on Computational Linguistics (p. 508).
Association for Computational Linguistics.

Winter School 2013, Birmingham
Rule-based Pre-reordering for SMT

 A phrase-based SMT (PBSMT) system
performs good lexical choices but is not
good at long distance reordering without
linguistics knowledge
 A rule-based word-reordering on the source
side is conducted to make the word order of
the source text much more similar with the
word order in the target side.
Winter School 2013, Birmingham
Rule-based Pre-reordering for SMT

Source
Text

PreReordering

Reordered
Source Text

PBSMT

Target
Text

Winter School 2013, Birmingham
PBSMT: Training

Source
Target

Prereordering

Reordered
Source

PBSMT
Training

PBSMT

Target

Winter School 2013, Birmingham
Pre-reordering: Training
 The rule for pre-ordering can be
automatic acquired from the parallel
corpus with automatic word alignment
and parsing trees in both side.

Winter School 2013, Birmingham
Pre-reordering: Training
 Parsing the source sentence
 Parsing the target sentence
 Align the words and the phrases in
both sides
 Extract the rewrite rules
Winter School 2013, Birmingham
Parsing Trees and Alignments

Winter School 2013, Birmingham
Rule Extraction

Winter School 2013, Birmingham
Rule Organization and Filtering

Winter School 2013, Birmingham
Applying Rewrite Rules

Winter School 2013, Birmingham
Rule-based Pre-reordering for SMT

Winter School 2013, Birmingham
Typical Hybrid MT Approaches
 Selective MT
 Pipelined MT

 Mixture MT
 Statistical Parsing in RBMT
 Rule-based Named Entity Translation in SMT
 Human-Acquired Rules in SMT
 SMT Decoding with TM Phrases
Winter School 2013, Birmingham
Statistical Parsing in RBMT
 Statistical parsing outperforms rulebased parsing if we have large scale
treebank.
 It is reasonable to use statistical
algorithm in the parsing component in
a RBMT system.
Winter School 2013, Birmingham
Rule-based Named Entity Translation
in SMT
 Ney, H. (2013). Statistical MT Systems Revisited:
How much Hybridity do they have? Proceedings of
the Second Workshop on Hybrid Approaches to
Translation, page 7, Sofia, Bulgaria, August 8,
2013.

Winter School 2013, Birmingham
Numerical Expression Translation

English:
3,501,749

3 million
501 thousand
and 749

3501749
350,1749
Chinese:

350 wan 1749

Winter School 2013, Birmingham
Human-Acquired Rules in SMT
 Li, X., Lü, Y., Meng, Y., Liu, Q., & Yu, H.
Feedback Selecting of Manually Acquired
Rules Using Automatic Evaluation.
Proceedings of the 4th Workshop on Patent
Translation, pages 52-59, MT Summit XIII,
Xiamen, China, September 2011

Winter School 2013, Birmingham
Human-Acquired Rules in SMT

These rules are used in the decoding process
together with the Hierarchical Phrases in a
SMT system
Winter School 2013, Birmingham
SMT Decoding with TM Phrases

 Philipp Koehn and Jean Senellart. 2010. Convergence of
translation memory and statistical machine translation. In
AMTA Workshop on MT Research and the Translation
Industry, pages 21–31.
 Wang, K., Zong, C., & Su, K. Y. Integrating Translation
Memory into Phrase-Based Machine Translation during
Decoding. Proceedings of the 51st Annual Meeting of the
Association for Computational Linguistics, pages 11–21,
Sofia, Bulgaria, August 4-9 2013
Winter School 2013, Birmingham
SMT Decoding with TM Phrases
 Yanjun Ma, Yifan He, Andy Way and Josef van Genabith.
2011. Consistent translation using discriminative learning: a
translation memory-inspired approach. In Proceedings of the
49th Annual Meeting of the Association for Computational
Lingui stics, pages 1239–1248, Portland, Oregon.
 Yifan He, Yanjun Ma, Andy Way and Josef van Genabith.
2011. Rich linguistic features for translation memory-inspired
consistent translation. In Proceedings of the Thirteenth
Machine Translation Summit, pages 456–463.
Winter School 2013, Birmingham
SMT Decoding with TM Phrases
 Extract TM phrases from similar
sentences in the translation memory
and use them in the decoding process
in the runtime.

Winter School 2013, Birmingham
Outline
 Why Hybrid MT?
 An overview of Hybrid MT
 Typical Hybrid MT Approaches
 Conclusion

Winter School 2013, Birmingham
Conclusion
 Different MT approaches have advantages and
disadvantages, which are usually complementary.
 Hybrid MT can take benefit from different MT
approaches
 Three categories of Hybrid MT is introduced:
Selective, Pipelined and Mixture.
 Actually almost all the real MT systems are hybrid
system.
Winter School 2013, Birmingham
Thank you!
Q&A

Winter School 2013, Birmingham

More Related Content

What's hot

Machine translation
Machine translationMachine translation
Machine translation
mohamed hassan
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
YONG ZHENG
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
Stephen Peacock
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
Md.Sumon Sarder
 
Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
Mahsa Mohaghegh
 
Introduction to computer graphics
Introduction to computer graphicsIntroduction to computer graphics
Introduction to computer graphics
Partnered Health
 
Web usage mining
Web usage miningWeb usage mining
Web usage mining
Monu Chaudhary
 
Temporal based Recommendation System
Temporal based Recommendation SystemTemporal based Recommendation System
Temporal based Recommendation System
Nurfadhlina Mohd Sharef
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
gulshan kumar
 
typography
 typography typography
typography
Rahul Gupta
 
Fundamental of translation
Fundamental of translationFundamental of translation
Fundamental of translation
Ajoy Singh
 
Text classification
Text classificationText classification
Text classification
James Wong
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
Graphic Design Workshop, 2018
Graphic Design Workshop, 2018Graphic Design Workshop, 2018
Graphic Design Workshop, 2018
Clearly Blue Digital Pvt. Ltd.
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
Saurav Shrestha
 
Multimedia System & Design Ch 4 Audio
Multimedia System & Design Ch 4 AudioMultimedia System & Design Ch 4 Audio
Multimedia System & Design Ch 4 Audio
Badar Waseer
 
Features of translation 2 (1)
Features of translation 2 (1)Features of translation 2 (1)
Features of translation 2 (1)
Arie Listiani
 
Online Advertisements and the AdWords Problem
Online Advertisements and the AdWords ProblemOnline Advertisements and the AdWords Problem
Online Advertisements and the AdWords Problem
Rajesh Piryani
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
saurabhnarhe
 
Historical background of Interpreting
Historical background of Interpreting Historical background of Interpreting
Historical background of Interpreting
hanie_dirgantara
 

What's hot (20)

Machine translation
Machine translationMachine translation
Machine translation
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
 
Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
 
Introduction to computer graphics
Introduction to computer graphicsIntroduction to computer graphics
Introduction to computer graphics
 
Web usage mining
Web usage miningWeb usage mining
Web usage mining
 
Temporal based Recommendation System
Temporal based Recommendation SystemTemporal based Recommendation System
Temporal based Recommendation System
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
typography
 typography typography
typography
 
Fundamental of translation
Fundamental of translationFundamental of translation
Fundamental of translation
 
Text classification
Text classificationText classification
Text classification
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Graphic Design Workshop, 2018
Graphic Design Workshop, 2018Graphic Design Workshop, 2018
Graphic Design Workshop, 2018
 
Natural Language Processing in AI
Natural Language Processing in AINatural Language Processing in AI
Natural Language Processing in AI
 
Multimedia System & Design Ch 4 Audio
Multimedia System & Design Ch 4 AudioMultimedia System & Design Ch 4 Audio
Multimedia System & Design Ch 4 Audio
 
Features of translation 2 (1)
Features of translation 2 (1)Features of translation 2 (1)
Features of translation 2 (1)
 
Online Advertisements and the AdWords Problem
Online Advertisements and the AdWords ProblemOnline Advertisements and the AdWords Problem
Online Advertisements and the AdWords Problem
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Historical background of Interpreting
Historical background of Interpreting Historical background of Interpreting
Historical background of Interpreting
 

Viewers also liked

17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2
RIILP
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
RIILP
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
RIILP
 
18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology
RIILP
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
RIILP
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
RIILP
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
RIILP
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
RIILP
 
3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction
RIILP
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
RIILP
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
RIILP
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
RIILP
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
RIILP
 
10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
RIILP
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
RIILP
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
RIILP
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
RIILP
 

Viewers also liked (17)

17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
 
18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology18. Alessandro Cattelan (Translated) Terminology
18. Alessandro Cattelan (Translated) Terminology
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
 
10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 

Similar to 8. Qun Liu (DCU) Hybrid Solutions for Translation

Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
Jinho Choi
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
Hiroshi Matsumoto
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Sheeyam Shellvacumar
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdf
sudeshnakundu10
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
milkesa13
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Yusuke Oda
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
Chamani Shiranthika
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
Kalyanee Baruah
 
High level speaker specific features modeling in automatic speaker recognitio...
High level speaker specific features modeling in automatic speaker recognitio...High level speaker specific features modeling in automatic speaker recognitio...
High level speaker specific features modeling in automatic speaker recognitio...
IJECEIAES
 
Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010
woransa
 
Two Level Disambiguation Model for Query Translation
Two Level Disambiguation Model for Query TranslationTwo Level Disambiguation Model for Query Translation
Two Level Disambiguation Model for Query Translation
IJECEIAES
 
Ph.D defence (Shinnosuke Takamichi)
Ph.D defence (Shinnosuke Takamichi)Ph.D defence (Shinnosuke Takamichi)
Ph.D defence (Shinnosuke Takamichi)
Shinnosuke Takamichi
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
cscpconf
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
RIILP
 
D2 anandkumar
D2 anandkumarD2 anandkumar
D2 anandkumar
Jasline Presilda
 
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
mlaij
 
Caching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systemsCaching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systems
Simon Dooms
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)
Konstantin Savenkov
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
MLconf
 
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
ijnlc
 

Similar to 8. Qun Liu (DCU) Hybrid Solutions for Translation (20)

Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
Introduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdfIntroduction to Large Language Models and the Transformer Architecture.pdf
Introduction to Large Language Models and the Transformer Architecture.pdf
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
High level speaker specific features modeling in automatic speaker recognitio...
High level speaker specific features modeling in automatic speaker recognitio...High level speaker specific features modeling in automatic speaker recognitio...
High level speaker specific features modeling in automatic speaker recognitio...
 
Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010Pbsmt presenation waleed_oransa_29_april2010
Pbsmt presenation waleed_oransa_29_april2010
 
Two Level Disambiguation Model for Query Translation
Two Level Disambiguation Model for Query TranslationTwo Level Disambiguation Model for Query Translation
Two Level Disambiguation Model for Query Translation
 
Ph.D defence (Shinnosuke Takamichi)
Ph.D defence (Shinnosuke Takamichi)Ph.D defence (Shinnosuke Takamichi)
Ph.D defence (Shinnosuke Takamichi)
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
D2 anandkumar
D2 anandkumarD2 anandkumar
D2 anandkumar
 
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...Classification of Machine Translation Outputs Using NB Classifier and SVM for...
Classification of Machine Translation Outputs Using NB Classifier and SVM for...
 
Caching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systemsCaching strategies for in memory neighborhood-based recommender systems
Caching strategies for in memory neighborhood-based recommender systems
 
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (stock engines, Jun 2019)
 
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017 Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
 
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
EMPLOYING PIVOT LANGUAGE TECHNIQUE THROUGH STATISTICAL AND NEURAL MACHINE TRA...
 

More from RIILP

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
RIILP
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
RIILP
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
RIILP
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
RIILP
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
RIILP
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
RIILP
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
RIILP
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
RIILP
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
RIILP
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
RIILP
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD
RIILP
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
RIILP
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
RIILP
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
RIILP
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
RIILP
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - Acclaro
RIILP
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
RIILP
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
RIILP
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
RIILP
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
RIILP
 

More from RIILP (20)

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - Acclaro
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 

8. Qun Liu (DCU) Hybrid Solutions for Translation

  • 1. Hybrid Solutions for Translation: Going hybrid Qun Liu (DCU) Dr. Manuel Herranz (Pangeanic) 12 November 2013, Birmingham, UK
  • 2. PART A Qun Liu (DCU) qliu@computing.dcu.ie
  • 3. Outline  Why Hybrid MT?  An overview of Hybrid MT  Typical Hybrid MT Approaches  Conclusion Winter School 2013, Birmingham
  • 4. MT Approaches  RBMT: Rule-based Machine Translation  EBMT: Example-based Machine Translation  TM: Translation Memory  SMT: Statistical Machine Translation Winter School 2013, Birmingham
  • 5. RBMT: Vauquois’ Triangle Interlingua Analysis Generation Semantic Transfer Syntactic Transfer Source Language Direct Target Language Winter School 2013, Birmingham
  • 6. RBMT: Rules for Components Morphological Analysis Source Semantic Rules Bilingual Lexicon Syntactic Transfer Syntactic Mapping Rules Semantic Transfer Semantic Mapping Rules Semantic Generation Generation Source Grammar Lexical Transfer Transfer Syntactic Analysis (Parsing) Semantic Analysis Analysis Source Morphological Rules Target Semantic Rules Syntactic Generation Target Grammar Morphological Generation Target Morphological Rules Winter School 2013, Birmingham
  • 7. RBMT: an Example Winter School 2013, Birmingham
  • 8. RBMT: an Example Winter School 2013, Birmingham
  • 9. RBMT: an Example Winter School 2013, Birmingham
  • 10. RBMT: an Example Winter School 2013, Birmingham
  • 11. RBMT: an Example Winter School 2013, Birmingham
  • 12. RBMT  RBMT makes use of human encoded linguistic rules for translation  Development of a RBMT system is very expensive because it needs plenty of human labour and takes a long time (years) Winter School 2013, Birmingham
  • 13. RBMT  RBMT systems can reach good translation quality after years of development in the given domain.  Well developed RBMT systems tend to better capture large size sentence structures but perform worse on small size expressions compared with SMT systems. Winter School 2013, Birmingham
  • 14. EBMT  An EBMT system translate sentences by analog of existing translation examples  EBMT does not need deep analysis of source text and may generate high quality translation when similar examples are found Winter School 2013, Birmingham
  • 16. EBMT  Quality of EBMT increases while we get more examples.  A problem of EBMT is the coverage of the examples, especially for long sentences. Winter School 2013, Birmingham
  • 17. TM  Translation Memory directly output existing target sentence when a very similar source sentence is found in the memory, or it outputs nothing. Winter School 2013, Birmingham
  • 18. SMT  SMT builds statistical models to predict the probability of a target sentence being the translation of a given source sentence.  To translate a given source sentence is just to search for a target sentence with the highest translation probability. Winter School 2013, Birmingham
  • 19. SMT  A large number of translation pairs (parallel corpus) is needed to estimate the model parameters.  To predict the translation, sentence pairs are broken into smaller translation equivalence, either in word level, or in phrase level or syntax rule level. Winter School 2013, Birmingham
  • 20. Word-based SMT Winter School 2013, Birmingham
  • 22. Phrase-based SMT Winter School 2013, Birmingham
  • 23. Phrase-based SMT Source Target Probability Bushi (布什) Bush 0.5 president Bush 0.3 the US president 0.2 Bush and 0.8 the president and 0.2 and Shalong 0.6 with Shalong 0.4 hold a meeting 0.7 had a meeting 0.3 Bushi yu (布什与) yu Shalong (与沙龙) juxing le huiang (举行了会谈) Winter School 2013, Birmingham
  • 24. Hierarchical Phrased-based SMT Winter School 2013, Birmingham
  • 25. Hierarchical Phrased-based SMT Source Target Probability juxing le huiang (举行了会谈) hold a meeting 0.6 had a meeting 0.3 X a meeting 0.8 X a talk 0.2 hold a X 0.5 had a X 0.5 Bushi yu Shalong (布什与沙龙) Bush and Sharon 0.8 Bushi X (布什X) Bush X 0.7 X yu Y (X与Y) X and Y 0.9 X huitang (X会谈) juxing le X (举行了X) Winter School 2013, Birmingham
  • 26. Syntax-based SMT Winter School 2013, Birmingham
  • 27. Syntax-based SMT Source Target Probability VPB(VS(juxing) AS(le) NPB(huiang)) hold a meeting 0.6 (举行了会谈) have a meeting 0.3 have a talk 0.1 hold a x1 0.5 have a x1 0.5 VPB(VS(juxing) AS(le) x1:NPB) (举行了x1) VP(PP(P(yu) x1:NPB) x2:VPB) (与 x1 x2) x2 with x1 0.9 IP(x1:NPB VP(x2:PP x3:VPB)) 0.7 x1 x3 x2 Winter School 2013, Birmingham
  • 28. SMT  SMT is cheap  SMT systems can be developed in a short time  SMT needs a large number of parallel corpus Winter School 2013, Birmingham
  • 29. SMT  SMT gets good quality translations if we have plenty of in-domain data  SMT quality drops dramatically for out-ofdomain data  SMT results is fluent in short phrases but not good at large size sentence structures (esp. for distant languages) Winter School 2013, Birmingham
  • 30. Why Hybrid MT?  Each MT approach has its pros and cons.  We want to take advantage of different MT approaches  We do not want to waste our investments on existing MT systems Winter School 2013, Birmingham
  • 31. Outline  Why Hybrid MT?  An overview of Hybrid MT  Typical Hybrid MT Approaches  Conclusion Winter School 2013, Birmingham
  • 32. An overview of Hybrid MT  Selective MT: loose coupling  Pipelined MT: medium coupling  Mixture MT: close coupling Winter School 2013, Birmingham
  • 33. Selective MT  Given translations generated by different approaches, Selective MT tries to select a best one, or select best parts from different translations and combine them to a new one. Winter School 2013, Birmingham
  • 36. Selective MT  Typical Selective MT:  System Recommendation  System Combination  Sentence-level combination  word-level combination Winter School 2013, Birmingham
  • 37. Pipelined MT  Pipelined MT adopts one approach as the main approach and use another approach for monolingual preprocessing or post-processing. Winter School 2013, Birmingham
  • 39. Pipelined MT  Typical Pipelined MT:  Statistical Post-Editing for RBMT  Rule-based Pre-reordering for SMT Winter School 2013, Birmingham
  • 40. Mixture MT  Mixture MT adopts one approach as the main approach but utilizes one or more different approaches in some components. Winter School 2013, Birmingham
  • 41. Mixture MT Winter School 2013, Birmingham
  • 42. Mixture MT  Typical Mixture MT:  Statistical Parsing in RBMT  Rule-based Named Entity Translation in SMT  Human-Encoded Rules in SMT  SMT Decoding with TM Phrases Winter School 2013, Birmingham
  • 43. Outline  Why Hybrid MT?  An overview of Hybrid MT  Typical Hybrid MT Approaches  Conclusion Winter School 2013, Birmingham
  • 44. Typical Hybrid MT Approaches  Selective MT  System Recommendation System Combination  Pipelined MT  Mixture MT Winter School 2013, Birmingham
  • 45. System Recommendation  Yifan He, Yanjun Ma, Josef van Genabith and Andy Way, Bridging SMT and TM with System Recommendation, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL2010), pages 622–630, Uppsala, Sweden, 11-16 July 2010. Winter School 2013, Birmingham
  • 46. System Recommendation  Intuition:  In some cases when we have enough big translation memory, the trained SMT system is comparable with TM output in translation quality. Here comes the problem of selection.  System recommendation recommends SMT outputs to a TM user when it predicts that SMT outputs are more suitable for post-editing than the hits provided by the TM Winter School 2013, Birmingham
  • 48. System Recommendation  A SVM binary classifier is adopted  The classifier is trained on humanannotated data  A confidence score is given for the recommendation Winter School 2013, Birmingham
  • 49. System Recommendation  SMT System Features: features used in the SMT system  TM Feature: Fuzzy Match Cost  System Independent Features:  Source-Side Language Model Score and Perplexity  Target-Side Language Model Perplexity  The Pseudo-Source Fuzzy Match Score  The IBM Model 1 Score. Winter School 2013, Birmingham
  • 50. System Recommendation  Evaluation Metrics: Where A is the set of recommended MT outputs, and B is the set of MT outputs that have lower TER than TM hits. Winter School 2013, Birmingham
  • 53. Typical Hybrid MT Approaches  Selective MT System Recommendation  System Combination  Pipelined MT  Mixture MT Winter School 2013, Birmingham
  • 54. System Combination  Rosti, A. V. I., Ayan, N. F., Xiang, B., Matsoukas, S., Schwartz, R. M., & Dorr, B. J. (2007, April). Combining Outputs from Multiple Machine Translation Systems. In HLT-NAACL (pp. 228-235). Winter School 2013, Birmingham
  • 55. System Combination  Rosti, A. V. I., Matsoukas, S., & Schwartz, R. (2007, June). Improved word-level system combination for machine translation. In ANNUAL MEETINGASSOCIATION FOR COMPUTATIONAL LINGUISTICS (Vol. 45, No. 1, p. 312). Winter School 2013, Birmingham
  • 56. System Combination  He, X., Yang, M., Gao, J., Nguyen, P., & Moore, R. 2008. Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 98-107). Association for Computational Linguistics. Winter School 2013, Birmingham
  • 57. System Combination  Feng, Y., Liu, Y., Mi, H., Liu, Q., & Lü, Y. 2009. Latticebased system combination for statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3-Volume 3 (pp. 1105-1113). Association for Computational Linguistics. Winter School 2013, Birmingham
  • 58. Sentence-Level System Combination  Kumar, S., & Byrne, W. J. (2004, May). Minimum Bayes-Risk Decoding for Statistical Machine Translation. In HLT-NAACL (pp. 169-176). Winter School 2013, Birmingham
  • 59. Sentence-Level System Combination  Consider we have several MT systems  For a given source text F, each MT system output a n-best target text  If possible, MT system gives each target text a probability P(E|F), or we may consider the n-best target text with equal probabilities. Winter School 2013, Birmingham
  • 60. Sentence-Level System Combination  Minimum Bayes-Risk (MBR): Winter School 2013, Birmingham
  • 61. Word-Level System Combination  Select a translation candidate as a skeleton (backbone) with Minimal Bayes Risk  Construct a confusion network by aligning all the words in other translation candidates to the words in the skeleton  Select the best path from the confusion network and generate a new translation Winter School 2013, Birmingham
  • 63. Word Alignment against the Skeleton Skeleton Winter School 2013, Birmingham
  • 64. Confusion Network Final output: Please show me on the map. Winter School 2013, Birmingham
  • 65. Word-Level System Combination  System combination is proved to be very effective  In NIST Open MT Evaluation ChineseEnglish task, MSR-NRC-SRI ranked no.1 by using system combination technologies  In later NIST evaluations, different tracks are defined participants using or not using system combination technologies. Winter School 2013, Birmingham
  • 66. Typical Hybrid MT Approaches  Selective MT  Pipelined MT Statistical Post-Editing for RBMT Rule-based Pre-reordering for SMT  Mixture MT Winter School 2013, Birmingham
  • 67. Statistical Post-Editing for RBMT  Dugast, L., Senellart, J., & Koehn, P. (2007, June). Statistical post-editing on SYSTRAN's rule-based translation system. In Proceedings of the Second Workshop on Statistical Machine Translation (pp. 220-223). Association for Computational Linguistics. Winter School 2013, Birmingham
  • 68. Statistical Post-Editing for RBMT  Simard, M., Ueffing, N., Isabelle, P., & Kuhn, R. (2007). Rule-based Translation With Statistical Phrase-based Post-editing. Second Workshop on Statistical Machine Translation. Prague, Czech Republic. June 23, 2007. pp. 203–206. Winter School 2013, Birmingham
  • 69. Statistical Post-Editing  When we have:  A very good RBMT system  Large number of parallel corpus which can be used for SMT training  Both RBMT and SMT have advantages and disadvantages  Can we make benefits from both methods? Winter School 2013, Birmingham
  • 70. Statistical Post-Editing A Statistical Post-Editing (SPE) system is a monolingual SMT system which takes the result of a RBMT system as input and generate a improved target output. Source Text RBMT RBMT Result SPE SPE Result Winter School 2013, Birmingham
  • 71. Statistical Post Edit: Training Source Target RBMT RBMT Target SPE Training SPE Target Winter School 2013, Birmingham
  • 72. Statistical Post Edit: Training  RBMT usually generates a better word order while SMT can make better lexical selection.  RBMT+SPE outperforms the original RBMT and SMT systems. Winter School 2013, Birmingham
  • 73. Typical Hybrid MT Approaches  Selective MT  Pipelined MT Statistical Post-Editing for RBMT Rule-based Pre-reordering for SMT  Mixture MT Winter School 2013, Birmingham
  • 74. Rule-based Pre-reordering for SMT  Elia Yuste, Manuel Herranz, Alexandra Helle and Hirokazu Suzuki, Go Hybrid: Pangeanic's and Toshiba's First Steps Towards ENJP MT Hybridization, AAMT Journal, No.50, December 2011 (Part B for this tutorial) Winter School 2013, Birmingham
  • 75. Rule-based Pre-reordering for SMT  Xia, F., & McCord, M. (2004, August). Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the 20th international conference on Computational Linguistics (p. 508). Association for Computational Linguistics. Winter School 2013, Birmingham
  • 76. Rule-based Pre-reordering for SMT  A phrase-based SMT (PBSMT) system performs good lexical choices but is not good at long distance reordering without linguistics knowledge  A rule-based word-reordering on the source side is conducted to make the word order of the source text much more similar with the word order in the target side. Winter School 2013, Birmingham
  • 77. Rule-based Pre-reordering for SMT Source Text PreReordering Reordered Source Text PBSMT Target Text Winter School 2013, Birmingham
  • 79. Pre-reordering: Training  The rule for pre-ordering can be automatic acquired from the parallel corpus with automatic word alignment and parsing trees in both side. Winter School 2013, Birmingham
  • 80. Pre-reordering: Training  Parsing the source sentence  Parsing the target sentence  Align the words and the phrases in both sides  Extract the rewrite rules Winter School 2013, Birmingham
  • 81. Parsing Trees and Alignments Winter School 2013, Birmingham
  • 82. Rule Extraction Winter School 2013, Birmingham
  • 83. Rule Organization and Filtering Winter School 2013, Birmingham
  • 84. Applying Rewrite Rules Winter School 2013, Birmingham
  • 85. Rule-based Pre-reordering for SMT Winter School 2013, Birmingham
  • 86. Typical Hybrid MT Approaches  Selective MT  Pipelined MT  Mixture MT  Statistical Parsing in RBMT  Rule-based Named Entity Translation in SMT  Human-Acquired Rules in SMT  SMT Decoding with TM Phrases Winter School 2013, Birmingham
  • 87. Statistical Parsing in RBMT  Statistical parsing outperforms rulebased parsing if we have large scale treebank.  It is reasonable to use statistical algorithm in the parsing component in a RBMT system. Winter School 2013, Birmingham
  • 88. Rule-based Named Entity Translation in SMT  Ney, H. (2013). Statistical MT Systems Revisited: How much Hybridity do they have? Proceedings of the Second Workshop on Hybrid Approaches to Translation, page 7, Sofia, Bulgaria, August 8, 2013. Winter School 2013, Birmingham
  • 89. Numerical Expression Translation English: 3,501,749 3 million 501 thousand and 749 3501749 350,1749 Chinese: 350 wan 1749 Winter School 2013, Birmingham
  • 90. Human-Acquired Rules in SMT  Li, X., Lü, Y., Meng, Y., Liu, Q., & Yu, H. Feedback Selecting of Manually Acquired Rules Using Automatic Evaluation. Proceedings of the 4th Workshop on Patent Translation, pages 52-59, MT Summit XIII, Xiamen, China, September 2011 Winter School 2013, Birmingham
  • 91. Human-Acquired Rules in SMT These rules are used in the decoding process together with the Hierarchical Phrases in a SMT system Winter School 2013, Birmingham
  • 92. SMT Decoding with TM Phrases  Philipp Koehn and Jean Senellart. 2010. Convergence of translation memory and statistical machine translation. In AMTA Workshop on MT Research and the Translation Industry, pages 21–31.  Wang, K., Zong, C., & Su, K. Y. Integrating Translation Memory into Phrase-Based Machine Translation during Decoding. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pages 11–21, Sofia, Bulgaria, August 4-9 2013 Winter School 2013, Birmingham
  • 93. SMT Decoding with TM Phrases  Yanjun Ma, Yifan He, Andy Way and Josef van Genabith. 2011. Consistent translation using discriminative learning: a translation memory-inspired approach. In Proceedings of the 49th Annual Meeting of the Association for Computational Lingui stics, pages 1239–1248, Portland, Oregon.  Yifan He, Yanjun Ma, Andy Way and Josef van Genabith. 2011. Rich linguistic features for translation memory-inspired consistent translation. In Proceedings of the Thirteenth Machine Translation Summit, pages 456–463. Winter School 2013, Birmingham
  • 94. SMT Decoding with TM Phrases  Extract TM phrases from similar sentences in the translation memory and use them in the decoding process in the runtime. Winter School 2013, Birmingham
  • 95. Outline  Why Hybrid MT?  An overview of Hybrid MT  Typical Hybrid MT Approaches  Conclusion Winter School 2013, Birmingham
  • 96. Conclusion  Different MT approaches have advantages and disadvantages, which are usually complementary.  Hybrid MT can take benefit from different MT approaches  Three categories of Hybrid MT is introduced: Selective, Pipelined and Mixture.  Actually almost all the real MT systems are hybrid system. Winter School 2013, Birmingham
  • 97. Thank you! Q&A Winter School 2013, Birmingham