8. relearnt rbmt

410 views

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
410
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

8. relearnt rbmt

  1. 1. Summary of Can we relearn an RBMT system? Summary of Can we relearn an RBMT system? Hiroshi Matsumoto Nagaoka University of Technology EEI Dept. March 5, 2013
  2. 2. Summary of Can we relearn an RBMT system?Outline 1 About this paper 2 Introduction 3 Systems 4 Models 5 Results
  3. 3. Summary of Can we relearn an RBMT system? About this paperAbout this paper: Title:Can we relearn an rbmt system? Author:Dugast, Lo{ï}c and Senellart, Jean and Koehn, Philipp Booktitle:Proceedings of the Third Workshop on Statistical Machine Translation Pages: 175178 Year: 2008 Organization:Association for Computational Linguistics
  4. 4. Summary of Can we relearn an RBMT system? IntroductionIntroduction Two Major Researches: 1 Rule-based Systems Manually written rules associated with bilingual dictionaries 2 Statistical Machine Translation Statistical framework based on large amount of monolingual and parallel corpora Aims of this research: nding ecient combination setups discriminating strengths/weaknesses of rule-based and statistical systems
  5. 5. Summary of Can we relearn an RBMT system? SystemsSystems Systems SYSTRAN: a pure rule-based system SYSTRAN Relearnt: a statistical model of the rule-based engine Relearnt uses a real English language model SYSTRAN Relearnt-0: a plain statistical model of SYSTRAN MOSES
  6. 6. Summary of Can we relearn an RBMT system? ModelsTraining w/o human ref. translation Problem The reliance of statistical models on parallel corpora is problematic. Solutions for this are such as by domain adaptation, statistical post-editing. Here, they came up with a new solution
  7. 7. Summary of Can we relearn an RBMT system? ModelsTraining w/o human ref. translation Submitted system: SL side of parallel corpus was translated with rule-based translation engine to produce the target side of the training data LM was trained on the real TL from data Non-Submitted system: Each corpus was built from newspaper SL corpus was translated by the rule-based system to produce the parallel training data, while TL corpus was used to train a LM
  8. 8. Summary of Can we relearn an RBMT system? ModelsTraining w/o human ref. translation
  9. 9. Summary of Can we relearn an RBMT system? ResultsResults #1 Comparison of Baseline Relearnt-0 Relearnt-0 model is slightly lower than the rule-based original Comparison of Relearnt Relearnt-0 5 BLEU points more for the Relearnt-0 with a real English language model and tuning set
  10. 10. Summary of Can we relearn an RBMT system? ResultsResults #2 To discriminate between the statistical nature of a translation system and the fact it was trained on the relevant domain, dened 11 error types counted occurrences for 100 random-picked sentences
  11. 11. Summary of Can we relearn an RBMT system? ResultsResults #2 Missing words Typical statistial error: but no evidence Extra words One of rule-based features to produce something extra Unknown words Not in dictionaries for rule-based Translation choice Statistical strength
  12. 12. Summary of Can we relearn an RBMT system? ResultsResult #3

×