Successfully reported this slideshow.
Your SlideShare is downloading. ×

8. relearnt rbmt

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Minhさん
Minhさん
Loading in …3
×

Check these out next

1 of 12 Ad
Advertisement

More Related Content

Similar to 8. relearnt rbmt (20)

More from Hiroshi Matsumoto (17)

Advertisement

Recently uploaded (20)

8. relearnt rbmt

  1. 1. Summary of Can we relearn an RBMT system? Summary of Can we relearn an RBMT system? Hiroshi Matsumoto Nagaoka University of Technology EEI Dept. March 5, 2013
  2. 2. Summary of Can we relearn an RBMT system? Outline 1 About this paper 2 Introduction 3 Systems 4 Models 5 Results
  3. 3. Summary of Can we relearn an RBMT system? About this paper About this paper: Title:Can we relearn an rbmt system? Author:Dugast, Lo{ï}c and Senellart, Jean and Koehn, Philipp Booktitle:Proceedings of the Third Workshop on Statistical Machine Translation Pages: 175178 Year: 2008 Organization:Association for Computational Linguistics
  4. 4. Summary of Can we relearn an RBMT system? Introduction Introduction Two Major Researches: 1 Rule-based Systems Manually written rules associated with bilingual dictionaries 2 Statistical Machine Translation Statistical framework based on large amount of monolingual and parallel corpora Aims of this research: nding ecient combination setups discriminating strengths/weaknesses of rule-based and statistical systems
  5. 5. Summary of Can we relearn an RBMT system? Systems Systems Systems SYSTRAN: a pure rule-based system SYSTRAN Relearnt: a statistical model of the rule-based engine Relearnt uses a real English language model SYSTRAN Relearnt-0: a plain statistical model of SYSTRAN MOSES
  6. 6. Summary of Can we relearn an RBMT system? Models Training w/o human ref. translation Problem The reliance of statistical models on parallel corpora is problematic. Solutions for this are such as by domain adaptation, statistical post-editing. Here, they came up with a new solution
  7. 7. Summary of Can we relearn an RBMT system? Models Training w/o human ref. translation Submitted system: SL side of parallel corpus was translated with rule-based translation engine to produce the target side of the training data LM was trained on the real TL from data Non-Submitted system: Each corpus was built from newspaper SL corpus was translated by the rule-based system to produce the parallel training data, while TL corpus was used to train a LM
  8. 8. Summary of Can we relearn an RBMT system? Models Training w/o human ref. translation
  9. 9. Summary of Can we relearn an RBMT system? Results Results #1 Comparison of Baseline Relearnt-0 Relearnt-0 model is slightly lower than the rule-based original Comparison of Relearnt Relearnt-0 5 BLEU points more for the Relearnt-0 with a real English language model and tuning set
  10. 10. Summary of Can we relearn an RBMT system? Results Results #2 To discriminate between the statistical nature of a translation system and the fact it was trained on the relevant domain, dened 11 error types counted occurrences for 100 random-picked sentences
  11. 11. Summary of Can we relearn an RBMT system? Results Results #2 Missing words Typical statistial error: but no evidence Extra words One of rule-based features to produce something extra Unknown words Not in dictionaries for rule-based Translation choice Statistical strength
  12. 12. Summary of Can we relearn an RBMT system? Results Result #3

×