Your SlideShare is downloading. ×
0

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Beijing, Chengqing Zong, Casia, 23 April 2012

926

Published on

In this talk Chengqing presents some work on development of statistical machine translation (MT) system based on the open source toolkit Moses at CASIA. In recent years, CASIA have developed several …

In this talk Chengqing presents some work on development of statistical machine translation (MT) system based on the open source toolkit Moses at CASIA. In recent years, CASIA have developed several MT systems, including Chinese-to-English and English-to-Chinese, Japanese-to-Chinese, Arabic-to-Chinese, Uigur-to-Chinese and Tibetan-to-Chinese MT systems etc. Moses is a basic translation engine in our systems. Chengqing shows audience how CASIA use and extend Moses to develop the multilingual MT systems.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit.
MosesCore is supporetd by the European Commission Grant Number 288487 under the 7th Framework Programme.
Latest news on Twitter - #MosesCore

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
926
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How We Use Moses to Develop Our Multi-lingual Machine Translation Systems? Chengqing ZONG (宗成庆) Institute of Automation, Chinese Academy of Sciences 中国科学院自动化研究所 cqzong@nlpr.ia.ac.cn100190 北京市海澱區中關村東路95號 電郵:cqzong@nlpr.ia.ac.cnhttp://www.nlpr.ia.ac.cn/cip/cqzong.htm 電話: +86-10-6255 4263
  • 2. Outline1.  Brief Introduction to Our Work2.  Main Features of Moses3.  How We Use Moses?4.  Our Feeling
  • 3. 1.  Brief Introduction to Our WorkOur group is working with machine translation(MT) research and system development in theNational Laboratory of Pattern Recognition(NLPR), Institute of Automation, ChineseAcademy of Sciences (CASIA). u  6 staffs u  8 Ph.D candidates, 1 Master student u  5 visiting scholars
  • 4. 1.  Brief Introduction to Our Work
  • 5. 1.  Brief Introduction to Our WorkMultilingual text-to-text translation system Japanese Chinese
  • 6. 1.  Brief Introduction to Our Workn  In evaluation of spokenlanguage translation(SLT) organized byIWSLT’2007The performance of CEclean text translation ofour system was the bestone according to theresults of humanrankings.
  • 7. 1.  Brief Introduction to Our Workn  In IWSLT’2008 CASIA CASIA
  • 8. 1.  Brief Introduction to Our Workn  In IWSLT’2009 CASIA CASIA
  • 9. 1.  Brief Introduction to Our Work CASIA CASIA
  • 10. 1.  Brief Introduction to Our Work NLPR
  • 11. 1.  Brief Introduction to Our Work ²  In MT evaluation organized by China Workshop on Machine Translation (CWMT) 2011 (Sept. 23~ 24th), our system participated in all tasks: 1.  Chinese to English (News domain, progress) 2.  English to Chinese (News domain, progress) 3.  English to Chinese (News domain, current) 4.  English to Chinese (Science domain) 5.  Japanese to Chinese (News domain) 6.  Tibetan to Chinese (Government documents) 7.  Mongolian to Chinese (Daily) 8.  Uigur to Chinese (News domain) 9.  Kazakh to Chinese (News domain) 10.  Kir Kyrgyz to Chinese (News domain)19 Units and 165 Systems participated in this evaluation
  • 12. 1.  Brief Introduction to Our WorkAccording to BLEU scores, the performance of oursystem was the top one in the following 5 tasks : ü  English to Chinese (News domain, progress) ü  Japanese-to-Chinese (News domain) ü  Tibetan to Chinese (Government documents) ü  Mongolian to Chinese (Daily) ü  Kir Kyrgyz to Chinese (News domain)And it is ranked at the second position in the following 4tasks: ü  Chinese to English (News domain, progress) ü  English to Chinese (News domain, current) ü  Uigur to Chinese (News domain) ü  Kazakh to Chinese (News domain)
  • 13. Outline1.  Brief Introduction to Our Work2.  Main Features of Moses3.  How We Use Moses?4.  Our Feeling
  • 14. 2. Main Features of Mosesn  The basic ideas of statistical machine translation (SMT) can be formulated in principle as ebest =argmaxe p(f | e)×pLM(e)×wlength(e) Now it is usually implemented by a log-linear model: weight feature
  • 15. 2. Main Features of MosesSome useful features include: ü  Phrase translation probability ; ü  Lexical phrase translation probability ; ü  Inversed phrase translation probability ; ü  Inversed lexical phrase translation probability ; ü  English language model based on n-gram ; ü  English sentence length penalty ; ü  Chinese phrase count penalty.
  • 16. 2. Main Features of MosesA phrase-based example: 欧洲 部分 地区 遭受 洪水 袭击(1) 欧洲 部分 地区 遭受 洪水 袭击(2) Europe parts of hit by floods(3) parts of Europe hit by floods
  • 17. 2. Main Features of Moses Development dataParallel Moses data training Test data Translation Moses model decoder Target MosesThe Framework: translation evaluation Good or bad
  • 18. 2. Main Features of Mosesn  Offer two types of translation models: phase-based and tree-basedn  Support factored translation modelsn  Allow the decoding of different kinds of inputs: sentences, confusion networks and word lattices
  • 19. 2. Main Features of Mosesn  Support n-best translation output besides the best one l  This is a good conference. l  This was a great conference. l  It is a good meeting. l  ……n  Provide an experimental management systemn  Translate fast with a good translation quality
  • 20. 2. Main Features of Mosesn  Keep balance on Speed or Quality? n  If we want translation speed, Moses provides us many options to accelerate the translation process, such as beam size, the granularity of translation rules. n  If we pursue translation quality, Moses also allows us to enlarge the translation search space in order to have a bigger change to obtain a better translation.
  • 21. 2. Main Features of Mosesn  It now includes more and more even better translation models n  Hierarchical Phrase-based Translation Model (HPB) n  Tree-to-Tree/String-to-Tree Translation Modelsn  It provides more new features, such as faster language modeling, multi-thread decoding, client-server translation etc. It keeps improving ……
  • 22. 2. Main Features of Mosesn  Moses provides good documentation and friendly interfacen  We can upgrade the components if we needn  We can develop hybrid translation methods in the framework of Moses It allows extension ……
  • 23. Outline1.  Brief Introduction to Our Work2.  Main Features of Moses3.  How We Use Moses?4.  Our Feeling
  • 24. 3. How We Use Moses?n  Moses facilitates our research work l  For the beginners of SMT l  For the researchers familiar with SMT l  For the engineers to build an SMT system
  • 25. 3. How We Use Moses?u For the beginners of SMT:n  For most beginners of SMT, Moses is the most fresh and vivid tutorials to give the beginners an intuitive feeling of SMT;n  Detailed guidance is very easy for beginners to use;n  It can provide a preliminary understanding of the modules involved in the SMT system;n  It can guide beginners to locate their interested research in SMT quickly.
  • 26. 3. How We Use Moses?We use Moses as a tutorial tool.
  • 27. 3. How We Use Moses?u For the researchers familiar with SMTn  Moses provides the whole toolkit for building a translation system n  data preparation, word alignment, translation rule extraction, parameter tuning, decoding, and evaluationn  We just need to study the sub-models that we are interested in and then propose new algorithms, and finally verify the effectiveness using Moses
  • 28. 3. How We Use Moses?n  For example, we proposed a new algorithm of word alignment and translation rule extractionn  Moses can help us to verify the effectiveness of the proposed methods in just few days. It accelerates our research work a lotn  The most important for MT researchers, Moses has become a de facto standard baseline to test their own models
  • 29. 3. How We Use Moses?We develop new models to comparewith Moses and propose new algorithmsto implement on Moses platform.
  • 30. 3. How We Use Moses? Interlingua Semantic Semantic Tree-to-tree Syntax Syntax String-to-tree Tree-to-stringFormalism gram. Hierarchical Formalism gram. phrase based Phrase-based Phrases Phrases Word-based modelSource language Target language
  • 31. 3. How We Use Moses?u For the engineers to build an SMT systemn  They do not need to care about the principle about how Moses worksn  just need to provide training data, development data, and test datan  do some pre-processing work to make data cleann  do some post-processing work to convert the output
  • 32. Source sentence Pre-processing MT engine 1 Moses MT engine 2 … MT engine 6 n-best list n-best list … n-best list Merged n-best list MBR decoder Word aligning References for alignment Merging alignments Decoder based on Confusion network C.N Translation NLPR, CAS-IA 4/23/12 32
  • 33. 3. How We Use Moses?We also use Moses as a tool toevaluate the quality of some collectedparallel corpus because we can build an MTsystem in two or three days based on thecorpus and evaluate the quality of translation.We know how well the translation qualityreflect the quality of corpus.
  • 34. 3. How We Use Moses?For example,1-1 merkezdiki dölet apparatliri bilen jaylardiki dölet apparatlirining xizmet hoquqi merkezning bir tutash rehberlikide jaylarning teshebbuskarliqi we aktipliqini toluq jari qildurush prinsipi boyiche ayrilidu.1-2 中央和地方的国家机构职权的划分,遵循在中央的统一 领导下,充分发挥地方的主动性、积极性的原则。2-1 madda jungxua xelq jumhuriyitide hemme millet bapbarawer.2-2 中华人民共和国各民族一律平等。 ……
  • 35. 3. How We Use Moses?Many participant systems in MTevaluations in the world employ Moses,such as in evaluations of NIST, WMT, IWSLTand CWMT etc.
  • 36. 3. How We Use Moses?Systems Use Moses? DCU √ 7 among 11 DFKI √ systems FBK √ employed Moses KIT √ in SLT evaluation LIG √LIMSI of IWSLT’2011! LIUM √ MIT MSR NICT √RWTH
  • 37. 3. How We Use Moses?Systems Use Systems Use Moses ? Moses ? DCU √ HIT √ 16 among 19 NTT √ IMNU √ systemsSystran √ FRDC √ employedICT-CAS √ BUAA √ Moses in MT evaluation ofIA-CAS √ XMU √ CWMT’2011!IS-CAS √ IIM √ NEU NJU XAUT √ BJTU √ ISTIC XJU √XJIPC √
  • 38. Outline1.  Brief Introduction to Our Work2.  Main Features of Moses3.  How We Use Moses?4.  Our Feeling
  • 39. 4. Our Feelingn  Moses is our friend n  It is a good helper and saves us a lot of labor n  It is a good mirror to reflect the quality of our MT systems n  It is a roll booster of MT research We love our friend!
  • 40. 4. Our Feelingn  Moses is our competitor n  We hope to develop new translation models to surpass Moses, as an MT researcher n  Competition makes us get progress We love our competitor! We love Moses!
  • 41. Thanks 谢谢!

×