Machine Translation course programBrief description of the course:There are two fundamental approaches to machine translat...
28. Method of structured prediction for learning machine translation modelsSeminar topics   1.   Mathematics of statistica...
Upcoming SlideShare
Loading in …5

Machine translation course program (in English)


Published on

This is the English version of my Machine Translation course program for the following course slides (in Russian):


Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Machine translation course program (in English)

  1. 1. Machine Translation course programBrief description of the course:There are two fundamental approaches to machine translation: rule-based approach (based on formalmodels of natural languages, like e.g. dependency grammars) and statistical approaches (based on parallelstreams of data). Both these approaches have their advantages: rule-based one being formal andstructured, while statistic approach gives an opportunity to construct and scale the system without theneed to deeply study properties of a natural language. On the other hand both these approaches have theirproblematic areas: rule-based approach is bound to a given language or a family of languages, whilestatistic approach doesn’t allow controlling subtle structures and properties of a natural language, like forexample generating prepositions. Recently combining these two fundamental approaches have been of aspecial interest of scientists. An entire pipeline of machine translation, starting from source languageformalization and finishing with word reordering on the target language side, can be considered as atraining area for combining rule based with statistics. This course will introduce students into all sub-tasksof creating a machine translation system using both fundamental approaches: formalization of naturallanguage, translational dictionaries, phrase translation, machine translation models, decoding and wordreordering. The course will also present formal semantic models of natural languages and their place in thetopic. Along with that, machine learning methods (like structured prediction) will be in the focus of thecourse. The course material assumes knowledge of general higher mathematics and knowledge or interestin the natural language processing. We will have some hands-on and take-away knowledge sessions, whichassume familiarity with formats, NLP algorithms and libraries.Course topics 1. Introduction to MT. Motivation of its existence 2. Short history of MT, mane phases. ALPAC report 3. MT systems triangle. Direct and indirect MT. Examples of MT systems 4. Current MT systems existing in the industry, main players 5. Existing software packages for natural language processing and building an MT system 6. Two fundamental approaches to MT: statistical and rule-based (classical) 7. Methods of MT 8. Direct MT system, its features, pros and cons. 9. Transfer MT system, types of transfer methods, features 10. Notion of interlingua. Features of MT based on interlingua, its comparison with transfer 11. Statistical MT and its components 12. Example based MT systems 13. Theory of statistical MT systems. Fundamental equation (Bayes theorem). Notion of statistical language model. MT model 14. model of machine translation in statistical MT 15. Task of word alignment 16. Features of MT systems 17. Existing programming components of statistical MT systems 18. Evaluation of MT systems: human evaluation and automatic metrics 19. BLEU score 20. METEOR score 21. NIST score 22. Round-trip evaluation method 23. Hybrid MT systems 24. Task of word reordering in a sentence on the target side. Rule-based and statistical approaches 25. Computer semantics of a natural language. MT system based on it 26. Pragmatics and context analysis on cross-sentence level 27. Practical details of software packages: GIZA++, SRILM, Moses
  2. 2. 28. Method of structured prediction for learning machine translation modelsSeminar topics 1. Mathematics of statistical MT, paper [1] 2. Hierarchical model of statistical MT, paper [2] 3. Phrase-based statistical MT, paper [3] 4. Rule-based MT systems, papers [4,5] 5. Hybrid MT systems, based on examples, paper [6] 6. BLEU score in details, paper [8] 7. Robust large-scale MT systems, based on examples, paper [9]Bibliography[1] Brown P., Della Petra S., Della Petra V., Mercer R.: The Mathematics ofStatistical Machine Translation: Parameter Estimation, 1993[2] Chiang D.: A Hierarchical Phrase-Based Model for Statistical MachineTranslation, 2005[3] Koehn P., Och F., Marcu D.: Statistical Phrase-Based Machine Translation, 2003[4] Kaplan R., Netter K., Wedekind J., Zaenen A.: Translation By StructuralCorrespondences, 1989[5] Landsbergen J.: The Rosetta Project, 1989[6] Groves D., Way A.: Hybrid Example-Based SMT: the Best of Both Worlds?[7] Athanaselis T., Bakamidis S., Dologou I.: Words Reordering based on StatisticalLanguage Model, 2006[8] Papineni K., Roukos S., Ward T., Zhu W.-J.: BLEU: a Method for AutomaticEvaluation of Machine Translation, 2002[9] Gough N., Way A.: Robust Large-Scale EBMT with Marker-Based Segmentation,2004