Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Tenth MT Marathon 2015
Prague, Czech Republic
September 7-12, 2015
Lectures
MT Evaluation
Yvette Graham (DCU)
http://ufal.mff.cuni.cz/mtm15/files/01-mt-evaluation-yvette-graham.pdf
MT Evaluation
Yvette Graham (DCU)
MT Evaluation
Yvette Graham (DCU)
MT Evaluation
Yvette Graham (DCU)
Introduction to Machine Translation and
Phrase-Based Machine Translation
http://ufal.mff.cuni.cz/mtm15/files/04-pbmt-intro...
MT Talks
Ondřej Bojar
• http://mttalks.ufal.cz/
• Mini-lectures on MT
• Coding Exercises that complement the lectures
MT Talks
Ondřej Bojar
• Intro: Why is MT difficult, approaches to MT.
• MT that Deceives: Serious translation errors even ...
Language Modelling
Kenneth Heafield (University of Edinburgh)
http://ufal.mff.cuni.cz/mtm15/files/09-language-modelling-ke...
Discriminative Training
Miloš Stanojević (ILLC, University of Amsterdam)
http://ufal.mff.cuni.cz/mtm15/files/10-discrimina...
Deep Syntactic MT and TectoMT
Martin Popel (UFAL)
http://ufal.mff.cuni.cz/mtm15/files/12-deep-syntactic-mt-and-tectomt-mar...
Deep Syntactic MT and TectoMT
• 1.2s per sentence
• Worst in WMT 2015
• Depfix can detect & fix negation,
mostly tries to ...
Syntax-Based Models and Decoding
http://ufal.mff.cuni.cz/mtm15/files/19a-syntax-based-models-hieu-hoang.pdf
http://ufal.mf...
Keynotes
Real-World Application of an Machine
Translation Workflow
• Cost is not the most important
driver - it is speed / shorter
...
Real-World Application of an Machine
Translation Workflow
• Cost is not the most important
driver - it is speed / shorter
...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Neural Network Models and Google
TranslateKeith Stevens (Google)
http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-mod...
Text Representations for NLP and MT
Hinrich Schütze
• Reduce sparseness with morphological analysis for better machine
tra...
Text Representations for NLP and MT
Hinrich Schütze
• Use embeddings for lemmata, not for word forms
• Embeddings and morp...
Text Representations for NLP and MT
Hinrich Schütze
Text Representations for NLP and MT
Hinrich Schütze
•Use lemmata for MT
•Use embeddings for MT
•Use linguistic morphological resources for MT
•Don’t represent sentences as ve...
Labs
translate5
http://www.translate5.net/login
http://ufal.mff.cuni.cz/mtm15/files/03-translate5-lab-marc-mittag.pdf
Column-ba...
eman
What is eman?
• A tool for managing pipelines of steps.
• Purpose independent, but bundled with an ecosystem of tools...
Box — Moses Suite on Amazon EC2
Here's what Box v2015-05-25 beta (current release) includes:
cdec/ Popular SMT framework: ...
Treex
http://ufal.mff.cuni.cz/treex
Papers
MT-ComparEval
Martin Popel
• Graphical evaluation interface for Machine Translation
development
• web-based tool for MT de...
MT-ComparEval
Martin Popel
• Online A = Bing?
• Online B = Google?
• systems = tasks
• Newest version of BLEU
• Some sente...
Joshua 6
• (New!) Phrased-based decoder (no OSM or lexical distortion)
• (New!) Language packs
CloudLM: a Cloud-based Language Model for
Machine Translation
Evaluating MT systems with BEER
Sampling Phrase Tables for the Moses
Statistical Machine Translation System
Projects
Docker
PDF 2 Bitext
MT4NLTK
Appraise++
An open-source system for manual evaluation of MT output
It supports collaborative collection of human feedback...
LM prefetch
Segmentation-Aware Language Model
Deep Machine Translation
Workshop 2015
Prague, Czech Republic
September 3-4, 2015
TectoMT Seminar 2015
Prague, Czech Republic
September 3-4, 2015
100% Acceptance Rate!
Charles University in Prague
Faculty of Mathematics and Physics
Institute of Formal and Applied Linguistics
Trdelník
Vltava
Národní technické muzeum
Thank
you!
MTM 2015
MTM 2015
MTM 2015
MTM 2015
Upcoming SlideShare
Loading in …5
×

MTM 2015

331 views

Published on

The 10th Machine Translation Marathon

Published in: Technology
  • Be the first to comment

  • Be the first to like this

MTM 2015

  1. 1. Tenth MT Marathon 2015 Prague, Czech Republic September 7-12, 2015
  2. 2. Lectures
  3. 3. MT Evaluation Yvette Graham (DCU) http://ufal.mff.cuni.cz/mtm15/files/01-mt-evaluation-yvette-graham.pdf
  4. 4. MT Evaluation Yvette Graham (DCU)
  5. 5. MT Evaluation Yvette Graham (DCU)
  6. 6. MT Evaluation Yvette Graham (DCU)
  7. 7. Introduction to Machine Translation and Phrase-Based Machine Translation http://ufal.mff.cuni.cz/mtm15/files/04-pbmt-introduction-ales-tamchyna.pdf Aleš Tamchyna (UFAL)
  8. 8. MT Talks Ondřej Bojar • http://mttalks.ufal.cz/ • Mini-lectures on MT • Coding Exercises that complement the lectures
  9. 9. MT Talks Ondřej Bojar • Intro: Why is MT difficult, approaches to MT. • MT that Deceives: Serious translation errors even for short and simple inputs. • Pre-processing: Normalization and other technical tricks bound to help your MT system. • MT Evaluation in General: Techniques of judging MT quality, dimensions of translation quality, number of possible translations. • Automatic MT Evaluation: Two common automatic MT evaluation methods: PER and BLEU • Data Acquisition: The need and possible sources of training data for MT, the diminishing utility of the new data additions due to Zipf's law. • Sentence Alignment: An introduction to the Gale & Church sentence alignment algorithm. • Word Alignment: Cutting the chicken-egg problem. • Phrase-based Model: Copy if you can. • Constituency Trees: Divide and conquer. • Dependency Trees: Trees with gaps. • Rich Vocabulary: Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz. • Scoring and Optimization: Features your model features.
  10. 10. Language Modelling Kenneth Heafield (University of Edinburgh) http://ufal.mff.cuni.cz/mtm15/files/09-language-modelling-kenneth-heafield.pdf LM = 50% of CPU
  11. 11. Discriminative Training Miloš Stanojević (ILLC, University of Amsterdam) http://ufal.mff.cuni.cz/mtm15/files/10-discriminative-training-milos-stanojevic.pdf
  12. 12. Deep Syntactic MT and TectoMT Martin Popel (UFAL) http://ufal.mff.cuni.cz/mtm15/files/12-deep-syntactic-mt-and-tectomt-martin-popel.pdf
  13. 13. Deep Syntactic MT and TectoMT • 1.2s per sentence • Worst in WMT 2015 • Depfix can detect & fix negation, mostly tries to fix morphological agreement • Originally CS-EN but within QTLeap adapted to CS-EN, EN-ES, EN-NL, EN-PT, EN-EU • 67% errors from transfer, 30% from analysis Martin Popel (UFAL) http://ufal.mff.cuni.cz/mtm15/files/12-deep-syntactic-mt-and-tectomt-martin-popel.pdf
  14. 14. Syntax-Based Models and Decoding http://ufal.mff.cuni.cz/mtm15/files/19a-syntax-based-models-hieu-hoang.pdf http://ufal.mff.cuni.cz/mtm15/files/19b-cyk-hieu-hoang.pdf Hieu Hoang (New York University, Abu Dhabi)
  15. 15. Keynotes
  16. 16. Real-World Application of an Machine Translation Workflow • Cost is not the most important driver - it is speed / shorter turnaround time • Pricing is part of the business relationship. MT usage is only one of many driving factors.
  17. 17. Real-World Application of an Machine Translation Workflow • Cost is not the most important driver - it is speed / shorter turnaround time • Pricing is part of the business relationship. MT usage is only one of many driving factors.
  18. 18. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  19. 19. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  20. 20. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  21. 21. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  22. 22. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  23. 23. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  24. 24. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  25. 25. Neural Network Models and Google TranslateKeith Stevens (Google) http://ufal.mff.cuni.cz/mtm15/files/11-neural-network-models-and-google-translate-keith-stevens.pdf
  26. 26. Text Representations for NLP and MT Hinrich Schütze • Reduce sparseness with morphological analysis for better machine translation • MarMoT - A fast and accurate morphological tagger - http://cistern.cis.lmu.de/marmot/
  27. 27. Text Representations for NLP and MT Hinrich Schütze • Use embeddings for lemmata, not for word forms • Embeddings and morphological resources provide complementary information - use both!
  28. 28. Text Representations for NLP and MT Hinrich Schütze
  29. 29. Text Representations for NLP and MT Hinrich Schütze
  30. 30. •Use lemmata for MT •Use embeddings for MT •Use linguistic morphological resources for MT •Don’t represent sentences as vectors for MT •Deep learning will not replace other MT work . . . •. . . but will be a powerful component of MT systems. Text Representations for NLP and MT Hinrich Schütze
  31. 31. Labs
  32. 32. translate5 http://www.translate5.net/login http://ufal.mff.cuni.cz/mtm15/files/03-translate5-lab-marc-mittag.pdf Column-based approach on data
  33. 33. eman What is eman? • A tool for managing pipelines of steps. • Purpose independent, but bundled with an ecosystem of tools for machine translation. • Written in Perl 5, runs on Linux (and probably other Unices). • Tasks submitted locally or using SGE cluster. Key Features • Create complex experiment pipelines. • Clone whole experiments or individual steps. • Re-use existing steps when possible. • Automaticaly resolve complex step dependencies. • Seamlessly share steps with others. • Generate tables of results based on customizable rules. • Easily scriptable and hacking friendly. https://ufal.mff.cuni.cz/eman/
  34. 34. Box — Moses Suite on Amazon EC2 Here's what Box v2015-05-25 beta (current release) includes: cdec/ Popular SMT framework: http://www.cdec-decoder.org cmph/ Hashing library (for compact phrase tables in Moses): http://cmph.sourceforge.net ducttape/ Experiment management system (for cdec): https://github.com/jhclark/ducttape eigen3/ Linear algebra library (for cdec): http://eigen.tuxfamily.org/index.php?title=Main_Page fast_align/ Word alignment tool : https://github.com/clab/fast_align giza-pp/ Word alignment package (for Moses): http://www.statmt.org/moses/giza/GIZA++.html kenlm/ Language modeling toolkit: http://kheafield.com/code/kenlm/ mgiza/ Multi-threaded Giza++ : http://www.kyloo.net/software/doku.php/mgiza:overview mosesdecoder/ Popular SMT framework: http://www.statmt.org/moses/ multeval/ MT evaluation tool: https://github.com/jhclark/multeval rnnlm/ Neural network language modeling toolkit: http://rnnlm.org salm/ Suffix-array toolkit for NLP (for Moses): https://github.com/moses-smt/salm scala/ Programming language (for cdec): http://www.scala-lang.org vowpal_wabbit/ Machine learning toolkit compatible with Moses: http://hunch.net/~vw/ word2vec/ Continuous word representations: https://code.google.com/p/word2vec/ http://www.boxresear.ch/
  35. 35. Treex http://ufal.mff.cuni.cz/treex
  36. 36. Papers
  37. 37. MT-ComparEval Martin Popel • Graphical evaluation interface for Machine Translation development • web-based tool for MT developers • check progress of a system over time or compare several MT systems • focus on analyzing system differences • API for uploading translations • Try it now - http://wmt.ufal.cz • Install it - https://github.com/choko/MT-ComparEval/
  38. 38. MT-ComparEval Martin Popel • Online A = Bing? • Online B = Google? • systems = tasks • Newest version of BLEU • Some sentence level smoothing
  39. 39. Joshua 6 • (New!) Phrased-based decoder (no OSM or lexical distortion) • (New!) Language packs
  40. 40. CloudLM: a Cloud-based Language Model for Machine Translation
  41. 41. Evaluating MT systems with BEER
  42. 42. Sampling Phrase Tables for the Moses Statistical Machine Translation System
  43. 43. Projects
  44. 44. Docker
  45. 45. PDF 2 Bitext
  46. 46. MT4NLTK
  47. 47. Appraise++ An open-source system for manual evaluation of MT output It supports collaborative collection of human feedback for MT evaluation. It implements tasks such as Translation Quality Checking, Ranking and Error Classification, and Manual Post-Editing. http://appraise.cf/
  48. 48. LM prefetch
  49. 49. Segmentation-Aware Language Model
  50. 50. Deep Machine Translation Workshop 2015 Prague, Czech Republic September 3-4, 2015
  51. 51. TectoMT Seminar 2015 Prague, Czech Republic September 3-4, 2015 100% Acceptance Rate!
  52. 52. Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics
  53. 53. Trdelník
  54. 54. Vltava
  55. 55. Národní technické muzeum
  56. 56. Thank you!

×