A brief introduction of Statistical Machine Translation reaching to IBM model 5. Given on 29/5/2015 at the AUEB's Informatics Department's Post-graduate program's weekly seminar.
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Statistical MT Presentation 20150529
1. Machine Translation (old)
Statistical Machine Translation
Machine Translation
The Statistical Paradigm
Dimitris Mavroeidis1
1Department of Embodied Language Processing
Cognitive Systems Research Institute
Athens, 29/05/2015
Dimitris Mavroeidis Machine Translation
11. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Advantages
Only lexica & rules, No parallel documents.
Any domain, any genre.
Dimitris Mavroeidis Machine Translation
12. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Advantages
Only lexica & rules, No parallel documents.
Any domain, any genre.
High quality translation!
Dimitris Mavroeidis Machine Translation
13. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Dimitris Mavroeidis Machine Translation
14. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Manually added linguistic information.
Dimitris Mavroeidis Machine Translation
15. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Manually added linguistic information.
Not good with: ambiguity, idioms
Dimitris Mavroeidis Machine Translation
16. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Manually added linguistic information.
Not good with: ambiguity, idioms
Very slow to deploy.
Dimitris Mavroeidis Machine Translation
17. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Manually added linguistic information.
Not good with: ambiguity, idioms
Very slow to deploy.
Needs many resources.
Dimitris Mavroeidis Machine Translation
18. Machine Translation (old)
Statistical Machine Translation
Rule-based
Example-based
Rule-based
Challenges
Good lexica are scarce.
Manually added linguistic information.
Not good with: ambiguity, idioms
Very slow to deploy.
Needs many resources.
Very Expensive!
Dimitris Mavroeidis Machine Translation
36. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Technical stuff
IBM-Model 3
Fertility Model: The number of target-language words generated
by a source-language word
Dimitris Mavroeidis Machine Translation
40. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Technical stuff
IBM-Model 5
Vacant position tracker Model: Fixes problems of models 1-4:
Eliminates impossible translations’ probabilities.
Dimitris Mavroeidis Machine Translation
41. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Technical stuff
IBM-Model 5
Vacant position tracker Model: Fixes problems of models 1-4:
Eliminates impossible translations’ probabilities.
Eliminates the possibility of multiple translation per position.
Dimitris Mavroeidis Machine Translation
42. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Outline
Machine Translation (old)
Rule-based
Example-based
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Dimitris Mavroeidis Machine Translation
43. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Dimitris Mavroeidis Machine Translation
44. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Current state-of-the-art.
Dimitris Mavroeidis Machine Translation
45. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Current state-of-the-art.
Better quality than IBM Models.
Dimitris Mavroeidis Machine Translation
46. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Current state-of-the-art.
Better quality than IBM Models.
Results heavily dependent on training data.
Tree-based
Addition of syntactic information in Statistical Machine
Translation.
Dimitris Mavroeidis Machine Translation
47. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Current state-of-the-art.
Better quality than IBM Models.
Results heavily dependent on training data.
Tree-based
Addition of syntactic information in Statistical Machine
Translation.
Factored
Addition of any kind of information (syntactic, linguistic,
morphologic, etc.)Dimitris Mavroeidis Machine Translation
48. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Advanced models
Phrase-based
Translate sequences of words (phrases) instead of words.
Current state-of-the-art.
Better quality than IBM Models.
Results heavily dependent on training data.
Tree-based
Addition of syntactic information in Statistical Machine
Translation.
Factored
Addition of any kind of information (syntactic, linguistic,
morphologic, etc.)Dimitris Mavroeidis Machine Translation
49. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Outline
Machine Translation (old)
Rule-based
Example-based
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Dimitris Mavroeidis Machine Translation
50. Machine Translation (old)
Statistical Machine Translation
Main Ideas
IBM Models
Advanced SMT Models
Statistical Machine Translation
Hands-On
Hands-On with ”GIZA++” and ”Moses” statistical translation
systems
Dimitris Mavroeidis Machine Translation