Machine translation approaches.


Published on

Machine translation is about automating the task of translating between two or more languages. This covers several approaches and their behaviors in the area.

Published in: Education
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Machine translation approaches.

  1. 1. By H.N.Gunasinghe – AS2010379 Machine Translation CSC 364 1.5 Seminar II Department of Computer Science and Statistics , USJP
  2. 2.  Introduction  Machine Translation Approaches  Applications  Summery Content 2
  3. 3.  Machine translation (MT) refers to the use of computers to automate some of the tasks or the entire task of translating between human languages.  Ex:-  Electronic resources and tools are limited. Introduction Language 1 MT System Language 2 I’m speaking MT System මමමමම මමමම 3
  4. 4.  Linguistically rich area  India has 18 constitutional languages  Sri Lanka has Sinhala & Tamil  Different Scripts  Indian languages with 10 different scripts  Sri Lankan languages with 2 different scripts  Foreign language widely used in media  English English  Sinhala Sinhala  Tamil Usage of MT 4
  5. 5. MT System Bilingual Unidirectional Bidirectional Multilingual Bidirectional MT Design SL TL SL TL 5
  6. 6. MT approaches MT System Human translation with machine support Machine translation with human support Fully automated translation Rule Based MT Direct MT Transfer MT Interlingua Knowledge based MT Principle based MT Empirical MT Statistical MT Word based Translation Phrase based Translation Hierarchical phrase based translation Example Based MT Online interactive MT Hybrid MT 6
  7. 7. Rule based approach Direct Transfer Interlingua Rule-based Approach Grammar Rules Lexicon Software Program 7
  8. 8.  In the direct translation method, the SL text is analyzed structurally up to the morphological level.  It is designed for a specific source and target language pair.  Performance depends on  The quality and quantity of the source-target language dictionaries  Morphological analysis  Text processing software  Word-by-word translation with minor grammatical adjustments on word order  Morphology. 1. Direct translation 8
  9. 9. SL Interlingua TL 2. Interlingua Based Translation 9 advantages The analyzer of the parser for the SL is independent of the generator for the TL. disadvantages • difficulty in defining the interlingua. • Interlingua does not take the advantage of similarities between languages
  10. 10. 3. Transfer Based Translation Generation a TL morphological analyzer is used to generate the final TL texts Transfer the result is converted into equivalent TL-oriented representations Analysis the SL parser is used to produce the syntactic representation of a SL sentence 10
  11. 11.  One of Empirical Machine Translation (EMT) systems - rely on large parallel aligned corpora.  A data-oriented statistical framework based on the knowledge and statistical models extracted from bilingual corpora. Statistical-based Approach Decoding Select best translation for input sentence Training Corpora ML algorithm Statistical table 11
  12. 12. Translate word by word Arrange to get the target sentence Word Based Statistical-based Approach (ctd) Phrase Based Hierarchical Phrase Based  A more accurate.  Each source and target sentence is divided into separate phrases.  The alignment between the phrases in the input and output sentences normally follows certain patterns  The advantage of this approach is that hierarchical phrases have recursive structures instead of simple phrases. 12
  13. 13. Hybrid-based Translation SBT RBT Hybrid based translation Method I SL input RBTS SBTS RBTS TL output Preprocessing Translate Postprocessing 13
  14. 14. Method II Input sentence SL RBTS Output sentence 1 TL1 Output sentence 2 TL2 Output sentence 3 TL3 SBTS Output sentence TL Translate Correct 14 Hybrid-based Translation (ctd)
  15. 15.  Example-based translation is essentially translation by analogy.  An EBMT system is given a set of sentences in the SL and their corresponding translations in the TL, and uses those examples to translate other, similar source-language sentences into the TL.  The basic premise is that, if a previously translated sentence occurs again, the same translation is likely to be correct again.  EBMT systems are attractive in that they require a minimum of prior knowledge; therefore, they are quickly adaptable to many language pairs.  A restricted form of example-based translation is available commercially, known as a translation memory.  In a translation memory, as the user translates text, the translations are added to a database, and when the same sentence occurs again, the previous translation is inserted into the translated document. Example-based translation 15
  16. 16.  In this interactive translation system, the user is allowed to suggest the correct translation to the translator online.  Useful in a situation  where the context of a word is unclear  there exists many possible meanings for a particular word.  In such cases, the structural ambiguity can be solved with the interpretation of the user. Online Interactive Systems 16
  17. 17. - Rest of this presentation was done by anther person. Including Applications and Summery - Thank you! 17