Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Methods for Handling Terminology in Machine Translation

544 views

Published on

Im Vortrag werden Möglichkeiten und Vor- und Nachteile verschiedener MÜ-Lösungen in der SDL-Language-Cloud vorgestellt. Besonderes Interesse weckt die sogenannte Adaptive MT, eine spezieller MÜ-System-Typ, welcher durch kontinuierliche Korrekturen bzw. nutzerspezifische Anpassungen von MÜ-Vorschlägen lernt, indem die Post-Edits des Nutzers zur Optimierung der Engine benutzt werden. Eine Technik, die auch im Rahmen der neuralen maschinellen Übersetzung bei SDL noch eine wichtige Rolle spielen wird.

Veranstaltung: ETUG 2017, Nürnberg

Published in: Business
  • Be the first to comment

  • Be the first to like this

Methods for Handling Terminology in Machine Translation

  1. 1. Methods for Handling Terminology in Machine Translation Christiane Mieth and Christian Eisold
  2. 2. Lean processes and systems for multilingual content Independent of software and translation vendors Best system solutions for our clients
  3. 3. 3Methods for Handling Terminology in Machine Translation, Nr. Methods for Handling Terminology in Machine Translation  Brief Introduction to Machine Translation and Domain Adaptation  Machine Translation in the SDL Tool Environment  Demo • Industry Engines vs. Baseline Engines • Training the Engine to Use Specific Terminology with AdaptiveMT • Using Dictionaries in SDL Language Cloud Agenda
  4. 4. 4Methods for Handling Terminology in Machine Translation, Nr. MT-System SMT Basics Engine Bilingual aligned corpus Translation Model Language Model A seminar for media on “A Fair Share in Europe - Creating a Common Asylum System” is organised by the EP press service with leading MEPs in the field, including EP rapporteur for a reformed EU asylum policy Cecilia Wikström and Migration Commissioner Dimitris Avramopoulos. The seminar will end with an exchange of views with President Antonio Tajani. Adaptive MT Corrections source target
  5. 5. 5Methods for Handling Terminology in Machine Translation, Nr. MT - Domain Adaptation & Terminology Adaptation of a translation system to specific contents (specific jargon & terminology) of technical corpora (texts) by training the system with text from that domain Chemistry SoftwareNews Domains Engineering What is domain adaptation? 1 Domain - 1 Engine Sub-Domains Sales Marketing Development User generated content
  6. 6. 6Methods for Handling Terminology in Machine Translation, Nr. Domain Adaptation in SMT IT Life Science Automotive Single domain corpus MT-System Electronics Each SMT engine is trained with domain specific corpora Electronics
  7. 7. 7Methods for Handling Terminology in Machine Translation, Nr. MT - Terminology & Pre-Processing of In-Domain Texts Computer, Rechner, PC Software, Anwendung, Programm Rechner Anwendung  Searching for term variances in training documents  No 1:n / n:1 / n:n – relations in training and source corpora One form - one meaning! No Synonyms, homographs! Term extraction Normalization dator, persondator, pc – Computer, Rechner, PC dator – Computer, Rechner, PC dator, persondator, pc – Rechner authoring tools + grammar checkers Termbase Antischlupfregelung ! Antriebsschlupfregelung ! Automatische Stabilitäts Control! Traktionskontrolle! TCS! TRACS!
  8. 8. 8Methods for Handling Terminology in Machine Translation, Nr. MT - Domain Adaptation & Terminology Benefits  Accuracy Training with in-domain data will fit better to your in-domain texts than unfiltered texts  Overfitting Training only on your in-domain data results in overfitting Risks  Less effort with pre-trained engines Using pre-trained in-domain engines will fit your data depending on the level of specifity  No customizing with pre-trained engines Will not cover customized terms  Consistency, speed Usage of engines together with term bases enables fast (lexical) switch between different terminologies  Possible shortcomings Term inflection Text quality has to be as good as possible
  9. 9. 9Methods for Handling Terminology in Machine Translation, Nr. Terminology Integration in MT Workflows Term extraction Term coordination Term admission & translation term base Authoring- Memory Texts (source language) Editing Terminology management CAT-Tool Translations (target language) Translation Engine MT-System post editing & update Adaptive MT Runtime integration
  10. 10. 10Methods for Handling Terminology in Machine Translation, Nr. Methods for Handling Terminology in Machine Translation  Brief Introduction to Machine Translation and Domain Adaptation  Machine Translation in the SDL Tool Environment  Demo • Industry Engines vs. Baseline Engines • Training the Engine to Use Specific Terminology with AdaptiveMT • Using Dictionaries in SDL Language Cloud Agenda
  11. 11. 11Methods for Handling Terminology in Machine Translation, Nr. MT in the Translation Process Chain Editing Translation Publishing , MT-Engine Post-Editing - Import MT while pre-translating - Use MT in your editor while translating Generate MT
  12. 12. 12Methods for Handling Terminology in Machine Translation, Nr. MT in the Translation Process Chain Editing Translation Publishing , MT-Engine Post-Editing Look-up for engine matches Pre-translation with found matches Select engine in project settings Post-editing
  13. 13. 13Methods for Handling Terminology in Machine Translation, Nr. MT in the Translation Process Chain Editing Translation Publishing , MT-Engine Look for matches while translating Matches displayed in TM results Select engine afterwards Engine as additional translation source
  14. 14. 14Methods for Handling Terminology in Machine Translation, Nr. MT Systems in the SDL Environment
  15. 15. 15Methods for Handling Terminology in Machine Translation, Nr. SDL – Statistic MT: BeGlobal, Language Cloud, ETS BeGlobal Community BeGlobal Enterprise SDL BeGlobal SDL Language Cloud - MT AdaptiveMT Baseline Engines Translator SDL ETS On Premise
  16. 16. 16Methods for Handling Terminology in Machine Translation, Nr. MT Functions in the SDL Environment SDL Trados Studio 2017 MT use in pre-translation (import)  Project settings MT use in translation (editor)  Translation results MT segment status ≈ No status, only Origin Penalties for MT definable  Project settings Clear differentiation of MT and TM matches ≈ In editor: yes, in TM: no Formatting adoption ≈ Depends on engine Display of MT matches in reports ≈ Only Language Cloud (AdaptiveMT) Use of custom MT engines  Based on baseline engines Use of terminology for MT training ≈ only in Trados Studio 2014/15 Use of existing industry engines  6 SDL industry engines (LC)
  17. 17. 17Methods for Handling Terminology in Machine Translation, Nr. Industry Engines in Trados Studio Automotive – Travel – Printer – IT – Electronics – Life Science
  18. 18. 18Methods for Handling Terminology in Machine Translation, Nr. Adaptive Engines in Trados Studio Dutch English English Dutch French German Italian Spanish French English Italian English Source languages Target languages
  19. 19. 19Methods for Handling Terminology in Machine Translation, Nr. Choose your Engine in Trados Studio
  20. 20. 20Methods for Handling Terminology in Machine Translation, Nr. Methods for Handling Terminology in Machine Translation  Brief Introduction to Machine Translation and Domain Adaptation  Machine Translation in the SDL Tool Environment  Demo • Industry Engines vs. Baseline Engines • Training the Engine to Use Specific Terminology with AdaptiveMT • Using Dictionaries in SDL Language Cloud Agenda
  21. 21. Thank you very much! @blcTeam +49 (0) 211 22 06 77 0 info@berns-language-consulting.de www.berns-language-consulting.de www.facebook.com/bernslanguageconsulting

×