A subset of the presentation that I use for my "Introduction to translation technologies" course at Lessius Hogeschool, Antwerp (Belgium).

Introduction To Translation Technologies

  Introduction to translation technologies Gerrit Sanders Computer-Assisted Translation
  Computer-assisted translation (CAT) or computer-aided translation is a translation process in which a human translator uses software to obtain a higher degree of precision and efficiency. 2 Computer-Assisted Translation Introduction
  Typical components of a CAT-solution include: Data mining tools: Translation memory alignment and (TM) term extraction Translation editor Quality assurance Translation management Termbase system (TMS) 33 Computer-Assisted Translation Introduction
  Translation memory (TM) Computer-Assisted Translation
  A translation memory (TM) is a database that stores sentences and their translations for reuse in new translation projects. This is a Ceci est This is a sentence. sentence. Ceci est une phrase. une phrase. 5 Computer-Assisted Translation Translation memory
  A record in the translation memory is called a translation unit (TU). source segment This is a sentence. target segment Ceci est une phrase. Created on: 18/09/2006 Created by: Gerrit information fields Customer: ACME Project: Training 6 Computer-Assisted Translation Translation memory
  Segmentation is the process of splitting the new source text into logical, reusable units. Segmentation can be either sentence-based or paragraph-based. Paragraph-based segmentation Sentence-based segmentation 1 Welcome to Brussels 1 Welcome to Brussels 2 Brussels is the capital of 2 Brussels is the capital of Belgium. It is officially bilingual. Belgium. 3 It is officially bilingual. 7 Computer-Assisted Translation Translation memory
  Translation memory (TM) 0% 99% or lower 100% 101% ?? No match Fuzzy match Exact match Context match The new source The new source The new source The new source segment is segment is segment is segment is not found in the similar (but not identical to a identical to a TM. identical) to a source segment source segment source segment found in the TM. found in the TM found in the TM. and they both have the same context. 8Computer-Assisted Translation Translation memory
  • Most translation memory tools support TMX (Translation Memory eXchange), an XML-based open standard for the exchange of translation memory data. • TMX is developed and maintained by LISA (  TMX does not ensure 100% compatibility between different translation tools: e.g. segmentation or formatting may be handled in different ways. 9 Computer-Assisted Translation Translation memory
  • SRX (Segmentation Rules eXchange) is an XML-based open standard for the exchange of segmentation rules. • Without SRX, TMX leverage may be lower than expected. • SRX is developed and maintained by LISA (  SRX is currently not supported by SDL Trados. 10 Computer-Assisted Translation Translation memory
  Translation editor Computer-Assisted Translation
  • A translation editor is the translator's working environment, offering easy access to source and target segments. • Translation editors typically include spelling checkers in a wide variety of languages, and may enable the user to add comments or status indications to each translation. • File filters convert the source document to a translatable (or localizable) format, such as XLIFF. 12 Computer-Assisted Translation Translation editor
  Source Document Translation Editor Target Document HTML DLL HTML DLL EXE PowerPoint EXE PowerPoint InDesign PHP InDesign PHP SGML FrameMaker SGML FrameMaker XLIFF DOCX File filters File filters DOCX PDF RTF PDF RTF QuarkXPress QuarkXPress OpenOffice Excel OpenOffice Excel TXT XML TXT XML DITA DITA PageMaker PageMaker 13 Computer-Assisted Translation Translation editor
  • XLIFF (XML Localization Interchange File Format) is an XML-based open standard for translatable (or localizable) files. • XLIFF is developed and maintained by OASIS (  There are various "flavours" of XLIFF (e.g. SDLXLIFF), which in practice complicates the interchange of XLIFF data between different tools. 14 Computer-Assisted Translation Translation editor
  XLIFF (localization data) source target skeleton (other data) 15 Computer-Assisted Translation Translation editor
  Alignment Computer-Assisted Translation
  Alignment is the process in which specialized software compares a source text with its translation, matching equivalent segments, e.g. for the purpose of creating a translation memory. In a semi-automatic alignment process, the alignment results are reviewed and misalignments are corrected by a human linguist. 17 17 Computer-Assisted Translation Alignment
  legacy segmentation revision export import documents + alignment source file TMX translation memory target file 18 Computer-Assisted Translation Alignment
  Termbase Computer-Assisted Translation
  Entry Subject Note English Definition Source Term Gender Source Term Gender Source French Definition Source Term Gender Source 20 Computer-Assisted Translation Termbase
  Your concept may look like this All terms and synonyms referring to the same concept should be stored in the same entry: car, motorcar, automobile, voiture, bagnole, ... This will ensure that each language in your termbase can be used as source or target language. 21 Computer-Assisted Translation Termbase
  • TBX (TermBase eXchange) is an XML-based open standard for exchanging structured terminological data. • The TBX standard is developed by LISA ( and has also been published as an ISO standard. 22 22 Computer-Assisted Translation Termbase
  Term extraction (or terminology extraction) is the process of extracting mono- or bilingual lists of potentially interesting terms from a selection of electronic texts. 23 23 Computer-Assisted Translation Termbase
  Linguistic term extraction: • uses grammatical information to identify term candidates (and their translations) • language dependent Statistical term extraction: • looks for repeated sequences of lexical items • language independent 24 Computer-Assisted Translation Termbase
  Quality assurance (QA) Computer-Assisted Translation
  Quality assurance (QA) tools detect formal errors in translations and/or translation memories, and enable their correction. Traceable errors include omissions, inconsistent translations, punctuation differences, formatting problems, terminology errors etc.  QA tools do NOT guarantee a flawless translation! 26 26 Computer-Assisted Translation Quality assurance
