5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
Upcoming SlideShare
Loading in...5
×
 

5. manuel arcedillo & juanjo arevalillo (hermes) translation memories

on

  • 434 views

 

Statistics

Views

Total Views
434
Views on SlideShare
263
Embed Views
171

Actions

Likes
0
Downloads
8
Comments
0

3 Embeds 171

http://expert-itn.eu 153
http://localhost 17
http://www.expert-itn.eu 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

5. manuel arcedillo & juanjo arevalillo (hermes) translation memories 5. manuel arcedillo & juanjo arevalillo (hermes) translation memories Presentation Transcript

  • Translation memories Hermes Traducciones y Servicios Lingüísticos
  • A brief history…
  • Processes have changed… …but not the ultimate goal.
  • Productivity
  • Found in Translation, Nataly Kelly & Jost Zetzsche (2012)
  • LAN Translation Memory LAN Server Project Managers Translators Engineering Revisers
  • WAN Translator Reviser DTPer Project Manager INTERNET Translation Memory LAN Server Project Managers Translators Engineering Revisers
  • Clouding Crowdsourcing MT TEnTs CAT SaaS Translation Memory LAN Server Project Managers Engineering Revisers Translators
  • Internals of a translation memory
  • Translation Memory Exchange •OSCAR (Open Standards for Container/Content Allowing Re-use) •TMX Standard (Translation Memory eXchange). •Leveraging of translation memories regardless the tool or platform.
  • The ancestors of CAT Tools… XL8 DOS tool in a workflow known as XLN
  • IBM TranslationManager Translation proposal Exact match Source text Proposed terms in dictionary
  • Trados Workbench
  • Déjà-Vu
  • Star Transit (no memory!)
  • WordFast
  • SDLx
  • memoQ
  • OmegaT (free!)
  • Workflow tools: Across
  • Across
  • SDL Idiom World Server
  • Specialised tools: Catalyst
  • Specialised tools: Passolo
  • Basic TM features in CAT tools  Leverage of previous translations.  Analysis for quoting, planning and keeping track of progress.  Concordance for sub-segment searches.  Maintenance to perform global changes, import/export content, etc.
  • Leveraging TMs CAT tools provide answers to these questions:  What is the fuzzy match of the segment?  What parts of the text are different?  Where is the match coming from?
  • Fuzzy match display
  • Fuzzy match display (II)
  • Fuzzy match display (III)
  • Fuzzy match display (IV)
  • Analysis feature  Every word from each segment is assigned to a different match band: 101% 100% 99-95% 94-85% 84-75% New words Repetitions
  • Analysis results
  • Different tools, different word counts CAT Tool 1 CAT Tool 2 101% 41,352 101% 29,782 100% 4194 100% 16,002 99-95% 3698 99-95% 6038 94-85% 2077 94-85% 2633 84-75% 5270 84-75% 1369 New words 5241 New words 6150 Repetitions 2068 Repetitions 5451 Total 63,900 Total 58,425
  • Different word counts  There is no standard fuzzy matching algorithm.  CAT tools may have different auto-substitution elements:  numbers, dates, acronyms, variables, etc.     Different approaches to 101% matches. Cross-file repetitions and internal fuzzy leverage. Different file format filters. Different segmentation rules.  SRX is the standard for segmentation rules.
  • Weighted word count  Each band is assigned a percentage of the full word rate according to a weighting scheme (negotiable per client). For example: 101% 0% 100% 20% 99-95% 30% 94-85% 40% 84-75% 50% New words 100% Repetitions 20%
  • Different tools, different word counts (II) CAT Tool 1 Band 41,352 Weighted words Words 101% CAT Tool 2 x 0% Band Words 0 101% 29782 Weighted words x 0% 0 100% 4194 x 20% 839 100% 16002 x 20% 3200 99-95% 3698 x 30% 1109 99-95% 6038 x 30% 1811 94-85% 2077 x 40% 831 94-85% 2633 x 40% 1053 84-75% 5270 x 50% 2635 84-75% 1369 x 50% 684 New words 5241 x 100% 5241 New words 6150 x 100% 6150 Repetitions 2068 x 20% 414 Repetitions 5451 x 20% 1090 11,069 Total Total 63,900 58,425 14,989
  • Weigted word count tools
  • TMs and statistical analysis  If big enough, TMs provide the bilingual corpus necessary to build SMT engines.  Some CAT tools can scan the TM in search of correlation between words in source and target.