SlideShare a Scribd company logo
Better Translation Technology
Andrzej Zydron, CTO XTM International
Better Translation Technology
DITA Localization
Better Translation Technology
In the beginning
Technical documentation was without form, and darkness was upon the face of
the page:
– Manual typesetting
– RTF
– WordPerfect
– MS Word
– FrameMaker
– Ventura Publisher
– Pagemaker
– SGML
3
Better Translation Technology
In the beginning
Lack of standards
•Proprietary solutions
•Problems with character encoding
•Expensive to design
•Expensive to build
•Expensive to maintain
•Expensive to localize
4
Better Translation Technology
Along came XML
Let there be light:
– XML born in 1997 from SGML/HTML
– Review of lessons learned from SGML
– Easier to implement
– Removed unnecessary complexity
– Declared standard encoding - Unicode
5
Better Translation Technology
DITA
Standards, Standards, Standards
DITA:
Advent of standards to
technical documentation
6
Better Translation Technology
DITA is not perfect!
Better Translation Technology
DITA - the good
Extremely well thought out XML document architecture:
– modularity
– fine level of granularity
– reuse
– bookmap
– standardized elements
– Write once, translate once, reuse many times
– Multiple output formats, multiple places, multiple docs:
• PDF, HTML, mobile, web, paper etc.
8
Better Translation Technology
DITA Localization
Practical considerations:
– Controlled Authoring:
• Consistency
• Terminology
– Delivery for localization:
• All at once in one big heap
• JIT - individual topics when ready
– Translation Consistency:
• Translation Memory
• Terminology
9
Better Translation Technology
DITA Localization - the good
Modularity:
– Translate a topic once
– Reuse many times!
• No need to retranslate
– Just in time translation
• Translate as soon as source is ready
• Dramatic improvement in time to market
• All documentation in all languages is ready concurrently
10
Better Translation Technology
DITA Localization - the good
• Decide how you want to translate:
– Whole document as one using bookmap
– Individual topics navigated according to bookmap
– Individual topics as and when ready
• Handling last minute engineering changes
– JIT translation
– Many TMS systems not good at handling this
– Automatically Update already translated segments
11
Better Translation Technology
DITA Localization - the <bad/><ugly/>
The bad and downright ugly (the three villains!):
– Word Substitution
• CONREF
• KEYREF
• DITAVAL
– Specialization
– Conditional processing
12
Better Translation Technology
DITA: square peg, round hole
• Do not try and force DITA to do what it is not designed for!
• DITA = Modular technical documentation
• Small, discrete topics
• No more than one page of text per topic
• Use the Open Toolkit
• Do not get overambitious with substitutions
– What works for English and Mandarin will not work for other languages
13
Better Translation Technology
DITA: Object Oriented Documentation
• DITA is an attempt to use OO design for XML documentation
• Very tempting for computer scientists
• We did it for computer programming
• Why not documentation?
• Problems arise with the nature of documentation
• Problems arise with the nature of human language
14
Better Translation Technology
Language – why humans mess things up!
What language is this?
What is he saying?
15
Better Translation Technology
Understanding the nature of English
• Why is English different from most other languages?
• English is a fusion language: a creole
– 60% Old Chaucerian English + 40% French
• Other Creoles with a high number of speakers:
– French (Vulgar Latin + Frankish)
– Swahili (Bantu + Arabic)
– Urdu (Hindi + Arabic)
– Mandarin
• (Many Sino-Tibetan languages)
16
Better Translation Technology
Understanding the nature of English
• Primitive morphology
– Nouns:
• Singular, plural, possessive
– ship, ships, ship’s, ships’
– No Gender
• a ship, the ship, the ships
– No adjectival agreement
• green ship, green ships
• We can substitute nouns and noun phrases without causing grammatical errors
• This is not true of most other languages
• English does not work like most other languages
• Your documentation WILL be translated sooner or later
17
Better Translation Technology
DITA Localization
Avoid word substitution (CONREF, KEYREF, DITAVAL):
– Linguistic issues
– Adjectival agreement
– Grammatical case
• Presenting the new Ford <keyword keyref=”model”> for 2014.
– very bad idea!
• Focus, Fiesta, Mondeo
• Nowy Focus, Nowa Fiesta, Nowe Mondeo
• Akin to saying ‘Presenting the Ford new Focus’
• Nowym Focus’em, Nową Fiestą, Nowym Mondeo
– May work for alphanumeric words
18
Better Translation Technology
DITA Localization
Only use substitution for linguistically complete sentences
– Warnings
– Cautions
– Notes
Avoid substitution for individual words or noun phrases
19
Better Translation Technology
Specialization
• Specialize at your peril!
– A double edged sword
• Increases exponentially difficulty:
– Authoring
– Publishing
– Localization
• New elements/attributes
– How are they to be treated
– For localization: completely new document type
20
Better Translation Technology
DITA and OAXAL
• OAXAL - Open Architecture for XML Authoring and Localization
• DITA Authoring and Localization in a Standards context:
– DITA is an Open Standard
– Why use proprietary software for Authoring and Localization of DITA?
Better Translation Technology
OAXAL
http://wiki.oasis-open.org/oaxal/FrontPage
Better Translation Technology
OAXAL Stack
Better Translation Technology
OAXAL Interaction
Better Translation Technology
OAXAL Source Lifecycle
Better Translation Technology
OAXAL Translation Lifecycle
26
Better Translation Technology
DITA Localization - considerations
• Choosing the right TMS/CAT System
– Can it handle XML properly:
• Entity references e.g. ‘&amp;’
• Encoding
• Validation
– Does it understand DITA
– Does it understand ditamap/bookmap
– Can you navigate using the bookmap
– Can it handle specialization
– Does it handle JIT
– Can it handle last minute changes
27
Better Translation Technology
How to reduce you translation costs
• Write less!
– Ford of Europe reduced translation costs by 50% in 2005
– It costs as much to translate into one language as it does to write the
original
• Use more graphics
– Integrate with CAD/CAM systems
– But beware text in graphics – use callouts
• People may actually start using your documentation
• KISS
• Manage your own translation assets: e.g. invest in your own TMS
– Save an additional 20% on average on cost and 50% on turnaround
Better Translation Technology
Less is More
Better Translation Technology
Contact Details
• Postal address:
– PO Box 2167
– Gerrards Cross
– Bucks SL9 8XF
– United Kingdom
• Phone: +44 1753 480 467
• Fax: +44 1753 480 465
• Andrzej Zydroń – azydron@xtm-intl.com

More Related Content

Viewers also liked

Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32
IXIASOFT
 
Putting DITA Localization into Practice
Putting DITA Localization into PracticePutting DITA Localization into Practice
Putting DITA Localization into Practice
XMetaL
 
Falcon
FalconFalcon
The tipping point
The tipping pointThe tipping point
The tipping point
Andrzej Zydroń MBCS
 
Interverbum falcon-10oct14-az
Interverbum falcon-10oct14-azInterverbum falcon-10oct14-az
Interverbum falcon-10oct14-az
Andrzej Zydroń MBCS
 
The Tipping Point
The Tipping PointThe Tipping Point
The Tipping Point
Andrzej Zydroń MBCS
 
OAXAL
OAXALOAXAL
Xtm webinar presentation xtm system overview
Xtm webinar presentation   xtm system overviewXtm webinar presentation   xtm system overview
Xtm webinar presentation xtm system overview
Andrzej Zydroń MBCS
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
Andrzej Zydroń MBCS
 
Open Standards
Open StandardsOpen Standards
Open Standards
Andrzej Zydroń MBCS
 
Dos and donts
Dos and dontsDos and donts
Dos and donts
Andrzej Zydroń MBCS
 
OAXAL
OAXALOAXAL
Understanding linport
Understanding linportUnderstanding linport
Understanding linport
Andrzej Zydroń MBCS
 
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Jack Molisani
 

Viewers also liked (14)

Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32
 
Putting DITA Localization into Practice
Putting DITA Localization into PracticePutting DITA Localization into Practice
Putting DITA Localization into Practice
 
Falcon
FalconFalcon
Falcon
 
The tipping point
The tipping pointThe tipping point
The tipping point
 
Interverbum falcon-10oct14-az
Interverbum falcon-10oct14-azInterverbum falcon-10oct14-az
Interverbum falcon-10oct14-az
 
The Tipping Point
The Tipping PointThe Tipping Point
The Tipping Point
 
OAXAL
OAXALOAXAL
OAXAL
 
Xtm webinar presentation xtm system overview
Xtm webinar presentation   xtm system overviewXtm webinar presentation   xtm system overview
Xtm webinar presentation xtm system overview
 
DITA and Translation Best Praticices
DITA and Translation Best PraticicesDITA and Translation Best Praticices
DITA and Translation Best Praticices
 
Open Standards
Open StandardsOpen Standards
Open Standards
 
Dos and donts
Dos and dontsDos and donts
Dos and donts
 
OAXAL
OAXALOAXAL
OAXAL
 
Understanding linport
Understanding linportUnderstanding linport
Understanding linport
 
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
 

Similar to DITA for Localization

Implementing Structured Writing and Content Management Globally
Implementing Structured Writing and Content Management GloballyImplementing Structured Writing and Content Management Globally
Implementing Structured Writing and Content Management Globally
Pam Noreault
 
ASTC 2019 - Exciting trends and technologies
ASTC 2019 - Exciting trends and technologiesASTC 2019 - Exciting trends and technologies
ASTC 2019 - Exciting trends and technologies
Gareth Oakes
 
Opening the Black Box of Software Localization
Opening the Black Box of Software LocalizationOpening the Black Box of Software Localization
Opening the Black Box of Software Localization
Kenneth Farrall
 
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
Gareth Oakes
 
TM-Town - Getting the Most out of Your Translation Memories
TM-Town - Getting the Most out of Your Translation MemoriesTM-Town - Getting the Most out of Your Translation Memories
TM-Town - Getting the Most out of Your Translation Memories
Kevin Dias
 
The Intricacies of DITA Content Localization
The Intricacies of DITA Content LocalizationThe Intricacies of DITA Content Localization
The Intricacies of DITA Content Localization
IXIASOFT
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
iText Group nv
 
Managing Localization from End-to-end - Going Global with DITA
Managing Localization from End-to-end - Going Global with DITAManaging Localization from End-to-end - Going Global with DITA
Managing Localization from End-to-end - Going Global with DITA
Keith Schengili-Roberts
 
Laura Dent: Single-Source and Localization
Laura Dent: Single-Source and LocalizationLaura Dent: Single-Source and Localization
Laura Dent: Single-Source and Localization
Jack Molisani
 
Lean and Collaborative Content - Workshop
Lean and Collaborative Content - WorkshopLean and Collaborative Content - Workshop
Lean and Collaborative Content - Workshop
IXIASOFT
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
SingleStore
 
Translation and Transcreation Workshop
Translation and Transcreation Workshop Translation and Transcreation Workshop
Translation and Transcreation Workshop
Conversis
 
An introduction to go programming language
An introduction to go programming languageAn introduction to go programming language
An introduction to go programming language
Technology Parser
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
Prof. Wim Van Criekinge
 
Deluxe techperl
Deluxe techperlDeluxe techperl
Deluxe techperl
Martin Houston
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
Welocalize
 
Intro to Programming Lang.pptx
Intro to Programming Lang.pptxIntro to Programming Lang.pptx
Intro to Programming Lang.pptx
ssuser51ead3
 
Build your own ASR engine
Build your own ASR engineBuild your own ASR engine
Build your own ASR engine
Korakot Chaovavanich
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
Iván Montes
 
Markup languages and warp-speed documentation
Markup languages and warp-speed documentationMarkup languages and warp-speed documentation
Markup languages and warp-speed documentation
Lois Patterson
 

Similar to DITA for Localization (20)

Implementing Structured Writing and Content Management Globally
Implementing Structured Writing and Content Management GloballyImplementing Structured Writing and Content Management Globally
Implementing Structured Writing and Content Management Globally
 
ASTC 2019 - Exciting trends and technologies
ASTC 2019 - Exciting trends and technologiesASTC 2019 - Exciting trends and technologies
ASTC 2019 - Exciting trends and technologies
 
Opening the Black Box of Software Localization
Opening the Black Box of Software LocalizationOpening the Black Box of Software Localization
Opening the Black Box of Software Localization
 
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
Managing Complex Print Deliverables with Arbortext - PTC/USER 2010
 
TM-Town - Getting the Most out of Your Translation Memories
TM-Town - Getting the Most out of Your Translation MemoriesTM-Town - Getting the Most out of Your Translation Memories
TM-Town - Getting the Most out of Your Translation Memories
 
The Intricacies of DITA Content Localization
The Intricacies of DITA Content LocalizationThe Intricacies of DITA Content Localization
The Intricacies of DITA Content Localization
 
The XML Forms Architecture
The XML Forms ArchitectureThe XML Forms Architecture
The XML Forms Architecture
 
Managing Localization from End-to-end - Going Global with DITA
Managing Localization from End-to-end - Going Global with DITAManaging Localization from End-to-end - Going Global with DITA
Managing Localization from End-to-end - Going Global with DITA
 
Laura Dent: Single-Source and Localization
Laura Dent: Single-Source and LocalizationLaura Dent: Single-Source and Localization
Laura Dent: Single-Source and Localization
 
Lean and Collaborative Content - Workshop
Lean and Collaborative Content - WorkshopLean and Collaborative Content - Workshop
Lean and Collaborative Content - Workshop
 
Putting Compilers to Work
Putting Compilers to WorkPutting Compilers to Work
Putting Compilers to Work
 
Translation and Transcreation Workshop
Translation and Transcreation Workshop Translation and Transcreation Workshop
Translation and Transcreation Workshop
 
An introduction to go programming language
An introduction to go programming languageAn introduction to go programming language
An introduction to go programming language
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
 
Deluxe techperl
Deluxe techperlDeluxe techperl
Deluxe techperl
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
 
Intro to Programming Lang.pptx
Intro to Programming Lang.pptxIntro to Programming Lang.pptx
Intro to Programming Lang.pptx
 
Build your own ASR engine
Build your own ASR engineBuild your own ASR engine
Build your own ASR engine
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
 
Markup languages and warp-speed documentation
Markup languages and warp-speed documentationMarkup languages and warp-speed documentation
Markup languages and warp-speed documentation
 

DITA for Localization

  • 1. Better Translation Technology Andrzej Zydron, CTO XTM International Better Translation Technology DITA Localization
  • 2. Better Translation Technology In the beginning Technical documentation was without form, and darkness was upon the face of the page: – Manual typesetting – RTF – WordPerfect – MS Word – FrameMaker – Ventura Publisher – Pagemaker – SGML
  • 3. 3 Better Translation Technology In the beginning Lack of standards •Proprietary solutions •Problems with character encoding •Expensive to design •Expensive to build •Expensive to maintain •Expensive to localize
  • 4. 4 Better Translation Technology Along came XML Let there be light: – XML born in 1997 from SGML/HTML – Review of lessons learned from SGML – Easier to implement – Removed unnecessary complexity – Declared standard encoding - Unicode
  • 5. 5 Better Translation Technology DITA Standards, Standards, Standards DITA: Advent of standards to technical documentation
  • 7. Better Translation Technology DITA - the good Extremely well thought out XML document architecture: – modularity – fine level of granularity – reuse – bookmap – standardized elements – Write once, translate once, reuse many times – Multiple output formats, multiple places, multiple docs: • PDF, HTML, mobile, web, paper etc.
  • 8. 8 Better Translation Technology DITA Localization Practical considerations: – Controlled Authoring: • Consistency • Terminology – Delivery for localization: • All at once in one big heap • JIT - individual topics when ready – Translation Consistency: • Translation Memory • Terminology
  • 9. 9 Better Translation Technology DITA Localization - the good Modularity: – Translate a topic once – Reuse many times! • No need to retranslate – Just in time translation • Translate as soon as source is ready • Dramatic improvement in time to market • All documentation in all languages is ready concurrently
  • 10. 10 Better Translation Technology DITA Localization - the good • Decide how you want to translate: – Whole document as one using bookmap – Individual topics navigated according to bookmap – Individual topics as and when ready • Handling last minute engineering changes – JIT translation – Many TMS systems not good at handling this – Automatically Update already translated segments
  • 11. 11 Better Translation Technology DITA Localization - the <bad/><ugly/> The bad and downright ugly (the three villains!): – Word Substitution • CONREF • KEYREF • DITAVAL – Specialization – Conditional processing
  • 12. 12 Better Translation Technology DITA: square peg, round hole • Do not try and force DITA to do what it is not designed for! • DITA = Modular technical documentation • Small, discrete topics • No more than one page of text per topic • Use the Open Toolkit • Do not get overambitious with substitutions – What works for English and Mandarin will not work for other languages
  • 13. 13 Better Translation Technology DITA: Object Oriented Documentation • DITA is an attempt to use OO design for XML documentation • Very tempting for computer scientists • We did it for computer programming • Why not documentation? • Problems arise with the nature of documentation • Problems arise with the nature of human language
  • 14. 14 Better Translation Technology Language – why humans mess things up! What language is this? What is he saying?
  • 15. 15 Better Translation Technology Understanding the nature of English • Why is English different from most other languages? • English is a fusion language: a creole – 60% Old Chaucerian English + 40% French • Other Creoles with a high number of speakers: – French (Vulgar Latin + Frankish) – Swahili (Bantu + Arabic) – Urdu (Hindi + Arabic) – Mandarin • (Many Sino-Tibetan languages)
  • 16. 16 Better Translation Technology Understanding the nature of English • Primitive morphology – Nouns: • Singular, plural, possessive – ship, ships, ship’s, ships’ – No Gender • a ship, the ship, the ships – No adjectival agreement • green ship, green ships • We can substitute nouns and noun phrases without causing grammatical errors • This is not true of most other languages • English does not work like most other languages • Your documentation WILL be translated sooner or later
  • 17. 17 Better Translation Technology DITA Localization Avoid word substitution (CONREF, KEYREF, DITAVAL): – Linguistic issues – Adjectival agreement – Grammatical case • Presenting the new Ford <keyword keyref=”model”> for 2014. – very bad idea! • Focus, Fiesta, Mondeo • Nowy Focus, Nowa Fiesta, Nowe Mondeo • Akin to saying ‘Presenting the Ford new Focus’ • Nowym Focus’em, Nową Fiestą, Nowym Mondeo – May work for alphanumeric words
  • 18. 18 Better Translation Technology DITA Localization Only use substitution for linguistically complete sentences – Warnings – Cautions – Notes Avoid substitution for individual words or noun phrases
  • 19. 19 Better Translation Technology Specialization • Specialize at your peril! – A double edged sword • Increases exponentially difficulty: – Authoring – Publishing – Localization • New elements/attributes – How are they to be treated – For localization: completely new document type
  • 20. 20 Better Translation Technology DITA and OAXAL • OAXAL - Open Architecture for XML Authoring and Localization • DITA Authoring and Localization in a Standards context: – DITA is an Open Standard – Why use proprietary software for Authoring and Localization of DITA?
  • 25. Better Translation Technology OAXAL Translation Lifecycle
  • 26. 26 Better Translation Technology DITA Localization - considerations • Choosing the right TMS/CAT System – Can it handle XML properly: • Entity references e.g. ‘&amp;’ • Encoding • Validation – Does it understand DITA – Does it understand ditamap/bookmap – Can you navigate using the bookmap – Can it handle specialization – Does it handle JIT – Can it handle last minute changes
  • 27. 27 Better Translation Technology How to reduce you translation costs • Write less! – Ford of Europe reduced translation costs by 50% in 2005 – It costs as much to translate into one language as it does to write the original • Use more graphics – Integrate with CAD/CAM systems – But beware text in graphics – use callouts • People may actually start using your documentation • KISS • Manage your own translation assets: e.g. invest in your own TMS – Save an additional 20% on average on cost and 50% on turnaround
  • 29. Better Translation Technology Contact Details • Postal address: – PO Box 2167 – Gerrards Cross – Bucks SL9 8XF – United Kingdom • Phone: +44 1753 480 467 • Fax: +44 1753 480 465 • Andrzej Zydroń – azydron@xtm-intl.com