Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Preparing an Open Source Documentation Repository for Translations

16 views

Published on

As part of the 2018 HPCC Systems Summit Community Day event:

On display first is a poster from Robert Kennedy, Florida Atlantic University on, Distributed Deep Learning on TensorFlow.

Following, Jim DeFabia, presents his breakout session in the Documentation & Training Track.

Translating a manual once is not a terribly difficult task. However, our manuals are always evolving, so we needed a plan to update translations on a regular basis. This requires a process that is maintainable, repeatable, and robust. In this case study of our forays into documentation internationalization, you can learn from our successes and laugh at some of our missteps along the way.

Jim DeFabia is the Documentation Lead for HPCC Systems®. He entered the field of technical documentation in 1993 at Clarion Software/TopSpeed, where he helped revolutionize the industry by creating manuals that people could read and actually understand. It was at TopSpeed that he first met and worked with many of the HPCC Systems “Mavericks.” So it was like coming home when he reunited with these colleagues in 2001 as they were initially developing the HPCC Systems platform, which was then released to the Open Source community in 2011.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Preparing an Open Source Documentation Repository for Translations

  1. 1. October 9, 2018 Jim DeFabia @drdephobia Preparing an Open Source Documentation Repository for Translations 2018 HPCC Systems® Community Day
  2. 2. Preparing an Open Source Documentation Repository for Translations
  3. 3. Why Internationalize? Preparing an Open Source Documentation Repository for Translations Out of the world’s approximately 7.5 billion inhabitants, 1.5 billion speak English — that’s [only] 20% of the Earth’s population. Source-Babbel.com and ASIST
  4. 4. Why Internationalize? Preparing an Open Source Documentation Repository for Translations 55% English Speaking 11% 6% 4% 3% 3% 3% 2% Portal Visitors United States India Brazil Philippines China France Germany Peru Canada Ireland Japan Russia South Korea Netherlands Australia Italy Vietnam Spain Our HPCC Systems® Portal visitors are more English-centric. But… If we support more languages that could change. Source-HPCCSystems®
  5. 5. Why Internationalize? Preparing an Open Source Documentation Repository for Translations Our HPCC Systems® Portal visitors are more English-centric. But… If we support more languages that could change. Source-HPCCSystems®
  6. 6. Why Internationalize? Should we lose half of our potential audience? * No actual camels were harmed in the creation of this slide Preparing an Open Source Documentation Repository for Translations
  7. 7. Why Internationalize? Should we lose half of our potential audience? Preparing an Open Source Documentation Repository for Translations
  8. 8. Why Internationalize? Should we lose half of our potential audience? Preparing an Open Source Documentation Repository for Translations
  9. 9. How to Internationalize? The Babel fish is small yellow and leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received … and then excretes into the mind of its carrier a telepathic matrix formed by combining the unconscious thought frequencies with nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish. Preparing an Open Source Documentation Repository for Translations
  10. 10. How to Internationalize? In 2013, Comrise, an HPCC Systems partner, gave us a wonderful gift! A Chinese translated version of several of our manuals!! The ECL Programmers Guide (version 4.2.0.3) is still available on the HPCC Systems® Portal The other books were incorporated into our online training. Preparing an Open Source Documentation Repository for Translations
  11. 11. How to Internationalize? Thanks, Comrise! Preparing an Open Source Documentation Repository for Translations
  12. 12. How to Internationalize? •Static (version 4.2) •Not easily maintainable •Has become outdated Preparing an Open Source Documentation Repository for Translations This one-off translation was nice to have but: So what SHOULD we do?
  13. 13. How to Internationalize? Use the Source, Luke! • Our documentation source is already in XML (Text) format • Easy to compare differences • GIT DIFF • Other tools • Minor differences handled in house • Major differences (e.g., adding a new 200 page book) would be sent to translation company Preparing an Open Source Documentation Repository for Translations
  14. 14. How to Internationalize? Since HPCC Systems® went Open Source, all of our documentation source files are in DocBook XML. Keep it Simple! • Translate sources from a checkpoint • Later only translate the differences (delta) By translating ONLY the delta, translation costs are dramatically reduced! Preparing an Open Source Documentation Repository for Translations Open Source to the rescue!
  15. 15. Git Diff Before After Preparing an Open Source Documentation Repository for Translations
  16. 16. Git Diff Before After Preparing an Open Source Documentation Repository for Translations
  17. 17. Git Diff PDF DocBook Preparing an Open Source Documentation Repository for Translations
  18. 18. How to Internationalize First Steps: Where are we gonna put it all? Preparing an Open Source Documentation Repository for Translations
  19. 19. How to Internationalize First Steps: Where are we gonna put it all? Preparing an Open Source Documentation Repository for Translations
  20. 20. Preparing an Open Source Documentation Repository for Translations We reorganized folders using the IETF tags to support each language • EN-US • PT-BR • ZH-CN • Etc. We chose to use a two-part IETF code: • Primary code that identifies the language (e.g., “EN") • Sub-code that specifies the national variety (e.g., "GB" or "US" ). This allows us to support variants of languages, if the need ever arises. Preparing an Open Source Documentation Repository for Translations
  21. 21. Preparing an Open Source Documentation Repository for Translations Hello World in Australian English (EN-AU) OUTPUT('G'day, Mate!'); Preparing an Open Source Documentation Repository for Translations
  22. 22. Preparing an Open Source Documentation Repository for Translations Preparing an Open Source Documentation Repository for Translations
  23. 23. Naming Conventions Preparing an Open Source Documentation Repository for Translations Name that file… DOCBOOK_TO_PDF( ${FO_XSL} ECLR-includer.xml "ECLLanguageReference_${DOC_LANG}" "ECLR_mods") This produces a filename of: ECLLanguageReference_EN_US-7.0.0-1.pdf
  24. 24. Naming Convention Preparing an Open Source Documentation Repository for Translations A rose by any other name…
  25. 25. Common Elements • Some Images • Logos • Warning • Tip • Icons • Version.xml (used locally) Preparing an Open Source Documentation Repository for Translations
  26. 26. Not So Common Elements Creative Commons License Preparing an Open Source Documentation Repository for Translations This document is licensed under the Creative Commons License CC BY-ND 3.0 applicable to the jurisdiction of the principal location of the user, as available; otherwise, the CC BY-ND 3.0 Unported https://creativecommons.org/licens es/by-nd/3.0/
  27. 27. Language-specific Images ECL Watch in Portuguese Preparing an Open Source Documentation Repository for Translations
  28. 28. Do Not Translate Tags Preparing an Open Source Documentation Repository for Translations
  29. 29. Do Not Translate Tags Preparing an Open Source Documentation Repository for Translations • Tools, tools, tools
  30. 30. NLP++ to the rescue Preparing an Open Source Documentation Repository for Translations
  31. 31. NLP++ to the rescue <informaltable colsep="1" frame="all" rowsep="1"> <tgroup cols="3"> <colspec colwidth="147.60pt" /> <colspec colwidth="147.60pt" /> <colspec colwidth="147.60pt" /> <thead> <row> <entry align="left"><!-- DNT-Start -->Field Name<!-- DNT-End --></entry> <entry align="left">Type</entry> <entry align="left">Description</entry> </row> </thead> <tbody> <row> <entry><!-- DNT-Start -->FirstName<!-- DNT-End --></entry> <entry>15 Character String</entry> <entry>First Name</entry> </row> <row> <entry><!-- DNT-Start -->LastName<!-- DNT-End --></entry> <entry>25 Character String</entry> <entry>Last name</entry> </row> . . . Preparing an Open Source Documentation Repository for Translations
  32. 32. Do Not Translate Preparing an Open Source Documentation Repository for Translations
  33. 33. Beneficial Side Effects We found bugs in our source files • Special Characters had been introduced during initial import/conversion • -- em dashes • ... Ellipses • Smart Quotes aren’t so smart • “Smart” vs • "Dumb" • In total, we found 1,450 errors introduced by autocorrect! Preparing an Open Source Documentation Repository for Translations (Thank You, AutoCorrect)
  34. 34. Beneficial Side Effects Preparing an Open Source Documentation Repository for Translations • We learned that we need more code reuse • Included content only needs translation once • Easier to maintain in any language • More Efficient
  35. 35. Side Effects Preparing an Open Source Documentation Repository for Translations
  36. 36. Side Effects Preparing an Open Source Documentation Repository for Translations • Automated Builds and Independent Builds • Include PT-BR in automated build process • Other languages can follow • Independent Doc Builds • English • PT-BR • All
  37. 37. Translation English <para>This tutorial assumes:</para> <itemizedlist> <listitem> <para> You have a running HPCC. This can be a VM Edition or a single or multinode HPCC platform </para> </listitem> </itemizedlist> <para>You have the ECL IDE <footnote><para> The ECL IDE (Integrated Development Environment) is the tool used to create queries into your data and ECL files with which to build your queries. </para> </footnote> installed and configured</para> Portuguese <para>Este tutorial presume que:</para> <itemizedlist> <listitem> <para> Você tem um HPCC em execução. Ele pode ser a VM Edition ou uma plataforma HPCC com um ou mais nós </para> </listitem> </itemizedlist> <para>Você tem o ECL IDE <footnote><para> O ECL IDE (Ambiente de desenvolvimento integrado) é uma ferramenta usada para criar consultas em seus dados e arquivos ECL com os quais suas consultas serão compiladas. </para> </footnote> instalado e configurado</para> Preparing an Open Source Documentation Repository for Translations
  38. 38. Translation Preparing an Open Source Documentation Repository for Translations
  39. 39. What’s Next? Preparing an Open Source Documentation Repository for Translations Today Brazil, Tomorrow the WORLD!
  40. 40. To Do List • Translate to more languages • Add more books • Build translated CHM files • Screen Shots from translated ECL Watch • Further engage the open source community Preparing an Open Source Documentation Repository for Translations
  41. 41. It’s a small, small, small, small world. Preparing an Open Source Documentation Repository for Translations
  42. 42. Shameless Plug: The HPCC Systems® Cookbook Preparing an Open Source Documentation Repository for Translations
  43. 43. I want to contribute to documentation but… What if my idea isn’t good enough? Preparing an Open Source Documentation Repository for Translations 44
  44. 44. What if my idea isn’t good enough? Preparing an Open Source Documentation Repository for Translations 45 Someone once said in a meeting Hey, let’s make a movie about a tornado full of sharks!
  45. 45. I want to contribute to documentation but…  I don’t know DocBook XML  I don’t know Git, GitHub, or your procedures for pull requests • Well let me present… Preparing an Open Source Documentation Repository for Translations 46
  46. 46. The HPCC Systems® Cookbook by HPCC Systems® Preparing an Open Source Documentation Repository for Translations 47
  47. 47. The HPCC Systems® Cookbook--A collection of recipes and tips • ECL How To Section • How to create a phonetic search • How to use superfiles/superkeys and consolidate them periodically • Modify a Jobname • Specify a Workunit Scope • Tools, Tips, and Techniques Section • How to use Git within the IDE • System Admin How To Section • Create a jailed SFTP site Preparing an Open Source Documentation Repository for Translations 48 Written by the best chefs in town—YOU!
  48. 48. I want to contribute to documentation but…  I have an idea, but I don’t have time to flesh it all out  I’m not good at writing Preparing an Open Source Documentation Repository for Translations 49
  49. 49. Here’s WIKI! Preparing an Open Source Documentation Repository for Translations 50
  50. 50. There are many ways to contribute Write directly in the Wiki Write it, but submit for review / editing Send in an idea Let us know about a forum question you found interesting Submit a code example that a colleague wrote (get permission first) Preparing an Open Source Documentation Repository for Translations 51
  51. 51. Why should I contribute? Fame and Fortune Work with professional editors Help the community CONTEST!!! Preparing an Open Source Documentation Repository for Translations 52
  52. 52. ♪ ♫ Goodbye, Farewell, Auf Wiedersehen, Adieu ♪ ♫ Preparing an Open Source Documentation Repository for Translations
  53. 53. Questions? Preparing an Open Source Documentation Repository for Translations james.defabia@lexisnexisrisk.com https://twitter.com/DrDePhobia https://www.linkedin.com/in/james-defabia/ https://github.com/JamesDeFabia

×