Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dynamic chunking of component-authored information


Published on

Automatically chunk topics to provide varied user experiences of documentation.

Presented at Information Development World 2015.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

Dynamic chunking of component-authored information

  1. 1. Dynamic Chunking of Component-Authored Information Ben Colborn Owen Richter Manager, Technical Publications Web Application Architect
  2. 2. 2 Converged compute and storage All intelligence in software Distributed everything Self-healing system Web-scale converged infrastructure Automation and Rich Analytics
  3. 3. 3 Technical publications responsibilities › Software documentation › Release documentation › Hardware documentation › Support knowledge base › Education collaboration › Localization
  4. 4. 4 Problem Ben didn’t like any available options for publishing documentation
  5. 5. 5 Monolithic documentation
  6. 6. 6 Fragmented documentation
  7. 7. 7 Advantages Monolithic •Easy to produce •Familiar for audience •Portable Fragmented •Easy to link •Short page load time •Familiar for authors
  8. 8. 8 Opportunity Growing company; development of new support portal
  9. 9. 9 Every page is page one › Every page is a potential entry point › Sometimes hierarchy and sequence are relevant › Often hierarchy and sequence are not relevant › Multiplicity of navigation options is required
  10. 10. 10 Information foraging behavior › Information scent: Users estimate a given hunt’s likely success from … assessing whether their path exhibits cues related to the desired outcome. › Informavores will keep clicking as long as they sense that they're “getting warmer”—the scent must keep getting stronger and stronger, or people give up. › Progress must seem rapid enough to be worth the predicted effort required to reach the destination. › As users drill down the site, … provide feedback about the current location and how it relates to users' tasks.
  11. 11. 11 Documentation use cases 1. A new user may want to browse a complete high level document. 2. A developing user may want an intermediate-sized chunk that has subject/sequence affinity. 3. An experienced user may want a small chunk with a particular item of information. 4. A support technician may need to provide a chunk scoped at an intermediate level to a customer so they are not overloaded with too much information, but also not given too little.
  12. 12. 12 Document levels Document Part Chapter Section Topic
  13. 13. 13 DITA gets us halfway there  Authoring and management is done at the topic level  Chunking exists as an approach but  Chunking control is manual  Chunks are static
  14. 14. 14 Ben’s magical solution If I had an infinite number of monkeys, I could chunk all topics in all possible combinations
  15. 15. 15 Cross-disciplinary thinking to the rescue › We need a recursive document! › A document is: 1. A title 2. A globally unique key (document name + sub document ID) 3. A locally unique key (sub document ID) 4. A list of tags 5. A (recursive) list of documents › DITA is recursive but none of the existing presentation mechanisms are recursive. › JSON is a natural way to represent a recursive document. › XSLT is a natural way to generate such a JSON document.
  16. 16. 16 JSON generation process DITA Source HTML JSON
  17. 17. 17 Theoretical document: Complete Document 1. Chapter 1.1 Section 2. Chapter 2.1 Section 2.1.1 Topic 2.2 Section 2.2.1 Topic 3. Chapter
  18. 18. 18 Theoretical document: Chunks 1. Chapter 1.1 Section 2. Chapter 2.1 Section 2.1.1 Topic 2.2 Section 2.2.1 Topic 3. Chapter 2.1 Section 2.1.1 Topic 2.2 Section 2.2.1 Topic 2.1.1 Topic 2.2.1 Topic 1.1 Section
  19. 19. 19 DITA to JSON 1: DITAMAP Document Properties Topic References
  20. 20. 20 DITA to JSON 2: HTML index Document Properties Topic References
  21. 21. 21 DITA to JSON 3: JSON Document Properties Topic Topic
  22. 22. 22 DITA to JSON 4: Sub-document Field Source Title Topic title ID Topic filename Unique key Top-level document filename + topic filename Ancestors List of ancestor topics at all levels Summary* Topic shortdesc Body Topic body HREF Topic path + topic filename Documents* List of sub-documents
  23. 23. 23 Document Loading Process Flatten each node Create Unique ID Establish ancestry Convert relative image and cross references to absolute links Create a standalone document of each node Load to DB Load to search index
  24. 24. 24 Search
  25. 25. 25 Task Topic
  26. 26. 26 Chapter
  27. 27. 27 Document
  28. 28. 28 TOC
  29. 29. 29 Multi-modality
  30. 30. 30 DITA output targets 1. PDF: monolithic 2. ePUB: monolithic 3. HTML: fragmented 4. JSON: dynamically chunked
  31. 31. 31 Conventions › Images › All image paths need to be converted to absolute paths. Having all of them in a flat folder called “images” is one easy way to accomplish this. › Cross References › Cross reference links within the JSON are all relative. Like images, they need to be converted to absolute links. › JSON Tag Recursion › It is tedious to add tags to all levels of the JSON Document, so most tags are programmatically pulled through to all sub documents. Tags can be overridden in children if desired. › Permissions – can be set in source › Anchors not supported › We currently have a single page app making anchors difficult, but somewhat irrelevant since each level is available as an independent link.
  32. 32. 32 What’s next? › More publishing automation › Publishing is currently a 2 step process. JSON Publication followed by document loading. It would be better to provide a 1 step process controlled by the document publisher. › Holistic approach › Search cultivation › Search analytics › Chat › Case Deflection Analysis driving documentation. › Tag-based navigation
  33. 33. 33 Ben is less dissatisfied Problems solved • Apparently dynamic presentation • Satisfactory context-sensitive help targets • CMS/search loading Problems not solved • Static transformations Problems created • Content removal • Proofing • Custom software