Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter

Dynamic Chunking of Component-Authored
Information
Ben Colborn Owen Richter
Manager, Technical Publications Web Application Architect

2
Converged
compute and
storage
All
intelligence in
software
Distributed
everything
Self-healing
system
Web-scale converged infrastructure
Automation
and Rich
Analytics

3
Technical publications responsibilities
› Software documentation
› Release documentation
› Hardware documentation
› Support knowledge base
› Education collaboration
› Localization

4
Problem
Ben didn’t like any available options for publishing documentation

7
Advantages
Monolithic
•Easy to
produce
•Familiar for
audience
•Portable
Fragmented
•Easy to link
•Short page
load time
•Familiar for
authors

8
Opportunity
Growing company; development of new support portal

9
Every page is page one
› Every page is a potential entry point
› Sometimes hierarchy and sequence are relevant
› Often hierarchy and sequence are not relevant
› Multiplicity of navigation options is required

10
Information foraging behavior
› Information scent: Users estimate a given hunt’s likely
success from … assessing whether their path
exhibits cues related to the desired outcome.
› Informavores will keep clicking as long as they sense that
they're “getting warmer”—the scent must keep getting
stronger and stronger, or people give up.
› Progress must seem rapid enough to be worth the
predicted effort required to reach the destination.
› As users drill down the site, … provide feedback about
the current location and how it relates to users' tasks.

11
Documentation use cases
1. A new user may want to browse a complete high level
document.
2. A developing user may want an intermediate-sized chunk
that has subject/sequence affinity.
3. An experienced user may want a small chunk with a
particular item of information.
4. A support technician may need to provide a chunk scoped
at an intermediate level to a customer so they are not
overloaded with too much information, but also not given
too little.

12
Document levels
Document
Part
Chapter
Section
Topic

13
DITA gets us halfway there
 Authoring and management is done at the
topic level
 Chunking exists as an approach
but
 Chunking control is manual
 Chunks are static

14
Ben’s magical solution
If I had an infinite number of monkeys, I could
chunk all topics in all possible combinations

15
Cross-disciplinary thinking to the rescue
› We need a recursive document!
› A document is:
1. A title
2. A globally unique key (document name + sub document ID)
3. A locally unique key (sub document ID)
4. A list of tags
5. A (recursive) list of documents
› DITA is recursive but none of the existing presentation
mechanisms are recursive.
› JSON is a natural way to represent a recursive document.
› XSLT is a natural way to generate such a JSON document.

16
JSON generation process
DITA Source HTML JSON

17
Theoretical document: Complete
Document
1. Chapter
1.1 Section
2. Chapter
2.1 Section
2.1.1 Topic
2.2 Section
2.2.1 Topic
3. Chapter

18
Theoretical document: Chunks
1. Chapter
1.1 Section
2. Chapter
2.1 Section
2.1.1 Topic
2.2 Section
2.2.1 Topic
3. Chapter
2.1 Section
2.1.1 Topic
2.2 Section
2.2.1 Topic
2.1.1 Topic
2.2.1 Topic
1.1 Section

19
DITA to JSON 1: DITAMAP
Document
Properties
Topic
References

20
DITA to JSON 2: HTML index
Document
Properties
Topic
References

21
DITA to JSON 3: JSON
Document
Properties
Topic
Topic

22
DITA to JSON 4: Sub-document
Field Source
Title Topic title
ID Topic filename
Unique key Top-level document filename +
topic filename
Ancestors List of ancestor topics at all
levels
Summary* Topic shortdesc
Body Topic body
HREF Topic path + topic filename
Documents* List of sub-documents

23
Document Loading Process
Flatten each node Create Unique ID Establish ancestry
Convert relative
image and cross
references to
absolute links
Create a standalone
document of each
node
Load to DB
Load to search
index

30
DITA output targets
1. PDF: monolithic
2. ePUB: monolithic
3. HTML: fragmented
4. JSON: dynamically chunked

31
Conventions
› Images
› All image paths need to be converted to absolute paths. Having all of them in a
flat folder called “images” is one easy way to accomplish this.
› Cross References
› Cross reference links within the JSON are all relative. Like images, they need to
be converted to absolute links.
› JSON Tag Recursion
› It is tedious to add tags to all levels of the JSON Document, so most tags are
programmatically pulled through to all sub documents. Tags can be overridden
in children if desired.
› Permissions – can be set in source
› Anchors not supported
› We currently have a single page app making anchors difficult, but somewhat
irrelevant since each level is available as an independent link.

32
What’s next?
› More publishing automation
› Publishing is currently a 2 step process. JSON Publication followed by document loading.
It would be better to provide a 1 step process controlled by the document publisher.
› Holistic approach
› Search cultivation
› Search analytics
› Chat
› Case Deflection Analysis driving documentation.
› Tag-based navigation

33
Ben is less dissatisfied
Problems solved
• Apparently dynamic presentation
• Satisfactory context-sensitive help targets
• CMS/search loading
Problems not solved
• Static transformations
Problems created
• Content removal
• Proofing
• Custom software

Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter

Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter

Similar to Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter (20)

More from Information Development World

More from Information Development World (20)

Recently uploaded

Recently uploaded (20)

Dynamic Chunking of Component-Authored Information with Ben Colborn & Owen Richter

Editor's Notes