Confidential & Proprietarywww.dclab.comwww.dclab.com
Developing and Implementing a QA Plan
When Converting Your Legacy Data
Naveh Greenberg,
Director, U.S. Defense Development,
Data Conversion Laboratory
Confidential & Proprietarywww.dclab.com 2
Valuable Content Transformed
• Document Digitization
• XML and HTML Conversion
• eBook Production
• Hosted Solutions
• Big Data Automation
• Conversion Management
• Editorial Services
• Harmonizer
Confidential & Proprietarywww.dclab.com 3
Experience the DCL Difference
DCL blends years of conversion experience with cutting-edge technology and the
infrastructure to make the process easy and efficient.
• World-Class Services
• Leading-Edge Technology
• Unparalleled Infrastructure
• US-Based Management
• Complex-Content Expertise
• 24/7 Online Project Tracking
• Automated Quality Control
• Global Capabilities
Confidential & Proprietarywww.dclab.com
We Serve a Very Broad Client Base . . .
4
Confidential & Proprietarywww.dclab.com 5
. . . Spanning All Industries
• Aerospace
• Associations
• Defense
• Distribution
• Education
• Financial
• Government
• Libraries
• Life Sciences
• Manufacturing
• Medical
• Museums
• Periodicals
• Professional
• Publishing
• Reference
• Research
• Societies
• Software
• STM
• Technology
• Telecommunications
• Universities
• Utilities
Confidential & Proprietarywww.dclab.com 6
Agenda
• What makes conversion difficult?
• Planning for a good conversion experience
• Implementing your plan
• Examples
• Q&A
Confidential & Proprietarywww.dclab.com 7
What Makes Conversion Difficult
• The usual conversion issues
– Accuracy of the transferred text
– Tables
– Math & Special Characters
– How to determine correct hierarchy.
– Pages & most formatting are not in the XML/SGML
– Irrelevant Cross-References
– Identifying reusable content
– Writer Creativity in Source Material
• And the people issues
– Getting used to a new “document” paradigm
– Agreeing on conversion rules
– Involving all stakeholders
Confidential & Proprietarywww.dclab.com 8
Most Importantly – Plan!!!
• Ask the important initial questions
• Who are the stakeholders. Who is the final client/user?
• What is the estimated volume and deadline?
• What is the standard ?
• What CMS or rendering tools will be used?
• What are we starting with? Not all source data are created equal.
• Budget?
• Learn from others.
• Join discussion groups.
• Case studies & Lesson Learn.
• Prepare for the next step
• Get your hands on the source data, schemas, sample of converted data.
• Build a solid team.
• Develop a solid process.
“If I had eight hours to chop down a tree, I'd spend six sharpening my ax.”
Confidential & Proprietarywww.dclab.com
“If I had eight hours to chop
down a tree, I'd spend six
sharpening my ax.”
- Abraham Lincoln
DCL’s Project Start-up Methodology
Confidential & Proprietarywww.dclab.com
Inventory & Assessment
• Log the batches received into a production control system.
• By logging and tracking each unit you can gather information
that can be used to:
– Project delivery schedules
– Confirm that processes are working properly
– Track each unit and show you in what step of the production
process it’s in.
Confidential & Proprietarywww.dclab.com 11
Inventory & Assessment: What to Convert, and in What Order
• Categorizing
– Active documents in good shape
– Active documents that need a lot of work
– Somewhat inactive document that will likely be retired
– Archival materials
• Prioritizing
– Documents that are most used
– Documents that are customer favorites
– Documents with longest product life
– Start with most recent documents and go back
• Identifying the process
– Can be converted as is
– Can be converted with some work
– Needs to be rewritten
– Don’t convert – just keep archival copies
Confidential & Proprietarywww.dclab.com
Why Is Reuse Analysis Important?
• Increased consistency
• Reduced development time
• Lower maintenance costs
• Rapid reconfiguration
• Find Typos or Applicability
• Divide and conquer
Confidential & Proprietarywww.dclab.com 13
Content Reuse Analysis Reports
• Finding exact or similar text will help you when mapping to Data Modules
• It will also help to detect applicability and inconsistencies
Confidential & Proprietarywww.dclab.com 14
Document Analysis – Text extraction
Confidential & Proprietarywww.dclab.com 15
Document Analysis – Text extraction
Confidential & Proprietarywww.dclab.com 16
Document Analysis – Text extraction
Confidential & Proprietarywww.dclab.com
The Conversion Specification
17
Confidential & Proprietarywww.dclab.com
The Conversion Specification
18
Confidential & Proprietarywww.dclab.com 19
Normalizing Your Data
<para>1. Clean the Engine.</para>
<step1><para>Clean the Engine.</para></step1>
<seqlist><item>Clean the Engine.</item></seqlist>
<entry>1.</entry><entry>Clean the Engine. </entry>
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
20
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
21
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
22
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
23
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
24
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
25
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
26
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
27
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
28
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
30
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
31
Confidential & Proprietarywww.dclab.com
Viewing Your Converted Data while QC
32
Confidential & Proprietarywww.dclab.com 33
Q&A
Naveh Greenberg
Director, U.S. Defense Development,
Data Conversion Laboratory
(718) 307-5758
ngreenberg@dclab.com
@dclaboratory

Developing and Implementing a QA Plan During Your Legacy Data to S1000D