DCL provides document conversion services using a hybrid approach. They blend years of experience with cutting-edge technology and infrastructure to make the conversion process easy and efficient for clients in various industries. DCL summarized several case studies of their work, including converting a large scientific journal collection, converting training materials for a technology company, and auditing converted documents for an engineering company supplying the US Air Force. They recommend clients consider which parts of the conversion process are their core business and which risks they want to take on to determine the best option of outsourcing, insourcing, or partnering for document conversion needs.
Injustice - Developers Among Us (SciFiDevCon 2024)
Mark Gross CEO Data Conversion Lab
1. Mark Gross, Founder and CEO, Data Conversion Laboratory
Creating a Hybrid Approach to Legacy Conversion
16 May, 2014
2. Valuable Content Transformed
• Document Digitization
• XML and HTML Conversion
• eBook Production
• Hosted Solutions
• Big Data Automation
• Conversion Management
• Editorial Services
• Harmonizer
3. Experience the DCL Difference
DCL blends years of conversion experience with cutting-edge technology and
the infrastructure to make the process easy and efficient.
• World-Class Services
• Leading-Edge Technology
• Unparalleled Infrastructure
• US-Based Management
• Complex-Content Expertise
• 24/7 Online Project Tracking
• Automated Quality Control
• Global Capabilities
7. • Identify materials that
are candidates for
conversion
• Assess the material’s
importance, how it
might be used
• Classify and prioritize
Conversion Setup Components in Detail
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
• Analyze documents to
identify potentially
redundant materials
• Normalize documents
to maximize reusability
• Evaluate document
sources to determine the
relative ease & accuracy of
content extraction
• Identify metadata sources
• Identify the types of
information in the
documents and the
appropriate level of
tagging
• Identify processes for
various materials
• Identify a suitable DTD or
Schema
8. • Detailed analysis of
documents by type
• Review enough documents
to understand the
potential variations
• Develop tagging
instructions
• Prepare specification
Conversion Setup Components in Detail (cont’d)
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
• Load balancing
• Capacity requirements
• Hardware requirements
• Identify conversion SW
requirement
• Evaluate tools
• Identify manual
conversion needs
• Develop or modify
conversion software per
conversion specification
9. • Identify the various steps
and plan a workflow
• Evaluate control and QA
mechanisms that will be
needed
• Design workflow process
to route documents
appropriately
Conversion Setup Components in Detail (cont’d)
Design & Develop
Automation &
Workflow SW
Conversion
Software
Testing
Training
• Prepare a test plan
• Develop a document test
baseline
• Create process to test
documents coming
through conversion flow
• Create process for:
− random testing
− testing new material
types
− software changes
• XML training
• Company standards
training
• How to write for XML
10. • Pulling content together
from the various locations
• Delivering to the
processing group
• Logging content into the
workflow system
Conversion Production Components in Detail
Organizing
Content for
Conversion
Hosting &
Running
Conversion
SW
Hosting & Running
Automation &
Workflow SW
• Maintaining facility to run
software and keep it
updated
• Monitor performance
and operations
• Sample materials on a
continual basis
• Maintain facility to route
materials between
software and manual
operations
• Monitor performance and
keep software and process
updated
11. • Paper preparation
• Scanning & zoning
• OCR processing
Conversion Production Components in Detail (cont’d)
Scanning &
OCR
Image
Processing
Proofreading
• Image extraction
• Resizing and image
correction
• Image conversion
• Proofread to required level
of accuracy
• How much can
automation do?
12. • Export text to normalized
form
• Automated & Manual
pre-tagging
• Pre-conversion review
• Styling QC
• SME (subject matter
expert) support
Conversion Production Components in Detail (cont’d)
Pre-Conversion
Document
Preparation
Conversion Parse/View
• Automated conversion
• Tagged output
• Parse document
• Review error logs and
correct until validated
• Render document for
viewing with images
• View document and
correct errors
• Image review
13. • Execute test plans
• Automated and Manual
QC
• Fix errors or provide
feedback
• Random sampling
• Continuous improvement
Conversion Production Components in Detail (cont’d)
Quality Control
Reporting,
Audit and
Reconciliation
• Management reporting
• Process monitoring
• Exception reporting
• Audit and reconciliation
of production throughput
14. • Consultant/Strategist
• Architecture Developer/Specialization Expert
• Trainers
• XML/Content Experts
• Subject Matter Experts (SMEs)
• Project/Program Management
• Conversion Operators
• Production Tracking
• Software Developers
• Filter Developers
• IT
• QA Experts
• Editors/Writers/Authors
Various Skills You May Need on Board
15. Consider Your Options …
• Outsource it all
• Convert in-house
• Partner with an expert
• All of the above
16. Case Study 1: Converting a Large Content Repository
• Client Situation
- Build a database of scientific journals – 750,000 pages spanning almost 100 years
- Complex materials with lots of math, tables, and images
- Multiple formats and types needed to be normalized to a manageable database to produce
new products, and support future products not yet conceived
- The organization wanted to keep its limited personnel resources focused on their expertise
• Approach
- Flexibility - The size and breadth of the collection made it impractical to develop full
specifications in advance.
- Develop an overall specification, with allowance for change as new scenarios are discovered
− Software development sprints to incorporate changes
− Close collaboration between vendor and client to manage new situations
− The organization leveraged it’s knowledge of its materials to identify potential problems in
advance, sequence the materials, actively review materials as they got produced
− Frequent review meetings to assess nuances in new materials as they came up
• Results
− This was a three year project to be completed this summer
− On schedule and on budget, with several new products already developed and out on the
market
− The close collaboration and involvement of the client shaved 6-8 months off the project
schedule, and created a product that all goals.
18. Case Study 2: International Technology Hardware and Software Company
• Client Situation
- Company has developed many thousands of hours of instructional materials it wants to
centralize and convert to XML using a SCORM-based Schema
- Materials included slides, video and taped lectures, written materials in various forms
- Goal was to identify the re-usable assets and to normalize these materials so that this
library of reusable assets can be reused for training its own engineers and other personnel
- Some materials would be offered for external training
- The materials were very specialized and subject matter expertise (SME) input was needed to
review all materials
• Approach
- DCL integrated as part of the client’s team
- DCL prepared transcripts of all oral materials with timings keyed to PowerPoint Slides
- DCL copyedited transcripts and PowerPoint slides and normalized style for both
- Client provided SME and legal review of transcripts
- Client re-recorded any needed voice-overs
- Client created Flash format for web publishing
- DCL created integrated XML products for loading into the client educational database
• Results
- Full integration of client and DCL teams allowed for a rapid ramp to produce pilot and move
into larger production
- Client was able to use it’s own personnel who knew the product well for SME support
- The client also contracted with another engineering company to provide additional SME
support for those products that could be supported by outside engineers
20. Case Study 3: Engineering Company Supplying the US Air Force
• Client Situation
- Material were to be converted from SGML and delivered in S1000D
- Company had created a fully automated conversion; Air Force wanted an independent audit
of the converted documents
• Approach
- Client had developed the conversion specified, and converted the documents to S1000D
- DCL to validated that the final XML met S1000D requirements
- DCL developed a conversion plan and tools to perform the audit
- DCL performed both automated and manual analysis and review of the conversion processes
and converted documents checking for inventory accuracy, tagging accuracy, and text
accuracy of tags and tag values
- DCL performed 100% audit of all materials and reported results, along with suggestions to
the client and to the Air Force
• Results
- Client was able to utilize DCL’s S1000D expertise and take advantage of DCL’s automated audit and
QA tools
- The client produced a better product as a result of feedback DCL was able to provide
- Air Force received a fully audited document set that satisfied their independent review requirement
22. Case Study 4: Large Journal Publisher with Facilities in China and India
• Client Situation
- Ongoing publishing operations with good understanding of its work flow and requirements
- Growing very quickly and needing to ramp up its capacity to convert author-written articles from
Word and PDF into XML
- Has in-place facilities in China to handle process management and labor-intensive tasks
- Had been building its own software capability, but it was taking longer than expected
- Wanted to take advantage of DCL’s infrastructure for conversion and workflow while maintaining it’s
own facilities for the human processing tasks
• Approach
- DCL configured it’s workflow and conversion software to the client’s requirements
- But instead of using DCL’s facilities, all preliminary work, and all manual work was routed by the
workflow system directly to the clients facility.
• Results
- Process made use of DCL’s existing infrastructure and software which were quickly reconfigured to
the client’s specification, and able to improve the automation of its process quickly and at lower cost
- Client was able to take advantage of the efficient facilities and infrastructure it had put into place
- DCL would monitor software and provide enhancements and updates as needed
- DCL would provide backup capability for overflow surges
24. The Model That Maximizes Results and
Minimizes Risk is Best for Your Organization
• Which parts of the process are your core business?
• Will this be a permanent process, or a limited time project?
• Do you have the needed in-house expertise?
• Do you want to build the staff and infrastructure?
• What are the risks?
• What combination will be best for your business?
Ask yourself these questions to help make the determination ...
… the good news – it’s not “one size fits all” anymore
“You don’t have to go it alone.”