SlideShare a Scribd company logo
1 of 34
Confidential & Proprietarywww.dclab.comwww.dclab.com
Preparing Your Legacy Data for Automation
in S1000D
Naveh Greenberg,
Director, U.S. Defense Development,
Data Conversion Laboratory
Confidential & Proprietarywww.dclab.com 2
Valuable Content Transformed
• Document Digitization
• XML and HTML Conversion
• eBook Production
• Hosted Solutions
• Big Data Automation
• Conversion Management
• Editorial Services
• Harmonizer
Confidential & Proprietarywww.dclab.com 3
Experience the DCL Difference
DCL blends years of conversion experience with cutting-edge technology and the
infrastructure to make the process easy and efficient.
• World-Class Services
• Leading-Edge Technology
• Unparalleled Infrastructure
• US-Based Management
• Complex-Content Expertise
• 24/7 Online Project Tracking
• Automated Quality Control
• Global Capabilities
Confidential & Proprietarywww.dclab.com
We Serve a Very Broad Client Base . . .
4
Confidential & Proprietarywww.dclab.com 5
. . . Spanning All Industries
• Aerospace
• Associations
• Defense
• Distribution
• Education
• Financial
• Government
• Libraries
• Life Sciences
• Manufacturing
• Medical
• Museums
• Periodicals
• Professional
• Publishing
• Reference
• Research
• Societies
• Software
• STM
• Technology
• Telecommunications
• Universities
• Utilities
Confidential & Proprietarywww.dclab.com 6
What Makes S1000D Conversion Difficult
• S1000D is a conceptual departure from linear information – and
is difficult for many to get used to
• Turns the traditional book into a collection of DMs
– Introductory material that applies to numerous DMs
– Placement of Warnings, Cautions and Notes
– Writer creativity
• DMC & business rules.
– Assigning DMCs and ICNs
– Hierarchy in Map Files (Publication Module)
– Data can fit more than one information code
• …but your documents weren’t likely to have been designed to do
this.
Confidential & Proprietarywww.dclab.com 7
Structuring a Book into Data Modules in S1000D
IPD
Wiring
Descriptive
Crew
Fault
Appendix B
Procedural
Para 1-1Early S1000D
Publication
Para 1-2
Para 1-3
Para 1-1
Para 3-1
Para 2-1
PDF Book
Para 1-2
38784 Book
Para 2-1
Para 2-2
Appendix A
Para 3-2
Appendix A
Appendix B
S1000D Common Source
Database
Publication 1
Publication2
Subtask
Task
Subtask
ATA Book
Pageblock
Pageblock
Pageblock
Pageblock
Pageblock
Task
Maintenance
Process
Descriptive DM
Procedural DM
IPD DM
Wiring DM
Crew DM
Process DM
Maintenance DM
Fault DM
IPD
Wiring
Descriptive
Crew
Fault
Procedural
Maintenance
Process
Process
Wiring
Procedural
Descriptive
Fault
Crew
Process
Publication3
Confidential & Proprietarywww.dclab.com 8
Further Complications in S1000D Conversion
• There’s the usual conversion issues
– Accuracy of the transferred text
– Tables
– Math or odd looking text
– Special Characters
• There’s also the structuring issues
– Identifying DMs
– Identifying reusable content
– Identifying Applicability
• And the people issues
– Getting rugged individualists to collaborate more
– Deciding what needs re-authoring
– Getting used to a new “document” paradigm
Confidential & Proprietarywww.dclab.com 9
Most Importantly – Plan!!!
• Ask the important initial questions
˗ Who are the stakeholders. Who is the final client/user?
˗ What is the estimated volume and deadline?
˗ Source format. Not all source data are created equal.
˗ What version of S1000D?
˗ Do we know what CMS or rendering tools will be used?
˗ Budget?
• Ask around or join discussion groups.
• Get your hands on the source data, business rules, and schemas.
• Begin looking for the right people. You don’t need to be a S1000D savvy
but you do at a minimum understand the concept.
Confidential & Proprietarywww.dclab.com
Ask Questions
10
Confidential & Proprietarywww.dclab.com
“If I had eight hours to chop
down a tree, I'd spend six
sharpening my ax.”
- Abraham Lincoln
DCL’s Project Start-up Methodology
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Conversion Production
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
Organizing
Content for
Conversion
Hosting &
Running
Conversion SW
Hosting & Running
Automation &
Workflow SW
Scanning &
OCR
Image
Processing
Proofreading
Pre-Conversion
Document
Preparation
Conversion
Parse/View
Quality
Control
Reporting,
Audit &
Reconciliation
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Inventory & Assessment
• Log the batches received into a production control system.
• By logging and tracking each unit you can gather information
that can be used to:
– Project delivery schedules
– Confirm that processes are working properly
– Track each unit and show you in what step of the production
process it’s in.
Confidential & Proprietarywww.dclab.com 15
Inventory & Assessment: What to Convert, and in What Order
• Categorizing
– Active documents in good shape
– Active documents that need a lot of work
– Somewhat inactive document that will likely be retired
– Archival materials
• Prioritizing
– Documents that are most used
– Documents that are customer favorites
– Documents with longest product life
– Start with most recent documents and go back
• Identifying the process
– Can be converted as is
– Can be converted with some work
– Needs to be rewritten
– Don’t convert – just keep archival copies
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com
Why Is Reuse Analysis Important?
• Increased consistency
• Reduced development time
• Lower maintenance costs
• Rapid reconfiguration
• Divide and conquer
Confidential & Proprietarywww.dclab.com
Why Is Reuse Analysis Important?
Confidential & Proprietarywww.dclab.com 19
Content Reuse Analysis Reports
• Finding exact or similar text will help you when mapping to Data Modules
• It will also help to detect applicability and inconsistencies
Confidential & Proprietarywww.dclab.com 20
Content Reuse Analysis Reports
Confidential & Proprietarywww.dclab.com
Conversion Setup
Components
Inventory &
Assessment
Reuse
Analysis
Document
Analysis
Conversion
Specification
Architecture
Design &
Configuration
Design &
Develop
Conversion
SW
Design &
Develop
Automation &
Workflow SW
Conversion
SW Testing
Training
What Does a Conversion Project Look Like?
Confidential & Proprietarywww.dclab.com 22
Document Analysis & Conversion Specification
• Evaluate document sources to determine the
relative ease & accuracy of content extraction
• Identify metadata sources
• Identify the types of information in the documents
and the appropriate level of tagging
• Identify processes for various materials
• Detailed analysis of documents by type
• Review enough documents to understand the
potential variations
• Develop tagging instructions
• Prepare specification
• Normalize your data
Confidential & Proprietarywww.dclab.com 23
Document Analysis – Text extraction
Sample Document Text OCR Output
Confidential & Proprietarywww.dclab.com
The Conversion Specification (DMRL & specific rules)
24
Confidential & Proprietarywww.dclab.com
The Conversion Specification
25
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
26
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
27
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
28
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
29
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
30
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
31
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
32
Confidential & Proprietarywww.dclab.com
Normalizing Your Data
33
Confidential & Proprietarywww.dclab.com 34
Q&A
Naveh Greenberg
Director, U.S. Defense Development,
Data Conversion Laboratory
(718) 307-5758
ngreenberg@dclab.com
@dclaboratory

More Related Content

What's hot

What are the Strengths and Weaknesses of DITA Adoption?
What are the Strengths and Weaknesses of DITA Adoption?What are the Strengths and Weaknesses of DITA Adoption?
What are the Strengths and Weaknesses of DITA Adoption?dclsocialmedia
 
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for KeysManaging Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keysdclsocialmedia
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!dclsocialmedia
 
Anticipating Lightweight DITA
Anticipating Lightweight DITAAnticipating Lightweight DITA
Anticipating Lightweight DITAdclsocialmedia
 
Converting and Integrating Content When Implementing a New CMS
Converting and Integrating Content When Implementing a New CMSConverting and Integrating Content When Implementing a New CMS
Converting and Integrating Content When Implementing a New CMSdclsocialmedia
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Tugdual Grall
 
Content Conversion Done Right Saves More Than Money
Content Conversion Done Right Saves More Than MoneyContent Conversion Done Right Saves More Than Money
Content Conversion Done Right Saves More Than Moneydclsocialmedia
 
10 Million Dita Topics Can't Be Wrong
10 Million Dita Topics Can't Be Wrong10 Million Dita Topics Can't Be Wrong
10 Million Dita Topics Can't Be WrongIXIASOFT
 
Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32IXIASOFT
 
Tackle your Documentation Challenges with the IXIASOFT DITA CMS
Tackle your Documentation Challenges with the IXIASOFT DITA CMSTackle your Documentation Challenges with the IXIASOFT DITA CMS
Tackle your Documentation Challenges with the IXIASOFT DITA CMSIXIASOFT
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesLars Albertsson
 
Sprinting to Success: Why Agile and DITA Work So Well Together
Sprinting to Success: Why Agile and DITA Work So Well TogetherSprinting to Success: Why Agile and DITA Work So Well Together
Sprinting to Success: Why Agile and DITA Work So Well TogetherIXIASOFT
 
M|18 How We Made the Move to MariaDB at FNI
M|18 How We Made the Move to MariaDB at FNIM|18 How We Made the Move to MariaDB at FNI
M|18 How We Made the Move to MariaDB at FNIMariaDB plc
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsVoltDB
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013Mark Rittman
 
DITA for Small Teams Workshop (Tekom 2017)
DITA for Small Teams Workshop (Tekom 2017)DITA for Small Teams Workshop (Tekom 2017)
DITA for Small Teams Workshop (Tekom 2017)Contrext Solutions
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document DsplayChris Despopoulos
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data opsLars Albertsson
 
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeWebinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeMongoDB
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL DatabaseNuoDB
 

What's hot (20)

What are the Strengths and Weaknesses of DITA Adoption?
What are the Strengths and Weaknesses of DITA Adoption?What are the Strengths and Weaknesses of DITA Adoption?
What are the Strengths and Weaknesses of DITA Adoption?
 
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for KeysManaging Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!
 
Anticipating Lightweight DITA
Anticipating Lightweight DITAAnticipating Lightweight DITA
Anticipating Lightweight DITA
 
Converting and Integrating Content When Implementing a New CMS
Converting and Integrating Content When Implementing a New CMSConverting and Integrating Content When Implementing a New CMS
Converting and Integrating Content When Implementing a New CMS
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Content Conversion Done Right Saves More Than Money
Content Conversion Done Right Saves More Than MoneyContent Conversion Done Right Saves More Than Money
Content Conversion Done Right Saves More Than Money
 
10 Million Dita Topics Can't Be Wrong
10 Million Dita Topics Can't Be Wrong10 Million Dita Topics Can't Be Wrong
10 Million Dita Topics Can't Be Wrong
 
Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32
 
Tackle your Documentation Challenges with the IXIASOFT DITA CMS
Tackle your Documentation Challenges with the IXIASOFT DITA CMSTackle your Documentation Challenges with the IXIASOFT DITA CMS
Tackle your Documentation Challenges with the IXIASOFT DITA CMS
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
Sprinting to Success: Why Agile and DITA Work So Well Together
Sprinting to Success: Why Agile and DITA Work So Well TogetherSprinting to Success: Why Agile and DITA Work So Well Together
Sprinting to Success: Why Agile and DITA Work So Well Together
 
M|18 How We Made the Move to MariaDB at FNI
M|18 How We Made the Move to MariaDB at FNIM|18 How We Made the Move to MariaDB at FNI
M|18 How We Made the Move to MariaDB at FNI
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
DITA for Small Teams Workshop (Tekom 2017)
DITA for Small Teams Workshop (Tekom 2017)DITA for Small Teams Workshop (Tekom 2017)
DITA for Small Teams Workshop (Tekom 2017)
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data ops
 
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeWebinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 

Viewers also liked

Minimalism Revisited — Let’s Stop Developing Content that No One Wants
Minimalism Revisited — Let’s Stop Developing Content that No One WantsMinimalism Revisited — Let’s Stop Developing Content that No One Wants
Minimalism Revisited — Let’s Stop Developing Content that No One Wantsdclsocialmedia
 
DITA for Small Teams: An Open Source Approach to DITA Content Management
DITA for Small Teams: An Open Source Approach to DITA Content ManagementDITA for Small Teams: An Open Source Approach to DITA Content Management
DITA for Small Teams: An Open Source Approach to DITA Content Managementdclsocialmedia
 
Content Engineering and The Internet of “Smart” Things
Content Engineering and The Internet of “Smart” ThingsContent Engineering and The Internet of “Smart” Things
Content Engineering and The Internet of “Smart” Thingsdclsocialmedia
 
Introduction to Structured Authoring
Introduction to Structured AuthoringIntroduction to Structured Authoring
Introduction to Structured Authoringdclsocialmedia
 
Optimizing the DITA Authoring Experience
Optimizing the DITA Authoring ExperienceOptimizing the DITA Authoring Experience
Optimizing the DITA Authoring Experiencedclsocialmedia
 
There's Gold in Them Thar Data
There's Gold in Them Thar DataThere's Gold in Them Thar Data
There's Gold in Them Thar Datadclsocialmedia
 
New Directions 2015 – Changes in Content Best Practices
New Directions 2015 – Changes in Content Best PracticesNew Directions 2015 – Changes in Content Best Practices
New Directions 2015 – Changes in Content Best Practicesdclsocialmedia
 
Precision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and TechnologyPrecision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and Technologydclsocialmedia
 
Using HTML5 to Deliver and Monetize Your Mobile Content
Using HTML5 to Deliver and Monetize Your Mobile ContentUsing HTML5 to Deliver and Monetize Your Mobile Content
Using HTML5 to Deliver and Monetize Your Mobile Contentdclsocialmedia
 
10 Mistakes When Moving to Topic-Based Authoring
10 Mistakes When Moving to Topic-Based Authoring10 Mistakes When Moving to Topic-Based Authoring
10 Mistakes When Moving to Topic-Based Authoringdclsocialmedia
 
DITA, EPUB, and HTML5: An Update for 2015
DITA, EPUB, and HTML5: An Update for 2015DITA, EPUB, and HTML5: An Update for 2015
DITA, EPUB, and HTML5: An Update for 2015dclsocialmedia
 
Demystifying SPL for Medical Devices
Demystifying SPL for Medical DevicesDemystifying SPL for Medical Devices
Demystifying SPL for Medical Devicesdclsocialmedia
 
Marketing and Strategy and Bears... oh my!
Marketing and Strategy and Bears... oh my!Marketing and Strategy and Bears... oh my!
Marketing and Strategy and Bears... oh my!dclsocialmedia
 

Viewers also liked (14)

Minimalism Revisited — Let’s Stop Developing Content that No One Wants
Minimalism Revisited — Let’s Stop Developing Content that No One WantsMinimalism Revisited — Let’s Stop Developing Content that No One Wants
Minimalism Revisited — Let’s Stop Developing Content that No One Wants
 
DITA for Small Teams: An Open Source Approach to DITA Content Management
DITA for Small Teams: An Open Source Approach to DITA Content ManagementDITA for Small Teams: An Open Source Approach to DITA Content Management
DITA for Small Teams: An Open Source Approach to DITA Content Management
 
Content Engineering and The Internet of “Smart” Things
Content Engineering and The Internet of “Smart” ThingsContent Engineering and The Internet of “Smart” Things
Content Engineering and The Internet of “Smart” Things
 
Introduction to Structured Authoring
Introduction to Structured AuthoringIntroduction to Structured Authoring
Introduction to Structured Authoring
 
Optimizing the DITA Authoring Experience
Optimizing the DITA Authoring ExperienceOptimizing the DITA Authoring Experience
Optimizing the DITA Authoring Experience
 
There's Gold in Them Thar Data
There's Gold in Them Thar DataThere's Gold in Them Thar Data
There's Gold in Them Thar Data
 
Metadata Matters
Metadata MattersMetadata Matters
Metadata Matters
 
New Directions 2015 – Changes in Content Best Practices
New Directions 2015 – Changes in Content Best PracticesNew Directions 2015 – Changes in Content Best Practices
New Directions 2015 – Changes in Content Best Practices
 
Precision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and TechnologyPrecision Content™ Tools, Techniques, and Technology
Precision Content™ Tools, Techniques, and Technology
 
Using HTML5 to Deliver and Monetize Your Mobile Content
Using HTML5 to Deliver and Monetize Your Mobile ContentUsing HTML5 to Deliver and Monetize Your Mobile Content
Using HTML5 to Deliver and Monetize Your Mobile Content
 
10 Mistakes When Moving to Topic-Based Authoring
10 Mistakes When Moving to Topic-Based Authoring10 Mistakes When Moving to Topic-Based Authoring
10 Mistakes When Moving to Topic-Based Authoring
 
DITA, EPUB, and HTML5: An Update for 2015
DITA, EPUB, and HTML5: An Update for 2015DITA, EPUB, and HTML5: An Update for 2015
DITA, EPUB, and HTML5: An Update for 2015
 
Demystifying SPL for Medical Devices
Demystifying SPL for Medical DevicesDemystifying SPL for Medical Devices
Demystifying SPL for Medical Devices
 
Marketing and Strategy and Bears... oh my!
Marketing and Strategy and Bears... oh my!Marketing and Strategy and Bears... oh my!
Marketing and Strategy and Bears... oh my!
 

Similar to Preparing Legacy Data for S1000D Automation

Creating a Hybrid Approach to Legacy Conversion
Creating a Hybrid Approach to Legacy ConversionCreating a Hybrid Approach to Legacy Conversion
Creating a Hybrid Approach to Legacy Conversiondclsocialmedia
 
DesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 MigrationDesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 MigrationMark Ginnebaugh
 
Presentation application change management and data masking strategies for ...
Presentation   application change management and data masking strategies for ...Presentation   application change management and data masking strategies for ...
Presentation application change management and data masking strategies for ...xKinAnx
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittDatabricks
 
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...
6. real time integration with odi 11g & golden gate 11g & dq 11g   20101103 -...6. real time integration with odi 11g & golden gate 11g & dq 11g   20101103 -...
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...Doina Draganescu
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationInside Analysis
 
SQL Server 2008 Migration
SQL Server 2008 MigrationSQL Server 2008 Migration
SQL Server 2008 MigrationMark Ginnebaugh
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeTorsten Steinbach
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...Precisely
 
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Lucas Jellema
 
SecureKloud_Corporate Deck.pdf
SecureKloud_Corporate Deck.pdfSecureKloud_Corporate Deck.pdf
SecureKloud_Corporate Deck.pdfSrinivasMahankali3
 
Hi I need security-related job points for the software develope.docx
Hi I need security-related job points for the software develope.docxHi I need security-related job points for the software develope.docx
Hi I need security-related job points for the software develope.docxfideladallimore
 
Engineering Collaboration Webinar One
Engineering Collaboration Webinar OneEngineering Collaboration Webinar One
Engineering Collaboration Webinar OneStephen Porter
 
Ms net work-sharepoint 2013-applied architecture from the field v4
Ms net work-sharepoint 2013-applied architecture from the field v4Ms net work-sharepoint 2013-applied architecture from the field v4
Ms net work-sharepoint 2013-applied architecture from the field v4Tihomir Ignatov
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
Microservices architecture
Microservices architectureMicroservices architecture
Microservices architectureMohammad Dameer
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastDatabricks
 

Similar to Preparing Legacy Data for S1000D Automation (20)

Creating a Hybrid Approach to Legacy Conversion
Creating a Hybrid Approach to Legacy ConversionCreating a Hybrid Approach to Legacy Conversion
Creating a Hybrid Approach to Legacy Conversion
 
DesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 MigrationDesignMind SQL Server 2008 Migration
DesignMind SQL Server 2008 Migration
 
Presentation application change management and data masking strategies for ...
Presentation   application change management and data masking strategies for ...Presentation   application change management and data masking strategies for ...
Presentation application change management and data masking strategies for ...
 
Automating Data Quality Processes at Reckitt
Automating Data Quality Processes at ReckittAutomating Data Quality Processes at Reckitt
Automating Data Quality Processes at Reckitt
 
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...
6. real time integration with odi 11g & golden gate 11g & dq 11g   20101103 -...6. real time integration with odi 11g & golden gate 11g & dq 11g   20101103 -...
6. real time integration with odi 11g & golden gate 11g & dq 11g 20101103 -...
 
AWS User Group October
AWS User Group OctoberAWS User Group October
AWS User Group October
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data Implementation
 
SQL Server 2008 Migration
SQL Server 2008 MigrationSQL Server 2008 Migration
SQL Server 2008 Migration
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
 
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
Pragmatic CQRS with existing applications and databases (Digital Xchange, May...
 
SecureKloud_Corporate Deck.pdf
SecureKloud_Corporate Deck.pdfSecureKloud_Corporate Deck.pdf
SecureKloud_Corporate Deck.pdf
 
Hi I need security-related job points for the software develope.docx
Hi I need security-related job points for the software develope.docxHi I need security-related job points for the software develope.docx
Hi I need security-related job points for the software develope.docx
 
Engineering Collaboration Webinar One
Engineering Collaboration Webinar OneEngineering Collaboration Webinar One
Engineering Collaboration Webinar One
 
Ms net work-sharepoint 2013-applied architecture from the field v4
Ms net work-sharepoint 2013-applied architecture from the field v4Ms net work-sharepoint 2013-applied architecture from the field v4
Ms net work-sharepoint 2013-applied architecture from the field v4
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
PLM Implementation
PLM ImplementationPLM Implementation
PLM Implementation
 
Microservices architecture
Microservices architectureMicroservices architecture
Microservices architecture
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 

Recently uploaded

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Preparing Legacy Data for S1000D Automation

  • 1. Confidential & Proprietarywww.dclab.comwww.dclab.com Preparing Your Legacy Data for Automation in S1000D Naveh Greenberg, Director, U.S. Defense Development, Data Conversion Laboratory
  • 2. Confidential & Proprietarywww.dclab.com 2 Valuable Content Transformed • Document Digitization • XML and HTML Conversion • eBook Production • Hosted Solutions • Big Data Automation • Conversion Management • Editorial Services • Harmonizer
  • 3. Confidential & Proprietarywww.dclab.com 3 Experience the DCL Difference DCL blends years of conversion experience with cutting-edge technology and the infrastructure to make the process easy and efficient. • World-Class Services • Leading-Edge Technology • Unparalleled Infrastructure • US-Based Management • Complex-Content Expertise • 24/7 Online Project Tracking • Automated Quality Control • Global Capabilities
  • 4. Confidential & Proprietarywww.dclab.com We Serve a Very Broad Client Base . . . 4
  • 5. Confidential & Proprietarywww.dclab.com 5 . . . Spanning All Industries • Aerospace • Associations • Defense • Distribution • Education • Financial • Government • Libraries • Life Sciences • Manufacturing • Medical • Museums • Periodicals • Professional • Publishing • Reference • Research • Societies • Software • STM • Technology • Telecommunications • Universities • Utilities
  • 6. Confidential & Proprietarywww.dclab.com 6 What Makes S1000D Conversion Difficult • S1000D is a conceptual departure from linear information – and is difficult for many to get used to • Turns the traditional book into a collection of DMs – Introductory material that applies to numerous DMs – Placement of Warnings, Cautions and Notes – Writer creativity • DMC & business rules. – Assigning DMCs and ICNs – Hierarchy in Map Files (Publication Module) – Data can fit more than one information code • …but your documents weren’t likely to have been designed to do this.
  • 7. Confidential & Proprietarywww.dclab.com 7 Structuring a Book into Data Modules in S1000D IPD Wiring Descriptive Crew Fault Appendix B Procedural Para 1-1Early S1000D Publication Para 1-2 Para 1-3 Para 1-1 Para 3-1 Para 2-1 PDF Book Para 1-2 38784 Book Para 2-1 Para 2-2 Appendix A Para 3-2 Appendix A Appendix B S1000D Common Source Database Publication 1 Publication2 Subtask Task Subtask ATA Book Pageblock Pageblock Pageblock Pageblock Pageblock Task Maintenance Process Descriptive DM Procedural DM IPD DM Wiring DM Crew DM Process DM Maintenance DM Fault DM IPD Wiring Descriptive Crew Fault Procedural Maintenance Process Process Wiring Procedural Descriptive Fault Crew Process Publication3
  • 8. Confidential & Proprietarywww.dclab.com 8 Further Complications in S1000D Conversion • There’s the usual conversion issues – Accuracy of the transferred text – Tables – Math or odd looking text – Special Characters • There’s also the structuring issues – Identifying DMs – Identifying reusable content – Identifying Applicability • And the people issues – Getting rugged individualists to collaborate more – Deciding what needs re-authoring – Getting used to a new “document” paradigm
  • 9. Confidential & Proprietarywww.dclab.com 9 Most Importantly – Plan!!! • Ask the important initial questions ˗ Who are the stakeholders. Who is the final client/user? ˗ What is the estimated volume and deadline? ˗ Source format. Not all source data are created equal. ˗ What version of S1000D? ˗ Do we know what CMS or rendering tools will be used? ˗ Budget? • Ask around or join discussion groups. • Get your hands on the source data, business rules, and schemas. • Begin looking for the right people. You don’t need to be a S1000D savvy but you do at a minimum understand the concept.
  • 11. Confidential & Proprietarywww.dclab.com “If I had eight hours to chop down a tree, I'd spend six sharpening my ax.” - Abraham Lincoln DCL’s Project Start-up Methodology
  • 12. Confidential & Proprietarywww.dclab.com Conversion Setup Components Conversion Production Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training Organizing Content for Conversion Hosting & Running Conversion SW Hosting & Running Automation & Workflow SW Scanning & OCR Image Processing Proofreading Pre-Conversion Document Preparation Conversion Parse/View Quality Control Reporting, Audit & Reconciliation What Does a Conversion Project Look Like?
  • 13. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 14. Confidential & Proprietarywww.dclab.com Inventory & Assessment • Log the batches received into a production control system. • By logging and tracking each unit you can gather information that can be used to: – Project delivery schedules – Confirm that processes are working properly – Track each unit and show you in what step of the production process it’s in.
  • 15. Confidential & Proprietarywww.dclab.com 15 Inventory & Assessment: What to Convert, and in What Order • Categorizing – Active documents in good shape – Active documents that need a lot of work – Somewhat inactive document that will likely be retired – Archival materials • Prioritizing – Documents that are most used – Documents that are customer favorites – Documents with longest product life – Start with most recent documents and go back • Identifying the process – Can be converted as is – Can be converted with some work – Needs to be rewritten – Don’t convert – just keep archival copies
  • 16. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 17. Confidential & Proprietarywww.dclab.com Why Is Reuse Analysis Important? • Increased consistency • Reduced development time • Lower maintenance costs • Rapid reconfiguration • Divide and conquer
  • 18. Confidential & Proprietarywww.dclab.com Why Is Reuse Analysis Important?
  • 19. Confidential & Proprietarywww.dclab.com 19 Content Reuse Analysis Reports • Finding exact or similar text will help you when mapping to Data Modules • It will also help to detect applicability and inconsistencies
  • 20. Confidential & Proprietarywww.dclab.com 20 Content Reuse Analysis Reports
  • 21. Confidential & Proprietarywww.dclab.com Conversion Setup Components Inventory & Assessment Reuse Analysis Document Analysis Conversion Specification Architecture Design & Configuration Design & Develop Conversion SW Design & Develop Automation & Workflow SW Conversion SW Testing Training What Does a Conversion Project Look Like?
  • 22. Confidential & Proprietarywww.dclab.com 22 Document Analysis & Conversion Specification • Evaluate document sources to determine the relative ease & accuracy of content extraction • Identify metadata sources • Identify the types of information in the documents and the appropriate level of tagging • Identify processes for various materials • Detailed analysis of documents by type • Review enough documents to understand the potential variations • Develop tagging instructions • Prepare specification • Normalize your data
  • 23. Confidential & Proprietarywww.dclab.com 23 Document Analysis – Text extraction Sample Document Text OCR Output
  • 24. Confidential & Proprietarywww.dclab.com The Conversion Specification (DMRL & specific rules) 24
  • 25. Confidential & Proprietarywww.dclab.com The Conversion Specification 25
  • 34. Confidential & Proprietarywww.dclab.com 34 Q&A Naveh Greenberg Director, U.S. Defense Development, Data Conversion Laboratory (718) 307-5758 ngreenberg@dclab.com @dclaboratory

Editor's Notes

  1. -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  2. -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  3. -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.
  4. -there’s a lot more components to getting a conversion project done than most people think -and there’s a lot more things that need to be setup so that there’s no surprise, or rework, later when you’re chunking things out -I tried to lay out the common tasks that I would expect in a large conversion project – there are of course some variations – but these are the major ones -traditionally most of this was done by whoever was “in charge of the conversion” – and that’ was the predominant model until a few years ago. -what we’re finding today is that many times a hybrid model – where different parties handle some of the task might work better, especially when the client company already has significant resources for some of the tasks, but needs expertise for others -later in this talk I will discuss several case studies of how this might work -but first, I would like to through what the various steps are, and a little about what gets done in which one -these two wheels represent the various tasks – the left wheel, read clockwise, represents what gets done to get set up, and the right wheel represents the production tasks.