SlideShare a Scribd company logo
1 of 33
With ChronoScan
Capture and Extraction:
Where ECM Begins
Capture means many things when speaking
of Document/Records Management or
Enterprise Content Management.
AIIM
Association for Information and
Image Management
“Capture boils down to entering
content into the system.”
Extraction is an important element
of Capture…
By extraction we mean pulling the
important information from the
content to use for classification or
taxonomy purposes, creation of
the appropriate metadata or tags,
and more.
Extraction is an important element
of Capture…
So Why is Capture and
Extraction so Important?
All Information Governance and Content
Management Depends on Correct Metadata
• Find key information on demand
• Apply the correct data security/privacy rules
• Determine the correct data retention
• Protect your entity regarding eDiscovery/legal
compliance issues
• Turn your content or knowledge into a
competitive advantage
You have to correctly identify the document or content to:
a comprehensive suite of software for
document scanning, data extraction
and integration into your ECM, CMIS
compliant, or line of business
database.
ChronoScan is:
The capture of
the “thing”:
• Scans
• Faxes
• Emails
• PrintStreams
Exterior Interior
Let’s categorize capture by what we’ll
call the Exterior and the Interior
The capture of the content of the
“thing”:
Actual data and information extracted
from the “thing” such as invoice
number, line items, customer number,
vendor number, patient
name…whatever your information
concerns.
This presentation
looks at the “interior”
capture accomplished
by ChronoScan’s
“extraction” features.
ChronoScan’s Extraction Features We’ll Examine
OCR technology is the
foundation for many of
ChronoScan’s auto extraction
capabilities.
Using sophisticated OCR
technologies such as Zonal OCR and
Grid OCR, ChronoScan can extract
data to classify the document and
create indexes (metadata or tags)
from structured and unstructured
documents.
Extract only data from the area of your document where your
important information is found for fast, automatic data
extraction.
Zonal OCR Capture
Use Dynamic Text Anchors to link to moving text using constant
or variable patterns, thus accommodating unstructured
documents.
Zonal OCR Capture
Here, ChronoScan finds the word “subtotal” and captures the data to the
right. Extracted data can be further manipulated and used for validation.
Optimize for your documents with multiple parameters like
image processing, OCR engine, type of data to find, regular
expression validation and more.
Zonal OCR Capture
Grid OCR is used for Line
Item Extraction and
Advanced Report
Breakdown or Dismount.
With Line Item
Extraction, extract and
manipulate line data found
on such forms as invoices
or delivery tickets.
Advanced Report Breakdown or Dismount
Convert complex reports to a structured data
format.
Convert complex PDF or scanned OCR
reports into a structured data format. With
this unique feature, ChronoScan is able to
break down complex reports automatically,
splitting every different record as an
independent processing unit. The software is
able to adapt extraction to different rules
and page limits to break down and structure
visually complex documents into a
compressible data file (CSV/XLS).
Advanced Report Breakdown or Dismount
Break Down
Extract
Converts complex reports to structured data.
ChronoScan breaks down complex reports
automatically, splitting every different record as an
independent processing unit.
Easily adapt extraction to different rules and page
limits to break down and structure visually complex
documents into a compressible data file (CSV/XLS).
(using sophisticated Grid OCR)
Nuance OCR
Plug-In Option
The world's most accurate
and robust OCR available.
• Dramatically increases zonal OCR
confidence
• Improves OCR triggers precision
• Better & faster background OCR
increases precision on regular
expression rules
• Better image orientation detection
Extract 1D/2D barcodes from your documents
and assign any part of them to fields for indexing,
database export, TXT report, file naming, etc.
Barcodes are tried and
true information tags.
Read Barcodes from Images
Assign custom actions based on the barcoded values such as set
field values, split documents, etc.
Process
Captured Data
1 2
Barcodes can be used on separator or slip sheets to designate
where documents should end and begin when a stack of
documents are scanned. And the barcode information on the
separator sheets can be extracted for indexing, naming and
routing purposes too.
ChronoScan imports
PDF files with native
text so you can easily
index the fields you
want and export your
data to TXT, CSV, Excel,
Word, HTML, and
OLE/ODBC databases
to easily feed your
indexing or database
application.
Automate PDF Processing Tasks
Automatically extract fields and tables from PDF files.
ChronoScan learns the Document
Type using comprehensive layout
recognition features to “remember”
user actions. Every different
document type can be assigned to a
different template or job to customize
OCR areas, settings and actions.
Result: Scan/import documents
together, without previous
preparation to automate repetitive
tasks and improve data input.
Automatic Document
Learning:
Training ChronoScan to identify
documents with Intelligent
Document Recognition to
automatically capture information
Type 1 Documents
Type 2 Documents
Once data is identified, it can
be used for many purposes
besides indexing or metadata
creation.
Validation
File Naming
File Splitting Routing
Classification
ECM Integration
Bookmarking
Metadata
Once data is identified, it can
be used for many purposes
besides indexing or metadata
creation.
Relying on manual scrutiny to bring this “wild content” under control simply
will not work. The failure of humans to consistently tag and classify new
documents as they are filed has created the mess in the first place.
© AIIM 2014, www.aiim.org
Remember, Everything Depends on Correct
Metadata
Relying on manual scrutiny to bring this “wild content” under control simply
will not work. The failure of humans to consistently tag and classify new
documents as they are filed has created the mess in the first place.
Remember, Everything Depends on Correct
Metadata
The Key:
Automatic Metadata Creation
With ChronoScan
© AIIM 2014, www.aiim.org
For more on:
• Automated document classification
• Automated metadata creation
• Batch Document processing
• Batch PDF mining
• Batch text mining
• Batch TIF mining
• Text mining
• Extracting metadata,
• Data extraction from unstructured data
• Intelligent data capture
• Data extraction
• Using regex to extract data
• Document scanning
• Extracting data
• Extract meta data,
• Scanner software,
• Barcode recognition,
• OCR software,
• Capture tutorial
• Pdf scanning,
• Scanning software
• Indexing
• Document indexing
• Automated capture
• Meta data
• Docufi
• Imageramp
• ChronoScan
• Data capture
• What is ChronoScan
• US Chronoscan reseller
• ChronoScan in the US
www.docufi.com info@docufi.com
Copyright ©2014
Get Started With Us
Our solutions include, ImageRamp Batch for folder processing, and
ChronoScan Capture for advanced data mining and barcode requirements.
Built on over 30 years’ experience in the Document Imaging and Capture market
DocuFi is a premier ChronoScan Solutions Partner offering
extensive professional services to configure the system to
your specific requirements. DocuFi has been providing
custom solutions into health care, financial services, retail,
educational and other markets since 2010.
Learn More:

More Related Content

What's hot

Painless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with AlfrescoPainless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with AlfrescoBlueFishTX
 
DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning SolutionsGordon Bishop
 
Oce e-Copy Barcode Recognition Services
Oce e-Copy Barcode Recognition ServicesOce e-Copy Barcode Recognition Services
Oce e-Copy Barcode Recognition ServicesAndrew Bain
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsConnexica
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse ArchitecturesTheju Paul
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesFellowBuddy.com
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo
 
Take control over GDPR compliance with ContentMap software!
Take control over GDPR compliance with ContentMap software!Take control over GDPR compliance with ContentMap software!
Take control over GDPR compliance with ContentMap software!Pär Eliasson
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 

What's hot (20)

Painless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with AlfrescoPainless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with Alfresco
 
Automatic file naming and routing for scanned documents and existing files.
Automatic file naming and routing for scanned documents and existing files.  Automatic file naming and routing for scanned documents and existing files.
Automatic file naming and routing for scanned documents and existing files.
 
Folder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch ScanningFolder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch Scanning
 
8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial
 
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
 
Batch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp BatchBatch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp Batch
 
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
 
An Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your RequirementsAn Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your Requirements
 
What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.
 
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning Solutions
 
Oce e-Copy Barcode Recognition Services
Oce e-Copy Barcode Recognition ServicesOce e-Copy Barcode Recognition Services
Oce e-Copy Barcode Recognition Services
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On Analytics
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
Denodo Platform 7.0: Redefine Analytics with In-Memory Parallel Processing an...
 
Take control over GDPR compliance with ContentMap software!
Take control over GDPR compliance with ContentMap software!Take control over GDPR compliance with ContentMap software!
Take control over GDPR compliance with ContentMap software!
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
ITGS - Data And Databases
ITGS - Data And DatabasesITGS - Data And Databases
ITGS - Data And Databases
 

Similar to Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification

Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfDhanashreeBadhe
 
Drivve overview
Drivve overviewDrivve overview
Drivve overviewLembit
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchAltair
 
Best Features of Document Management System Software | Digismartek
Best Features of Document Management System Software | Digismartek Best Features of Document Management System Software | Digismartek
Best Features of Document Management System Software | Digismartek Digismartek
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File UtilityExistco Pty Ltd
 
Applying ocr to extract information : Text mining
Applying ocr to extract information  : Text miningApplying ocr to extract information  : Text mining
Applying ocr to extract information : Text miningSaurabh Singh
 
Data Capture Solution for Logistics
Data Capture Solution for LogisticsData Capture Solution for Logistics
Data Capture Solution for LogisticsDokumentive
 
UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxRohitRadhakrishnan8
 
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...Alistair Pugin
 
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...ISIS Papyrus Software
 
UiPath Document Understanding_Day 3.pptx
UiPath Document Understanding_Day 3.pptxUiPath Document Understanding_Day 3.pptx
UiPath Document Understanding_Day 3.pptxUiPathCommunity
 
InfoDNA Everteam houston breakfast 06.29.17
InfoDNA Everteam houston breakfast 06.29.17InfoDNA Everteam houston breakfast 06.29.17
InfoDNA Everteam houston breakfast 06.29.17Everteam
 
iData Sciences Product Overview
iData Sciences Product OverviewiData Sciences Product Overview
iData Sciences Product Overviewjvsrinivas1
 
Ilinxcapturev
IlinxcapturevIlinxcapturev
Ilinxcapturevmsemple
 
Electronic document management system
Electronic document management systemElectronic document management system
Electronic document management systemBiodor Bonifacio
 
Scan, Import, and Automatically File documents to SharePoint with ccScan
Scan, Import, and Automatically File documents to SharePoint with ccScanScan, Import, and Automatically File documents to SharePoint with ccScan
Scan, Import, and Automatically File documents to SharePoint with ccScanCapture Components LLC
 

Similar to Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification (20)

Document Parsing
Document ParsingDocument Parsing
Document Parsing
 
SoftTrac Synergetics
SoftTrac SynergeticsSoftTrac Synergetics
SoftTrac Synergetics
 
Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdf
 
Drivve overview
Drivve overviewDrivve overview
Drivve overview
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair Monarch
 
Best Features of Document Management System Software | Digismartek
Best Features of Document Management System Software | Digismartek Best Features of Document Management System Software | Digismartek
Best Features of Document Management System Software | Digismartek
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File Utility
 
Applying ocr to extract information : Text mining
Applying ocr to extract information  : Text miningApplying ocr to extract information  : Text mining
Applying ocr to extract information : Text mining
 
Ecm model
Ecm modelEcm model
Ecm model
 
Data Capture Solution for Logistics
Data Capture Solution for LogisticsData Capture Solution for Logistics
Data Capture Solution for Logistics
 
DU_SERIES_Session1.pdf
DU_SERIES_Session1.pdfDU_SERIES_Session1.pdf
DU_SERIES_Session1.pdf
 
UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptx
 
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...
Effective Document Capture in SharePoint - SharePoint Saturday Cape Town - 22...
 
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...
Inbound Mail Processing - Technology Innovation Brochure by ISIS Papyrus Soft...
 
UiPath Document Understanding_Day 3.pptx
UiPath Document Understanding_Day 3.pptxUiPath Document Understanding_Day 3.pptx
UiPath Document Understanding_Day 3.pptx
 
InfoDNA Everteam houston breakfast 06.29.17
InfoDNA Everteam houston breakfast 06.29.17InfoDNA Everteam houston breakfast 06.29.17
InfoDNA Everteam houston breakfast 06.29.17
 
iData Sciences Product Overview
iData Sciences Product OverviewiData Sciences Product Overview
iData Sciences Product Overview
 
Ilinxcapturev
IlinxcapturevIlinxcapturev
Ilinxcapturev
 
Electronic document management system
Electronic document management systemElectronic document management system
Electronic document management system
 
Scan, Import, and Automatically File documents to SharePoint with ccScan
Scan, Import, and Automatically File documents to SharePoint with ccScanScan, Import, and Automatically File documents to SharePoint with ccScan
Scan, Import, and Automatically File documents to SharePoint with ccScan
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Recently uploaded (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification

  • 1. With ChronoScan Capture and Extraction: Where ECM Begins
  • 2. Capture means many things when speaking of Document/Records Management or Enterprise Content Management.
  • 3. AIIM Association for Information and Image Management “Capture boils down to entering content into the system.”
  • 4. Extraction is an important element of Capture…
  • 5. By extraction we mean pulling the important information from the content to use for classification or taxonomy purposes, creation of the appropriate metadata or tags, and more. Extraction is an important element of Capture…
  • 6. So Why is Capture and Extraction so Important?
  • 7. All Information Governance and Content Management Depends on Correct Metadata • Find key information on demand • Apply the correct data security/privacy rules • Determine the correct data retention • Protect your entity regarding eDiscovery/legal compliance issues • Turn your content or knowledge into a competitive advantage You have to correctly identify the document or content to:
  • 8. a comprehensive suite of software for document scanning, data extraction and integration into your ECM, CMIS compliant, or line of business database. ChronoScan is:
  • 9. The capture of the “thing”: • Scans • Faxes • Emails • PrintStreams Exterior Interior Let’s categorize capture by what we’ll call the Exterior and the Interior The capture of the content of the “thing”: Actual data and information extracted from the “thing” such as invoice number, line items, customer number, vendor number, patient name…whatever your information concerns.
  • 10. This presentation looks at the “interior” capture accomplished by ChronoScan’s “extraction” features.
  • 12. OCR technology is the foundation for many of ChronoScan’s auto extraction capabilities.
  • 13. Using sophisticated OCR technologies such as Zonal OCR and Grid OCR, ChronoScan can extract data to classify the document and create indexes (metadata or tags) from structured and unstructured documents.
  • 14. Extract only data from the area of your document where your important information is found for fast, automatic data extraction. Zonal OCR Capture
  • 15. Use Dynamic Text Anchors to link to moving text using constant or variable patterns, thus accommodating unstructured documents. Zonal OCR Capture Here, ChronoScan finds the word “subtotal” and captures the data to the right. Extracted data can be further manipulated and used for validation.
  • 16. Optimize for your documents with multiple parameters like image processing, OCR engine, type of data to find, regular expression validation and more. Zonal OCR Capture
  • 17. Grid OCR is used for Line Item Extraction and Advanced Report Breakdown or Dismount.
  • 18. With Line Item Extraction, extract and manipulate line data found on such forms as invoices or delivery tickets.
  • 19. Advanced Report Breakdown or Dismount Convert complex reports to a structured data format. Convert complex PDF or scanned OCR reports into a structured data format. With this unique feature, ChronoScan is able to break down complex reports automatically, splitting every different record as an independent processing unit. The software is able to adapt extraction to different rules and page limits to break down and structure visually complex documents into a compressible data file (CSV/XLS). Advanced Report Breakdown or Dismount Break Down Extract Converts complex reports to structured data.
  • 20. ChronoScan breaks down complex reports automatically, splitting every different record as an independent processing unit.
  • 21. Easily adapt extraction to different rules and page limits to break down and structure visually complex documents into a compressible data file (CSV/XLS). (using sophisticated Grid OCR)
  • 22. Nuance OCR Plug-In Option The world's most accurate and robust OCR available. • Dramatically increases zonal OCR confidence • Improves OCR triggers precision • Better & faster background OCR increases precision on regular expression rules • Better image orientation detection
  • 23. Extract 1D/2D barcodes from your documents and assign any part of them to fields for indexing, database export, TXT report, file naming, etc. Barcodes are tried and true information tags.
  • 24. Read Barcodes from Images Assign custom actions based on the barcoded values such as set field values, split documents, etc. Process Captured Data 1 2
  • 25. Barcodes can be used on separator or slip sheets to designate where documents should end and begin when a stack of documents are scanned. And the barcode information on the separator sheets can be extracted for indexing, naming and routing purposes too.
  • 26. ChronoScan imports PDF files with native text so you can easily index the fields you want and export your data to TXT, CSV, Excel, Word, HTML, and OLE/ODBC databases to easily feed your indexing or database application. Automate PDF Processing Tasks Automatically extract fields and tables from PDF files.
  • 27. ChronoScan learns the Document Type using comprehensive layout recognition features to “remember” user actions. Every different document type can be assigned to a different template or job to customize OCR areas, settings and actions. Result: Scan/import documents together, without previous preparation to automate repetitive tasks and improve data input. Automatic Document Learning: Training ChronoScan to identify documents with Intelligent Document Recognition to automatically capture information Type 1 Documents Type 2 Documents
  • 28. Once data is identified, it can be used for many purposes besides indexing or metadata creation.
  • 29. Validation File Naming File Splitting Routing Classification ECM Integration Bookmarking Metadata Once data is identified, it can be used for many purposes besides indexing or metadata creation.
  • 30. Relying on manual scrutiny to bring this “wild content” under control simply will not work. The failure of humans to consistently tag and classify new documents as they are filed has created the mess in the first place. © AIIM 2014, www.aiim.org Remember, Everything Depends on Correct Metadata
  • 31. Relying on manual scrutiny to bring this “wild content” under control simply will not work. The failure of humans to consistently tag and classify new documents as they are filed has created the mess in the first place. Remember, Everything Depends on Correct Metadata The Key: Automatic Metadata Creation With ChronoScan © AIIM 2014, www.aiim.org
  • 32. For more on: • Automated document classification • Automated metadata creation • Batch Document processing • Batch PDF mining • Batch text mining • Batch TIF mining • Text mining • Extracting metadata, • Data extraction from unstructured data • Intelligent data capture • Data extraction • Using regex to extract data • Document scanning • Extracting data • Extract meta data, • Scanner software, • Barcode recognition, • OCR software, • Capture tutorial • Pdf scanning, • Scanning software • Indexing • Document indexing • Automated capture • Meta data • Docufi • Imageramp • ChronoScan • Data capture • What is ChronoScan • US Chronoscan reseller • ChronoScan in the US www.docufi.com info@docufi.com Copyright ©2014 Get Started With Us Our solutions include, ImageRamp Batch for folder processing, and ChronoScan Capture for advanced data mining and barcode requirements. Built on over 30 years’ experience in the Document Imaging and Capture market DocuFi is a premier ChronoScan Solutions Partner offering extensive professional services to configure the system to your specific requirements. DocuFi has been providing custom solutions into health care, financial services, retail, educational and other markets since 2010.