Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification

With ChronoScan
Capture and Extraction:
Where ECM Begins

Capture means many things when speaking
of Document/Records Management or
Enterprise Content Management.

AIIM
Association for Information and
Image Management
“Capture boils down to entering
content into the system.”

Extraction is an important element
of Capture…

By extraction we mean pulling the
important information from the
content to use for classification or
taxonomy purposes, creation of
the appropriate metadata or tags,
and more.
Extraction is an important element
of Capture…

So Why is Capture and
Extraction so Important?

All Information Governance and Content
Management Depends on Correct Metadata
• Find key information on demand
• Apply the correct data security/privacy rules
• Determine the correct data retention
• Protect your entity regarding eDiscovery/legal
compliance issues
• Turn your content or knowledge into a
competitive advantage
You have to correctly identify the document or content to:

a comprehensive suite of software for
document scanning, data extraction
and integration into your ECM, CMIS
compliant, or line of business
database.
ChronoScan is:

The capture of
the “thing”:
• Scans
• Faxes
• Emails
• PrintStreams
Exterior Interior
Let’s categorize capture by what we’ll
call the Exterior and the Interior
The capture of the content of the
“thing”:
Actual data and information extracted
from the “thing” such as invoice
number, line items, customer number,
vendor number, patient
name…whatever your information
concerns.

This presentation
looks at the “interior”
capture accomplished
by ChronoScan’s
“extraction” features.

ChronoScan’s Extraction Features We’ll Examine

OCR technology is the
foundation for many of
ChronoScan’s auto extraction
capabilities.

Using sophisticated OCR
technologies such as Zonal OCR and
Grid OCR, ChronoScan can extract
data to classify the document and
create indexes (metadata or tags)
from structured and unstructured
documents.

Extract only data from the area of your document where your
important information is found for fast, automatic data
extraction.
Zonal OCR Capture

Use Dynamic Text Anchors to link to moving text using constant
or variable patterns, thus accommodating unstructured
documents.
Zonal OCR Capture
Here, ChronoScan finds the word “subtotal” and captures the data to the
right. Extracted data can be further manipulated and used for validation.

Optimize for your documents with multiple parameters like
image processing, OCR engine, type of data to find, regular
expression validation and more.
Zonal OCR Capture

Grid OCR is used for Line
Item Extraction and
Advanced Report
Breakdown or Dismount.

With Line Item
Extraction, extract and
manipulate line data found
on such forms as invoices
or delivery tickets.

Advanced Report Breakdown or Dismount
Convert complex reports to a structured data
format.
Convert complex PDF or scanned OCR
reports into a structured data format. With
this unique feature, ChronoScan is able to
break down complex reports automatically,
splitting every different record as an
independent processing unit. The software is
able to adapt extraction to different rules
and page limits to break down and structure
visually complex documents into a
compressible data file (CSV/XLS).
Advanced Report Breakdown or Dismount
Break Down
Extract
Converts complex reports to structured data.

ChronoScan breaks down complex reports
automatically, splitting every different record as an
independent processing unit.

Easily adapt extraction to different rules and page
limits to break down and structure visually complex
documents into a compressible data file (CSV/XLS).
(using sophisticated Grid OCR)

Nuance OCR
Plug-In Option
The world's most accurate
and robust OCR available.
• Dramatically increases zonal OCR
confidence
• Improves OCR triggers precision
• Better & faster background OCR
increases precision on regular
expression rules
• Better image orientation detection

Extract 1D/2D barcodes from your documents
and assign any part of them to fields for indexing,
database export, TXT report, file naming, etc.
Barcodes are tried and
true information tags.

Read Barcodes from Images
Assign custom actions based on the barcoded values such as set
field values, split documents, etc.
Process
Captured Data
1 2

Barcodes can be used on separator or slip sheets to designate
where documents should end and begin when a stack of
documents are scanned. And the barcode information on the
separator sheets can be extracted for indexing, naming and
routing purposes too.

ChronoScan imports
PDF files with native
text so you can easily
index the fields you
want and export your
data to TXT, CSV, Excel,
Word, HTML, and
OLE/ODBC databases
to easily feed your
indexing or database
application.
Automate PDF Processing Tasks
Automatically extract fields and tables from PDF files.

ChronoScan learns the Document
Type using comprehensive layout
recognition features to “remember”
user actions. Every different
document type can be assigned to a
different template or job to customize
OCR areas, settings and actions.
Result: Scan/import documents
together, without previous
preparation to automate repetitive
tasks and improve data input.
Automatic Document
Learning:
Training ChronoScan to identify
documents with Intelligent
Document Recognition to
automatically capture information
Type 1 Documents
Type 2 Documents

Once data is identified, it can
be used for many purposes
besides indexing or metadata
creation.

Validation
File Naming
File Splitting Routing
Classification
ECM Integration
Bookmarking
Metadata
Once data is identified, it can
be used for many purposes
besides indexing or metadata
creation.

Relying on manual scrutiny to bring this “wild content” under control simply
will not work. The failure of humans to consistently tag and classify new
documents as they are filed has created the mess in the first place.
© AIIM 2014, www.aiim.org
Remember, Everything Depends on Correct
Metadata

Relying on manual scrutiny to bring this “wild content” under control simply
will not work. The failure of humans to consistently tag and classify new
documents as they are filed has created the mess in the first place.
Remember, Everything Depends on Correct
Metadata
The Key:
Automatic Metadata Creation
With ChronoScan
© AIIM 2014, www.aiim.org

For more on:
• Automated document classification
• Automated metadata creation
• Batch Document processing
• Batch PDF mining
• Batch text mining
• Batch TIF mining
• Text mining
• Extracting metadata,
• Data extraction from unstructured data
• Intelligent data capture
• Data extraction
• Using regex to extract data
• Document scanning
• Extracting data
• Extract meta data,
• Scanner software,
• Barcode recognition,
• OCR software,
• Capture tutorial
• Pdf scanning,
• Scanning software
• Indexing
• Document indexing
• Automated capture
• Meta data
• Docufi
• Imageramp
• ChronoScan
• Data capture
• What is ChronoScan
• US Chronoscan reseller
• ChronoScan in the US
www.docufi.com info@docufi.com
Copyright ©2014
Get Started With Us
Our solutions include, ImageRamp Batch for folder processing, and
ChronoScan Capture for advanced data mining and barcode requirements.
Built on over 30 years’ experience in the Document Imaging and Capture market
DocuFi is a premier ChronoScan Solutions Partner offering
extensive professional services to configure the system to
your specific requirements. DocuFi has been providing
custom solutions into health care, financial services, retail,
educational and other markets since 2010.

Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification

Similar to Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification (20)

Recently uploaded

Recently uploaded (20)

Automated Data Capture and Extraction with ChronoScan for Automated Metadata and Classification