December 2007 Document Recognition Technology Overview Presentation - Presentation Transcript
Document Recognition
a technology overview
Presented by:
Chris Riley of Artsyl Technologies, Inc.
But First
Your new AIIM Board!
Exciting new events
Golf
Networking
More Education Sessions
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
Why Chris?
Who is Artsyl?
What qualifies Chris to talk to me?
When a developer turns to sales
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
Who knows what OCR is?
The Technologies
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: OCR
OCR – Optical Character Recognition
Ship To:
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: ICR
OCR – Optical Character Recognition
Ilya
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: OMR
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition Card Account
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Barcode
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition 1889094476620
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Handwriting
OCR – Optical Character Recognition
* Critical *
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Acronym Heaven
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: CAR/LAR
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
2 hundred dollars & no cents
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Assisted Capture
OCR – Optical Character Recognition
ICR – Intelligent Character Recognition
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Fixed Form Processing
OCR – Optical Character Recognition Name: Ilya
ICR – Intelligent Character Recognition Date: 12/21/2982
OMR – Optical Mark Recognition
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Fixed Form Processing
Name: Ilya
Date: 12/21/2982
80% of business end-user documents
are semi-structured
The Technologies: Semi-Structured Forms
Invoice No: 99044
OCR – Optical Character Recognition Date: 06/09/04
ICR – Intelligent Character Recognition Invoice No: 24567
OMR – Optical Mark Recognition Date: 06/09/04
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Semi-Structured Forms
Invoice No: 99044
Date: 06/09/04
Invoice No: 24567
Date: 06/09/04 (06/09/2004)
The Technologies: Semi-Structured Forms
Consignee
OCR – Optical Character Recognition Consignor
ICR – Intelligent Character Recognition Date
OMR – Optical Mark Recognition Term
Barcode
Handwriting
All the other ones made up for marketing purposes
CAR/LAR ( Check21 ) – Courtesy and Legal Amount Recognition
Assisted Capture
Fixed Form Process
Semi-Structured Forms Processing
Unstructured Document Processing
The Technologies: Common Processes
Full page conversion
Classification
Index level extraction
Redaction
Routing
Auto Filing
Re-Purposing
Image Rotation
The Technologies: Full page conversion
Image file to electronic data file
ALL text on the page
Includes:
Image Pre-processing
Document Analysis/Zoning
Extraction
Export ( Commonly PDF, DOC )
The Technologies: Classification
Software tells you the document type
Scan batches of mixed documents
ng ce
oi
di
a v
In
fL k
lo ec
l
Bi Ch
PO
The Technologies: Index Level Extraction
Just certain required fields extracted
Normalization of data
Export usually to a database
Invoice Number
Invoice Date
Total Amt Due
Term
The Technologies: How Accurate
Better question is how do you determine
accuracy
Document Type Accuracy
Field/Zone Location Accuracy
Data Type Accuracy
Character Accuracy
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
There Really are only 3 core
technology providers
It takes 50 man-years to develop OCR
using current computing abilities
Who Makes Them: Core Engines
ABBYY
Nuance ( formally ScanSoft )
ReadI.R.I.S
Océ
CharacTell
ParaScript
A2iA
Handful of Open Source
Handful of Other Vendors
Two handfuls of OLD engines
Who Makes Them: Who Licenses Them
EVERYONE ELSE!
AnaComp
Anydoc
BancTec
BrainWare
Captaris
Captivation
Cardiff
CVision
DataCap
DigiTech
eCopy
EMC Documentum
Kofax
LaserFiche
LeadTools
Microsoft
NSi AutoStore
OnBase
Perceptive Imaging
ReadSoft
SER
Top Image Systems
Tower
Westbrook
Xerox
Hundreds More
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
30% of organizations that purchase,
purchase the wrong thing
Over 50 % of organizations that
purchase never use it properly
Buyer Beware
If OCR is the reason for buying a solution know
what Engine it is!
Talk about the WHOLE solution not the pieces
Get past marketing gimmicks
Trust, Love, Be Certain of your reseller / vendor
Buyer Beware: Know your engine
What version?
Will they upgrade?
Buyer Beware: Talk about Whole Solution
Scanner / Input
Capture
Storage
Have Requirements List Before
Buyer Beware: Get past Gimmicks
NOTHING! Is 100%
All canned demos work perfect
Always see test on your documents
Version numbers are really arbitrary
Buyer Beware: Trust your vendor / reseller
Support after sale ( test them )
Where to get professional services
Do they understand the solution and not
just the pieces?
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
The Future
Full-page OCR will be a commodity
Advance Document Processing will become main-
stream but less required
Think about what to do now that you will be gathering
data rapidly
There will be a new approach to OCR
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
Questions and Answers
Before you ask
What we will cover:
Why Chris?
What Are the Document Recognition Technologies
Who Makes Them
Buyer Beware
The future
Q&A
Free Stuff!
Free Stuff
Copy of ABBYY FineReader Pro 9.0
Copy of Nuance OmniPage 16
Copy of ReadI.R.I.S Pro 11
4 Hour Consulting Session with ME!
0 comments
Post a comment