SlideShare a Scribd company logo
Learn What Is Intelligent Document and Data Capture
and Get Started
The Paperless Office…
Chasing the Impossible?
In a now famous (or infamous) 1975
issue of BusinessWeek titled “The Office
of the Future” technologists describe
“The Paperless Office.”
“Vincent E. Giuliano of Arthur D. Little,
Inc., figures that the use of paper in
business for records and correspondence
should be declining by 1980, ‘and by 1990,
most record-handling will be electronic.’”
I think we can all agree that we’re
not there yet.
How about we agree that what we really
want is “The Nearly Paperless Office”?
The first part of any Document or
Content Management System is capture.
What is Intelligent
Document and Data
Capture?
To keep it simple let’s stick with AIIM’s (Association
for Information and Image Management) definition.
AIIM is a nonprofit, serving information and image
professionals.
“Document capture and data capture are not the
same thing. Document capture is the conversion of a
paper document into an electronic image of that
document. Data capture extracts data from a
business form”.
We’ll interpret “form” here as
any paper or electronic source.
Why intelligent or
automated?
Reduce Labor
Speed Processing and
Information Delivery
Comply with
Regulations
Reduce Errors
So what is the capture process?
So what is the capture process?
There are many models, from broad three-
step processes to more specific five-step
processes.
So what is the capture process?
There are many models, from broad three-
step processes to more specific five-step
processes.
Let’s go with the five-step.
1. Capture
Paper Sources: Electronic Sources:
Captured with scanners or
MFP devices.
Network directories, emails,
electronic forms, print streams,
faxes…anything made of 1’s and 0s.
2. Classify/Organize/Categorize
Identifying what the document or information is in
order to correctly process and deliver the document
and extract the information.
2. Classify/Organize/Categorize
Identifying what the document or information is in
order to correctly process and deliver the document
and extract the information.
Invoice ContractTax Form
Patient
Record
?
How should it be processed? Where should it be
routed and stored?
3. Extract or Mine
Capturing data for the index or other purposes.
May be data such as
customer number, freight
tracking number, invoice
number, supplier name
etc.
Or, full-text indexing may
be required where all
text on the documents
are captured. See What
is Document Indexing.
4. Validate
Using technology or manual inspection to ensure that
a document is classified and processed correctly
4. Validate
With technology this may mean automatically validating
against data sources or employing business rules.
For instance if an inventory item should contain three alpha
characters followed by five numbers, all documents not
following that scheme may be tagged for manual inspection
before further processing is done.
PEN21096
CAP36581
INV98453
PA568793
5. Deliver or Integrate
…to or with a search and retrieval or content
management system.
Obviously, without a system to
locate documents or data, a system
is useless.
Henry Schein,
Dentrix, Dentrix
Enterprise
Dentrix Ascend,
Easy Dental
Viive,
DentalVision,
axiUm
5. Deliver or Integrate
Often index information is sent to the document
management system via an XML or CSV file where it can be
made immediately available to the user.
Systems such as SharePoint, Epic, Laserfiche and other
ECM, EMR, EHR systems have various ways of accepting
data feeds
Filenet
Laserfiche
Documentum
MyMedicalRecords
Eaglesoft
Allscripts
Epic
Dentrix
CSV or XML
So how do we get that pig to
Today we have proven and developing
technologies propelling us to The Nearly
Paperless Office.
Barcode recognition (BCR) offers
the most trustworthy recognition
technology for data capture.
• Split Files
• Classify Documents
• Route Files
• Index
• Name Files
• Bookmark PDFs
Use Barcodes to …
Learn more at What Can Barcodes Do For Me?
OCR is another mature data capture technology to...
• Digitize text images so that they can be electronically
edited, searched, and stored
• Make image-based files fully text-searchable or extract
data from a zone for indexing
• Identify document areas for automatic OCR capture
(zonal OCR)
• Drag-and-drop highlighted document text which is
automatically OCR'd and dropped into index fields (drag
and drop OCR or rubber band OCR)
• Use extracted data to split, name, route, validate, etc.
Other Recognition Technologies For Data
Capture
• Handwriting recognition
• Not as accurate as OCR, limited role in some capture systems
ICR (Intelligent Character
Recognition)
• Capturing human-marked data from document forms such as
surveys and tests.
• Like ICR, lower accuracy, limited application within data capture
OMR (Optical Mark Recognition)
• Uses BCR, OCR, ICR and OMR in a structured data capture format
• Typically templates are designed to instruct the capture software
where to look for information and how to process the information
Forms Recognition
Data or Text Mining
(Often using Regular Expressions (regex))
A fast and powerful method to search, extract and
replace specific data found within scanned documents.
• Essentially a special text string for
describing a search pattern.
• Extremely flexible and patterns can be
constructed to match almost anything.
• Use data identified with regex to
classify, split, name and route files.
Learn more at Using Regular Expressions for Automated Data Capture and Extraction.
Data or Text Mining
(Often using Regular Expressions (regex))
…simply processing a large volume of
documents, generally into a few files
or one file and using intelligent
capture software to process.
Some products process folders of
documents on demand or “watch”
folders for files to process.
Batch Document
Processing
Learn more at What is Batch Document Processing?
Image Enhancement
• Adaptive thresholding
• Deskew
• Despeckle
• Remove blank pages or
separator sheets
• Auto rotate
• Remove lines
To improve usability and increase accuracy of OCR and other
recognition technologies, image enhancement is required.
Learn more at Improving OCR Accuracy with Cleanup and Enhancement.
Where is intelligent
document and data
capture going?
Cloud Computing
Increased cloud computing will bring easily
accessible resources and repositories for
documents.
See Docs in the Clouds.
“The use of cloud computing is growing,
and by 2016 this growth will increase to
become the bulk of new IT spend.”
Gartner, Inc. Oct. 2013
Security Focus
Couple the increasing number of documents being
stored with the growing ways to access them, and
security concerns will continue to increase.
Improved Data Mining and
Classification
The increased used of data mining and better
classification will increase OCR demands and
lower the use of barcodes and separator pages.
Increased Mobility
Increased mobility demands in business impacts
all information technology. Users want all
information available from all platforms, no
matter when or where.
Don’t be caught napping,
JUST GET STARTED.
No one data capture product can “do it
all”, but there is no better time to get
started than now. ”The Nearly Paperless
Office” can be yours.
Learn More about Document Imaging and Capture
For more on:
• Watching folder,
• Monitoring folder,
• Watching folders,
• Batch Processing,
• Bulk scanning,
• Split files with barcodes,
• Barcode splitting,
• How to batch process,
• Batch process folders,
• Docufi,
• Imageramp,
• Watch folders,
• Data capture,
• Scanning to folders,
• Scanning to folder,
• Scan to Folder,
• Batch Splitting
• Migration to document
management
Contact Us
DocuFi
30 years’ experience in the Document Imaging market
Capture Solutions www.docufi.com
Copyright ©2014
makers of ImageRamp,
Document Management
Capture Solution
Image Credits
• Christina Rutz, “When Pigs Fly”, http://bit.ly/1giOj05
• Nottsexminer , “Utopia”, http://bit.ly/1gnZTmS
• Kenny Louie, “One Way”, http://bit.ly/1iA7pxQ
• Spiffie, “Fujitsu ScanSnap S300M”, http://bit.ly/1ksdhhv
• Doctorwonder, “Stack O'Money!”, http://bit.ly/1fgxpko
• Maciej Lewandowski, “Pig on the wings”, http://bit.ly/N6lZCJ
• Sjsharktank, “Pigs fly, so now what?”, http://bit.ly/1g8UsYc
• Elvissa, “flyingpig”, http://bit.ly/1nLMzyB
• Jennicatpink, “Piglet Pile”, http://bit.ly/1cT6KUF
• Eddi, “phone”, http://bit.ly/1ftUezJ
• Martin Cathrae, “Cute Piggie“,http://bit.ly/1nLUDiT
• Sarah Beth Dwyer, “Jim's Pig”, http://bit.ly/Prl3dl

More Related Content

What's hot

8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial
DocuFi, offering HAI and Infection Prevention Analytics
 
An Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your RequirementsAn Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your Requirements
DocuFi, offering HAI and Infection Prevention Analytics
 
Batch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp BatchBatch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp Batch
DocuFi, offering HAI and Infection Prevention Analytics
 
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
DocuFi, offering HAI and Infection Prevention Analytics
 
What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.
DocuFi, offering HAI and Infection Prevention Analytics
 
Painless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with AlfrescoPainless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with Alfresco
BlueFishTX
 
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
DocuFi, offering HAI and Infection Prevention Analytics
 
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
DocuFi, offering HAI and Infection Prevention Analytics
 
Custom Capture Tool Development
Custom Capture Tool DevelopmentCustom Capture Tool Development
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
DocuFi, offering HAI and Infection Prevention Analytics
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
DocuFi, offering HAI and Infection Prevention Analytics
 
DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning Solutions
Gordon Bishop
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
Amr Abd El Latief
 
Big data
Big dataBig data
Big data
Mohamed Salman
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
FellowBuddy.com
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
Rahul Chaturvedi
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
Dung Nguyen
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
Er. Nawaraj Bhandari
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
MadhuriNigam1
 

What's hot (20)

8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial8 Document Capture Must Haves, a Document Management Tutorial
8 Document Capture Must Haves, a Document Management Tutorial
 
An Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your RequirementsAn Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your Requirements
 
Batch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp BatchBatch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp Batch
 
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
 
What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.What is Document Indexing? A tutorial for intelligent data capture.
What is Document Indexing? A tutorial for intelligent data capture.
 
Painless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with AlfrescoPainless Document Scanning and Indexing with Alfresco
Painless Document Scanning and Indexing with Alfresco
 
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
 
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
 
Custom Capture Tool Development
Custom Capture Tool DevelopmentCustom Capture Tool Development
Custom Capture Tool Development
 
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning Solutions
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Big data
Big dataBig data
Big data
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 

Viewers also liked

Image Scanning Services
Image Scanning ServicesImage Scanning Services
Image Scanning Services
Global Associates
 
Scanning & document management
Scanning & document managementScanning & document management
Scanning & document managementGautam Ganguly
 
Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?
Digismartek
 
What is Data Capture
What is Data CaptureWhat is Data Capture
What is Data Capture
Chris Riley ☁
 
RU
RURU
Scanning Document Types | Record Nations
Scanning Document Types | Record NationsScanning Document Types | Record Nations
Scanning Document Types | Record Nations
Record Nations
 
Apa itu soft copy
Apa itu soft copyApa itu soft copy
Apa itu soft copy
johnthj
 
Document scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestDocument scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestVander Loto
 

Viewers also liked (8)

Image Scanning Services
Image Scanning ServicesImage Scanning Services
Image Scanning Services
 
Scanning & document management
Scanning & document managementScanning & document management
Scanning & document management
 
Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?Why you need to use document scanning management system for business?
Why you need to use document scanning management system for business?
 
What is Data Capture
What is Data CaptureWhat is Data Capture
What is Data Capture
 
RU
RURU
RU
 
Scanning Document Types | Record Nations
Scanning Document Types | Record NationsScanning Document Types | Record Nations
Scanning Document Types | Record Nations
 
Apa itu soft copy
Apa itu soft copyApa itu soft copy
Apa itu soft copy
 
Document scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working bestDocument scanning and capture (local, central, outsource) what's working best
Document scanning and capture (local, central, outsource) what's working best
 

Similar to What is Intelligent Document and Data Capture? A look at the technologies to move to a "nearly" paperless office.

Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdf
DhanashreeBadhe
 
DU_SERIES_Session1.pdf
DU_SERIES_Session1.pdfDU_SERIES_Session1.pdf
DU_SERIES_Session1.pdf
RohitRadhakrishnan8
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?
ARC Document Solutions
 
Data Processing and its Types
Data Processing and its TypesData Processing and its Types
Data Processing and its Types
Muhammad Zubair
 
Proven Methods of Data Collection in Data Processing
Proven Methods of Data Collection in Data ProcessingProven Methods of Data Collection in Data Processing
Proven Methods of Data Collection in Data Processing
loginworks software
 
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBordeITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
Garrett P. Laborde
 
IoT underthe hood
IoT underthe hoodIoT underthe hood
IoT underthe hood
Dave Callaghan
 
UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptx
RohitRadhakrishnan8
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair Monarch
Altair
 
Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP Solution
Provectus
 
Document Parsing
Document ParsingDocument Parsing
Document Parsing
OliviaSmith160
 
Accenture Insurance Data Capture
Accenture Insurance Data Capture Accenture Insurance Data Capture
Accenture Insurance Data Capture
Accenture Insurance
 
Automation of document management paul fenton webinar
Automation of document management paul fenton webinarAutomation of document management paul fenton webinar
Automation of document management paul fenton webinar
Montrium
 
Document Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionDocument Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVision
Chris Riley ☁
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
Nicolas Morales
 
Leveraging IOT and Latest Technologies
Leveraging IOT and Latest TechnologiesLeveraging IOT and Latest Technologies
Leveraging IOT and Latest Technologies
Mithileysh Sathiyanarayanan
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
Karthik132344
 

Similar to What is Intelligent Document and Data Capture? A look at the technologies to move to a "nearly" paperless office. (20)

Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdf
 
DU_SERIES_Session1.pdf
DU_SERIES_Session1.pdfDU_SERIES_Session1.pdf
DU_SERIES_Session1.pdf
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?
 
Data Processing and its Types
Data Processing and its TypesData Processing and its Types
Data Processing and its Types
 
Proven Methods of Data Collection in Data Processing
Proven Methods of Data Collection in Data ProcessingProven Methods of Data Collection in Data Processing
Proven Methods of Data Collection in Data Processing
 
iot_module4.pdf
iot_module4.pdfiot_module4.pdf
iot_module4.pdf
 
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBordeITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
ITGulfCoast: Technology Trends In The Legal Industry by Garrett LaBorde
 
IoT underthe hood
IoT underthe hoodIoT underthe hood
IoT underthe hood
 
UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptx
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair Monarch
 
Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP Solution
 
Document Parsing
Document ParsingDocument Parsing
Document Parsing
 
Accenture Insurance Data Capture
Accenture Insurance Data Capture Accenture Insurance Data Capture
Accenture Insurance Data Capture
 
Automation of document management paul fenton webinar
Automation of document management paul fenton webinarAutomation of document management paul fenton webinar
Automation of document management paul fenton webinar
 
Document Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionDocument Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVision
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
Leveraging IOT and Latest Technologies
Leveraging IOT and Latest TechnologiesLeveraging IOT and Latest Technologies
Leveraging IOT and Latest Technologies
 
Abstract
AbstractAbstract
Abstract
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
 
DU PPT (1).pptx
DU PPT (1).pptxDU PPT (1).pptx
DU PPT (1).pptx
 

Recently uploaded

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

What is Intelligent Document and Data Capture? A look at the technologies to move to a "nearly" paperless office.

  • 1. Learn What Is Intelligent Document and Data Capture and Get Started The Paperless Office… Chasing the Impossible?
  • 2. In a now famous (or infamous) 1975 issue of BusinessWeek titled “The Office of the Future” technologists describe “The Paperless Office.”
  • 3. “Vincent E. Giuliano of Arthur D. Little, Inc., figures that the use of paper in business for records and correspondence should be declining by 1980, ‘and by 1990, most record-handling will be electronic.’”
  • 4. I think we can all agree that we’re not there yet.
  • 5. How about we agree that what we really want is “The Nearly Paperless Office”?
  • 6. The first part of any Document or Content Management System is capture.
  • 7. What is Intelligent Document and Data Capture?
  • 8. To keep it simple let’s stick with AIIM’s (Association for Information and Image Management) definition. AIIM is a nonprofit, serving information and image professionals.
  • 9. “Document capture and data capture are not the same thing. Document capture is the conversion of a paper document into an electronic image of that document. Data capture extracts data from a business form”.
  • 10. We’ll interpret “form” here as any paper or electronic source.
  • 11. Why intelligent or automated? Reduce Labor Speed Processing and Information Delivery Comply with Regulations Reduce Errors
  • 12. So what is the capture process?
  • 13. So what is the capture process? There are many models, from broad three- step processes to more specific five-step processes.
  • 14. So what is the capture process? There are many models, from broad three- step processes to more specific five-step processes. Let’s go with the five-step.
  • 15. 1. Capture Paper Sources: Electronic Sources: Captured with scanners or MFP devices. Network directories, emails, electronic forms, print streams, faxes…anything made of 1’s and 0s.
  • 16. 2. Classify/Organize/Categorize Identifying what the document or information is in order to correctly process and deliver the document and extract the information.
  • 17. 2. Classify/Organize/Categorize Identifying what the document or information is in order to correctly process and deliver the document and extract the information. Invoice ContractTax Form Patient Record ? How should it be processed? Where should it be routed and stored?
  • 18. 3. Extract or Mine Capturing data for the index or other purposes. May be data such as customer number, freight tracking number, invoice number, supplier name etc. Or, full-text indexing may be required where all text on the documents are captured. See What is Document Indexing.
  • 19. 4. Validate Using technology or manual inspection to ensure that a document is classified and processed correctly
  • 20. 4. Validate With technology this may mean automatically validating against data sources or employing business rules. For instance if an inventory item should contain three alpha characters followed by five numbers, all documents not following that scheme may be tagged for manual inspection before further processing is done. PEN21096 CAP36581 INV98453 PA568793
  • 21. 5. Deliver or Integrate …to or with a search and retrieval or content management system. Obviously, without a system to locate documents or data, a system is useless.
  • 22. Henry Schein, Dentrix, Dentrix Enterprise Dentrix Ascend, Easy Dental Viive, DentalVision, axiUm 5. Deliver or Integrate Often index information is sent to the document management system via an XML or CSV file where it can be made immediately available to the user. Systems such as SharePoint, Epic, Laserfiche and other ECM, EMR, EHR systems have various ways of accepting data feeds Filenet Laserfiche Documentum MyMedicalRecords Eaglesoft Allscripts Epic Dentrix CSV or XML
  • 23. So how do we get that pig to
  • 24. Today we have proven and developing technologies propelling us to The Nearly Paperless Office.
  • 25. Barcode recognition (BCR) offers the most trustworthy recognition technology for data capture.
  • 26. • Split Files • Classify Documents • Route Files • Index • Name Files • Bookmark PDFs Use Barcodes to … Learn more at What Can Barcodes Do For Me?
  • 27. OCR is another mature data capture technology to... • Digitize text images so that they can be electronically edited, searched, and stored • Make image-based files fully text-searchable or extract data from a zone for indexing • Identify document areas for automatic OCR capture (zonal OCR) • Drag-and-drop highlighted document text which is automatically OCR'd and dropped into index fields (drag and drop OCR or rubber band OCR) • Use extracted data to split, name, route, validate, etc.
  • 28. Other Recognition Technologies For Data Capture • Handwriting recognition • Not as accurate as OCR, limited role in some capture systems ICR (Intelligent Character Recognition) • Capturing human-marked data from document forms such as surveys and tests. • Like ICR, lower accuracy, limited application within data capture OMR (Optical Mark Recognition) • Uses BCR, OCR, ICR and OMR in a structured data capture format • Typically templates are designed to instruct the capture software where to look for information and how to process the information Forms Recognition
  • 29. Data or Text Mining (Often using Regular Expressions (regex)) A fast and powerful method to search, extract and replace specific data found within scanned documents.
  • 30. • Essentially a special text string for describing a search pattern. • Extremely flexible and patterns can be constructed to match almost anything. • Use data identified with regex to classify, split, name and route files. Learn more at Using Regular Expressions for Automated Data Capture and Extraction. Data or Text Mining (Often using Regular Expressions (regex))
  • 31. …simply processing a large volume of documents, generally into a few files or one file and using intelligent capture software to process. Some products process folders of documents on demand or “watch” folders for files to process. Batch Document Processing Learn more at What is Batch Document Processing?
  • 32. Image Enhancement • Adaptive thresholding • Deskew • Despeckle • Remove blank pages or separator sheets • Auto rotate • Remove lines To improve usability and increase accuracy of OCR and other recognition technologies, image enhancement is required. Learn more at Improving OCR Accuracy with Cleanup and Enhancement.
  • 33. Where is intelligent document and data capture going?
  • 34. Cloud Computing Increased cloud computing will bring easily accessible resources and repositories for documents. See Docs in the Clouds. “The use of cloud computing is growing, and by 2016 this growth will increase to become the bulk of new IT spend.” Gartner, Inc. Oct. 2013
  • 35. Security Focus Couple the increasing number of documents being stored with the growing ways to access them, and security concerns will continue to increase.
  • 36. Improved Data Mining and Classification The increased used of data mining and better classification will increase OCR demands and lower the use of barcodes and separator pages.
  • 37. Increased Mobility Increased mobility demands in business impacts all information technology. Users want all information available from all platforms, no matter when or where.
  • 38. Don’t be caught napping, JUST GET STARTED.
  • 39. No one data capture product can “do it all”, but there is no better time to get started than now. ”The Nearly Paperless Office” can be yours.
  • 40. Learn More about Document Imaging and Capture
  • 41. For more on: • Watching folder, • Monitoring folder, • Watching folders, • Batch Processing, • Bulk scanning, • Split files with barcodes, • Barcode splitting, • How to batch process, • Batch process folders, • Docufi, • Imageramp, • Watch folders, • Data capture, • Scanning to folders, • Scanning to folder, • Scan to Folder, • Batch Splitting • Migration to document management Contact Us DocuFi 30 years’ experience in the Document Imaging market Capture Solutions www.docufi.com Copyright ©2014 makers of ImageRamp, Document Management Capture Solution
  • 42. Image Credits • Christina Rutz, “When Pigs Fly”, http://bit.ly/1giOj05 • Nottsexminer , “Utopia”, http://bit.ly/1gnZTmS • Kenny Louie, “One Way”, http://bit.ly/1iA7pxQ • Spiffie, “Fujitsu ScanSnap S300M”, http://bit.ly/1ksdhhv • Doctorwonder, “Stack O'Money!”, http://bit.ly/1fgxpko • Maciej Lewandowski, “Pig on the wings”, http://bit.ly/N6lZCJ • Sjsharktank, “Pigs fly, so now what?”, http://bit.ly/1g8UsYc • Elvissa, “flyingpig”, http://bit.ly/1nLMzyB • Jennicatpink, “Piglet Pile”, http://bit.ly/1cT6KUF • Eddi, “phone”, http://bit.ly/1ftUezJ • Martin Cathrae, “Cute Piggie“,http://bit.ly/1nLUDiT • Sarah Beth Dwyer, “Jim's Pig”, http://bit.ly/Prl3dl