What is Data Capture


Published on

Brief overview on data capture technology what is it and why use it.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

What is Data Capture

  1. 1. What we will cover: <ul><li>What is Data Capture? </li></ul><ul><li>Types of Data Capture </li></ul><ul><ul><li>Fixed Form </li></ul></ul><ul><ul><li>Semi-Structured </li></ul></ul><ul><ul><li>Unstructured </li></ul></ul><ul><li>Best Practices </li></ul><ul><ul><li>Preparing yourself to automate </li></ul></ul><ul><ul><li>Give it the best chance possible </li></ul></ul><ul><ul><li>Choose a technology </li></ul></ul><ul><ul><li>Let’s Talk Invoices and EOBs </li></ul></ul><ul><li>Wrap Up </li></ul>
  2. 2. What is Data Capture <ul><li>One Definition: The extraction of field / data pairs from image files using OCR/ICR/OMR/ and Barcode technologies. </li></ul><ul><li>It is not Full-Page Conversion </li></ul><ul><li>Usually results in a record in a database </li></ul>
  3. 3. Types of data capture <ul><li>Fixed Form </li></ul><ul><ul><li>Usually survey type forms </li></ul></ul><ul><ul><li>Usually hand printed </li></ul></ul><ul><li>Semi-Structured </li></ul><ul><ul><li>80% + of the documents that exist </li></ul></ul><ul><ul><li>Only requires that fields are of consistent type </li></ul></ul><ul><li>Unstructured </li></ul><ul><ul><li>Not common </li></ul></ul><ul><ul><li>Heavily tied to business process </li></ul></ul>
  4. 4. Example of Fixed Form Name: Ilya Date: 12/21/82 Reference Marks
  5. 5. Example of Fixed Form <ul><li>One approach out there: OCR/ICR specific zone by: </li></ul><ul><li>[ x, y, height, width ] </li></ul><ul><li>Always has reference marks to normalize coordinates </li></ul><ul><li>Always has the same number of fields </li></ul><ul><li>Fields are always in the same location per page </li></ul>
  6. 6. Example of Fixed Form <ul><li>Fields are always in the same location per page </li></ul><ul><li>Has a consistent marking type per page </li></ul><ul><li>Short automation setup time and complexity </li></ul><ul><li>Usually hand printed forms </li></ul><ul><ul><li>Occasional Typographic </li></ul></ul>
  7. 7. Example of semi-structured Invoice No: 99044 Date: 06/09/04 Invoice No: 24567 Date: 06/09/04
  8. 8. Example of Semi-Structured <ul><li>Two Approaches out there: </li></ul><ul><ul><li>OCR/ICR based on keyword / value pair rules and search elements </li></ul></ul><ul><ul><ul><li>Most Flexibility, More Complex Setup </li></ul></ul></ul><ul><ul><ul><li>Example: find the word “Invoice” or “Inv No.” and find a number to the right </li></ul></ul></ul><ul><ul><li>OCR/ICR based on iterative trained templates </li></ul></ul><ul><ul><ul><li>Easier setup, Less Flexibility </li></ul></ul></ul><ul><ul><ul><li>Example: draw a template quickly for each variation and let the software find the differences </li></ul></ul></ul><ul><ul><li>Some solutions combine the two </li></ul></ul>
  9. 9. Example of Semi-Structured <ul><li>Usually Typographic </li></ul><ul><li>Can some times be processed as fixed form with a lower accuracy and lower cost. </li></ul><ul><li>Often Confused with Fixed forms </li></ul><ul><ul><li>Number of items may vary </li></ul></ul><ul><ul><li>Print moves within boxes </li></ul></ul><ul><ul><li>Without registration marks coordinates are not normalized </li></ul></ul><ul><li>Medium to long setup time and complexity depending on document type </li></ul>
  10. 10. Example of Unstructured: Does it exist? <ul><li>Human Resource Documents? </li></ul><ul><li>Legal Contracts? </li></ul><ul><li>Mortgage Documents? </li></ul><ul><li>Approaches </li></ul><ul><ul><li>Argument </li></ul></ul>
  11. 11. <ul><li>Know your document types </li></ul><ul><ul><li>Object Types </li></ul></ul><ul><ul><ul><li>Invoice </li></ul></ul></ul><ul><ul><ul><li>Check </li></ul></ul></ul><ul><ul><li>Subjective Types </li></ul></ul><ul><ul><ul><li>Medical Record </li></ul></ul></ul><ul><ul><ul><li>Remittance </li></ul></ul></ul><ul><li>Know the business process associated with each document type </li></ul><ul><li>Know how your organization will handle change </li></ul><ul><li>Understand how accuracy is determined </li></ul>Best Practices: Prepare yourself
  12. 12. <ul><li>Most people assume there is only one type of accuracy. Wrong! </li></ul><ul><li>Document Type Accuracy </li></ul><ul><li>Field/Zone Location Accuracy </li></ul><ul><li>Data Type Accuracy </li></ul><ul><li>Character Accuracy </li></ul>Best Practices: What is accuracy?
  13. 13. <ul><li>If you have control of the form design it well </li></ul><ul><li>Good scanning / input </li></ul><ul><ul><li>Scan Settings </li></ul></ul><ul><ul><li>Document Prep </li></ul></ul><ul><ul><li>Avoid multitude of input types </li></ul></ul><ul><ul><ul><li>Mail </li></ul></ul></ul><ul><ul><ul><li>Email </li></ul></ul></ul><ul><ul><ul><li>Print </li></ul></ul></ul><ul><ul><ul><li>Fax </li></ul></ul></ul>Best Practices: Give it a fighting chance
  14. 14. <ul><li>Image Clean-up tools </li></ul><ul><li>Set your goals – be realistic </li></ul><ul><ul><li>Accuracy </li></ul></ul><ul><ul><li>Time to automate most critical document </li></ul></ul>Best Practices: Give it a fighting chance
  15. 15. <ul><li>What type of data capture are you looking at </li></ul><ul><ul><li>Combo? </li></ul></ul><ul><li>Make potential solution list </li></ul><ul><li>See canned demo </li></ul><ul><li>See demo on your documents </li></ul><ul><ul><li>5 good, 10 normal, 5 bad </li></ul></ul>Best Practices: Choose a technology
  16. 16. <ul><li>Get a good understanding of how expectations are handled </li></ul><ul><ul><li>Work to include in processing </li></ul></ul><ul><ul><li>Where do they go? </li></ul></ul><ul><li>Get a good understanding of setup </li></ul><ul><ul><li>How long it took to configure for your demo </li></ul></ul><ul><ul><li>What is the skill level required </li></ul></ul>Best Practices: Choose a technology
  17. 17. <ul><li>Invoice processing is NOT a vertical </li></ul><ul><ul><li>Commercial Invoices </li></ul></ul><ul><ul><li>Legal Invoices </li></ul></ul><ul><ul><li>Manufacturing </li></ul></ul><ul><ul><li>Telecom </li></ul></ul><ul><ul><li>Etc. </li></ul></ul><ul><li>To line-item or not to line-item </li></ul><ul><li>Don’t forget about your supporting data </li></ul><ul><li>Don’t forget about your business process </li></ul>Let’s Talk Invoices
  18. 18. <ul><li>Arguably the most difficult document out there </li></ul><ul><li>Image Clean-Up is imperative </li></ul><ul><li>Group your variants </li></ul><ul><li>DO NOT pick a solution based on export format </li></ul><ul><ul><li>HL7, 857 etc. has nothing to do with OCR </li></ul></ul>Let’s Talk EOBs
  19. 19. <ul><li>The success of data capture can easily be more than you expect if you let it. </li></ul><ul><li>Set your expectations straight before talking to any vendor </li></ul>Wrap up