Multistream and Dropout Color Forms Processing
by Kevin Neal, Technology Nerd on Jan 13, 2012
- 1,221 views
This was a presentation I created on some simple, but extremely useful, techniques that can be used when scanning documents to drastically improve your automatic data capture accuracy. ...
This was a presentation I created on some simple, but extremely useful, techniques that can be used when scanning documents to drastically improve your automatic data capture accuracy.
These techniques include Multistream which simply means that the scanner can output two versions of the scanned document. Typically one is in color and one is in black and white. Why? You would want to save the color version of the image for retrieval purposes. In other words, the user would see an identical electronic version of the hard copy document. The black and white version is used strictly for automatic data extraction because often times the color in unnecessary for OCR.
The second technique is Background Color Removal. Forms designed specifically for automatic data capture such as Health Care Financing Administration (HCFA) CMS1500, UB-92 or OB04's will have one-shade of a consistent background color. Why? This form color is designed make it obvious for the person completing the form exactly where characters and specific information is to be placed in the form. In other words, Social Security Number has an exact box for each of the nine numbers in your SSN. This way the software knows exactly where to automatically look for the SSN field then accurately populate each of the nine numbers. In forms processing, you don't care about the background color, you care about the information on the form. So, therefore, you "dropout" the color and expose the data.
I've written about additional data capture tips, tricks and techniques here:
- Total Views
- Views on SlideShare
- Embed Views