Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BL Demo Day - July2011 - (3) Image Enhancement for OCR

5,109 views

Published on

Slides from Niall Anderson's 1st presentation on Image Enhancement and OCR at the British Library Demo-day on the 12th July 2011.

Published in: Technology, Business
  • Sex in your area is here: ♥♥♥ http://bit.ly/36cXjBY ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ❶❶❶ http://bit.ly/36cXjBY ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

BL Demo Day - July2011 - (3) Image Enhancement for OCR

  1. 1. Image Enhancement and OCR Niall Anderson, The British Library, 12 July 2010
  2. 2. What is Image Enhancement? <ul><li>Image enhancement is a suite of technical solutions to improve display or delivery of digital images – particularly text-based images </li></ul><ul><li>Main areas of improvement </li></ul><ul><ul><li>Removing noise and other digital artefacts </li></ul></ul><ul><ul><li>Geometric correction for skewed images </li></ul></ul><ul><ul><li>Geometric correction for warped pages in paper original </li></ul></ul>
  3. 3. Example of an enhanced image Warped Dewarped
  4. 4. Why Image Enhancement? <ul><ul><li>To increase quality of image for display </li></ul></ul><ul><ul><li>To increase quality of image for printing (especially for Print On Demand) </li></ul></ul><ul><ul><li>To increase quality of Optical Character Recognition results </li></ul></ul>
  5. 5. OCR and Image Enhancement <ul><ul><li>OCR will produce its best results on material with the following characteristics </li></ul></ul><ul><ul><ul><li>The layout of the text is simple, with no tables or illustrations; </li></ul></ul></ul><ul><ul><ul><li>The text itself is in a modern, computer-generated typeface; </li></ul></ul></ul><ul><ul><ul><li>The digital image preserves a high contrast between the text block and non-text detail (including blank space) </li></ul></ul></ul><ul><ul><ul><li>The image has been created from a perfectly flat and straight scan (if a digital copy from an analogue source) </li></ul></ul></ul><ul><ul><ul><li>The text of the analogue source is clear, well aligned and consistently presented </li></ul></ul></ul><ul><ul><ul><li>The basic material of the analogue source is undamaged; the text is in a single language </li></ul></ul></ul><ul><ul><ul><li>The image has been taken from the original physical source and not a degraded surrogate (such as microfilm) </li></ul></ul></ul>
  6. 6. IMPACT Image Enhancement toolkit
  7. 7. Types of image enhancement in toolkit <ul><li>Binarisation </li></ul>
  8. 8. Types of image enhancement in toolkit <ul><li>Border removal </li></ul>
  9. 9. Types of image enhancement in toolkit <ul><li>Page splitting </li></ul>
  10. 10. Types of Image Enhancement in toolkit <ul><li>Dewarping </li></ul>
  11. 11. Using the IMPACT Image Enhancement toolkit - 1 <ul><ul><li>Select the directory with your images or copy your images to directory </li></ul></ul>
  12. 12. Using the IMPACT Image Enhancement toolkit - 2 <ul><ul><li>Select the directory for saving the results </li></ul></ul>
  13. 13. Using the IMPACT Image Enhancement toolkit - 3 <ul><ul><li>Select one or more document images </li></ul></ul>
  14. 14. Using the IMPACT Image Enhancement toolkit - 4 <ul><ul><li>Define a processing workflow </li></ul></ul>
  15. 15. Using the IMPACT Image Enhancement toolkit - 5 <ul><ul><li>Select the method for every processing module </li></ul></ul>
  16. 16. Using the IMPACT Image Enhancement toolkit - 6 <ul><li>Execute workflow by pressing &quot;Apply Processes&quot; </li></ul>
  17. 17. Using the IMPACT Image Enhancement toolkit - 7 <ul><ul><li>View results on the preview window or right click on any module at the workflow line and select &quot;View Result&quot;. </li></ul></ul>
  18. 18. Indicative results – Border Removal 22383 images to test border removal BL: 7% BNE: 34% BNF: 34% BSB: 11% JSI: 6% NLB: 2% ONB: 6% Only images with borders 38718 images to test border removal BL: 9% BNE: 29% BNF: 32% BSB: 12% JSI: 11% NLB: 2% ONB: 5%
  19. 19. Indicative results – Page splitting 458 images from BNF to test page split 3009 images to test page split BL: 72% BSB: 10% JSI: 18%
  20. 20. Indicative results - Dewarping IMPACT Page Curl Correction v.4 87.78% (81.98% only coarse correction) BookRestorer 80.87%
  21. 21. Research and references <ul><ul><li>N. Stamatopoulos, B. Gatos, I. Pratikakis and S.J. Perantonis, Goal-oriented Rectification of Camera-Based Document Images , IEEE Transactions on Image Processing, vol. 20, no. 4, pp. 910-920, 2011. </li></ul></ul><ul><ul><li>N. Stamatopoulos, B. Gatos, T. Georgiou, Page frame detection for double page document images , 9th IAPR International Workshop on Document Analysis Systems (DAS 2010), pp. 401-408, Cambridge, MA, USA, June 2010 </li></ul></ul><ul><ul><li>B. Gatos, I. Pratikakis and S. J. Perantonis, Adaptive Degraded Document Image Binarization , Pattern Recognition, Vol. 39, pp. 317-327, 2006 </li></ul></ul>
  22. 22. <ul><li>Questions? </li></ul>

×