Scene Text Detection on Images using Cellular Automata

1,481 views

Published on

Textual information in images constitutes a very rich source of high-level semantics for retrieval and indexing. In this paper, a new approach is proposed using Cellular Automata (CA) which strives towards identifying scene text on natural images. Initially, a binary edge map is calculated. Then, taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution. Finally, a post-processing technique based on edge projection analysis is employed for high density edge images concerning the elimination of possible false positives. Evaluation results indicate considerable performance gains without sacrificing text detection accuracy.

2 Comments
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
1,481
On SlideShare
0
From Embeds
0
Number of Embeds
198
Actions
Shares
0
Downloads
56
Comments
2
Likes
0
Embeds 0
No embeds

No notes for slide

Scene Text Detection on Images using Cellular Automata

  1. 1. Scene Text Detection on Images using Cellular Automata Konstantinos Zagoris and Ioannis Pratikakis Image Processing and Multimedia Lab,Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece kzagoris@ee.duth.gr, ipratika@ee.duth.gr
  2. 2. Outline Introduction State of the Art Disadvantages Architecture of the proposed method Canny Edge Detector Coordinating Logic Filters (CLF) Proposed Cellular Automata Text Detection Method Evaluation and Experimental Results
  3. 3. Introduction Textual information in images or video constitutes a very rich source of high-level semantics for retrieval and indexing It can be acquired as scene text that was captured by a video or photo camera as part of a scene Text detection on natural scenes is still a hard task to solve Have very high computational cost
  4. 4. State of the Art Split in two categories: region-based and texture- based Region-based algorithms group pixels based on common characteristics Texture-based methods scan the image at different scales using a sliding window and classify text areas based on texture information. From another perspective, can be divided into heuristic-based and machine learning-based methods. Heuristic-based algorithms segment the image into small regions and then group them by some constraints Machine learning-based methods use directly
  5. 5. Disadvantages Many parameters have to be estimated experimentally condemns them to data dependency and lack of generality When background is really complex, they become computationally expensive. Texture-based techniques cannot catch satisfactory text with size bigger of the sliding window. An increase of the window make these methods quite costly. In addition, they still use empirical thresholds on specific features therefore they lack adaptability.
  6. 6. Proposed Method Address the scene text detection problem by modeling texture into cellular automata (CA) context Replace costly image processing operations with their equivalent cellular operations Eliminate most limitations, such as the empirical thresholds and heavy computational procedures
  7. 7. Architecture of the proposed methodOriginal Image Canny Edge Map Logical OR Cellular Automata Logical AND Coordinating Logic Logical OR Filters Majority State Rule Edge Projection Filtering Final Text
  8. 8. Coordinating Logic Filters (CLF) execute coordinate logic operations among the pixels of the image The CLF operations is similar to the morphological operations, achieving similar functionality morphology Dilation is the logical OR morphology Erosion is the logical AND
  9. 9. Canny Edge Detector Detection of the salient image edges Use Sobel masks thresholding and non-maxima suppression(low threshold equal to 20 and high threshold equal to 100) The final edge map is a binarised image with the contour pixels set to one (white) and the remainder pixels equal to zero (black). This approach exploits the fact that text lines produce strong vertical edges horizontally aligned with a high density. gives us the opportunity to detect normal or
  10. 10. Canny Edge Detector
  11. 11. Proposed Cellular Automata The proposed CA is considered to be a 2-D lattice of cells where every pixel is represented by a cell. The CA grid width and height is defined by the edge image width and height Each cell have two states as the input image is binary. Taking advantage of the CA flexibility, the transition rules are changing and are applied in four consecutive steps resulting in four time steps CA evolution.
  12. 12. 1st Step – Logical OR
  13. 13. 1st Step – Logical OR
  14. 14. 2nd Step – Logical AND
  15. 15. 2nd Step Logical AND
  16. 16. 3rd Step – Logical OR
  17. 17. 3rd Step – Logical OR
  18. 18. Majority State Rule
  19. 19. 4th Step - Majority State Rule
  20. 20. Edge Projection Filtering in the high edge density images, the method produces a number of false positives post-processing filtering is required in order to remove them filtered them based on horizontal and vertical projections Areas with mean horizontal and vertical projections below a threshold are discarded.
  21. 21. Edge Projection Filtering
  22. 22. Examples
  23. 23. Examples
  24. 24. Evaluation
  25. 25. Evaluation 1. Wolf, C., Jolion, J.M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. International Journal on Document Analysis and Recognition 8(4), 280–296 (2006)
  26. 26. Experimental Results  In order to showcase the advantages of our proposed method, we test it against a machine- learning edge based scene text detection system.  We replace the CLF with the corresponding morphological operations (dilation and opening) and the majority state rule with the Support Vector Machines (SVMs) classifierMethod Recall Precision Harmonic MeanProposed CA-based 0.7942 0.7462 0.7652methodMachine-learning based 0.7134 0.5234 0.6038method
  27. 27. Experimental ResultsMean execution time of each of them for a set images(15 total) in a Intel Core 2 Quad CPU Q9550(2.83GHz) machine.Method Mean Execution Time (sec)Proposed CA-based 2.75 secmethodMachine-learning based 5.96 secmethod
  28. 28. Conclusions A method based on the Cellular Automata was presented for the detection of scene text on natural images Initially, the Canny edge detector is employed in order to exposed the dominant edges on the image. Then a CA is used for the calculation of the candidate text areas. Its rules depend on Coordinating Logic Filters and on the majority state rule A post-processing technique based on edge projection analysis is employed for the high density edge images in order to eliminated the false positives.
  29. 29. Ευχαριστώ Πολφ! Thank You!

×