SlideShare a Scribd company logo
1 of 18
Detecting Text in Natural Scenes with Stroke Width
Transform
Presented by,
POOJA G N
Overview
• Introduction
• Steps involved in text detection algorithm
• Edge map
• Stroke width transform
• Finding letter candidates
• Grouping letter candidates
• Strength and weakness of SWT
• Results
• Applications
• References
Introduction
• With the increasing use of digital image capturing devices,
content-based image analysis techniques are receiving intensive
attention in recent years.
• As indicative marks in natural scene images, text information
provides brief and significant clues for many image-based
applications.
• We present a image operator that seeks to find the value of
stroke width for each image pixel, and demonstrate its use on
the task of text detection in natural images.
Introduction(contd.,)
Current text detection approaches can be roughly classified into three groups:
 Region-based approaches
This attempt to use similarity criterions of text, such as color, size, stroke
width, edge and gradient information, to gather pixels.
 Texture based approaches
This utilize distinct textural properties of text regions to extract candidate
sub-windows and the final outputs are formed by merging these sub-windows.
 Hybrid approaches
This take advantages of both region-based approaches which can closely
cover text regions and texture-based approaches which can estimate
coarse text location in scenes.
Steps involved in text detection algorithm
1. Image(input)
2. Edge map
 Here we use Canny Edge detection algorithm.
 The Canny edge detector is an edge detection operator that uses
a multi-stage algorithm to detect a wide range of edges in images.
Input image Edge detected image
3. Stroke Width Transform
SWT is a local operator which calculates for each pixel the width of the most likely
stroke containing the pixel.
(a).
(b).
(c).
Figures shows the implementation of the SWT
where
(a) A typical stroke. The pixels of the stroke in
this example are darker than the background
pixels.
(b) p is a pixel on the boundary of the stroke.
Searching in the direction of the gradient at
p, leads to finding q, and the
corresponding pixel on the other side of the
stroke.
(c) Each pixel along the ray is assigned by the
minimum of its current value and the
found width of the stroke.
The rules to components are as follows:
• The variance of the stroke-width within a
component must not be too big.
• The aspect ratio of a component must be within a
small range of values, in order to reject long and
narrow components.
• Components whose size is too large or too small
will also be ignored.
4. Finding Letter Candidate
5. Grouping letter candidates into regions of text
• Grouping the pixels into letter candidates based on their stroke width.
• The grouping of the image will be done by using a Connected Component algorithm.
• The image partition creates a set of connected components from an input
image, including both text characters and unwanted noises.
• We perform structural analysis of text strings to distinguish connected
components representing text characters from those representing noises.
• Assuming that a text string has at least three characters in alignment, we
develop two methods to locate regions containing text strings: adjacent
character grouping and text line grouping.
Grouping letter candidates into regions of text(contd.,)
• Group closely positioned letter candidates into regions of text.
• Filters out many falsely-identified letter candidates, and improves the
reliability of the algorithm results.
The rules to pair the letters are as follows:
• Two letter candidates should have similar
stroke width.
• The distance between letters must not
exceed three times the width of the wider
one.
• Characters of the same word are expected
to have a similar color; therefore we
compare the average color of the candidates
for pairing.
Resultant Image at each step of the algorithm
Strengths of SWT
• The SW Detector can detect letters of different languages (English, Hebrew, Arabic etc.)
• The text can be of varying sizes.
• The text can be of different orientation, including curvy text.
• Even handwriting can be detected.
Weakness of SWT
• Appearance of noise.
• Foliage resembles letters.
• Does not handle round and curved letters.
• Small and close letters tend to be grouped together in the SW labeling phase and these
groups may be dismissed in the ‘finding letter candidates’ phase.
Results
Applications
 Mobile text recognition
 Content-based web image search
 Automatic geocoding
 Robotic navigation
 License plate reading
References
1) Gili Werner ”Text Detection in Natural Scene with Stroke Width Transform”. ICBV,
February, 2013.
2) B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke
width transform,” in Computer Vision and Pattern Recognition(CVPR),Conference
on. IEEE, 2010.
3) Mr. Hemil A. Patel, Mrs. Kishori S. Shekokar, “Text Detection in Natural Scenes with
Stroke Width Transform”, [Patel, 3(11): November, 2014], ISSN: 2277-9655.
4) L. Neumann, J. Matas, “ A method for text localization and recognition in real-world
images”, ACCV, 2010.
Any queries?
Thank you

More Related Content

What's hot

What's hot (20)

Automated Face Detection System
Automated Face Detection SystemAutomated Face Detection System
Automated Face Detection System
 
Face recognition
Face recognitionFace recognition
Face recognition
 
Pattern Recognition.pptx
Pattern Recognition.pptxPattern Recognition.pptx
Pattern Recognition.pptx
 
Canny Edge Detection
Canny Edge DetectionCanny Edge Detection
Canny Edge Detection
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Machine Learning ppt.pptx
Machine Learning ppt.pptxMachine Learning ppt.pptx
Machine Learning ppt.pptx
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
 
Computer vision
Computer visionComputer vision
Computer vision
 
Fourier descriptors & moments
Fourier descriptors & momentsFourier descriptors & moments
Fourier descriptors & moments
 
Face detection ppt
Face detection pptFace detection ppt
Face detection ppt
 
EMOTION DETECTION USING AI
EMOTION DETECTION USING AIEMOTION DETECTION USING AI
EMOTION DETECTION USING AI
 
Computer Vision - Artificial Intelligence
Computer Vision - Artificial IntelligenceComputer Vision - Artificial Intelligence
Computer Vision - Artificial Intelligence
 
Homomorphic filtering
Homomorphic filteringHomomorphic filtering
Homomorphic filtering
 
HUMAN FACE RECOGNITION USING IMAGE PROCESSING PCA AND NEURAL NETWORK
HUMAN FACE RECOGNITION USING IMAGE PROCESSING PCA AND NEURAL NETWORKHUMAN FACE RECOGNITION USING IMAGE PROCESSING PCA AND NEURAL NETWORK
HUMAN FACE RECOGNITION USING IMAGE PROCESSING PCA AND NEURAL NETWORK
 
Ai lecture 03 computer vision
Ai lecture 03 computer visionAi lecture 03 computer vision
Ai lecture 03 computer vision
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
Final year ppt
Final year pptFinal year ppt
Final year ppt
 
Computer vision
Computer visionComputer vision
Computer vision
 
human face detection using matlab
human face detection using matlabhuman face detection using matlab
human face detection using matlab
 

Similar to Detecting text from natural images with Stroke Width Transform

Enhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical RecordsEnhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical Records
csandit
 
Pattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptxPattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptx
EngRSMY2
 

Similar to Detecting text from natural images with Stroke Width Transform (20)

F045053236
F045053236F045053236
F045053236
 
IRJET- A Survey on MSER Based Scene Text Detection
IRJET-  	  A Survey on MSER Based Scene Text DetectionIRJET-  	  A Survey on MSER Based Scene Text Detection
IRJET- A Survey on MSER Based Scene Text Detection
 
Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...
 
Das09112008
Das09112008Das09112008
Das09112008
 
Text Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text RegionsText Extraction System by Eliminating Non-Text Regions
Text Extraction System by Eliminating Non-Text Regions
 
IRJET- Devnagari Text Detection
IRJET- Devnagari Text DetectionIRJET- Devnagari Text Detection
IRJET- Devnagari Text Detection
 
LSDI 2.pptx
LSDI 2.pptxLSDI 2.pptx
LSDI 2.pptx
 
40120140501009
4012014050100940120140501009
40120140501009
 
Detection and Localization of Text Information in Video Frames
Detection and Localization of Text Information in Video FramesDetection and Localization of Text Information in Video Frames
Detection and Localization of Text Information in Video Frames
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wild
 
Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...Scene text recognition in mobile applications by character descriptor and str...
Scene text recognition in mobile applications by character descriptor and str...
 
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
AN ENHANCED EDGE ADAPTIVE STEGANOGRAPHY APPROACH USING THRESHOLD VALUE FOR RE...
 
Enhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical RecordsEnhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical Records
 
Representation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesRepresentation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templates
 
Pattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptxPattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptx
 
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformText Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
 
industrial engg
industrial enggindustrial engg
industrial engg
 
C04741319
C04741319C04741319
C04741319
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Detecting text from natural images with Stroke Width Transform

  • 1. Detecting Text in Natural Scenes with Stroke Width Transform Presented by, POOJA G N
  • 2. Overview • Introduction • Steps involved in text detection algorithm • Edge map • Stroke width transform • Finding letter candidates • Grouping letter candidates • Strength and weakness of SWT • Results • Applications • References
  • 3. Introduction • With the increasing use of digital image capturing devices, content-based image analysis techniques are receiving intensive attention in recent years. • As indicative marks in natural scene images, text information provides brief and significant clues for many image-based applications. • We present a image operator that seeks to find the value of stroke width for each image pixel, and demonstrate its use on the task of text detection in natural images.
  • 4. Introduction(contd.,) Current text detection approaches can be roughly classified into three groups:  Region-based approaches This attempt to use similarity criterions of text, such as color, size, stroke width, edge and gradient information, to gather pixels.  Texture based approaches This utilize distinct textural properties of text regions to extract candidate sub-windows and the final outputs are formed by merging these sub-windows.  Hybrid approaches This take advantages of both region-based approaches which can closely cover text regions and texture-based approaches which can estimate coarse text location in scenes.
  • 5. Steps involved in text detection algorithm
  • 7. 2. Edge map  Here we use Canny Edge detection algorithm.  The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. Input image Edge detected image
  • 8. 3. Stroke Width Transform SWT is a local operator which calculates for each pixel the width of the most likely stroke containing the pixel. (a). (b). (c). Figures shows the implementation of the SWT where (a) A typical stroke. The pixels of the stroke in this example are darker than the background pixels. (b) p is a pixel on the boundary of the stroke. Searching in the direction of the gradient at p, leads to finding q, and the corresponding pixel on the other side of the stroke. (c) Each pixel along the ray is assigned by the minimum of its current value and the found width of the stroke.
  • 9. The rules to components are as follows: • The variance of the stroke-width within a component must not be too big. • The aspect ratio of a component must be within a small range of values, in order to reject long and narrow components. • Components whose size is too large or too small will also be ignored. 4. Finding Letter Candidate
  • 10. 5. Grouping letter candidates into regions of text • Grouping the pixels into letter candidates based on their stroke width. • The grouping of the image will be done by using a Connected Component algorithm. • The image partition creates a set of connected components from an input image, including both text characters and unwanted noises. • We perform structural analysis of text strings to distinguish connected components representing text characters from those representing noises. • Assuming that a text string has at least three characters in alignment, we develop two methods to locate regions containing text strings: adjacent character grouping and text line grouping.
  • 11. Grouping letter candidates into regions of text(contd.,) • Group closely positioned letter candidates into regions of text. • Filters out many falsely-identified letter candidates, and improves the reliability of the algorithm results. The rules to pair the letters are as follows: • Two letter candidates should have similar stroke width. • The distance between letters must not exceed three times the width of the wider one. • Characters of the same word are expected to have a similar color; therefore we compare the average color of the candidates for pairing.
  • 12. Resultant Image at each step of the algorithm
  • 13. Strengths of SWT • The SW Detector can detect letters of different languages (English, Hebrew, Arabic etc.) • The text can be of varying sizes. • The text can be of different orientation, including curvy text. • Even handwriting can be detected. Weakness of SWT • Appearance of noise. • Foliage resembles letters. • Does not handle round and curved letters. • Small and close letters tend to be grouped together in the SW labeling phase and these groups may be dismissed in the ‘finding letter candidates’ phase.
  • 15. Applications  Mobile text recognition  Content-based web image search  Automatic geocoding  Robotic navigation  License plate reading
  • 16. References 1) Gili Werner ”Text Detection in Natural Scene with Stroke Width Transform”. ICBV, February, 2013. 2) B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” in Computer Vision and Pattern Recognition(CVPR),Conference on. IEEE, 2010. 3) Mr. Hemil A. Patel, Mrs. Kishori S. Shekokar, “Text Detection in Natural Scenes with Stroke Width Transform”, [Patel, 3(11): November, 2014], ISSN: 2277-9655. 4) L. Neumann, J. Matas, “ A method for text localization and recognition in real-world images”, ACCV, 2010.