SlideShare a Scribd company logo
1 of 9
Scene Text Recognition in Images – A
Deep Learning Era Survey
Scene texts contains more semantic information which has increased
attention in recent years. Basically, text identification has brought
enormous applications in computer vision as well as natural language
processing fields. Techniques developed in the recent times are
addressing traditional problems and identifying new applications. I am
interested to survey on the approaches and state-of-art methods that are
developed from past couple of years in the Text Recognition system.
This article gives you an understanding on following
1. Work on a High-Level
2. Previous techniques
3. Recent advances
4. Popular Datasets available
5. Future areas
Briefing of the work:
Techniques used to detect and recognize the text in scene images and
videos are categorized into 3
Text Detection and Localization: The technique helps in finding out
whether a text is present in the given scene image or video, if present it
identifies the location of the text.
Challenges: Different orientation, different languages, colors and sizes,
complex background occlusion, blur, noise non-uniform illumination
Text Recognition: Text recognition aims at converting the localized text
in images into character coding.
Challenges:
1. Scene Complexity: Images or videos generally suffer from noise,
distortion, non-uniform illumination, partial occlusion, as well as
confusion of the text and background. Complex background brings
some obstacles to text detection or recognition in real world.
2. Text Diversity and more stringent practical requirements: Scene
text vary in color, size, orientation, font, language, and text partial
deletion, etc.
End-to-End Text Recognition: This combines both text detection and
localization with recognition.
ClassicalPreviousMethodsand theiradapted
Methodologies
 Photo OCR system for text extraction
1. Text detection
2. Segmentation (Niblack Binarization (a morphological approach),
Binary sliding window classifier)
3. Beam search
4. character classifier (A full connected network with Relu Units)
5. Language Modeling (A standard N-gram model)
 Real-Time lexicon Free scene Text localization and recognition
1. Character Detection: (Extremal Regions), Incrementally Computable
Descriptors (Area, Bounding Box, Perimeter), Sequential classifier
(firstly-a real Ada-boost classifier with decision trees, secondly —
SVM+ RBF kernel)
2. Text Line Formation
3. Character Recognition
4. Sequence Selection
 RRN with Attention Modeling for OCR in the wild
1. Character Sequence Model (encodes image features using Recursive
Convolution layers decodes text using RNN’s)
2. RNN with attention function.
RecentAdvances
Text Detection and localization: This mainly focuses on processing
the images identifying the text and its positions. This is broadly
categorized in to below three forms.
connected component (CC)-based methods: Find the smaller
components and combine it to one larger and filter out non-text
components using classifier. Finally extract the text from images and
combine into one text region. An arguable limitation for the above
methodology is Handling rotation scale changes and complex
backgrounds.
Overcoming techniques:
1. Maximally stable extremal Regions (MSER): It provides robustness
for geometric and illumination conditions. Whereas it only adapting
to horizontal texts is its disadvantage.
2. Stroke Width Transformer (SWT): It seems very efficient and has
advantages of detecting text in any fonts, languages. Insensitive to
directions and multi-scales.
Texture-based methods: Idea is finding the text in images which has
distinct textual properties which can be separated from background.
Below techniques are mostly used in this method.
1. Gabor filters
2. Wavelet Transformation
3. Fast Fourier transformation
4. Discrete cosine Transform (DCT) Domain
5. Laplacian Wavelet, Wavelet decomposition
6. symmetry-based text line detector under the observation of the
symmetry and self- similarity properties of character groups.
Deep Learning-based methods: CNN’s has entirely changed, widely
explored and answered the unresolved questions. Main advantage hit by
the CNN’s is with less computationalcosts able to extract the features
from the images directly. Advanced properties of CNN have helped a lot
in scene detection in natural images. CNN’s implementations can be
broadly fall into 3 different groups within this deep learning-based
method they are below…
1. Region Proposal Based Methods: Simple CNN’s with MSER, R-
CNN’s and their advances— Its approach is to instance segmentation.
A Masked R-CNN uses a Bounding box to generate the object
segmentation by a shaded mask also called as semantic segmentation
technique. A Faster R-CNN can process a classification and detection
of objects in images. R-CNN’s uses a bounding box detection thus
creating the boxes around the objects in the images. By using
Regional proposal network an attention mechanism can happen in
faster R-CNN in 2 stages. These bounding boxes and determining
regions of interest using RPN protocol for each ROI we define the
class label called ROI Pooling. A Pixel-by-Pixel fully convolutional
networks can also be categorized in this method.
2. Segmentation-Based Methods: These mainly focus on producing
more precise multi size text detection but ineffective in detecting
individual words. Like a Cascaded two-convolutional network +
TextSegNet + YOLO(WordDetNet). Also, a super pixel segmentation
with hierarchical clustering for new character candidate extraction
method also comes in segmentation technique.
3. Hybrid Methods Using Multitask Learning: It accounts both CC
based methods and texture-based methods. Below techniques can
account for.
a. Character candidate detection (cascade boosting technique), min-cost
flow network (False character candidate removal, text line extraction,
text line verification).
b. Connectionist Text Proposal Network which can be extended to
multilingual and multi-scale text detection (vertical anchor mechanism,
in-network RNN)
c. Text-attentional convolutional neural network (Text-CNN)-contrast-
enhanced MSER.
Text Recognition: Identifying and understanding the text in the
candidate region. This is mainly classified into 3 ways as below.
Character Based Methods:
1. “Strokelets” -whose essence is a set of multi-scale mid-level
primitives and can be automatically learned from bounding box labels.
It’s very good in describing the characters.
2. Character Recognition by extracting low level features and integrated
automatically via region-based feature pooling technique.
Word Based Methods-Recognizing text at word level:
1. Dense SIFT in a bag-of-key points framework, character could be
recognized robustly
2. word segmentation with recognition in the probabilistic framework,
Lexical decision and sparse beam search tools were used to improve
the recognition accuracy.
Sequence-Based Methods: text is represented in character sequences.
1. An irregular text recognition which is called RARE (Robust text
recognizer with Automatic Rectification). This model combined
spatial transformer network (STN) and a sequence recognition
network (SRN)
2. Lexicon-free photo-OCR system called recursive recurrent neural
networks with attention modeling (R2AM)
3. Convolutional recurrent neural network (CRNN)
4. Auxiliary dense character detection model and an attention model.
End-to-End Text recognition System: CNN’s drastically changed
the procedures and techniques of attempting for combined model which
is one stop place of doing Text detection and text recognition. Text boxes
are been developed as part of advances in proposed end-to-end models.
A sliding window and connected component methods are best proposed
techniques, where they proposed an unconstrained end-to-end real-time
text localization and recognition method.
Benchmark Datasets for scene text identification in Images are below:
https://airtable.com/tbl3hLG7GneipFrDD/viwGd2IomvBf5D5U
d?blocks=show
Future areas:
We have witnessed numerous approaches under different categories and
methods with the same rapid development, especially CNN+RNN
framework is quite popular. Now, let’s discuss some future exciting stuff.
1. Complex scene and large-scale dataset: you can refer COCO dataset
and try to build multi challenging cases
2. Multilingual detection and recognition: Identifying multiple
languages like scripts identification within broad backgrounds and
text sizes project is one of my aspiring areas
3. Real-time detection and recognition: Identifying the texts from
images on real time is also a greatest focusing areas.

More Related Content

What's hot

TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKTEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKijscai
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET Journal
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Editor IJARCET
 
Representation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesRepresentation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesAhmed Abd-Elwasaa
 
Comparative analysis of c99 and topictiling text
Comparative analysis of c99 and topictiling textComparative analysis of c99 and topictiling text
Comparative analysis of c99 and topictiling texteSAT Publishing House
 
Comparative analysis of c99 and topictiling text segmentation algorithms
Comparative analysis of c99 and topictiling text segmentation algorithmsComparative analysis of c99 and topictiling text segmentation algorithms
Comparative analysis of c99 and topictiling text segmentation algorithmseSAT Journals
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
Natural language description of images using hybrid recurrent neural network
Natural language description of images using hybrid recurrent neural networkNatural language description of images using hybrid recurrent neural network
Natural language description of images using hybrid recurrent neural networkIJECEIAES
 
Inpainting scheme for text in video a survey
Inpainting scheme for text in video   a surveyInpainting scheme for text in video   a survey
Inpainting scheme for text in video a surveyeSAT Journals
 
Character recognition project
Character recognition projectCharacter recognition project
Character recognition projectMonsif sakienah
 
Project report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMProject report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMMohammad Saiful Islam
 
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIER
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIERHANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIER
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIERvineet raj
 
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...IRJET Journal
 
JPM1415 Scene Text Recognition in Mobile Applications by Character Descripto...
JPM1415  Scene Text Recognition in Mobile Applications by Character Descripto...JPM1415  Scene Text Recognition in Mobile Applications by Character Descripto...
JPM1415 Scene Text Recognition in Mobile Applications by Character Descripto...chennaijp
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodTELKOMNIKA JOURNAL
 

What's hot (18)

TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKTEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A Survey
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124
 
Representation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesRepresentation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templates
 
New Technology
New TechnologyNew Technology
New Technology
 
Comparative analysis of c99 and topictiling text
Comparative analysis of c99 and topictiling textComparative analysis of c99 and topictiling text
Comparative analysis of c99 and topictiling text
 
Comparative analysis of c99 and topictiling text segmentation algorithms
Comparative analysis of c99 and topictiling text segmentation algorithmsComparative analysis of c99 and topictiling text segmentation algorithms
Comparative analysis of c99 and topictiling text segmentation algorithms
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
Natural language description of images using hybrid recurrent neural network
Natural language description of images using hybrid recurrent neural networkNatural language description of images using hybrid recurrent neural network
Natural language description of images using hybrid recurrent neural network
 
Inpainting scheme for text in video a survey
Inpainting scheme for text in video   a surveyInpainting scheme for text in video   a survey
Inpainting scheme for text in video a survey
 
Character recognition project
Character recognition projectCharacter recognition project
Character recognition project
 
Project report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVMProject report - Bengali digit recongnition using SVM
Project report - Bengali digit recongnition using SVM
 
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIER
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIERHANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIER
HANDWRITTEN DIGIT RECOGNITION USING k-NN CLASSIFIER
 
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
 
Seminar5
Seminar5Seminar5
Seminar5
 
JPM1415 Scene Text Recognition in Mobile Applications by Character Descripto...
JPM1415  Scene Text Recognition in Mobile Applications by Character Descripto...JPM1415  Scene Text Recognition in Mobile Applications by Character Descripto...
JPM1415 Scene Text Recognition in Mobile Applications by Character Descripto...
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network method
 

Similar to Scene Text detection in Images-A Deep Learning Survey

Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Chiranjeevi Adi
 
Customized mask region based convolutional neural networks for un-uniformed ...
Customized mask region based convolutional neural networks  for un-uniformed ...Customized mask region based convolutional neural networks  for un-uniformed ...
Customized mask region based convolutional neural networks for un-uniformed ...IJECEIAES
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyRimzim Thube
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMIRJET Journal
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...IRJET Journal
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
Balochi Language Text Classification Using Deep Learning 1.pptx
Balochi Language Text Classification Using Deep Learning 1.pptxBalochi Language Text Classification Using Deep Learning 1.pptx
Balochi Language Text Classification Using Deep Learning 1.pptxMuhammadHamza463794
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitShubham Verma
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Editor IJARCET
 
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...ijiert bestjournal
 
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdf
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdfHandwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdf
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdfSachin414679
 
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformText Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformIOSR Journals
 
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...CSCJournals
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesTELKOMNIKA JOURNAL
 
JPM1417 Characterness: An Indicator of Text in the Wild
JPM1417   Characterness: An Indicator of Text in the WildJPM1417   Characterness: An Indicator of Text in the Wild
JPM1417 Characterness: An Indicator of Text in the Wildchennaijp
 

Similar to Scene Text detection in Images-A Deep Learning Survey (20)

Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Customized mask region based convolutional neural networks for un-uniformed ...
Customized mask region based convolutional neural networks  for un-uniformed ...Customized mask region based convolutional neural networks  for un-uniformed ...
Customized mask region based convolutional neural networks for un-uniformed ...
 
C04741319
C04741319C04741319
C04741319
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
 
Three classes of deep learning networks
Three classes of deep learning networksThree classes of deep learning networks
Three classes of deep learning networks
 
Et25897899
Et25897899Et25897899
Et25897899
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
Balochi Language Text Classification Using Deep Learning 1.pptx
Balochi Language Text Classification Using Deep Learning 1.pptxBalochi Language Text Classification Using Deep Learning 1.pptx
Balochi Language Text Classification Using Deep Learning 1.pptx
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124
 
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...
 
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdf
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdfHandwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdf
Handwriting_Recognition_using_KNN_classificatiob_algorithm_ijariie6729 (1).pdf
 
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformText Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
 
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approaches
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
 
JPM1417 Characterness: An Indicator of Text in the Wild
JPM1417   Characterness: An Indicator of Text in the WildJPM1417   Characterness: An Indicator of Text in the Wild
JPM1417 Characterness: An Indicator of Text in the Wild
 

Recently uploaded

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 

Scene Text detection in Images-A Deep Learning Survey

  • 1. Scene Text Recognition in Images – A Deep Learning Era Survey
  • 2. Scene texts contains more semantic information which has increased attention in recent years. Basically, text identification has brought enormous applications in computer vision as well as natural language processing fields. Techniques developed in the recent times are addressing traditional problems and identifying new applications. I am interested to survey on the approaches and state-of-art methods that are developed from past couple of years in the Text Recognition system. This article gives you an understanding on following 1. Work on a High-Level 2. Previous techniques 3. Recent advances 4. Popular Datasets available 5. Future areas Briefing of the work: Techniques used to detect and recognize the text in scene images and videos are categorized into 3 Text Detection and Localization: The technique helps in finding out whether a text is present in the given scene image or video, if present it identifies the location of the text.
  • 3. Challenges: Different orientation, different languages, colors and sizes, complex background occlusion, blur, noise non-uniform illumination Text Recognition: Text recognition aims at converting the localized text in images into character coding. Challenges: 1. Scene Complexity: Images or videos generally suffer from noise, distortion, non-uniform illumination, partial occlusion, as well as confusion of the text and background. Complex background brings some obstacles to text detection or recognition in real world. 2. Text Diversity and more stringent practical requirements: Scene text vary in color, size, orientation, font, language, and text partial deletion, etc. End-to-End Text Recognition: This combines both text detection and localization with recognition. ClassicalPreviousMethodsand theiradapted Methodologies  Photo OCR system for text extraction 1. Text detection 2. Segmentation (Niblack Binarization (a morphological approach), Binary sliding window classifier) 3. Beam search
  • 4. 4. character classifier (A full connected network with Relu Units) 5. Language Modeling (A standard N-gram model)  Real-Time lexicon Free scene Text localization and recognition 1. Character Detection: (Extremal Regions), Incrementally Computable Descriptors (Area, Bounding Box, Perimeter), Sequential classifier (firstly-a real Ada-boost classifier with decision trees, secondly — SVM+ RBF kernel) 2. Text Line Formation 3. Character Recognition 4. Sequence Selection  RRN with Attention Modeling for OCR in the wild 1. Character Sequence Model (encodes image features using Recursive Convolution layers decodes text using RNN’s) 2. RNN with attention function. RecentAdvances Text Detection and localization: This mainly focuses on processing the images identifying the text and its positions. This is broadly categorized in to below three forms.
  • 5. connected component (CC)-based methods: Find the smaller components and combine it to one larger and filter out non-text components using classifier. Finally extract the text from images and combine into one text region. An arguable limitation for the above methodology is Handling rotation scale changes and complex backgrounds. Overcoming techniques: 1. Maximally stable extremal Regions (MSER): It provides robustness for geometric and illumination conditions. Whereas it only adapting to horizontal texts is its disadvantage. 2. Stroke Width Transformer (SWT): It seems very efficient and has advantages of detecting text in any fonts, languages. Insensitive to directions and multi-scales. Texture-based methods: Idea is finding the text in images which has distinct textual properties which can be separated from background. Below techniques are mostly used in this method. 1. Gabor filters 2. Wavelet Transformation 3. Fast Fourier transformation 4. Discrete cosine Transform (DCT) Domain 5. Laplacian Wavelet, Wavelet decomposition
  • 6. 6. symmetry-based text line detector under the observation of the symmetry and self- similarity properties of character groups. Deep Learning-based methods: CNN’s has entirely changed, widely explored and answered the unresolved questions. Main advantage hit by the CNN’s is with less computationalcosts able to extract the features from the images directly. Advanced properties of CNN have helped a lot in scene detection in natural images. CNN’s implementations can be broadly fall into 3 different groups within this deep learning-based method they are below… 1. Region Proposal Based Methods: Simple CNN’s with MSER, R- CNN’s and their advances— Its approach is to instance segmentation. A Masked R-CNN uses a Bounding box to generate the object segmentation by a shaded mask also called as semantic segmentation technique. A Faster R-CNN can process a classification and detection of objects in images. R-CNN’s uses a bounding box detection thus creating the boxes around the objects in the images. By using Regional proposal network an attention mechanism can happen in faster R-CNN in 2 stages. These bounding boxes and determining regions of interest using RPN protocol for each ROI we define the class label called ROI Pooling. A Pixel-by-Pixel fully convolutional networks can also be categorized in this method. 2. Segmentation-Based Methods: These mainly focus on producing more precise multi size text detection but ineffective in detecting individual words. Like a Cascaded two-convolutional network + TextSegNet + YOLO(WordDetNet). Also, a super pixel segmentation with hierarchical clustering for new character candidate extraction method also comes in segmentation technique.
  • 7. 3. Hybrid Methods Using Multitask Learning: It accounts both CC based methods and texture-based methods. Below techniques can account for. a. Character candidate detection (cascade boosting technique), min-cost flow network (False character candidate removal, text line extraction, text line verification). b. Connectionist Text Proposal Network which can be extended to multilingual and multi-scale text detection (vertical anchor mechanism, in-network RNN) c. Text-attentional convolutional neural network (Text-CNN)-contrast- enhanced MSER. Text Recognition: Identifying and understanding the text in the candidate region. This is mainly classified into 3 ways as below. Character Based Methods: 1. “Strokelets” -whose essence is a set of multi-scale mid-level primitives and can be automatically learned from bounding box labels. It’s very good in describing the characters. 2. Character Recognition by extracting low level features and integrated automatically via region-based feature pooling technique.
  • 8. Word Based Methods-Recognizing text at word level: 1. Dense SIFT in a bag-of-key points framework, character could be recognized robustly 2. word segmentation with recognition in the probabilistic framework, Lexical decision and sparse beam search tools were used to improve the recognition accuracy. Sequence-Based Methods: text is represented in character sequences. 1. An irregular text recognition which is called RARE (Robust text recognizer with Automatic Rectification). This model combined spatial transformer network (STN) and a sequence recognition network (SRN) 2. Lexicon-free photo-OCR system called recursive recurrent neural networks with attention modeling (R2AM) 3. Convolutional recurrent neural network (CRNN) 4. Auxiliary dense character detection model and an attention model. End-to-End Text recognition System: CNN’s drastically changed the procedures and techniques of attempting for combined model which is one stop place of doing Text detection and text recognition. Text boxes are been developed as part of advances in proposed end-to-end models. A sliding window and connected component methods are best proposed
  • 9. techniques, where they proposed an unconstrained end-to-end real-time text localization and recognition method. Benchmark Datasets for scene text identification in Images are below: https://airtable.com/tbl3hLG7GneipFrDD/viwGd2IomvBf5D5U d?blocks=show Future areas: We have witnessed numerous approaches under different categories and methods with the same rapid development, especially CNN+RNN framework is quite popular. Now, let’s discuss some future exciting stuff. 1. Complex scene and large-scale dataset: you can refer COCO dataset and try to build multi challenging cases 2. Multilingual detection and recognition: Identifying multiple languages like scripts identification within broad backgrounds and text sizes project is one of my aspiring areas 3. Real-time detection and recognition: Identifying the texts from images on real time is also a greatest focusing areas.