SlideShare a Scribd company logo
Hindi Scene Text
Recognition
Guide: Dr. Gaurav Harit
Surya Yadav, Vikas Yadav, Vikas Goyal
Objective
Create a system that detect and
recognize characters from natural scene
images containing Devanagari text.
Motivation
 Hindi is the most spoken language in India and third most spoken
language in the world.
 Most of the websites in Devnagri use images to represent text. There is
need to index such image based on the text in them so that they can be
easily searched.
 Tourist often face problem in India. So there is demand for automated
system that understand natural scene images and provide translated
information.
 Scene text like shop name, company name, traffic information, road signs
and other natural scene board display are important to be recognized
and processed.
Steps:
Natural
Scene
Image
Text block
detection
Word and
character
segmentation
Error
Correction
Feature
Detection and
classification
Output
Text Block Detection
Steps:
Image Gray scale
Image
Canny edge
map
Morphological
closing
Use of similarity
measures to
find text region
missed in
previous step
Use of Script
Specific Rules
Verification of
uniform
thickness
Connected
Component
region
Extraction
Input Image
Gray Image
Canny Edge Map
We compute canny edge map of gray image so as to get
the connected components.
Distance Transform of a binary image
Each pixel in the image is set to a value equal to distance from nearest background pixel
Computation of Stroke Thickness
 For each pixel with non zero value in distance
transformed image if the pixel is local maxima around
3x3 window centered at that pixel we store it in a list
 We compute the mean and variance of values in the
list.
 If mean value is greater than twice the standard
deviation then we decide that thickness of underlying
stroke transform is nearly uniform and select the sub
image as a candidate text region and draw the
bounding box.
Condition based on geometry
For each selected region we get in previous step we first
test it against these set of rules.
1. Aspect ratio of text region should vary between 0.1 to
10.
2. Both height and width of candidate text region cannot
be larger than half of the corresponding size of input
image.
3. Height of candidate text region should be greater than
10 pixels.
Overlapping problem
 There were many bounding box overlapped with each
other.
 Overlap between two bounding box of adjacent text
region should not be greater than 30% of either.
 For solving this issue we merge each pair of bounding
box which have intersection area greater than some
threshold value.
After applying geometry condition and
solving overlapping problem
Sobel Filtering
 Now we use Sobel edge detection algorithm to detect possible horizontal and
possible vertical lines.
Detection of head lines
 For each above region we compute probabilistic Hough transform of the image
in the previous step that is after Horizontal Sobel filtering of image to obtain
characteristic horizontal headlines in Devanagari texts.
 Necessary condition for selection of member as candidate headline is that it
should lie in the upper half part of bounding box.
Detection of vertical lines
 Final decision of existence of possible head line among the possible
horizontal lines is based on computation of vertical Hough lines.
 We compute vertical lines by again applying Hough transform with lower
threshold value as they are not as prominent as horizontal.
 If majority of vertical lines lie below member of horizontal line, the
corresponding horizontal line will be treated as headline.
Detected Horizontal and vertical Lines
Output Image
Character Segmentation
(Next Proposed step)
 Applying Sobel Filter only in one direction that is in vertical direction
removes the headline from candidate region.
 After the removal of headline in each of the bounding box we segment the
word based on vertical histogram analysis.
Next Step ………………Phase ii
 After headline removal we perform Character
Segmentation in selected image.
 After the character segmentation of image we get
each particular characters of Devanagari Script.
 For each character we then perform character
recognition.
Segmentation
Guide: Dr. Gaurav harit
Vikas Yadav, Vikas Goyal, Surya Yadav
Previous Work
 Until now we are able to get bounding box around words.
Segmentation
Character
segmentation from
middle and lower
zone
Baseline
Detection
Character
segmentation from
upper and middle-
lower zone
Headline
Detection
Obtain skew
corrected image
Obtain skew angle
by detecting near
horizontal line in
upper half of image
Obtain thin
image
Conversion of text
to black and
background to
white
Text and
background
separation
Combine
cluster from
both method
Otsu’s
threshholding on
pixels not
normalized
K-mean clustering
on normalized
pixel
RGB
Normalization
where needed
Image
Text and Background Detection
 Converting the image into a binary image by applying popular global or
local thresholding method cannot segment the text from the background
properly.
 Therefore, we applied combination of otsu’s thresholding and unsupervised
k mean clustering to cluster different colour regions in an image.
 Often scene image texts are effected by varying lightness. To handle this
lightness effect on an image we normalize the RGB values of an image
before implementing K-means clustering. But we do not normalize those
pixels where the pixel have near gray RGB values.
 For each pixel we check
(max(R, G, B) - min(R,G,B)/ max(R,G,B)) > 0.2
threshold value 0.2 is selected to filter out the RGB values having near gray
values.
 For the set of pixels not satisfying above criteria, we convert RGB values to
gray and perform otsu’s threshholding.
 For the set pixels satisfying above criteria, RGB normalization is carried out
on this set to remove the lightness effect from those pixel, keeping color
information intact.
 Perform K-mean clustering after normalizing the set satisfying criteria to
obtain text and background separately.
 Combine the clusters from otsu’s thresholding and K-mean clustering to
obtain text and background clusters.
Skew Correction
 Apply thinning algorithm on text region to obtain skeleton image.
 Use Hough transform to obtain all line segments in the upper half of image
with slopes less than 65o.
 If the length of the longest line segment among them is greater than an
empirically selected threshold value, it is decided as the headline.
 If this headline is not parallel to the x-axis then its skew is corrected by
rotating the word image.
(i) Skeleton image obtained for detecting headline for skew correction
Headline Detection
 In order to segment the characters we need to detect the thick headline.
 Compute the projection profile by row-wise sum of gray values for each
row in the upper half of word image.
 Scan the normalized projection profiles of successive rows in the upward
direction starting from the spine and stop scanning when this value drops to
less than a pre-defined threshold value. This row of the word image is
considered as the upper boundary of the headline.
 Similarly, we scan these projection profile values downward starting from
the spine and the row, for which this value drops to less than the same
threshold value, is considered as the lower boundary of the headline.
Character Segmentation
 Use the region growing method to extract the individual characters or their
parts from the binarized and skew corrected word image.
 Locate the lowest and leftmost black pixel in B, and consider it as the seed
point for region growing module.
 The current segment is extracted using the standard region growing
approach based on 8-neighborhood. The stopping criteria for the
implementation of region growing is either
(i) reach the upper or lower boundary of the thick headline
or (ii) reach at a white pixel.
 The extraction of the current segment is continued until no pixel is left to visit
satisfying the above.
Appending local headline
 Append the part of the headline to the above extracted segment as
follows.
 The top left and top right pixels of this segment lie on the lower boundary of
the headline and the portion of the thick headline just above these two
pixels are appended to the segment before its extraction.
 Repeat until there is no black pixel left.
Baseline Detection
 For baseline detection module we feed all the segments of the middle-lower
zone which either hang from the headline or from immediate below (at most 0.2
times the height of the middle-lower region) the headline.
 Find the respective heights hi of each segment and then normalize it to
hi
’ where 0< hi
’<10. Now find
hmin = min{ hi
’ | hi
’ >6.0 }
Next we find
h* = maxi {hi
’ | hi
’ > hmin & hi
’ < floor(hmin) + 1}
 The horizontal line through the bottom most pixel of the segment with
normalized height h* is the baseline
(I) Input image
(ii) Image obtained after applying K-mean
clustering and Otsu's threshholding and skew
correction.
(iii) Segments obtained after character segmentation
(I) Input image
(ii) Image obtained after applying K-mean
clustering and Otsu's threshholding and skew
correction.
(iii) Segments obtained after character segmentation
References
 Prakriti Banik , Ujjwal Bhattacharya, Swapan K. Parui. Segmentation of
Bangla Words in Scene Images.
 U. Bhattacharya, S. K. Parui, and S. Mondal. Devanagari and bangla text
extraction from natural scene images. Proc. of Int. Conf. on Document
Analysis and Recognition, pages 171{175, 2009.

More Related Content

What's hot

Rumba
RumbaRumba
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
rubaiyat11
 
3 d display-methods-in-computer-graphics(For DIU)
3 d display-methods-in-computer-graphics(For DIU)3 d display-methods-in-computer-graphics(For DIU)
3 d display-methods-in-computer-graphics(For DIU)
Rajon rdx
 
Template Matching - Pattern Recognition
Template Matching - Pattern RecognitionTemplate Matching - Pattern Recognition
Template Matching - Pattern Recognition
Mustafa Salam
 
Fuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge DetectionFuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge Detection
Dawn Raider Gupta
 
Edges and lines
Edges and linesEdges and lines
Edges and lines
Rushil Anirudh
 
Multimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMultimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital libraries
Mazin Alwaaly
 
Template matching03
Template matching03Template matching03
Template matching03
amitkhanna1991
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
Rishabh shah
 
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITIONFEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
International Journal of Technical Research & Application
 
image processing
image processingimage processing
image processing
minhaz uddin
 
Ajay ppt region segmentation new copy
Ajay ppt region segmentation new   copyAjay ppt region segmentation new   copy
Ajay ppt region segmentation new copy
Ajay Kumar Singh
 
Morphological image processing
Morphological image processingMorphological image processing
Morphological image processing
Vinayak Narayanan
 
Features image processing and Extaction
Features image processing and ExtactionFeatures image processing and Extaction
Features image processing and Extaction
Ali A Jalil
 
Image processing
Image processingImage processing
Image processing
Mydul Islam Rashed
 
Lec10 matching
Lec10 matchingLec10 matching
Lec10 matching
Suravet Konsetthee
 
Dk34681688
Dk34681688Dk34681688
Dk34681688
IJERA Editor
 
Offline Signiture and Numeral Recognition in Context of Cheque
Offline Signiture and Numeral Recognition in Context of ChequeOffline Signiture and Numeral Recognition in Context of Cheque
Offline Signiture and Numeral Recognition in Context of Cheque
IJERA Editor
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
eshveeen
 
Shading for Computer Topics in Burapha University
Shading for Computer Topics in Burapha UniversityShading for Computer Topics in Burapha University
Shading for Computer Topics in Burapha University
Mao Sararith
 

What's hot (20)

Rumba
RumbaRumba
Rumba
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
 
3 d display-methods-in-computer-graphics(For DIU)
3 d display-methods-in-computer-graphics(For DIU)3 d display-methods-in-computer-graphics(For DIU)
3 d display-methods-in-computer-graphics(For DIU)
 
Template Matching - Pattern Recognition
Template Matching - Pattern RecognitionTemplate Matching - Pattern Recognition
Template Matching - Pattern Recognition
 
Fuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge DetectionFuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge Detection
 
Edges and lines
Edges and linesEdges and lines
Edges and lines
 
Multimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital librariesMultimedia content based retrieval in digital libraries
Multimedia content based retrieval in digital libraries
 
Template matching03
Template matching03Template matching03
Template matching03
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITIONFEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
 
image processing
image processingimage processing
image processing
 
Ajay ppt region segmentation new copy
Ajay ppt region segmentation new   copyAjay ppt region segmentation new   copy
Ajay ppt region segmentation new copy
 
Morphological image processing
Morphological image processingMorphological image processing
Morphological image processing
 
Features image processing and Extaction
Features image processing and ExtactionFeatures image processing and Extaction
Features image processing and Extaction
 
Image processing
Image processingImage processing
Image processing
 
Lec10 matching
Lec10 matchingLec10 matching
Lec10 matching
 
Dk34681688
Dk34681688Dk34681688
Dk34681688
 
Offline Signiture and Numeral Recognition in Context of Cheque
Offline Signiture and Numeral Recognition in Context of ChequeOffline Signiture and Numeral Recognition in Context of Cheque
Offline Signiture and Numeral Recognition in Context of Cheque
 
Lighting and shading
Lighting and shadingLighting and shading
Lighting and shading
 
Shading for Computer Topics in Burapha University
Shading for Computer Topics in Burapha UniversityShading for Computer Topics in Burapha University
Shading for Computer Topics in Burapha University
 

Viewers also liked

TIP_E-Conversion_System
TIP_E-Conversion_SystemTIP_E-Conversion_System
TIP_E-Conversion_System
Rana Saini
 
Devanagari Character Recognition
Devanagari Character RecognitionDevanagari Character Recognition
Devanagari Character Recognition
Pulkit Goyal
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
Badruz Nasrin Basri
 
Output
OutputOutput
Output
Vikas Goyal
 
Text Localizaion Output
Text Localizaion OutputText Localizaion Output
Text Localizaion Output
Vikas Goyal
 
Natural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking ToolsNatural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking Tools
Lukas Renggli
 
Introduction to graphs and their ability to represent images
Introduction to graphs and their ability to represent imagesIntroduction to graphs and their ability to represent images
Introduction to graphs and their ability to represent images
Anyline
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width Transform
Pooja G N
 
Gui in matlab :
Gui in matlab :Gui in matlab :
Gui in matlab :
elboob2025
 
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
Cheriyan K M
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
hemanthmcqueen
 
Error control coding
Error control codingError control coding
Error control coding
Mohammad Bappy
 
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशालाराजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
राहुल खटे (Rahul Khate)
 
Text Detection Strategies
Text Detection StrategiesText Detection Strategies
Text Detection Strategies
Anyline
 
GUI in Matlab - 1
GUI in Matlab - 1GUI in Matlab - 1
GUI in Matlab - 1
Sahil Potnis
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
Vikas Goyal
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
Rahul Khanwani
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
SOYEON KIM
 
Text extraction From Digital image
Text extraction From Digital imageText extraction From Digital image
Text extraction From Digital image
Kaushik Godhani
 
Number plate recognition system using matlab.
Number plate recognition system using matlab.Number plate recognition system using matlab.
Number plate recognition system using matlab.
Namra Afzal
 

Viewers also liked (20)

TIP_E-Conversion_System
TIP_E-Conversion_SystemTIP_E-Conversion_System
TIP_E-Conversion_System
 
Devanagari Character Recognition
Devanagari Character RecognitionDevanagari Character Recognition
Devanagari Character Recognition
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Output
OutputOutput
Output
 
Text Localizaion Output
Text Localizaion OutputText Localizaion Output
Text Localizaion Output
 
Natural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking ToolsNatural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking Tools
 
Introduction to graphs and their ability to represent images
Introduction to graphs and their ability to represent imagesIntroduction to graphs and their ability to represent images
Introduction to graphs and their ability to represent images
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width Transform
 
Gui in matlab :
Gui in matlab :Gui in matlab :
Gui in matlab :
 
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
SCENE TEXT RECOGNITION IN MOBILE APPLICATION BY CHARACTER DESCRIPTOR AND STRU...
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
 
Error control coding
Error control codingError control coding
Error control coding
 
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशालाराजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
राजभाषा हिंदी-सूचना और प्रौद्योगि‍की विषय पर हिंदी कार्यशाला
 
Text Detection Strategies
Text Detection StrategiesText Detection Strategies
Text Detection Strategies
 
GUI in Matlab - 1
GUI in Matlab - 1GUI in Matlab - 1
GUI in Matlab - 1
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
 
Text extraction from natural scene image, a survey
Text extraction from natural scene image, a surveyText extraction from natural scene image, a survey
Text extraction from natural scene image, a survey
 
Text extraction From Digital image
Text extraction From Digital imageText extraction From Digital image
Text extraction From Digital image
 
Number plate recognition system using matlab.
Number plate recognition system using matlab.Number plate recognition system using matlab.
Number plate recognition system using matlab.
 

Similar to Presen_Segmentation

Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
Nitin Vishwari
 
Sample Paper Techscribe
Sample  Paper TechscribeSample  Paper Techscribe
Sample Paper Techscribe
guest533af374
 
Image to text Converter
Image to text ConverterImage to text Converter
Image to text Converter
Dhiraj Raj
 
V.karthikeyan published article
V.karthikeyan published articleV.karthikeyan published article
V.karthikeyan published article
KARTHIKEYAN V
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
Rushin Shah
 
A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...
sipij
 
4 image enhancement in spatial domain
4 image enhancement in spatial domain4 image enhancement in spatial domain
4 image enhancement in spatial domain
Prof. Dr. Subhasis Bose
 
06 image features
06 image features06 image features
06 image features
ankit_ppt
 
Digital image processing Tool presentation
Digital image processing Tool presentationDigital image processing Tool presentation
Digital image processing Tool presentation
dikshabehl5392
 
C04741319
C04741319C04741319
C04741319
IOSR-JEN
 
JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1
Jonathan Westlake
 
search engine for images
search engine for imagessearch engine for images
search engine for images
Anjani
 
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
IRJET Journal
 
A Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and AnalysisA Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and Analysis
IOSR Journals
 
Noise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machineNoise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machine
eSAT Publishing House
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniques
gmidhubala
 
Comparative study on image segmentation techniques
Comparative study on image segmentation techniquesComparative study on image segmentation techniques
Comparative study on image segmentation techniques
gmidhubala
 
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
SamridhGarg
 
Content based image retrieval based on shape with texture features
Content based image retrieval based on shape with texture featuresContent based image retrieval based on shape with texture features
Content based image retrieval based on shape with texture features
Alexander Decker
 
B018110915
B018110915B018110915
B018110915
IOSR Journals
 

Similar to Presen_Segmentation (20)

Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Sample Paper Techscribe
Sample  Paper TechscribeSample  Paper Techscribe
Sample Paper Techscribe
 
Image to text Converter
Image to text ConverterImage to text Converter
Image to text Converter
 
V.karthikeyan published article
V.karthikeyan published articleV.karthikeyan published article
V.karthikeyan published article
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...A binarization technique for extraction of devanagari text from camera based ...
A binarization technique for extraction of devanagari text from camera based ...
 
4 image enhancement in spatial domain
4 image enhancement in spatial domain4 image enhancement in spatial domain
4 image enhancement in spatial domain
 
06 image features
06 image features06 image features
06 image features
 
Digital image processing Tool presentation
Digital image processing Tool presentationDigital image processing Tool presentation
Digital image processing Tool presentation
 
C04741319
C04741319C04741319
C04741319
 
JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1
 
search engine for images
search engine for imagessearch engine for images
search engine for images
 
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
Automatic License Plate Detection in Foggy Condition using Enhanced OTSU Tech...
 
A Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and AnalysisA Novel Edge Detection Technique for Image Classification and Analysis
A Novel Edge Detection Technique for Image Classification and Analysis
 
Noise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machineNoise tolerant color image segmentation using support vector machine
Noise tolerant color image segmentation using support vector machine
 
Image segmentation techniques
Image segmentation techniquesImage segmentation techniques
Image segmentation techniques
 
Comparative study on image segmentation techniques
Comparative study on image segmentation techniquesComparative study on image segmentation techniques
Comparative study on image segmentation techniques
 
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
19BCS1815_PresentationAutomatic Number Plate Recognition(ANPR)P.pptx
 
Content based image retrieval based on shape with texture features
Content based image retrieval based on shape with texture featuresContent based image retrieval based on shape with texture features
Content based image retrieval based on shape with texture features
 
B018110915
B018110915B018110915
B018110915
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 

Presen_Segmentation

  • 1. Hindi Scene Text Recognition Guide: Dr. Gaurav Harit Surya Yadav, Vikas Yadav, Vikas Goyal
  • 2. Objective Create a system that detect and recognize characters from natural scene images containing Devanagari text.
  • 3. Motivation  Hindi is the most spoken language in India and third most spoken language in the world.  Most of the websites in Devnagri use images to represent text. There is need to index such image based on the text in them so that they can be easily searched.  Tourist often face problem in India. So there is demand for automated system that understand natural scene images and provide translated information.  Scene text like shop name, company name, traffic information, road signs and other natural scene board display are important to be recognized and processed.
  • 6. Steps: Image Gray scale Image Canny edge map Morphological closing Use of similarity measures to find text region missed in previous step Use of Script Specific Rules Verification of uniform thickness Connected Component region Extraction
  • 9. Canny Edge Map We compute canny edge map of gray image so as to get the connected components.
  • 10. Distance Transform of a binary image Each pixel in the image is set to a value equal to distance from nearest background pixel
  • 11. Computation of Stroke Thickness  For each pixel with non zero value in distance transformed image if the pixel is local maxima around 3x3 window centered at that pixel we store it in a list  We compute the mean and variance of values in the list.  If mean value is greater than twice the standard deviation then we decide that thickness of underlying stroke transform is nearly uniform and select the sub image as a candidate text region and draw the bounding box.
  • 12. Condition based on geometry For each selected region we get in previous step we first test it against these set of rules. 1. Aspect ratio of text region should vary between 0.1 to 10. 2. Both height and width of candidate text region cannot be larger than half of the corresponding size of input image. 3. Height of candidate text region should be greater than 10 pixels.
  • 13. Overlapping problem  There were many bounding box overlapped with each other.  Overlap between two bounding box of adjacent text region should not be greater than 30% of either.  For solving this issue we merge each pair of bounding box which have intersection area greater than some threshold value.
  • 14. After applying geometry condition and solving overlapping problem
  • 15. Sobel Filtering  Now we use Sobel edge detection algorithm to detect possible horizontal and possible vertical lines.
  • 16. Detection of head lines  For each above region we compute probabilistic Hough transform of the image in the previous step that is after Horizontal Sobel filtering of image to obtain characteristic horizontal headlines in Devanagari texts.  Necessary condition for selection of member as candidate headline is that it should lie in the upper half part of bounding box.
  • 17. Detection of vertical lines  Final decision of existence of possible head line among the possible horizontal lines is based on computation of vertical Hough lines.  We compute vertical lines by again applying Hough transform with lower threshold value as they are not as prominent as horizontal.  If majority of vertical lines lie below member of horizontal line, the corresponding horizontal line will be treated as headline.
  • 18. Detected Horizontal and vertical Lines
  • 20. Character Segmentation (Next Proposed step)  Applying Sobel Filter only in one direction that is in vertical direction removes the headline from candidate region.  After the removal of headline in each of the bounding box we segment the word based on vertical histogram analysis.
  • 21. Next Step ………………Phase ii  After headline removal we perform Character Segmentation in selected image.  After the character segmentation of image we get each particular characters of Devanagari Script.  For each character we then perform character recognition.
  • 22. Segmentation Guide: Dr. Gaurav harit Vikas Yadav, Vikas Goyal, Surya Yadav
  • 23. Previous Work  Until now we are able to get bounding box around words.
  • 24. Segmentation Character segmentation from middle and lower zone Baseline Detection Character segmentation from upper and middle- lower zone Headline Detection Obtain skew corrected image Obtain skew angle by detecting near horizontal line in upper half of image Obtain thin image Conversion of text to black and background to white Text and background separation Combine cluster from both method Otsu’s threshholding on pixels not normalized K-mean clustering on normalized pixel RGB Normalization where needed Image
  • 25. Text and Background Detection  Converting the image into a binary image by applying popular global or local thresholding method cannot segment the text from the background properly.  Therefore, we applied combination of otsu’s thresholding and unsupervised k mean clustering to cluster different colour regions in an image.  Often scene image texts are effected by varying lightness. To handle this lightness effect on an image we normalize the RGB values of an image before implementing K-means clustering. But we do not normalize those pixels where the pixel have near gray RGB values.
  • 26.  For each pixel we check (max(R, G, B) - min(R,G,B)/ max(R,G,B)) > 0.2 threshold value 0.2 is selected to filter out the RGB values having near gray values.  For the set of pixels not satisfying above criteria, we convert RGB values to gray and perform otsu’s threshholding.  For the set pixels satisfying above criteria, RGB normalization is carried out on this set to remove the lightness effect from those pixel, keeping color information intact.  Perform K-mean clustering after normalizing the set satisfying criteria to obtain text and background separately.  Combine the clusters from otsu’s thresholding and K-mean clustering to obtain text and background clusters.
  • 27. Skew Correction  Apply thinning algorithm on text region to obtain skeleton image.  Use Hough transform to obtain all line segments in the upper half of image with slopes less than 65o.  If the length of the longest line segment among them is greater than an empirically selected threshold value, it is decided as the headline.  If this headline is not parallel to the x-axis then its skew is corrected by rotating the word image. (i) Skeleton image obtained for detecting headline for skew correction
  • 28. Headline Detection  In order to segment the characters we need to detect the thick headline.  Compute the projection profile by row-wise sum of gray values for each row in the upper half of word image.  Scan the normalized projection profiles of successive rows in the upward direction starting from the spine and stop scanning when this value drops to less than a pre-defined threshold value. This row of the word image is considered as the upper boundary of the headline.  Similarly, we scan these projection profile values downward starting from the spine and the row, for which this value drops to less than the same threshold value, is considered as the lower boundary of the headline.
  • 29. Character Segmentation  Use the region growing method to extract the individual characters or their parts from the binarized and skew corrected word image.  Locate the lowest and leftmost black pixel in B, and consider it as the seed point for region growing module.  The current segment is extracted using the standard region growing approach based on 8-neighborhood. The stopping criteria for the implementation of region growing is either (i) reach the upper or lower boundary of the thick headline or (ii) reach at a white pixel.  The extraction of the current segment is continued until no pixel is left to visit satisfying the above.
  • 30. Appending local headline  Append the part of the headline to the above extracted segment as follows.  The top left and top right pixels of this segment lie on the lower boundary of the headline and the portion of the thick headline just above these two pixels are appended to the segment before its extraction.  Repeat until there is no black pixel left.
  • 31. Baseline Detection  For baseline detection module we feed all the segments of the middle-lower zone which either hang from the headline or from immediate below (at most 0.2 times the height of the middle-lower region) the headline.  Find the respective heights hi of each segment and then normalize it to hi ’ where 0< hi ’<10. Now find hmin = min{ hi ’ | hi ’ >6.0 } Next we find h* = maxi {hi ’ | hi ’ > hmin & hi ’ < floor(hmin) + 1}  The horizontal line through the bottom most pixel of the segment with normalized height h* is the baseline
  • 32. (I) Input image (ii) Image obtained after applying K-mean clustering and Otsu's threshholding and skew correction. (iii) Segments obtained after character segmentation (I) Input image (ii) Image obtained after applying K-mean clustering and Otsu's threshholding and skew correction. (iii) Segments obtained after character segmentation
  • 33. References  Prakriti Banik , Ujjwal Bhattacharya, Swapan K. Parui. Segmentation of Bangla Words in Scene Images.  U. Bhattacharya, S. K. Parui, and S. Mondal. Devanagari and bangla text extraction from natural scene images. Proc. of Int. Conf. on Document Analysis and Recognition, pages 171{175, 2009.