Prepared By:
Amit Bhoraniya (7022)
Kaushik Godhani(7009)
Mayur Halai(7016)
Vikram Ghunsar(7039)

Text Extraction From
Image
Guided By:
Mr. Udesang Jaliya
Mr. Kirti Sharma
What is Text Extraction ??
Text Extraction is a process by which
we convert Printed document/Scanned
Page or Image in which text are available to
ASCII Character that a Computer can
Recognize.
Goal Of Project
GENERAL APTITUDE
Computer Science
Electronics &
Communication
Engineering
How Will We Archive That Goal ??

Pre
processing

Segmentation

Recognition
Pre-Processing
Pre-Processing

Gray Scale

Noise
Removal

Thresholding
Gray Scale
Noise Removal
Noise Removal is used to Enhance the Image
For Enhancing We have used Median Filter
 FilteredImage = Median Filter(Origional Image, FilterSize)
 We have used FilterSize [5,5]
Thresholding
Edge Detection
Dilate Image
Detect Text Area Using Histrogram
Personal Thresholding to Text Area
Edge Detection using Canny
Dilate
Text Area Using Histrogram
Algorithm
• Row Histrogram
• Separate Region by (no. of Pixel > 60 )
• For Each Row
– Separate Region by (no. of Pixel > Height of (Row/4))
Segmentation
Segmentation

Line
Segmentation

Word
Segmentation

Character
Segmentation
TEXT SEGMENTATION

From above Image, Image are segment in
to Different Lines, Below an example of
Only For one Line.
Segmentation
Find all the word than convert text
area in one image

Character are separate from the
word
Recognition
Recognization

Feature
Extraction

Classifier

Text Document
Recognization

• Feature Extraction
• Binary Code Method
• Chain Code Method
• PCA (Principle Component Analysis)
• LDA (Linear Discriminative Image)
• Classifier
• Artificial Neural Network
• Support Vector Machine
Applications
• Banking (To read Credit Card)
• Libraries (To convert Scanned Page to
Image)
• Govt. Sector (Form Processing)
• Used in Car Number Plate Recognition
System
• Undesirable Text removal from images.
References
1. OCR for Devnagari Script by Mahesh Goyani
2. Edge Based Text Extraction From Complex Images
by Xiaoqing Liu and Jagath Samarbandhu
3. Automatic Text Detection using Morphological
Operations and Inpainting by Khyati Vaghela
4. Font and Background Color Independent Text
Binarization by T.Kasar , J.Kumar , A.G. Ramkrishnan
Thank You

Text extraction From Digital image