SlideShare a Scribd company logo
1 of 33
Pattern Recognition via Character Recognition
Is a system that provides a full alphanumeric recognition of printed or
handwritten characters at electronic speed by simply scanning the form
OCR : Optical Character Recognition
Intelligent Character Recognition (ICR) is used to describe the process of
interpreting image data, in particular alphanumeric text.
Sometimes OCR is known as ICR
Functions & Features of OCR
• Forms can be scanned through a scanner and then the recognition
engine of the OCR system interpret the images and turn images of
handwritten or printed characters into ASCII data (machine-readable
characters).
• The technology provides a complete form processing and
documents capture solution.
• Allows an open, scalable and workflow.
• Includes forms definition, scanning, image
• pre-processing, and recognition capabilities.
ICR,OCR and OMR Differences
• ICR and OCR are recognition engines used with imaging;
• OMR is a data collection technology that does not require a
recognition engine.
• OMR cannot recognize hand-printed or machine-printed
characters.
Optical Mark Reader (OMR)
• Forms
• An OMR works with a specialized document and contains timing
tracks along one edge of the form to indicate scanner where to read
for marks which look like black boxes on the top or bottom of a
form.
• The cut of the form is very precise and the bubbles on a form must
be located in the same location on every form.
• Storage
• With OMR, the image of a document is not scanned and stored.
• Accuracy
• OMR is simpler than OCR.
• designed properly, OMR has more accuracy than OCR.
OCR/ ICR
• Forms
• OCR/ ICR is more flexible since no timing tracks or block like form
IDs required.
• The image can float on a page.
• ICR/ OCR technology uses registration mark on the four-corners of a
document, in the recognition of an image. Respondents place one
character per box on this form.
• The use of drop color reduces the size of the scanner’s output and
enhances the accuracy.
• Storage/ retrieval
• If the document needs to be electronically stored and maintained, then
OCR/ ICR is needed.
• OCR/ICR technologies, images can be scanned, indexed, and written to
optical media.
OMR-OCR/ICR Compared
Advantages and Disadvantages
• Advantages of Using Images Rather Than Paper
• Quicker processing; no moving or storage of questionnaires near
operators
• Savings in costs and efficiencies by not having the paper questionnaires
• Scanning and recognition allowed efficient management and planning for
the rest of the processing workload
• Reduced long term storage requirements, questionnaires could be
destroyed after the initial scanning, recognition and repair
• Quick retrieval for editing and reprocessing
• Minimizes errors associated with physical handling of the questionnaires
Disadvantages of Using Images Rather Than Paper
• Accuracy
• While OCR technology can be effective in converting
handwritten or typed characters, it does not give as high
accuracy as of OMR for reading data, where users are
actually marking forms
• Additional workload to data collectors OCR has severe
limitations when it comes to human handwriting
• Characters must be hand-printed with separate
characters in boxes
OCR Systems
OCR systems consist of four major stages :
• Pre-processing
• Segmentation
• Feature Extraction
• Classification
• Post-processing
Pre-processing
The raw data is subjected to a number of preliminary processing steps to
make it usable in the descriptive stages of character analysis. Pre-
processing aims to produce data that are easy for the OCR systems to
operate accurately. The main objectives of pre-processing are :
• Binarization
• Noise reduction
• Stroke width normalization
• Skew correction
• Slant removal
Binarization
Document image binarization (thresholding) refers to the
conversion of a gray-scale image into a binary image. Two
categories of thresholding:
• Global, picks one threshold value for the entire document
image which is often based on an estimation of the
background level from the intensity histogram of the
image.
• Adaptive (local), uses different values for each pixel
according to the local area information
Noise Reduction - Normalization
 Normalization provides a tremendous reduction in data size, thinning
extracts the shape information of the characters.
 Noise reduction improves the quality of the document.
Two main approaches:
• Filtering (masks)
• Morphological Operations (erosion, dilation, etc)
Skew Correction
 Skew Correction methods are used to align the paper document with the
coordinate system of the scanner. Main approaches for skew detection
include correlation, projection profiles, Hough transform.
Slant Removal
 The slant of handwritten texts varies from user to user. Slant
removal methods are used to normalize the all characters to a
standard form.
 Popular deslanting techniques are:
• Calculation of the average angle of near-vertical elements
• Bozinovic – Shrihari Method (BSM).
Slant Removal
 Entropy
• The dominant slope of the character is found from the slope corrected
characters which gives the minimum entropy of a vertical projection
histogram. The vertical histogram projection is calculated for a range of
angles ± R. In our case R=60, seems to cover all writing styles. The slope
of the character, ,is found from:
m
a
H
R
a
m


 min
 i
N
i
i p
p
H log
1




• The character is then corrected by using: m
a
)
tan( m
a
y
x
x 

 y
y 

Segmentation
 Text Line Detection (Hough Transform, projections,
smearing)
 Word Extraction (vertical projections, connected component
analysis)
 Word Extraction 2 (RLSA)
Segmentation
 Explicit Segmentation
In explicit approaches one tries to identify the smallest
possible word segments (primitive segments) that may be
smaller than letters, but surely cannot be segmented further.
Later in the recognition process these primitive segments are
assembled into letters based on input from the character
recognizer. The advantage of the first strategy is that it is
robust and quite straightforward, but is not very flexible.
 Implicit Segmentation
In implicit approaches the words are recognized entirely
without segmenting them into letters. This is most effective
and viable only when the set of possible words is small and
known in advance, such as the recognition of bank checks
and postal address
Statistical Features
 Representation of a character image by statistical distribution of
points takes care of style variations to some extent.
 The major statistical features used for character representation are:
• Zoning
• Projections and profiles
• Crossings and distances
Zoning
 The character image is divided into NxM zones. From each zone
features are extracted to form the feature vector. The goal of zoning is to
obtain the local characteristics instead of global characteristics
Zoning – Density Features
 The number of foreground pixels, or the normalized number of
foreground pixels, in each cell is considered a feature.
Darker squares indicate higher density of zone pixels.
Zoning – Direction Features
 Based on the contour of the character image
 For each zone the contour is followed and a directional
histogram is obtained by analyzing the adjacent pixels in a
3x3 neighborhood
Zoning – Direction Features
 Based on the skeleton of the character image
 Distinguish individual line segments
 Labeling line segment information
• Line segments are coded with a direction number
2 = vertical line segment
3 = right diagonal line segment
4 = horizontal line segment
5 = left diagonal line segment
 Line type normalization
 Formation of feature vector through zoning
number of
horizontal
lines
total length
of horizontal
lines
number
of
right
diagonal
lines
total length
of right
diagonal
lines
number
of
vertical
lines
total
length of
vertical
lines
number
of
left
diagonal
lines
total length
of left
diagonal
lines
number of
intersection points
Projection Histograms
 The basic idea behind using projections is that character images, which are
2-D signals, can be represented as 1-D signal. These features, although
independent to noise and deformation, depend on rotation.
 Projection histograms count the number of pixels in each
column and row of a character image. Projection histograms
can separate characters such as “m” and “n” .
Profiles
 The profile counts the number of pixels (distance) between the bounding
box of the character image and the edge of the character. The profiles
describe well the external shapes of characters and allow to distinguish
between a great number of letters, such as “p” and “q”.
Profiles
 Profiles can also be used to the contour of the character image
• Extract the contour of the character
• Locate the uppermost and the lowermost points of the
contour
• Calculate the in and out profiles of the contour
Crossings and Distances
 Crossings count the number of transitions from background to foreground
pixels along vertical and horizontal lines through the character image and
Distances calculate the distances of the first image pixel detected from the
upper and lower boundaries, of the image, along vertical lines and from the
left and right boundaries along horizontal lines
Structural Features
 Characters can be represented by structural features with high
tolerance to distortions and style variations. This type of
representation may also encode some knowledge about the
structure of the object or may provide some knowledge as to what
sort of components make up that object.
 Structural features are based on topological and geometrical
properties of the character, such as aspect ratio, cross points, loops,
branch points, strokes and their directions, inflection between two
points, horizontal curves at top or bottom, etc.
Structural Features
Structural Features
 A structural feature extraction method for recognizing Greek handwritten
characters [Kavallieratou et.al 2002]
 Three types of features:
• Horizontal and Vertical projection histograms
• Radial histogram
• Radial out-in and radial in-out profiles
Global Transformations - Moments
 The Fourier Transform (FT) of the contour of the image is
calculated. Since the first n coefficients of the FT can be used in order
to reconstruct the contour, then these n coefficients are considered to
be a n-dimesional feature vector that represents the character.
 Central, Zenrike moments that make the process of
recognizing an object scale, translation, and rotation invariant.
The original image can be completely reconstructed from the
moment coefficients.
Classification
 k-Nearest Neighbour (k-NN) , Bayes Classifier,
Neural Networks (NN), Hidden Markov Models
(HMM), Support Vector Machines (SVM), etc
There is no such thing as the “best classifier”. The use
of classifier depends on many factors, such as
available training set, number of free parameters etc.
Pattern_Recognition_via_Character_Recogn.pptx

More Related Content

Similar to Pattern_Recognition_via_Character_Recogn.pptx

Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalBiniam Asnake
 
OCR for Urdu translation
OCR for Urdu translation OCR for Urdu translation
OCR for Urdu translation Yasar Hayat
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
 
A Review of Optical Character Recognition System for Recognition of Printed Text
A Review of Optical Character Recognition System for Recognition of Printed TextA Review of Optical Character Recognition System for Recognition of Printed Text
A Review of Optical Character Recognition System for Recognition of Printed Textiosrjce
 
Improvement of the Recognition Rate by Random Forest
Improvement of the Recognition Rate by Random ForestImprovement of the Recognition Rate by Random Forest
Improvement of the Recognition Rate by Random ForestIJERA Editor
 
Improvement oh the recognition rate by random forest
Improvement oh the recognition rate by random forestImprovement oh the recognition rate by random forest
Improvement oh the recognition rate by random forestYoussef Rachidi
 
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...Youssef Rachidi
 
Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)IJERA Editor
 
IRJET- Optical Character Recognition using Image Processing
IRJET-  	  Optical Character Recognition using Image ProcessingIRJET-  	  Optical Character Recognition using Image Processing
IRJET- Optical Character Recognition using Image ProcessingIRJET Journal
 
PORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND- H...
PORTABLE CAMERA-BASED  ASSISTIVE TEXT AND PRODUCT  LABEL READING FROM HAND- H...PORTABLE CAMERA-BASED  ASSISTIVE TEXT AND PRODUCT  LABEL READING FROM HAND- H...
PORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND- H...Sathmica K
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyEr. Ashish Pandey
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
 
A Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition TechniquesA Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition Techniquesijsrd.com
 
Optical Character Recognition from Text Image
Optical Character Recognition from Text ImageOptical Character Recognition from Text Image
Optical Character Recognition from Text ImageEditor IJCATR
 

Similar to Pattern_Recognition_via_Character_Recogn.pptx (20)

Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
 
05a
05a05a
05a
 
OCR for Urdu translation
OCR for Urdu translation OCR for Urdu translation
OCR for Urdu translation
 
A Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition SystemA Comprehensive Study On Handwritten Character Recognition System
A Comprehensive Study On Handwritten Character Recognition System
 
A017240107
A017240107A017240107
A017240107
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
 
E017322833
E017322833E017322833
E017322833
 
A Review of Optical Character Recognition System for Recognition of Printed Text
A Review of Optical Character Recognition System for Recognition of Printed TextA Review of Optical Character Recognition System for Recognition of Printed Text
A Review of Optical Character Recognition System for Recognition of Printed Text
 
Improvement of the Recognition Rate by Random Forest
Improvement of the Recognition Rate by Random ForestImprovement of the Recognition Rate by Random Forest
Improvement of the Recognition Rate by Random Forest
 
Improvement oh the recognition rate by random forest
Improvement oh the recognition rate by random forestImprovement oh the recognition rate by random forest
Improvement oh the recognition rate by random forest
 
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...
Handwritten amazigh-character-recognition-system-for-image-obtained-by-camera...
 
Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)Character Recognition (Devanagari Script)
Character Recognition (Devanagari Script)
 
E123440
E123440E123440
E123440
 
IRJET- Optical Character Recognition using Image Processing
IRJET-  	  Optical Character Recognition using Image ProcessingIRJET-  	  Optical Character Recognition using Image Processing
IRJET- Optical Character Recognition using Image Processing
 
PORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND- H...
PORTABLE CAMERA-BASED  ASSISTIVE TEXT AND PRODUCT  LABEL READING FROM HAND- H...PORTABLE CAMERA-BASED  ASSISTIVE TEXT AND PRODUCT  LABEL READING FROM HAND- H...
PORTABLE CAMERA-BASED ASSISTIVE TEXT AND PRODUCT LABEL READING FROM HAND- H...
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper Study
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...
 
A Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition TechniquesA Survey of Modern Character Recognition Techniques
A Survey of Modern Character Recognition Techniques
 
Optical Character Recognition from Text Image
Optical Character Recognition from Text ImageOptical Character Recognition from Text Image
Optical Character Recognition from Text Image
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 

Recently uploaded

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Recently uploaded (20)

The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

Pattern_Recognition_via_Character_Recogn.pptx

  • 1. Pattern Recognition via Character Recognition
  • 2. Is a system that provides a full alphanumeric recognition of printed or handwritten characters at electronic speed by simply scanning the form OCR : Optical Character Recognition Intelligent Character Recognition (ICR) is used to describe the process of interpreting image data, in particular alphanumeric text. Sometimes OCR is known as ICR
  • 3. Functions & Features of OCR • Forms can be scanned through a scanner and then the recognition engine of the OCR system interpret the images and turn images of handwritten or printed characters into ASCII data (machine-readable characters). • The technology provides a complete form processing and documents capture solution. • Allows an open, scalable and workflow. • Includes forms definition, scanning, image • pre-processing, and recognition capabilities.
  • 4. ICR,OCR and OMR Differences • ICR and OCR are recognition engines used with imaging; • OMR is a data collection technology that does not require a recognition engine. • OMR cannot recognize hand-printed or machine-printed characters.
  • 5. Optical Mark Reader (OMR) • Forms • An OMR works with a specialized document and contains timing tracks along one edge of the form to indicate scanner where to read for marks which look like black boxes on the top or bottom of a form. • The cut of the form is very precise and the bubbles on a form must be located in the same location on every form. • Storage • With OMR, the image of a document is not scanned and stored. • Accuracy • OMR is simpler than OCR. • designed properly, OMR has more accuracy than OCR.
  • 6. OCR/ ICR • Forms • OCR/ ICR is more flexible since no timing tracks or block like form IDs required. • The image can float on a page. • ICR/ OCR technology uses registration mark on the four-corners of a document, in the recognition of an image. Respondents place one character per box on this form. • The use of drop color reduces the size of the scanner’s output and enhances the accuracy. • Storage/ retrieval • If the document needs to be electronically stored and maintained, then OCR/ ICR is needed. • OCR/ICR technologies, images can be scanned, indexed, and written to optical media.
  • 8. Advantages and Disadvantages • Advantages of Using Images Rather Than Paper • Quicker processing; no moving or storage of questionnaires near operators • Savings in costs and efficiencies by not having the paper questionnaires • Scanning and recognition allowed efficient management and planning for the rest of the processing workload • Reduced long term storage requirements, questionnaires could be destroyed after the initial scanning, recognition and repair • Quick retrieval for editing and reprocessing • Minimizes errors associated with physical handling of the questionnaires
  • 9. Disadvantages of Using Images Rather Than Paper • Accuracy • While OCR technology can be effective in converting handwritten or typed characters, it does not give as high accuracy as of OMR for reading data, where users are actually marking forms • Additional workload to data collectors OCR has severe limitations when it comes to human handwriting • Characters must be hand-printed with separate characters in boxes
  • 10. OCR Systems OCR systems consist of four major stages : • Pre-processing • Segmentation • Feature Extraction • Classification • Post-processing
  • 11. Pre-processing The raw data is subjected to a number of preliminary processing steps to make it usable in the descriptive stages of character analysis. Pre- processing aims to produce data that are easy for the OCR systems to operate accurately. The main objectives of pre-processing are : • Binarization • Noise reduction • Stroke width normalization • Skew correction • Slant removal
  • 12. Binarization Document image binarization (thresholding) refers to the conversion of a gray-scale image into a binary image. Two categories of thresholding: • Global, picks one threshold value for the entire document image which is often based on an estimation of the background level from the intensity histogram of the image. • Adaptive (local), uses different values for each pixel according to the local area information
  • 13. Noise Reduction - Normalization  Normalization provides a tremendous reduction in data size, thinning extracts the shape information of the characters.  Noise reduction improves the quality of the document. Two main approaches: • Filtering (masks) • Morphological Operations (erosion, dilation, etc)
  • 14. Skew Correction  Skew Correction methods are used to align the paper document with the coordinate system of the scanner. Main approaches for skew detection include correlation, projection profiles, Hough transform.
  • 15. Slant Removal  The slant of handwritten texts varies from user to user. Slant removal methods are used to normalize the all characters to a standard form.  Popular deslanting techniques are: • Calculation of the average angle of near-vertical elements • Bozinovic – Shrihari Method (BSM).
  • 16. Slant Removal  Entropy • The dominant slope of the character is found from the slope corrected characters which gives the minimum entropy of a vertical projection histogram. The vertical histogram projection is calculated for a range of angles ± R. In our case R=60, seems to cover all writing styles. The slope of the character, ,is found from: m a H R a m    min  i N i i p p H log 1     • The character is then corrected by using: m a ) tan( m a y x x    y y  
  • 17. Segmentation  Text Line Detection (Hough Transform, projections, smearing)  Word Extraction (vertical projections, connected component analysis)  Word Extraction 2 (RLSA)
  • 18. Segmentation  Explicit Segmentation In explicit approaches one tries to identify the smallest possible word segments (primitive segments) that may be smaller than letters, but surely cannot be segmented further. Later in the recognition process these primitive segments are assembled into letters based on input from the character recognizer. The advantage of the first strategy is that it is robust and quite straightforward, but is not very flexible.  Implicit Segmentation In implicit approaches the words are recognized entirely without segmenting them into letters. This is most effective and viable only when the set of possible words is small and known in advance, such as the recognition of bank checks and postal address
  • 19. Statistical Features  Representation of a character image by statistical distribution of points takes care of style variations to some extent.  The major statistical features used for character representation are: • Zoning • Projections and profiles • Crossings and distances
  • 20. Zoning  The character image is divided into NxM zones. From each zone features are extracted to form the feature vector. The goal of zoning is to obtain the local characteristics instead of global characteristics
  • 21. Zoning – Density Features  The number of foreground pixels, or the normalized number of foreground pixels, in each cell is considered a feature. Darker squares indicate higher density of zone pixels.
  • 22. Zoning – Direction Features  Based on the contour of the character image  For each zone the contour is followed and a directional histogram is obtained by analyzing the adjacent pixels in a 3x3 neighborhood
  • 23. Zoning – Direction Features  Based on the skeleton of the character image  Distinguish individual line segments  Labeling line segment information • Line segments are coded with a direction number 2 = vertical line segment 3 = right diagonal line segment 4 = horizontal line segment 5 = left diagonal line segment  Line type normalization  Formation of feature vector through zoning number of horizontal lines total length of horizontal lines number of right diagonal lines total length of right diagonal lines number of vertical lines total length of vertical lines number of left diagonal lines total length of left diagonal lines number of intersection points
  • 24. Projection Histograms  The basic idea behind using projections is that character images, which are 2-D signals, can be represented as 1-D signal. These features, although independent to noise and deformation, depend on rotation.  Projection histograms count the number of pixels in each column and row of a character image. Projection histograms can separate characters such as “m” and “n” .
  • 25. Profiles  The profile counts the number of pixels (distance) between the bounding box of the character image and the edge of the character. The profiles describe well the external shapes of characters and allow to distinguish between a great number of letters, such as “p” and “q”.
  • 26. Profiles  Profiles can also be used to the contour of the character image • Extract the contour of the character • Locate the uppermost and the lowermost points of the contour • Calculate the in and out profiles of the contour
  • 27. Crossings and Distances  Crossings count the number of transitions from background to foreground pixels along vertical and horizontal lines through the character image and Distances calculate the distances of the first image pixel detected from the upper and lower boundaries, of the image, along vertical lines and from the left and right boundaries along horizontal lines
  • 28. Structural Features  Characters can be represented by structural features with high tolerance to distortions and style variations. This type of representation may also encode some knowledge about the structure of the object or may provide some knowledge as to what sort of components make up that object.  Structural features are based on topological and geometrical properties of the character, such as aspect ratio, cross points, loops, branch points, strokes and their directions, inflection between two points, horizontal curves at top or bottom, etc.
  • 30. Structural Features  A structural feature extraction method for recognizing Greek handwritten characters [Kavallieratou et.al 2002]  Three types of features: • Horizontal and Vertical projection histograms • Radial histogram • Radial out-in and radial in-out profiles
  • 31. Global Transformations - Moments  The Fourier Transform (FT) of the contour of the image is calculated. Since the first n coefficients of the FT can be used in order to reconstruct the contour, then these n coefficients are considered to be a n-dimesional feature vector that represents the character.  Central, Zenrike moments that make the process of recognizing an object scale, translation, and rotation invariant. The original image can be completely reconstructed from the moment coefficients.
  • 32. Classification  k-Nearest Neighbour (k-NN) , Bayes Classifier, Neural Networks (NN), Hidden Markov Models (HMM), Support Vector Machines (SVM), etc There is no such thing as the “best classifier”. The use of classifier depends on many factors, such as available training set, number of free parameters etc.