Artificial Neural
Network Based Optical
Character Reognition
ARNAB MAHATA
Index
Introduction to OCR
Overview of Artificial Neural Networks (ANNs)
Working Principle of ANN in OCR
Background Theory
Applications of OCR
Challenges and Future Directions
Conclusion
References
2
Introduction to OCR
Definition of OCR
OCR stands for Optical Character Recognition. It is a technology that converts printed or handwritten text into a machine-readable
format. OCR uses complex algorithms to automatically identify and extract text from unstructured documents like images,
screenshots, and physical paper documents.
• OCR automates the conversion of printed or handwritten text into digital format, reducing the time and effort required for
manual data entry.
• OCR enables the digitization of historical documents, books, and archives, preserving valuable information in electronic format
for easy access and long-term storage.
• OCR plays a crucial role in extracting valuable data from documents, such as forms, surveys, and invoices, for further analysis
and decision-making.
Sample Footer Text 3
Importance and applications of OCR
Overview of Artificial Neural Networks
(ANNs)
Definition of ANNs
An Artificial Neural Network (ANN) is a computational model inspired by the human brain’s neural structure. It consists of
interconnected nodes (neurons) organized into layers. Information flows through these nodes, and the network adjusts the
connection strengths (weights) during training to learn from data, enabling it to recognize patterns, make predictions, and solve
various tasks in machine learning and artificial intelligence.
Neurons (Nodes):
Neurons are the fundamental units of ANNs.
Layers:
ANNs consist of multiple layers of neurons organized into an input layer, one or more hidden layers, and an output layer.
Weights and Biases:
Weights represent the strength of connections between neurons in adjacent layers.
Components of Artificial Neural Networks
4
Working Principle of ANN in OCR
Input Layer:
The input layer receives pixel values or features extracted from the input image representing the characters to be recognized.
Hidden Layer:
Intermediate layers, known as hidden layers, perform computations on the input data through weighted connections.
Output Layer:
The output layer consists of neurons corresponding to the possible classes or characters to be recognized.
Activation Functions:
Neurons within the network apply activation functions to the weighted sum of their inputs to introduce non-linearity and determine their
output.
Training Process:
During training, the network learns to associate input images with their corresponding labels (characters).
5
Background Theory
Image acquisition
A scanner reads documents and converts them to binary data. The OCR software analyzes the scanned image and classifies the light areas as background and
the dark areas as text.
Preprocessing
The OCR software first cleans the image and removes errors to prepare it for reading.
Text recognition
The two main types of OCR algorithms or software processes that an OCR software uses for text recognition are called pattern matching and feature extraction.
Pattern matching
Pattern matching works by isolating a character image, called a glyph, and comparing it with a similarly stored glyph. Pattern recognition works only if the
stored glyph has a similar font and scale to the input glyph. This method works well with scanned images of documents that have been typed in a known font.
Feature extraction
Feature extraction breaks down or decomposes the glyphs into features such as lines, closed loops, line direction, and line intersections. It then uses these
features to find the best match or the nearest neighbor among its various stored glyphs.
Postprocessing
After analysis, the system converts the extracted text data into a computerized file. Some OCR systems can create annotated PDF files that include both the
before and after versions of the scanned document.
6
Background Theory
7
Optical Character Recognition (OCR) is a versatile technology with widespread applications across various
industries, leveraging its ability to convert images of text into editable and searchable data.
Document Digitization:
OCR is extensively used to convert paper documents, such as books, reports, and archives, into digital
formats.
Data Extraction and Analysis:
OCR enables the extraction of valuable data from documents, including forms, invoices, and receipts.
Automatic Number Plate Recognition (ANPR):
OCR is employed in ANPR systems to automatically read and recognize license plate numbers from images
captured by cameras.
Mobile Applications:
OCR is integrated into mobile scanning apps for smartphones and tablets, allowing users to digitize
documents on-the-go.
Challenges and Future Directions
While Optical Character Recognition (OCR) has made significant
advancements, several challenges persist, alongside promising future
directions for the technology.
Challenges:
• Handwriting Recognition
• Noise and Distortion
• Layout and Formatting
Future Directions:
• Deep Learning Advancements
• Semantic Understanding
8
Conclusion
Optical Character Recognition (OCR) is a
versatile technology with widespread
applications across various industries,
ranging from document management and
data extraction to accessibility and mobile
applications. Its ability to automate data
entry, digitize documents, and improve
accessibility makes it indispensable in
today's digital age.
9
References
• Vinod Chandra and R. Sudhakar, “Recent
Developments in Artificial Neural Network Based
Character Recognition: A Performance Study”,
IEEE, 1988.
• Evelina Maria De Almeida Neves, Adilson
Gonzaga, Annie France Frere Slaets, “A Multi-Font
Character Recognition Based on its Fundamental
Features by Artificial Neural Networks”, IEEE, 1997.
• D. Sasikala, R. Neelaveni, “Correlation Coefficient
Measure of Multimodal Brain Image Registration
using Fast Walsh Hadamard Transform”, Journal of
Theoretical and Applied Information Technology,
2005.
10
Thank You
Arnab Mahata
11

OCR Presentation hjhPresentation 23.pptx

  • 1.
    Artificial Neural Network BasedOptical Character Reognition ARNAB MAHATA
  • 2.
    Index Introduction to OCR Overviewof Artificial Neural Networks (ANNs) Working Principle of ANN in OCR Background Theory Applications of OCR Challenges and Future Directions Conclusion References 2
  • 3.
    Introduction to OCR Definitionof OCR OCR stands for Optical Character Recognition. It is a technology that converts printed or handwritten text into a machine-readable format. OCR uses complex algorithms to automatically identify and extract text from unstructured documents like images, screenshots, and physical paper documents. • OCR automates the conversion of printed or handwritten text into digital format, reducing the time and effort required for manual data entry. • OCR enables the digitization of historical documents, books, and archives, preserving valuable information in electronic format for easy access and long-term storage. • OCR plays a crucial role in extracting valuable data from documents, such as forms, surveys, and invoices, for further analysis and decision-making. Sample Footer Text 3 Importance and applications of OCR
  • 4.
    Overview of ArtificialNeural Networks (ANNs) Definition of ANNs An Artificial Neural Network (ANN) is a computational model inspired by the human brain’s neural structure. It consists of interconnected nodes (neurons) organized into layers. Information flows through these nodes, and the network adjusts the connection strengths (weights) during training to learn from data, enabling it to recognize patterns, make predictions, and solve various tasks in machine learning and artificial intelligence. Neurons (Nodes): Neurons are the fundamental units of ANNs. Layers: ANNs consist of multiple layers of neurons organized into an input layer, one or more hidden layers, and an output layer. Weights and Biases: Weights represent the strength of connections between neurons in adjacent layers. Components of Artificial Neural Networks 4
  • 5.
    Working Principle ofANN in OCR Input Layer: The input layer receives pixel values or features extracted from the input image representing the characters to be recognized. Hidden Layer: Intermediate layers, known as hidden layers, perform computations on the input data through weighted connections. Output Layer: The output layer consists of neurons corresponding to the possible classes or characters to be recognized. Activation Functions: Neurons within the network apply activation functions to the weighted sum of their inputs to introduce non-linearity and determine their output. Training Process: During training, the network learns to associate input images with their corresponding labels (characters). 5
  • 6.
    Background Theory Image acquisition Ascanner reads documents and converts them to binary data. The OCR software analyzes the scanned image and classifies the light areas as background and the dark areas as text. Preprocessing The OCR software first cleans the image and removes errors to prepare it for reading. Text recognition The two main types of OCR algorithms or software processes that an OCR software uses for text recognition are called pattern matching and feature extraction. Pattern matching Pattern matching works by isolating a character image, called a glyph, and comparing it with a similarly stored glyph. Pattern recognition works only if the stored glyph has a similar font and scale to the input glyph. This method works well with scanned images of documents that have been typed in a known font. Feature extraction Feature extraction breaks down or decomposes the glyphs into features such as lines, closed loops, line direction, and line intersections. It then uses these features to find the best match or the nearest neighbor among its various stored glyphs. Postprocessing After analysis, the system converts the extracted text data into a computerized file. Some OCR systems can create annotated PDF files that include both the before and after versions of the scanned document. 6
  • 7.
    Background Theory 7 Optical CharacterRecognition (OCR) is a versatile technology with widespread applications across various industries, leveraging its ability to convert images of text into editable and searchable data. Document Digitization: OCR is extensively used to convert paper documents, such as books, reports, and archives, into digital formats. Data Extraction and Analysis: OCR enables the extraction of valuable data from documents, including forms, invoices, and receipts. Automatic Number Plate Recognition (ANPR): OCR is employed in ANPR systems to automatically read and recognize license plate numbers from images captured by cameras. Mobile Applications: OCR is integrated into mobile scanning apps for smartphones and tablets, allowing users to digitize documents on-the-go.
  • 8.
    Challenges and FutureDirections While Optical Character Recognition (OCR) has made significant advancements, several challenges persist, alongside promising future directions for the technology. Challenges: • Handwriting Recognition • Noise and Distortion • Layout and Formatting Future Directions: • Deep Learning Advancements • Semantic Understanding 8
  • 9.
    Conclusion Optical Character Recognition(OCR) is a versatile technology with widespread applications across various industries, ranging from document management and data extraction to accessibility and mobile applications. Its ability to automate data entry, digitize documents, and improve accessibility makes it indispensable in today's digital age. 9
  • 10.
    References • Vinod Chandraand R. Sudhakar, “Recent Developments in Artificial Neural Network Based Character Recognition: A Performance Study”, IEEE, 1988. • Evelina Maria De Almeida Neves, Adilson Gonzaga, Annie France Frere Slaets, “A Multi-Font Character Recognition Based on its Fundamental Features by Artificial Neural Networks”, IEEE, 1997. • D. Sasikala, R. Neelaveni, “Correlation Coefficient Measure of Multimodal Brain Image Registration using Fast Walsh Hadamard Transform”, Journal of Theoretical and Applied Information Technology, 2005. 10
  • 11.