The document describes a project to develop optical character recognition (OCR) software for recognizing online and offline handwritten text in multiple languages. It aims to recognize characters from scanned documents or real-time handwriting input and create a user profile. The system scope includes recognizing handwriting from multiple users and cursive script. It will store recognized characters in a text file and optionally convert words to audio for reading documents aloud. The document provides details on OCR technology, applications, literature review, user and system requirements, and the project's goal of using OCR for applications like forms processing.
1. Multilingual OCR Introduction
ABSTRACT
The aim of the project ‘Multilingual OCR’ is to develop OCR software for
online/offline handwriting recognition. OCR is an Optical character recognition and is the
mechanical or electronic translation of images of handwritten or typewritten text (usually
captured by a scanner) into machine-editable text. OCR is a field of research in pattern
recognition, artificial intelligence and machine vision.
Handwritten recognition is used most often to describe the ability of a computer to
translate human writing into text. This may take in one of the two ways, either by scanning of
written text or by writing directly on peripheral input devices.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 1
2. Multilingual OCR Introduction
Aim: To develop an OCR for online/offline handwriting recognition.
Description:
We are going to implement the software which will recognize the characters from
online or offline document (in image format) and use it as individual user profile.
Here we are developing OCR which will recognize handwritten English characters.
OCR is an Optical character recognition and is the mechanical or electronic translation of
images of handwritten or typewritten text (usually captured by a scanner) into machine-
editable text. OCR is a field of research in pattern recognition, artificial intelligence and
machine vision.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 2
3. Multilingual OCR Introduction
Scope of the project:
This system can be used by multiple users. We can do this by improving our software for
recognizing the handwriting of more than one user. Also if we can take the stroke information and
give it to our system, then it will be possible to recognize even cursive script also.
The recognized characters are stored in the text file. We can add words to the sound files and invoke
them through the program, so that the recognized words can be read aloud. Thus we can make the
computer read the handwritten document.
Block Diagram:
Stored
Characters
Grayscale
Conversion
Touch Pad
Filtering
On Line / Real Time Input
PC Thinning
Feature
Scanned Document Extraction
Off Line Input Pattern
Recognition
Recognition
Output
Software Domain
Fig. Block Diagram for OCR
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 3
4. Multilingual OCR Introduction
1. Introduction
1.1 Problem Statement:
To develop an OCR for online/offline handwriting recognition.
1.2 Project Scope:
This system can be used by multiple users. We can do this by improving our software
for recognizing the handwriting of more than one user. Also if we can take the stroke
information and give it to our system, then it will be possible to recognize even cursive script
also.
The recognized characters are stored in the text file. We can add words to the sound
files and invoke them through the program, so that the recognized words can be read aloud.
Thus we can make the computer read the handwritten document.
1.3 Project Objectives:
This software is for recognizing handwritten characters and creating profile for each
particular user. This software supports various languages (except Marathi and Hindi). The
software can be used for security purposes and for creating font of user’s handwriting.
1.4 Assumptions and dependencies:
1. “Multilingual OCR” requires input image with a black background and white fore color.
For this purpose, the software has Invert Image option, which will convert the image in
proper format.
2. System is designed only for Windows OS. It may not work for other operating system.
3. System will recognize any set of characters provided that they are written in legible manner.
4. The characters must be properly separated for greater accuracy.
5. The input given to the system must be in a Bitmap, png, jpeg, jpg file.
6. There should be constant distance between characters and rows to ensure accuracy.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 4
5. Multilingual OCR Introduction
1.5 Applications of OCR:
• Practical Applications:
In recent years, OCR (Optical Character Recognition) technology has been applied
throughout the entire spectrum of industries, revolutionizing the document management
process. OCR has enabled scanned documents to become more than just image files, turning
into fully searchable documents with text content that is recognized by computers. With the
help of OCR, people no longer need to manually retype important documents when entering
them into electronic databases. Instead, OCR extracts relevant information and enters it
automatically. The result is accurate, efficient information processing in less time.
• Banking:
The uses of OCR vary across different fields. One widely known application is in
banking, where OCR is used to process checks without human involvement. A check can be
inserted into a machine, the writing on it is scanned instantly, and the correct amount of
money is transferred. This technology has nearly been perfected for printed checks, and is
fairly accurate for handwritten checks as well, though it occasionally requires manual
confirmation. Overall, this reduces wait times in many banks.
• Legal:
In the legal industry, there has also been a significant movement to digitize paper
documents. In order to save space and eliminate the need to sift through boxes of paper files,
documents are being scanned and entered into computer databases. OCR further simplifies
the process by making documents text-searchable, so that they are easier to locate and work
with once in the database. Legal professionals now have fast, easy access to a huge library of
documents in electronic format, which they can find simply by typing in a few keywords.
• Healthcare:
Healthcare has also seen an increase in the use of OCR technology to process
paperwork. Healthcare professionals always have to deal with large volumes of forms for
each patient, including insurance forms as well as general health forms. To keep up with all
of this information, it is useful to input relevant data into an electronic database that can be
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 5
6. Multilingual OCR Introduction
accessed as necessary. Form processing tools, powered by OCR, are able to extract
information from forms and put it into databases, so that every patient's data is promptly
recorded. As a result, healthcare providers can focus on delivering the best possible service to
every patient.
• OCR in Other Industries:
OCR is widely used in many other fields, including education, finance, and government
agencies. OCR has made countless texts available online, saving money for students and
allowing knowledge to be shared. Invoice imaging applications are used in many businesses
to keep track of financial records and prevent a backlog of payments from piling up. In
government agencies and independent organizations, OCR simplifies data collection and
analysis, among other processes. As the technology continues to develop, more and more
applications are found for OCR technology, including increased use of handwriting
recognition. Furthermore, other technologies related to OCR, such as barcode recognition, are
used daily in retail and other industries. To learn more about OCR solutions for your office,
you can download a free trial of Maestro Recognition Server, CVISION's OCR toolkit, or
Trapeze, our automated form-processing solution.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 6
7. Multilingual OCR Introduction
1.6 Literature Survey:
Now a days, there are software’s for recognizing only the English characters. It
recognizes and stores the characters in ASCII format.
Optical character recognition, usually abbreviated to OCR, is the mechanical or
electronic translation of images of handwritten, typewritten or printed text (usually captured
by a scanner) into machine-editable text.
OCR is a field of research in pattern recognition, artificial intelligence and machine
vision. Though academic research in the field continues, the focus on OCR has shifted to
implementation of proven techniques. Optical character recognition (using optical techniques
such as mirrors and lenses) and digital character recognition (using scanners and computer
algorithms) were originally considered separate fields. Because very few applications survive
that use true optical techniques, the OCR term has now been broadened to include digital
image processing as well.
Early systems required training (the provision of known samples of each character) to
read a specific font. "Intelligent" systems with a high degree of recognition accuracy for most
fonts are now common. Some systems are even capable of reproducing formatted output that
closely approximates the original scanned page including images, columns and other non-
textual components.
In about 1965, Reader's Digest and RCA collaborated to build an OCR Document
reader designed to digitize the serial numbers on Reader's Digest coupons returned from
advertisements. The fonts used on the documents were printed by an RCA Drum printer
using the OCR-A font. The reader was connected directly to an RCA 301 computer (one of
the first solid state computers). This reader was followed by a specialised document reader
installed at TWA where the reader processed Airline Ticket stock. The readers processed
documents at a rate of 1,500 documents per minute, and checked each document, rejecting
those it was not able to process correctly. The product became part of the RCA product line
as a reader designed to process "Turn around Documents" such as those utility and insurance
bills returned with payments.
The United States Postal Service has been using OCR machines to sort mail since
1965 based on technology devised primarily by the prolific inventor Jacob Rabinow. The first
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 7
8. Multilingual OCR Introduction
use of OCR in Europe was by the British General Post Office (GPO). In 1965 it began
planning an entire banking system, the National Giro, using OCR technology, a process that
revolutionized bill payment systems in the UK. Canada Post has been using OCR systems
since 1971.
In 1974 Ray Kurzweil started the company Kurzweil Computer Products, Inc. and led
development of the first omni-font optical character recognition system — a computer
program capable of recognizing text printed in any normal font. He decided that the best
application of this technology would be to create a reading machine for the blind, which
would allow blind people to have a computer read text to them out loud. This device required
the invention of two enabling technologies — the CCD flatbed scanner and the text-to-speech
synthesizer.
In 1978 Kurzweil Computer Products began selling a commercial version of the
optical character recognition computer program. LexisNexis was one of the first customers,
and bought the program to upload paper legal and news documents onto its nascent online
databases.
1992-1996 Commissioned by the U.S. Department of Energy (DOE), Information
Science Research Institute (ISRI) conducted the most authoritative of the Annual Test of
OCR Accuracy for 5 consecutive years in the mid-90s. Information Science Research
Institute (ISRI) is a research and development unit of University of Nevada, Las Vegas. ISRI
was established in 1990 with funding from the U.S. Department of Energy. Its mission is to
foster the improvement of automated technologies for understanding machine printed
documents.
One study based on recognition of 19th and early 20th century newspaper pages
concluded that character-by-character OCR accuracy for commercial OCR software varied
from 71% to 98%; total accuracy can only be achieved by human review. Other areas—
including recognition of hand printing, cursive handwriting, and printed text in other scripts
(especially those East Asian language characters which have many strokes for a single
character)—are still the subject of active research.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 8
9. Multilingual OCR Introduction
3.5 User Characteristics:
• User should be provided proper training to operate whole system
• User must have the basic knowledge of computers.
• User must know the handling of different instruments e.g. scanner, mouse etc.
3.6 Specific Requirement:
3.6.1 User Interfaces
The user will interact with system
• Depending on type of user required output will be generated
• By writing directly on the text area provided on the GUI.
• By first writing in an image file and then giving as input to the system.
• The user will be asked to save the text generated in a .TXT file.
3.6.2 Hardware Requirements
• Intel Pentium 2 Processor
• CPU minimum 500MHZ
• Minimum 64 MB of RAM
• Mouse
• Keyboard
• Scanner
• Monitor
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 9
10. Multilingual OCR Introduction
3.6.3 Software Requirements
• Microsoft Windows 98/NT/XP/2000
• MINIMUM JDK 1.4
• JAVA 2D API
• JAVA Advanced Imaging API
• JAVA Image I/O API
• JAVA Media Frameworks
3.6.4 Performance Requirements:
• Accuracy: The extent to which a program satisfies its specification and fulfils the
customer mission objective.
• Reliability: The extent to which a program can be expected to perform its
intended function with require precision.
• Speed: The time require for a program to perform the given task.
• Maintainability: The efforts required to locate and fix an error in the program.
• Portability: The efforts required to transform a program from one hardware and/or
software system environment to another.
• Availability: The system is expected to be available around the clock as it will be
further used to analyze blood slides at the installed site.
3.6.5 Functional Requirements:
1. For static OCR, software should provide a way to load scanned document for
recognition purpose.
2. If scanned image is not having black background and white foreground, facility
for image inversion should be provided by software.
3. Software should process the image and extract characters.
4. User should have facility to save extracted data in format of his interest.
5. For dynamic OCR, the software should recognize characters drawn by user
simultaneously.
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 10
11. Multilingual OCR Introduction
6. If software is not giving proper output, there should be a way for training the
database of software.
3.6.6 Other Requirements:
• The input image is to be in the bitmap file format
• In case of scanned image, a high quality scanner as well as good paper quality is
required. The resolution of the scanner should be set to a minimum of 300 dots per inch
(dpi).
• During scanning a maximum tilt of up to 20º can be corrected.
• In case of discontinuities in the hand written characters a maximum gap of up to 3 pixel
wide thickness is tolerable.
• A first order median filter is used.
3.7 Position Statement:
Optical Handwriting recognition is used most often used to describe
the ability of a computer to translate human writing into text. This system can
be used for: -
Railway Reservation Forms
Libraries
Government Agencies
School/College Admission Forms
Make other Lengthy Documents available Electronically
PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 11