SlideShare a Scribd company logo
1 of 11
Multilingual OCR                                                                 Introduction


                                       ABSTRACT
       The aim of the project ‘Multilingual OCR’ is to develop OCR software for
online/offline handwriting recognition. OCR is an Optical character recognition and is the
mechanical or electronic translation of images of handwritten or typewritten text (usually
captured by a scanner) into machine-editable text. OCR is a field of research in pattern
recognition, artificial intelligence and machine vision.
       Handwritten recognition is used most often to describe the ability of a computer to
translate human writing into text. This may take in one of the two ways, either by scanning of
written text or by writing directly on peripheral input devices.




PES’s Modern College of Engineering, Shivajinagar, Pune-5                              Page 1
Multilingual OCR                                                                      Introduction


Aim: To develop an OCR for online/offline handwriting recognition.


Description:
       We are going to implement the software which will recognize the characters from
online or offline document (in image format) and use it as individual user profile.
       Here we are developing OCR which will recognize handwritten English characters.
OCR is an Optical character recognition and is the mechanical or electronic translation of
images of handwritten or typewritten text (usually captured by a scanner) into machine-
editable text. OCR is a field of research in pattern recognition, artificial intelligence and
machine vision.




PES’s Modern College of Engineering, Shivajinagar, Pune-5                                  Page 2
Multilingual OCR                                                                         Introduction


Scope of the project:
       This system can be used by multiple users. We can do this by improving our software for
recognizing the handwriting of more than one user. Also if we can take the stroke information and
give it to our system, then it will be possible to recognize even cursive script also.

The recognized characters are stored in the text file. We can add words to the sound files and invoke
them through the program, so that the recognized words can be read aloud. Thus we can make the
computer read the handwritten document.


Block Diagram:



                                               Stored
                                             Characters
                                                                                Grayscale
                                                                                Conversion
      Touch Pad
                                                                                  Filtering
On Line / Real Time Input
                                                  PC                              Thinning


                                                                                  Feature
 Scanned Document                                                                Extraction


   Off Line Input                                                                 Pattern
                                                                                Recognition

                                                                                 Recognition
                                                                                   Output


                                                                             Software Domain


                                 Fig. Block Diagram for OCR




PES’s Modern College of Engineering, Shivajinagar, Pune-5                                     Page 3
Multilingual OCR                                                                       Introduction




1. Introduction

1.1 Problem Statement:

         To develop an OCR for online/offline handwriting recognition.


1.2 Project Scope:

         This system can be used by multiple users. We can do this by improving our software
for recognizing the handwriting of more than one user. Also if we can take the stroke
information and give it to our system, then it will be possible to recognize even cursive script
also.
         The recognized characters are stored in the text file. We can add words to the sound
files and invoke them through the program, so that the recognized words can be read aloud.
Thus we can make the computer read the handwritten document.


1.3 Project Objectives:

         This software is for recognizing handwritten characters and creating profile for each
particular user. This software supports various languages (except Marathi and Hindi). The
software can be used for security purposes and for creating font of user’s handwriting.


1.4 Assumptions and dependencies:

    1. “Multilingual OCR” requires input image with a black background and white fore color.
         For this purpose, the software has Invert Image option, which will convert the image in
         proper format.
    2.     System is designed only for Windows OS. It may not work for other operating system.
    3.     System will recognize any set of characters provided that they are written in legible manner.
    4. The characters must be properly separated for greater accuracy.
    5. The input given to the system must be in a Bitmap, png, jpeg, jpg file.
    6. There should be constant distance between characters and rows to ensure accuracy.

PES’s Modern College of Engineering, Shivajinagar, Pune-5                                    Page 4
Multilingual OCR                                                                    Introduction




1.5 Applications of OCR:

•   Practical Applications:
       In recent years, OCR (Optical Character Recognition) technology has been applied
throughout the entire spectrum of industries, revolutionizing the document management
process. OCR has enabled scanned documents to become more than just image files, turning
into fully searchable documents with text content that is recognized by computers. With the
help of OCR, people no longer need to manually retype important documents when entering
them into electronic databases. Instead, OCR extracts relevant information and enters it
automatically. The result is accurate, efficient information processing in less time.


•   Banking:
    The uses of OCR vary across different fields. One widely known application is in
banking, where OCR is used to process checks without human involvement. A check can be
inserted into a machine, the writing on it is scanned instantly, and the correct amount of
money is transferred. This technology has nearly been perfected for printed checks, and is
fairly accurate for handwritten checks as well, though it occasionally requires manual
confirmation. Overall, this reduces wait times in many banks.


•   Legal:
    In the legal industry, there has also been a significant movement to digitize paper
documents. In order to save space and eliminate the need to sift through boxes of paper files,
documents are being scanned and entered into computer databases. OCR further simplifies
the process by making documents text-searchable, so that they are easier to locate and work
with once in the database. Legal professionals now have fast, easy access to a huge library of
documents in electronic format, which they can find simply by typing in a few keywords.


•   Healthcare:
    Healthcare has also seen an increase in the use of OCR technology to process
paperwork. Healthcare professionals always have to deal with large volumes of forms for
each patient, including insurance forms as well as general health forms. To keep up with all
of this information, it is useful to input relevant data into an electronic database that can be

PES’s Modern College of Engineering, Shivajinagar, Pune-5                                Page 5
Multilingual OCR                                                                   Introduction


accessed as necessary. Form processing tools, powered by OCR, are able to extract
information from forms and put it into databases, so that every patient's data is promptly
recorded. As a result, healthcare providers can focus on delivering the best possible service to
every patient.


•   OCR in Other Industries:
    OCR is widely used in many other fields, including education, finance, and government
agencies. OCR has made countless texts available online, saving money for students and
allowing knowledge to be shared. Invoice imaging applications are used in many businesses
to keep track of financial records and prevent a backlog of payments from piling up. In
government agencies and independent organizations, OCR simplifies data collection and
analysis, among other processes. As the technology continues to develop, more and more
applications are found for OCR technology, including increased use of handwriting
recognition. Furthermore, other technologies related to OCR, such as barcode recognition, are
used daily in retail and other industries. To learn more about OCR solutions for your office,
you can download a free trial of Maestro Recognition Server, CVISION's OCR toolkit, or
Trapeze, our automated form-processing solution.




PES’s Modern College of Engineering, Shivajinagar, Pune-5                               Page 6
Multilingual OCR                                                                  Introduction




1.6 Literature Survey:

       Now a days, there are software’s for recognizing only the English characters. It
recognizes and stores the characters in ASCII format.

       Optical character recognition, usually abbreviated to OCR, is the mechanical or
electronic translation of images of handwritten, typewritten or printed text (usually captured
by a scanner) into machine-editable text.

       OCR is a field of research in pattern recognition, artificial intelligence and machine
vision. Though academic research in the field continues, the focus on OCR has shifted to
implementation of proven techniques. Optical character recognition (using optical techniques
such as mirrors and lenses) and digital character recognition (using scanners and computer
algorithms) were originally considered separate fields. Because very few applications survive
that use true optical techniques, the OCR term has now been broadened to include digital
image processing as well.

       Early systems required training (the provision of known samples of each character) to
read a specific font. "Intelligent" systems with a high degree of recognition accuracy for most
fonts are now common. Some systems are even capable of reproducing formatted output that
closely approximates the original scanned page including images, columns and other non-
textual components.

         In about 1965, Reader's Digest and RCA collaborated to build an OCR Document
reader designed to digitize the serial numbers on Reader's Digest coupons returned from
advertisements. The fonts used on the documents were printed by an RCA Drum printer
using the OCR-A font. The reader was connected directly to an RCA 301 computer (one of
the first solid state computers). This reader was followed by a specialised document reader
installed at TWA where the reader processed Airline Ticket stock. The readers processed
documents at a rate of 1,500 documents per minute, and checked each document, rejecting
those it was not able to process correctly. The product became part of the RCA product line
as a reader designed to process "Turn around Documents" such as those utility and insurance
bills returned with payments.

       The United States Postal Service has been using OCR machines to sort mail since
1965 based on technology devised primarily by the prolific inventor Jacob Rabinow. The first

PES’s Modern College of Engineering, Shivajinagar, Pune-5                               Page 7
Multilingual OCR                                                                 Introduction


use of OCR in Europe was by the British General Post Office (GPO). In 1965 it began
planning an entire banking system, the National Giro, using OCR technology, a process that
revolutionized bill payment systems in the UK. Canada Post has been using OCR systems
since 1971.

        In 1974 Ray Kurzweil started the company Kurzweil Computer Products, Inc. and led
development of the first omni-font optical character recognition system — a computer
program capable of recognizing text printed in any normal font. He decided that the best
application of this technology would be to create a reading machine for the blind, which
would allow blind people to have a computer read text to them out loud. This device required
the invention of two enabling technologies — the CCD flatbed scanner and the text-to-speech
synthesizer.

        In 1978 Kurzweil Computer Products began selling a commercial version of the
optical character recognition computer program. LexisNexis was one of the first customers,
and bought the program to upload paper legal and news documents onto its nascent online
databases.

        1992-1996 Commissioned by the U.S. Department of Energy (DOE), Information
Science Research Institute (ISRI) conducted the most authoritative of the Annual Test of
OCR Accuracy for 5 consecutive years in the mid-90s. Information Science Research
Institute (ISRI) is a research and development unit of University of Nevada, Las Vegas. ISRI
was established in 1990 with funding from the U.S. Department of Energy. Its mission is to
foster the improvement of automated technologies for understanding machine printed
documents.

       One study based on recognition of 19th and early 20th century newspaper pages
concluded that character-by-character OCR accuracy for commercial OCR software varied
from 71% to 98%; total accuracy can only be achieved by human review. Other areas—
including recognition of hand printing, cursive handwriting, and printed text in other scripts
(especially those East Asian language characters which have many strokes for a single
character)—are still the subject of active research.




PES’s Modern College of Engineering, Shivajinagar, Pune-5                              Page 8
Multilingual OCR                                                               Introduction




3.5 User Characteristics:

   •   User should be provided proper training to operate whole system
   •   User must have the basic knowledge of computers.
   •   User must know the handling of different instruments e.g. scanner, mouse etc.


3.6 Specific Requirement:

3.6.1 User Interfaces

   The user will interact with system

   •   Depending on type of user required output will be generated
   •   By writing directly on the text area provided on the GUI.
   •   By first writing in an image file and then giving as input to the system.
   •   The user will be asked to save the text generated in a .TXT file.


3.6.2 Hardware Requirements

   •    Intel Pentium 2 Processor
   •    CPU minimum 500MHZ
   •    Minimum 64 MB of RAM
   •   Mouse
   •   Keyboard
   •   Scanner
   •   Monitor




PES’s Modern College of Engineering, Shivajinagar, Pune-5                              Page 9
Multilingual OCR                                                               Introduction


3.6.3 Software Requirements

   •   Microsoft Windows 98/NT/XP/2000
   •   MINIMUM JDK 1.4
   •   JAVA 2D API
   •   JAVA Advanced Imaging API
   •   JAVA Image I/O API
   •   JAVA Media Frameworks


3.6.4 Performance Requirements:
   •   Accuracy: The extent to which a program satisfies its specification and fulfils the
       customer mission objective.
   •   Reliability: The extent to which a program can be expected to perform its
       intended function with require precision.
   •   Speed: The time require for a program to perform the given task.
   •   Maintainability: The efforts required to locate and fix an error in the program.
   •   Portability: The efforts required to transform a program from one hardware and/or
       software system environment to another.
   •   Availability: The system is expected to be available around the clock as it will be
       further used to analyze blood slides at the installed site.

3.6.5 Functional Requirements:
    1. For static OCR, software should provide a way to load scanned document for
       recognition purpose.
    2. If scanned image is not having black background and white foreground, facility
       for image inversion should be provided by software.
    3. Software should process the image and extract characters.
    4. User should have facility to save extracted data in format of his interest.
    5. For dynamic OCR, the software should recognize characters drawn by user
       simultaneously.



PES’s Modern College of Engineering, Shivajinagar, Pune-5                            Page 10
Multilingual OCR                                                             Introduction


    6. If software is not giving proper output, there should be a way for training the
        database of software.

3.6.6 Other Requirements:

    •   The input image is to be in the bitmap file format
    •   In case of scanned image, a high quality scanner as well as good paper quality is
        required. The resolution of the scanner should be set to a minimum of 300 dots per inch
        (dpi).
    •   During scanning a maximum tilt of up to 20º can be corrected.
    •   In case of discontinuities in the hand written characters a maximum gap of up to 3 pixel
        wide thickness is tolerable.
    •   A first order median filter is used.




3.7 Position Statement:

        Optical Handwriting recognition is used most often used to describe
the ability of a computer to translate human writing into text. This system can
be used for: -
                Railway Reservation Forms
                Libraries
                Government Agencies
                School/College Admission Forms
                Make other Lengthy Documents available Electronically




PES’s Modern College of Engineering, Shivajinagar, Pune-5                         Page 11

More Related Content

What's hot

optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition systemVijay Apurva
 
Optical character recognition (ocr) ppt
Optical character recognition (ocr) pptOptical character recognition (ocr) ppt
Optical character recognition (ocr) pptDeijee Kalita
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCRxsconfused
 
Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Karan Panjwani
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechKushagraChadha1
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using PythonYogeshIJTSRD
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) Systemiosrjce
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting RecognitionBindu Karki
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionNaiyan Noor
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Vidyut Singhania
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHarshana Madusanka Jayamaha
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognitionBIJIT GHOSH
 

What's hot (20)

optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
 
Optical character recognition (ocr) ppt
Optical character recognition (ocr) pptOptical character recognition (ocr) ppt
Optical character recognition (ocr) ppt
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCR
 
Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Optical Character Recognition( OCR )
Optical Character Recognition( OCR )
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTech
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) System
 
Text reader [OCR]
Text reader [OCR]Text reader [OCR]
Text reader [OCR]
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
Basics of-optical-character-recognition
Basics of-optical-character-recognitionBasics of-optical-character-recognition
Basics of-optical-character-recognition
 
OCR Text Extraction
OCR Text ExtractionOCR Text Extraction
OCR Text Extraction
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
 
ocr
ocrocr
ocr
 
Handwritten Character Recognition
Handwritten Character RecognitionHandwritten Character Recognition
Handwritten Character Recognition
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural network
 
Computer vision
Computer visionComputer vision
Computer vision
 
Automatic handwriting recognition
Automatic handwriting recognitionAutomatic handwriting recognition
Automatic handwriting recognition
 
Computer vision
Computer visionComputer vision
Computer vision
 

Viewers also liked

Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
Text extraction From Digital image
Text extraction From Digital imageText extraction From Digital image
Text extraction From Digital imageKaushik Godhani
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Chiranjeevi Adi
 
Nuance-ACEDS May 21 OCR Webcast
Nuance-ACEDS May 21 OCR Webcast Nuance-ACEDS May 21 OCR Webcast
Nuance-ACEDS May 21 OCR Webcast Robbie Hilson
 
Machine learning
Machine learningMachine learning
Machine learningAmit Gupta
 
Scalability in Model Checking through Relational Databases
Scalability in Model Checking through Relational DatabasesScalability in Model Checking through Relational Databases
Scalability in Model Checking through Relational DatabasesCSCJournals
 
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...CSCJournals
 
Stages of image processing
Stages of image processingStages of image processing
Stages of image processingAmal Mp
 
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...janningr
 
Image Enhancement by Image Fusion for Crime Investigation
Image Enhancement by Image Fusion for Crime InvestigationImage Enhancement by Image Fusion for Crime Investigation
Image Enhancement by Image Fusion for Crime InvestigationCSCJournals
 
Fourth Dimension Level 1 By Dr.Moiz Hussain
Fourth Dimension Level 1  By Dr.Moiz HussainFourth Dimension Level 1  By Dr.Moiz Hussain
Fourth Dimension Level 1 By Dr.Moiz HussainEhtesham Mirxa
 
final year project_leaf recognition
final year project_leaf recognitionfinal year project_leaf recognition
final year project_leaf recognitionNupur Aggarwal
 
Matlab Image Enhancement Techniques
Matlab Image Enhancement TechniquesMatlab Image Enhancement Techniques
Matlab Image Enhancement Techniquesmatlab Content
 
An OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontAn OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontDr. Syed Hassan Amin
 
Off-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative SurveyOff-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative Surveyidescitation
 
Matlab and Image Processing Workshop-SKERG
Matlab and Image Processing Workshop-SKERG Matlab and Image Processing Workshop-SKERG
Matlab and Image Processing Workshop-SKERG Sulaf Almagooshi
 

Viewers also liked (18)

Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
OCR
OCROCR
OCR
 
Text extraction From Digital image
Text extraction From Digital imageText extraction From Digital image
Text extraction From Digital image
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Nuance-ACEDS May 21 OCR Webcast
Nuance-ACEDS May 21 OCR Webcast Nuance-ACEDS May 21 OCR Webcast
Nuance-ACEDS May 21 OCR Webcast
 
Machine learning
Machine learningMachine learning
Machine learning
 
Image processing
Image processingImage processing
Image processing
 
Scalability in Model Checking through Relational Databases
Scalability in Model Checking through Relational DatabasesScalability in Model Checking through Relational Databases
Scalability in Model Checking through Relational Databases
 
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...
Usage of Shape From Focus Method For 3D Shape Recovery And Identification of ...
 
Stages of image processing
Stages of image processingStages of image processing
Stages of image processing
 
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...
Feature Analysis for Affect Recognition Supporting Task Sequencing in Adaptiv...
 
Image Enhancement by Image Fusion for Crime Investigation
Image Enhancement by Image Fusion for Crime InvestigationImage Enhancement by Image Fusion for Crime Investigation
Image Enhancement by Image Fusion for Crime Investigation
 
Fourth Dimension Level 1 By Dr.Moiz Hussain
Fourth Dimension Level 1  By Dr.Moiz HussainFourth Dimension Level 1  By Dr.Moiz Hussain
Fourth Dimension Level 1 By Dr.Moiz Hussain
 
final year project_leaf recognition
final year project_leaf recognitionfinal year project_leaf recognition
final year project_leaf recognition
 
Matlab Image Enhancement Techniques
Matlab Image Enhancement TechniquesMatlab Image Enhancement Techniques
Matlab Image Enhancement Techniques
 
An OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq FontAn OCR System for recognition of Urdu text in Nastaliq Font
An OCR System for recognition of Urdu text in Nastaliq Font
 
Off-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative SurveyOff-line English Character Recognition: A Comparative Survey
Off-line English Character Recognition: A Comparative Survey
 
Matlab and Image Processing Workshop-SKERG
Matlab and Image Processing Workshop-SKERG Matlab and Image Processing Workshop-SKERG
Matlab and Image Processing Workshop-SKERG
 

Similar to Multilingual OCR Introduction and Applications

OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptxNeerajBudhlakoti
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functionsprithvi764
 
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCRA SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCRIRJET Journal
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?ARC Document Solutions
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsMonika Renate Barget
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization wordDhana K
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...ijiert bestjournal
 
Colorful Modern Group Project Creative Presentation.pdf
Colorful Modern Group Project Creative Presentation.pdfColorful Modern Group Project Creative Presentation.pdf
Colorful Modern Group Project Creative Presentation.pdfImmanImman6
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptxDanielJDanso
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PIijtsrd
 
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREOPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREIRJET Journal
 

Similar to Multilingual OCR Introduction and Applications (20)

D017222226
D017222226D017222226
D017222226
 
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functions
 
Bj35343348
Bj35343348Bj35343348
Bj35343348
 
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCRA SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
A SMART LANGUAGE TRANSLATION TECHNIQUE USING OCR
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutions
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
Ocr 1
Ocr 1Ocr 1
Ocr 1
 
O45018291
O45018291O45018291
O45018291
 
En31919926
En31919926En31919926
En31919926
 
Colorful Modern Group Project Creative Presentation.pdf
Colorful Modern Group Project Creative Presentation.pdfColorful Modern Group Project Creative Presentation.pdf
Colorful Modern Group Project Creative Presentation.pdf
 
50120130406005
5012013040600550120130406005
50120130406005
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
 
131 133
131 133131 133
131 133
 
OCR, optical character reader
OCR, optical character readerOCR, optical character reader
OCR, optical character reader
 
PB.docx
PB.docxPB.docx
PB.docx
 
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCAREOPTICAL CHARACTER RECOGNITION IN HEALTHCARE
OPTICAL CHARACTER RECOGNITION IN HEALTHCARE
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 

Multilingual OCR Introduction and Applications

  • 1. Multilingual OCR Introduction ABSTRACT The aim of the project ‘Multilingual OCR’ is to develop OCR software for online/offline handwriting recognition. OCR is an Optical character recognition and is the mechanical or electronic translation of images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text. OCR is a field of research in pattern recognition, artificial intelligence and machine vision. Handwritten recognition is used most often to describe the ability of a computer to translate human writing into text. This may take in one of the two ways, either by scanning of written text or by writing directly on peripheral input devices. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 1
  • 2. Multilingual OCR Introduction Aim: To develop an OCR for online/offline handwriting recognition. Description: We are going to implement the software which will recognize the characters from online or offline document (in image format) and use it as individual user profile. Here we are developing OCR which will recognize handwritten English characters. OCR is an Optical character recognition and is the mechanical or electronic translation of images of handwritten or typewritten text (usually captured by a scanner) into machine- editable text. OCR is a field of research in pattern recognition, artificial intelligence and machine vision. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 2
  • 3. Multilingual OCR Introduction Scope of the project: This system can be used by multiple users. We can do this by improving our software for recognizing the handwriting of more than one user. Also if we can take the stroke information and give it to our system, then it will be possible to recognize even cursive script also. The recognized characters are stored in the text file. We can add words to the sound files and invoke them through the program, so that the recognized words can be read aloud. Thus we can make the computer read the handwritten document. Block Diagram: Stored Characters Grayscale Conversion Touch Pad Filtering On Line / Real Time Input PC Thinning Feature Scanned Document Extraction Off Line Input Pattern Recognition Recognition Output Software Domain Fig. Block Diagram for OCR PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 3
  • 4. Multilingual OCR Introduction 1. Introduction 1.1 Problem Statement: To develop an OCR for online/offline handwriting recognition. 1.2 Project Scope: This system can be used by multiple users. We can do this by improving our software for recognizing the handwriting of more than one user. Also if we can take the stroke information and give it to our system, then it will be possible to recognize even cursive script also. The recognized characters are stored in the text file. We can add words to the sound files and invoke them through the program, so that the recognized words can be read aloud. Thus we can make the computer read the handwritten document. 1.3 Project Objectives: This software is for recognizing handwritten characters and creating profile for each particular user. This software supports various languages (except Marathi and Hindi). The software can be used for security purposes and for creating font of user’s handwriting. 1.4 Assumptions and dependencies: 1. “Multilingual OCR” requires input image with a black background and white fore color. For this purpose, the software has Invert Image option, which will convert the image in proper format. 2. System is designed only for Windows OS. It may not work for other operating system. 3. System will recognize any set of characters provided that they are written in legible manner. 4. The characters must be properly separated for greater accuracy. 5. The input given to the system must be in a Bitmap, png, jpeg, jpg file. 6. There should be constant distance between characters and rows to ensure accuracy. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 4
  • 5. Multilingual OCR Introduction 1.5 Applications of OCR: • Practical Applications: In recent years, OCR (Optical Character Recognition) technology has been applied throughout the entire spectrum of industries, revolutionizing the document management process. OCR has enabled scanned documents to become more than just image files, turning into fully searchable documents with text content that is recognized by computers. With the help of OCR, people no longer need to manually retype important documents when entering them into electronic databases. Instead, OCR extracts relevant information and enters it automatically. The result is accurate, efficient information processing in less time. • Banking: The uses of OCR vary across different fields. One widely known application is in banking, where OCR is used to process checks without human involvement. A check can be inserted into a machine, the writing on it is scanned instantly, and the correct amount of money is transferred. This technology has nearly been perfected for printed checks, and is fairly accurate for handwritten checks as well, though it occasionally requires manual confirmation. Overall, this reduces wait times in many banks. • Legal: In the legal industry, there has also been a significant movement to digitize paper documents. In order to save space and eliminate the need to sift through boxes of paper files, documents are being scanned and entered into computer databases. OCR further simplifies the process by making documents text-searchable, so that they are easier to locate and work with once in the database. Legal professionals now have fast, easy access to a huge library of documents in electronic format, which they can find simply by typing in a few keywords. • Healthcare: Healthcare has also seen an increase in the use of OCR technology to process paperwork. Healthcare professionals always have to deal with large volumes of forms for each patient, including insurance forms as well as general health forms. To keep up with all of this information, it is useful to input relevant data into an electronic database that can be PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 5
  • 6. Multilingual OCR Introduction accessed as necessary. Form processing tools, powered by OCR, are able to extract information from forms and put it into databases, so that every patient's data is promptly recorded. As a result, healthcare providers can focus on delivering the best possible service to every patient. • OCR in Other Industries: OCR is widely used in many other fields, including education, finance, and government agencies. OCR has made countless texts available online, saving money for students and allowing knowledge to be shared. Invoice imaging applications are used in many businesses to keep track of financial records and prevent a backlog of payments from piling up. In government agencies and independent organizations, OCR simplifies data collection and analysis, among other processes. As the technology continues to develop, more and more applications are found for OCR technology, including increased use of handwriting recognition. Furthermore, other technologies related to OCR, such as barcode recognition, are used daily in retail and other industries. To learn more about OCR solutions for your office, you can download a free trial of Maestro Recognition Server, CVISION's OCR toolkit, or Trapeze, our automated form-processing solution. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 6
  • 7. Multilingual OCR Introduction 1.6 Literature Survey: Now a days, there are software’s for recognizing only the English characters. It recognizes and stores the characters in ASCII format. Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text. OCR is a field of research in pattern recognition, artificial intelligence and machine vision. Though academic research in the field continues, the focus on OCR has shifted to implementation of proven techniques. Optical character recognition (using optical techniques such as mirrors and lenses) and digital character recognition (using scanners and computer algorithms) were originally considered separate fields. Because very few applications survive that use true optical techniques, the OCR term has now been broadened to include digital image processing as well. Early systems required training (the provision of known samples of each character) to read a specific font. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common. Some systems are even capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non- textual components. In about 1965, Reader's Digest and RCA collaborated to build an OCR Document reader designed to digitize the serial numbers on Reader's Digest coupons returned from advertisements. The fonts used on the documents were printed by an RCA Drum printer using the OCR-A font. The reader was connected directly to an RCA 301 computer (one of the first solid state computers). This reader was followed by a specialised document reader installed at TWA where the reader processed Airline Ticket stock. The readers processed documents at a rate of 1,500 documents per minute, and checked each document, rejecting those it was not able to process correctly. The product became part of the RCA product line as a reader designed to process "Turn around Documents" such as those utility and insurance bills returned with payments. The United States Postal Service has been using OCR machines to sort mail since 1965 based on technology devised primarily by the prolific inventor Jacob Rabinow. The first PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 7
  • 8. Multilingual OCR Introduction use of OCR in Europe was by the British General Post Office (GPO). In 1965 it began planning an entire banking system, the National Giro, using OCR technology, a process that revolutionized bill payment systems in the UK. Canada Post has been using OCR systems since 1971. In 1974 Ray Kurzweil started the company Kurzweil Computer Products, Inc. and led development of the first omni-font optical character recognition system — a computer program capable of recognizing text printed in any normal font. He decided that the best application of this technology would be to create a reading machine for the blind, which would allow blind people to have a computer read text to them out loud. This device required the invention of two enabling technologies — the CCD flatbed scanner and the text-to-speech synthesizer. In 1978 Kurzweil Computer Products began selling a commercial version of the optical character recognition computer program. LexisNexis was one of the first customers, and bought the program to upload paper legal and news documents onto its nascent online databases. 1992-1996 Commissioned by the U.S. Department of Energy (DOE), Information Science Research Institute (ISRI) conducted the most authoritative of the Annual Test of OCR Accuracy for 5 consecutive years in the mid-90s. Information Science Research Institute (ISRI) is a research and development unit of University of Nevada, Las Vegas. ISRI was established in 1990 with funding from the U.S. Department of Energy. Its mission is to foster the improvement of automated technologies for understanding machine printed documents. One study based on recognition of 19th and early 20th century newspaper pages concluded that character-by-character OCR accuracy for commercial OCR software varied from 71% to 98%; total accuracy can only be achieved by human review. Other areas— including recognition of hand printing, cursive handwriting, and printed text in other scripts (especially those East Asian language characters which have many strokes for a single character)—are still the subject of active research. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 8
  • 9. Multilingual OCR Introduction 3.5 User Characteristics: • User should be provided proper training to operate whole system • User must have the basic knowledge of computers. • User must know the handling of different instruments e.g. scanner, mouse etc. 3.6 Specific Requirement: 3.6.1 User Interfaces The user will interact with system • Depending on type of user required output will be generated • By writing directly on the text area provided on the GUI. • By first writing in an image file and then giving as input to the system. • The user will be asked to save the text generated in a .TXT file. 3.6.2 Hardware Requirements • Intel Pentium 2 Processor • CPU minimum 500MHZ • Minimum 64 MB of RAM • Mouse • Keyboard • Scanner • Monitor PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 9
  • 10. Multilingual OCR Introduction 3.6.3 Software Requirements • Microsoft Windows 98/NT/XP/2000 • MINIMUM JDK 1.4 • JAVA 2D API • JAVA Advanced Imaging API • JAVA Image I/O API • JAVA Media Frameworks 3.6.4 Performance Requirements: • Accuracy: The extent to which a program satisfies its specification and fulfils the customer mission objective. • Reliability: The extent to which a program can be expected to perform its intended function with require precision. • Speed: The time require for a program to perform the given task. • Maintainability: The efforts required to locate and fix an error in the program. • Portability: The efforts required to transform a program from one hardware and/or software system environment to another. • Availability: The system is expected to be available around the clock as it will be further used to analyze blood slides at the installed site. 3.6.5 Functional Requirements: 1. For static OCR, software should provide a way to load scanned document for recognition purpose. 2. If scanned image is not having black background and white foreground, facility for image inversion should be provided by software. 3. Software should process the image and extract characters. 4. User should have facility to save extracted data in format of his interest. 5. For dynamic OCR, the software should recognize characters drawn by user simultaneously. PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 10
  • 11. Multilingual OCR Introduction 6. If software is not giving proper output, there should be a way for training the database of software. 3.6.6 Other Requirements: • The input image is to be in the bitmap file format • In case of scanned image, a high quality scanner as well as good paper quality is required. The resolution of the scanner should be set to a minimum of 300 dots per inch (dpi). • During scanning a maximum tilt of up to 20º can be corrected. • In case of discontinuities in the hand written characters a maximum gap of up to 3 pixel wide thickness is tolerable. • A first order median filter is used. 3.7 Position Statement: Optical Handwriting recognition is used most often used to describe the ability of a computer to translate human writing into text. This system can be used for: -  Railway Reservation Forms  Libraries  Government Agencies  School/College Admission Forms  Make other Lengthy Documents available Electronically PES’s Modern College of Engineering, Shivajinagar, Pune-5 Page 11