Optical Character Recognition System for Urdu (Naskh Font)Using Pattern Match...CSCJournals
The offline optical character recognition (OCR) for different languages has been developed over the recent years. Since 1965, the US postal service has been using this system for automating their services. The range of the applications under this area is increasing day by day, due to its utility in almost major areas of government as well as private sector. This technique has been very useful in making paper free environment in many major organizations as far as the backup of their previous file record is concerned. Our this system has been proposed for the Offline Character Recognition for Isolated Characters of Urdu language, as Urdu language forms words by combining Isolated Characters. Urdu is a cursive language, having connected characters making words. The major area of utility for Urdu OCR will be digitizing of a lot of literature related material already stocked in libraries. Urdu language is famous and spoken in more than 3 big countries including Pakistan, India and Bangladesh. A lot of work has been done in Urdu poetry and literature up to the recent century. Creation of OCR for Urdu language will make an important role in converting all those work from physical libraries to electronic libraries. Most of the stuff already placed on internet is in the form of images having text, which took a lot of space to transfer and even read online. So the need of an Urdu OCR is a must. The system is of training system type. It consists of the image preprocessing, line and character segmentation, creation of xml file for training purpose. While Recognition system includes taking xml file, the image to be recognized, segment it and creation of chain codes for character images and matching with already stored in xml file. The system has been implemented and it has 89% recognition accuracy with a 15 char/sec recognition rate.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Angular Symmetric Axis Constellation Model for Off-line Odia Handwritten Char...IJAAS Team
Optical character recognition is one of the emerging research topics in the field of image processing, and it has extensive area of application in pattern recognition. Odia handwritten script is the most research concern area because it has eldest and most likable language in the state of odisha, India. Odia character is a usually handwritten, which was generally occupied by scanner into machine readable form. In this regard several recognition technique have been evolved for variance kind of languages but writing pattern of odia character is just like as curve appearance; Hence it is more difficult for recognition. In this article we have presented the novel approach for Odia character recognition based on the different angle based symmetric axis feature extraction technique which gives high accuracy of recognition pattern. This empirical model generates a unique angle based boundary points on every skeletonised character images. These points are interconnected with each other in order to extract row and column symmetry axis. We extracted feature matrix having mean distance of row, mean angle of row, mean distance of column and mean angle of column from centre of the image to midpoint of the symmetric axis respectively. The system uses a 10 fold validation to the random forest (RF) classifier and SVM for feature matrix. We have considered the standard database on 200 images having each of 47 Odia character and 10 Odia numeric for simulation. As we have noted outcome of simulation of SVM and RF yields 96.3% and 98.2% accuracy rate on NIT Rourkela Odia character database and 88.9% and 93.6% from ISI Kolkata Odia numerical database.
Optical character recognition is one of the emerging research topics in the field of image processing, and it has extensive area of application in pattern recognition. Odia handwritten script is the most research concern area because it has eldest and most likable language in the state of odisha, India. Odia character is a usually handwritten, which was generally occupied by scanner into machine readable form. In this regard several recognition technique have been evolved for variance kind of languages but writing pattern of odia character is just like as curve appearance; Hence it is more difficult for recognition. In this article we have presented the novel approach for Odia character recognition based on the different angle based symmetric axis feature extraction technique which gives high accuracy of recognition pattern. This empirical model generates a unique angle based boundary points on every skeletonised character images. These points are interconnected with each other in order to extract row and column symmetry axis. We extracted feature matrix having mean distance of row, mean angle of row, mean distance of column and mean angle of column from centre of the image to midpoint of the symmetric axis respectively. The system uses a 10 fold validation to the random forest (RF) classifier and SVM for feature matrix. We have considered the standard database on 200 images having each of 47 Odia character and 10 Odia numeric for simulation. As we have noted outcome of simulation of SVM and RF yields 96.3% and 98.2% accuracy rate on NIT Rourkela Odia character database and 88.9% and 93.6% from ISI Kolkata Odia numerical database.
Optical Character Recognition System for Urdu (Naskh Font)Using Pattern Match...CSCJournals
The offline optical character recognition (OCR) for different languages has been developed over the recent years. Since 1965, the US postal service has been using this system for automating their services. The range of the applications under this area is increasing day by day, due to its utility in almost major areas of government as well as private sector. This technique has been very useful in making paper free environment in many major organizations as far as the backup of their previous file record is concerned. Our this system has been proposed for the Offline Character Recognition for Isolated Characters of Urdu language, as Urdu language forms words by combining Isolated Characters. Urdu is a cursive language, having connected characters making words. The major area of utility for Urdu OCR will be digitizing of a lot of literature related material already stocked in libraries. Urdu language is famous and spoken in more than 3 big countries including Pakistan, India and Bangladesh. A lot of work has been done in Urdu poetry and literature up to the recent century. Creation of OCR for Urdu language will make an important role in converting all those work from physical libraries to electronic libraries. Most of the stuff already placed on internet is in the form of images having text, which took a lot of space to transfer and even read online. So the need of an Urdu OCR is a must. The system is of training system type. It consists of the image preprocessing, line and character segmentation, creation of xml file for training purpose. While Recognition system includes taking xml file, the image to be recognized, segment it and creation of chain codes for character images and matching with already stored in xml file. The system has been implemented and it has 89% recognition accuracy with a 15 char/sec recognition rate.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Angular Symmetric Axis Constellation Model for Off-line Odia Handwritten Char...IJAAS Team
Optical character recognition is one of the emerging research topics in the field of image processing, and it has extensive area of application in pattern recognition. Odia handwritten script is the most research concern area because it has eldest and most likable language in the state of odisha, India. Odia character is a usually handwritten, which was generally occupied by scanner into machine readable form. In this regard several recognition technique have been evolved for variance kind of languages but writing pattern of odia character is just like as curve appearance; Hence it is more difficult for recognition. In this article we have presented the novel approach for Odia character recognition based on the different angle based symmetric axis feature extraction technique which gives high accuracy of recognition pattern. This empirical model generates a unique angle based boundary points on every skeletonised character images. These points are interconnected with each other in order to extract row and column symmetry axis. We extracted feature matrix having mean distance of row, mean angle of row, mean distance of column and mean angle of column from centre of the image to midpoint of the symmetric axis respectively. The system uses a 10 fold validation to the random forest (RF) classifier and SVM for feature matrix. We have considered the standard database on 200 images having each of 47 Odia character and 10 Odia numeric for simulation. As we have noted outcome of simulation of SVM and RF yields 96.3% and 98.2% accuracy rate on NIT Rourkela Odia character database and 88.9% and 93.6% from ISI Kolkata Odia numerical database.
Optical character recognition is one of the emerging research topics in the field of image processing, and it has extensive area of application in pattern recognition. Odia handwritten script is the most research concern area because it has eldest and most likable language in the state of odisha, India. Odia character is a usually handwritten, which was generally occupied by scanner into machine readable form. In this regard several recognition technique have been evolved for variance kind of languages but writing pattern of odia character is just like as curve appearance; Hence it is more difficult for recognition. In this article we have presented the novel approach for Odia character recognition based on the different angle based symmetric axis feature extraction technique which gives high accuracy of recognition pattern. This empirical model generates a unique angle based boundary points on every skeletonised character images. These points are interconnected with each other in order to extract row and column symmetry axis. We extracted feature matrix having mean distance of row, mean angle of row, mean distance of column and mean angle of column from centre of the image to midpoint of the symmetric axis respectively. The system uses a 10 fold validation to the random forest (RF) classifier and SVM for feature matrix. We have considered the standard database on 200 images having each of 47 Odia character and 10 Odia numeric for simulation. As we have noted outcome of simulation of SVM and RF yields 96.3% and 98.2% accuracy rate on NIT Rourkela Odia character database and 88.9% and 93.6% from ISI Kolkata Odia numerical database.
An Optical Character Recognition for Handwritten Devanagari ScriptIJERA Editor
Optical Character Recognition is process of recognition of character from scanned document and lots of OCR now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters . There are no sufficient number of work on Indian language script like Devanagari so this paper present a review on optical character recognition on handwritten Devanagari script
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESijnlc
Large amount of information is lying dormant in historical documents and manuscripts. This information would go futile if not stored in digital form. Searching some relevant information from these scanned images would ideally require converting these document images to text form by doing optical character
recognition (OCR). For indigenous scripts of India, there are very few OCRs that can successfully recognize printed text images of varying quality, size, style and font. An alternate approach using word spotting can be effective to access large collections of document images. We propose a word spotting
technique based on codes for matching the word images of Devanagari script. The shape information is utilised for generating integer codes for words in the document image and these codes are matched for final retrieval of relevant documents. The technique is illustrated using Marathi document images.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Script Identification of Text Words from a Tri-Lingual Document Using Voting ...CSCJournals
In a multi script environment, majority of the documents may contain text information printed in more than one script/language forms. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this context, this paper proposes to develop a model to identify and separate text words of Kannada, Hindi and English scripts from a printed tri-lingual document. The proposed method is trained to learn thoroughly the distinct features of each script. The binary tree classifier is used to classify the input text image. Experimentation conducted involved 1500 text words for learning and 1200 text words for testing. Extensive experimentation has been carried out on both manually created data set and scanned data set. The results are very encouraging and prove the efficacy of the proposed model. The average success rate is found to be 99% for manually created data set and 98.5% for data set constructed from scanned document images.
Design and implementation of optical character recognition using template mat...eSAT Journals
Abstract
Optical character recognition (OCR) is an efficient way of converting scanned image into machine code which can further edit. There are variety of methods have been implemented in the field of character recognition. This paper proposes Optical character recognition by using Template Matching. The templates formed, having variety of fonts and size .In this proposed system, Image pre-processing, Feature extraction and classification algorithms have been implemented so as to build an excellent character recognition technique for different scripts .Result of this approach is also discussed in this paper. This system is implemented in Matlab.
Keywords- OCR, Feature Extraction, Classification
SCRIPTS AND NUMERALS IDENTIFICATION FROM PRINTED MULTILINGUAL DOCUMENT IMAGEScscpconf
Identification of scripts from multi-script document is one of the important steps in the design
of an OCR system for successful analysis and recognition. Most optical character recognition
(OCR) systems can recognize at most a few scripts. But for large archives of document images
containing different scripts, there must be some way to automatically categorize these
documents before applying the proper OCR on them. Much work has already been reported in
this area. In the Indian context, though some results have been reported, the task is still at its
infancy. This paper presents a research in the identification of Tamil, English and Hindi
scripts at word level irrespective of their font faces and sizes. It also identifies English
numerals from multilingual document images. The proposed technique performs document
vectorization method which generates vectors from the nine zones segmented over the
characters based on their shape, density and transition features. Script is then determined by
using Rule based classifiers and its sub classifiers containing set of classification rules which
are raised from the vectors. The proposed system identifies scripts from document images
even if it suffers from noise and other kinds of distortions. Results from experiments,
simulations, and human vision encounter that the proposed technique identifies scripts and numerals with minimal pre-processing and high accuracy. In future, this can also be extended for other scripts.
Comparative Analysis of PSO and GA in Geom-Statistical Character Features Sel...IJERA Editor
Online handwriting recognition today has special interest due to increased usage of the hand held devices and it
has become a difficult problem because of the high variability and ambiguity in the character shapes written by
individuals. One major problem encountered by researchers in developing character recognition system is
selection of efficient features (optimal features). In this paper, a feature extraction technique for online character
recognition system was developed using hybrid of geometrical and statistical (Geom-statistical) features. Thus,
through the integration of geometrical and statistical features, insights were gained into new character
properties, since these types of features were considered to be complementary. Several optimization techniques
have been used in literature for feature selection in character recognition such as; Ant Colony Optimization
Algorithm (ACO), Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Simulated Annealing but
comparative analysis of GA and PSO in online character has not been carried out. In this paper, a comparative
analysis of performance was made between the GA and PSO in optimizing the Geom-statistical features in
online character recognition using Modified Optical Backpropagation (MOBP) as classifier. Simulation of the
system was done and carried out on Matlab 7.10a. The results generated show that PSO is a well-accepted
optimization algorithm in selection of optimal features as it outperforms the GA in terms of number of features
selected, training time and recognition accuracy.
DISCRIMINATION OF ENGLISH TO OTHER INDIAN LANGUAGES (KANNADA AND HINDI) FOR O...IJCSEA Journal
India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in English and Telugu script. For Optical Character Recognition (OCR) of such a bilingual document, it is necessary to identify the script before feeding the text words to the OCRs of individual scripts. In this paper, we are introducing a simple and efficient technique of script identification for Kannada, English and Hindi text words of a printed document. The proposed approach is based on the horizontal and vertical projection profile for the discrimination of the three scripts. The feature extraction is done based on the horizontal projection profile of each text words. We analysed 700 different words of Kannada, English and Hindi in order to extract the discrimination features and for the development of knowledge base. We use the horizontal projection profile of each text word and based on the horizontal projection profile we extract the appropriate features. The proposed system is tested on 100 differentdocument images containing more than 1000 text words of each script and a classification rate of 98.25%, 99.25% and 98.87% is achieved for Kannada, English and Hindi respectively.
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...ijaia
Optical Character recognition is the method of digitalization of hand and type written or printed text into
machine-encoded form and is superfluity of the various applications of envision of human’s life. In present
human life OCR has been successfully using in finance, legal, banking, health care and home need
appliances. India is a multi cultural, literature and traditional scripted country. Telugu is the southern
Indian language, it is a syllabic language, symbol script represents a complete syllable and formed with the
conjunct mixed consonants in their representation. Recognition of mixed conjunct consonants is critical
than the normal consonants, because of their variation in written strokes, conjunct maxing with pre and
post level of consonants. This paper proposes the layered approach methodology to recognize the
characters, conjunct consonants, mixed- conjunct consonants and expressed the efficient classification of
the hand written and printed conjunct consonants. This paper implements the Advanced Fuzzy Logic system
controller to take the text in the form of written or printed, collected the text images from the scanned file,
digital camera, Processing the Image with Examine the high intensity of images based on the quality
ration, Extract the image characters depends on the quality then check the character orientation and
alignment then to check the character thickness, base and print ration. The input image characters can
classify into the two ways, first way represents the normal consonants and the second way represents
conjunct consonants. Digitalized image text divided into three layers, the middle layer represents normal
consonants and the top and bottom layer represents mixed conjunct consonants. Here recognition process
starts from middle layer, and then it continues to check the top and bottom layers. The recognition process
treat as conjunct consonants when it can detect any symbolic characters in top and bottom layers of
present base character otherwise treats as normal consonants. The post processing technique applied to all
three layered characters. Post processing of the image: concentrated on the image text readability and
compatibility, if the readability is not process then repeat the process again. In this recognition process
includes slant correction, thinning, normalization, segmentation, feature extraction and classification. In
the process of development of the algorithm the pre-processing, segmentation, character recognition and
post-processing modules were discussed. The main objectives to the development of this paper are: To
develop the classification, identification of deference prototyping for written and printed consonants,
conjunct consonants and symbols based on 3 layered approaches with different measurable area by using
fuzzy logic and to determine suitable features for handwritten character recognition.
ARABIC ONLINE HANDWRITING RECOGNITION USING NEURAL NETWORKijaia
This article presents the development of an Arabic online handwriting recognition system. To develop our
system, we have chosen the neural network approach. It offers solutions for most of the difficulties linked
to Arabic script recognition. We test the approach with our collected databases. This system shows a good
result and it has a high accuracy (98.50% for characters, 96.90% for words).
Signboard Text Translator: A Guide to TouristIJECEIAES
The travelers face troubles in understanding the signboards which are written in local lan- guage. The travelers can rely on smart phone for traveling purposes. Smart phones become most popular in recent years in terms of market value and the number of useful applications to the users. This work intends to build up a web application that can recognize the English content present on signboard pictures captured using a smart phone, translate the content from English to Telugu, and display the translated Telugu text back onto the screen of the phone. Experiments have been conducted on various signboard pictures and the outcomes demonstrate the viability of the proposed approach.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
An Optical Character Recognition for Handwritten Devanagari ScriptIJERA Editor
Optical Character Recognition is process of recognition of character from scanned document and lots of OCR now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters . There are no sufficient number of work on Indian language script like Devanagari so this paper present a review on optical character recognition on handwritten Devanagari script
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESijnlc
Large amount of information is lying dormant in historical documents and manuscripts. This information would go futile if not stored in digital form. Searching some relevant information from these scanned images would ideally require converting these document images to text form by doing optical character
recognition (OCR). For indigenous scripts of India, there are very few OCRs that can successfully recognize printed text images of varying quality, size, style and font. An alternate approach using word spotting can be effective to access large collections of document images. We propose a word spotting
technique based on codes for matching the word images of Devanagari script. The shape information is utilised for generating integer codes for words in the document image and these codes are matched for final retrieval of relevant documents. The technique is illustrated using Marathi document images.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Script Identification of Text Words from a Tri-Lingual Document Using Voting ...CSCJournals
In a multi script environment, majority of the documents may contain text information printed in more than one script/language forms. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this context, this paper proposes to develop a model to identify and separate text words of Kannada, Hindi and English scripts from a printed tri-lingual document. The proposed method is trained to learn thoroughly the distinct features of each script. The binary tree classifier is used to classify the input text image. Experimentation conducted involved 1500 text words for learning and 1200 text words for testing. Extensive experimentation has been carried out on both manually created data set and scanned data set. The results are very encouraging and prove the efficacy of the proposed model. The average success rate is found to be 99% for manually created data set and 98.5% for data set constructed from scanned document images.
Design and implementation of optical character recognition using template mat...eSAT Journals
Abstract
Optical character recognition (OCR) is an efficient way of converting scanned image into machine code which can further edit. There are variety of methods have been implemented in the field of character recognition. This paper proposes Optical character recognition by using Template Matching. The templates formed, having variety of fonts and size .In this proposed system, Image pre-processing, Feature extraction and classification algorithms have been implemented so as to build an excellent character recognition technique for different scripts .Result of this approach is also discussed in this paper. This system is implemented in Matlab.
Keywords- OCR, Feature Extraction, Classification
SCRIPTS AND NUMERALS IDENTIFICATION FROM PRINTED MULTILINGUAL DOCUMENT IMAGEScscpconf
Identification of scripts from multi-script document is one of the important steps in the design
of an OCR system for successful analysis and recognition. Most optical character recognition
(OCR) systems can recognize at most a few scripts. But for large archives of document images
containing different scripts, there must be some way to automatically categorize these
documents before applying the proper OCR on them. Much work has already been reported in
this area. In the Indian context, though some results have been reported, the task is still at its
infancy. This paper presents a research in the identification of Tamil, English and Hindi
scripts at word level irrespective of their font faces and sizes. It also identifies English
numerals from multilingual document images. The proposed technique performs document
vectorization method which generates vectors from the nine zones segmented over the
characters based on their shape, density and transition features. Script is then determined by
using Rule based classifiers and its sub classifiers containing set of classification rules which
are raised from the vectors. The proposed system identifies scripts from document images
even if it suffers from noise and other kinds of distortions. Results from experiments,
simulations, and human vision encounter that the proposed technique identifies scripts and numerals with minimal pre-processing and high accuracy. In future, this can also be extended for other scripts.
Comparative Analysis of PSO and GA in Geom-Statistical Character Features Sel...IJERA Editor
Online handwriting recognition today has special interest due to increased usage of the hand held devices and it
has become a difficult problem because of the high variability and ambiguity in the character shapes written by
individuals. One major problem encountered by researchers in developing character recognition system is
selection of efficient features (optimal features). In this paper, a feature extraction technique for online character
recognition system was developed using hybrid of geometrical and statistical (Geom-statistical) features. Thus,
through the integration of geometrical and statistical features, insights were gained into new character
properties, since these types of features were considered to be complementary. Several optimization techniques
have been used in literature for feature selection in character recognition such as; Ant Colony Optimization
Algorithm (ACO), Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Simulated Annealing but
comparative analysis of GA and PSO in online character has not been carried out. In this paper, a comparative
analysis of performance was made between the GA and PSO in optimizing the Geom-statistical features in
online character recognition using Modified Optical Backpropagation (MOBP) as classifier. Simulation of the
system was done and carried out on Matlab 7.10a. The results generated show that PSO is a well-accepted
optimization algorithm in selection of optimal features as it outperforms the GA in terms of number of features
selected, training time and recognition accuracy.
DISCRIMINATION OF ENGLISH TO OTHER INDIAN LANGUAGES (KANNADA AND HINDI) FOR O...IJCSEA Journal
India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in English and Telugu script. For Optical Character Recognition (OCR) of such a bilingual document, it is necessary to identify the script before feeding the text words to the OCRs of individual scripts. In this paper, we are introducing a simple and efficient technique of script identification for Kannada, English and Hindi text words of a printed document. The proposed approach is based on the horizontal and vertical projection profile for the discrimination of the three scripts. The feature extraction is done based on the horizontal projection profile of each text words. We analysed 700 different words of Kannada, English and Hindi in order to extract the discrimination features and for the development of knowledge base. We use the horizontal projection profile of each text word and based on the horizontal projection profile we extract the appropriate features. The proposed system is tested on 100 differentdocument images containing more than 1000 text words of each script and a classification rate of 98.25%, 99.25% and 98.87% is achieved for Kannada, English and Hindi respectively.
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
OCR-THE 3 LAYERED APPROACH FOR DECISION MAKING STATE AND IDENTIFICATION OF TE...ijaia
Optical Character recognition is the method of digitalization of hand and type written or printed text into
machine-encoded form and is superfluity of the various applications of envision of human’s life. In present
human life OCR has been successfully using in finance, legal, banking, health care and home need
appliances. India is a multi cultural, literature and traditional scripted country. Telugu is the southern
Indian language, it is a syllabic language, symbol script represents a complete syllable and formed with the
conjunct mixed consonants in their representation. Recognition of mixed conjunct consonants is critical
than the normal consonants, because of their variation in written strokes, conjunct maxing with pre and
post level of consonants. This paper proposes the layered approach methodology to recognize the
characters, conjunct consonants, mixed- conjunct consonants and expressed the efficient classification of
the hand written and printed conjunct consonants. This paper implements the Advanced Fuzzy Logic system
controller to take the text in the form of written or printed, collected the text images from the scanned file,
digital camera, Processing the Image with Examine the high intensity of images based on the quality
ration, Extract the image characters depends on the quality then check the character orientation and
alignment then to check the character thickness, base and print ration. The input image characters can
classify into the two ways, first way represents the normal consonants and the second way represents
conjunct consonants. Digitalized image text divided into three layers, the middle layer represents normal
consonants and the top and bottom layer represents mixed conjunct consonants. Here recognition process
starts from middle layer, and then it continues to check the top and bottom layers. The recognition process
treat as conjunct consonants when it can detect any symbolic characters in top and bottom layers of
present base character otherwise treats as normal consonants. The post processing technique applied to all
three layered characters. Post processing of the image: concentrated on the image text readability and
compatibility, if the readability is not process then repeat the process again. In this recognition process
includes slant correction, thinning, normalization, segmentation, feature extraction and classification. In
the process of development of the algorithm the pre-processing, segmentation, character recognition and
post-processing modules were discussed. The main objectives to the development of this paper are: To
develop the classification, identification of deference prototyping for written and printed consonants,
conjunct consonants and symbols based on 3 layered approaches with different measurable area by using
fuzzy logic and to determine suitable features for handwritten character recognition.
ARABIC ONLINE HANDWRITING RECOGNITION USING NEURAL NETWORKijaia
This article presents the development of an Arabic online handwriting recognition system. To develop our
system, we have chosen the neural network approach. It offers solutions for most of the difficulties linked
to Arabic script recognition. We test the approach with our collected databases. This system shows a good
result and it has a high accuracy (98.50% for characters, 96.90% for words).
Signboard Text Translator: A Guide to TouristIJECEIAES
The travelers face troubles in understanding the signboards which are written in local lan- guage. The travelers can rely on smart phone for traveling purposes. Smart phones become most popular in recent years in terms of market value and the number of useful applications to the users. This work intends to build up a web application that can recognize the English content present on signboard pictures captured using a smart phone, translate the content from English to Telugu, and display the translated Telugu text back onto the screen of the phone. Experiments have been conducted on various signboard pictures and the outcomes demonstrate the viability of the proposed approach.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Preprocessing techniques for recognition of ancient Kannada epigraphsIJECEIAES
The Dravidian language Kannada is most spoken in the state of Karnataka, and because of its extensive library of epigraphs, including old manuscripts and inscriptions, it is regarded as a repository of knowledge. To make this knowledge more accessible to the people, efforts are being made to digitize the documents for optimized usage and storage using optical character recognition (OCR) but oftentimes these epigraphs are in poor conditions and the quality of the image being fed to the OCR model may not be good enough to achieve high accuracy of recognition and classification. Preprocessing techniques are used to make the dataset more reliable by improving the quality and accuracy of the model. Preprocessing methods including binarization, smoothing, edge detection, and segmentation help to increase the model's interpretability, decrease overfitting, and train it more quickly and with fewer resources. When applied to the epigraphs, these preprocessing approaches significantly increase the image quality and minimize noise, making it simpler to identify and digitize the text.
Optical character recognition (OCR) is process of classification of optical patterns contained in a digital image. The process of OCR Recognition involves several steps including pre-processing, segmentation, feature extraction, classification. Pre-processing is for done the basic operation on input image like noise reduction which remove the noisy signal from image. Segmentation stage for segment the given image into line by line and segment each character from segmented line. Future extraction calculates the characteristics of character. A Radial Basis Function Neural Network (RBFNN) is used to classification contains the database and does the comparison.
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34×16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
An optical character recognition (OCR) refers to a process of converting the text document images into editable and searchable text. OCR process poses several challenges in particular in the Arabic language due to it has caused a high percentage of errors. In this paper, a method, to improve the outputs of the arabic optical character recognition (AOCR) Systems is suggested based on a statistical language model built from the available huge corpora. This method includes detecting and correcting non-word and real words error according to the context of the word in the sentence. The results show that the percentage of improvement in the results is up to (98%) as a new accuracy for AOCR output.
DIMENSION REDUCTION FOR SCRIPT CLASSIFICATION- PRINTED INDIAN DOCUMENTSijait
Automatic identification of a script in a given document image facilitates many important applications such
as automatic archiving of multilingual documents, searching online archives of document images and for
the selection of script specific OCR in a multilingual environment. This paper provides a comparison study
of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR)
and principal component analysis (PCA), and evaluates the relative performance of classification
procedures incorporating those methods
Dimension Reduction for Script Classification - Printed Indian Documentsijait
Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. This paper provides a comparison study of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR)
and principal component analysis (PCA), and evaluates the relative performance of classification procedures incorporating those methods. For given script we extracted different features like Gray Level Co-occurrence Method (GLCM) and Scale invariant feature transform (SIFT) features. The features are
extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Extracted features are reduced using various dimension reduction techniques. The reduced features are fed into Nearest Neighbor classifier. Thus the proposed
scheme is efficient and can be used for many practical
pplications which require processing large volumes
of data. The scheme has been tested on 10 Indian scripts and found to be robust in the process of scanning and relatively insensitive to change in font size. This proposed system achieves good classification accuracy on a large testing data set.
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...iosrjce
Segmentation technique plays a major role in scripting the documents for extraction of various
features. Many researchers are doing various research works in this field to make the segmenting process
simple as well as efficient. In this paper a simple segmentation technique for both the line and word
segmentation of a script document has been proposed. The main objective of this technique is to recognize the
spaces that separate two text lines.For the Word segmentation technique also similar procedure is followed. In
this work ,three different scanned document have been taken as input images for both line and word
segmentation techniques. The results found were outstanding with average accuracy for both line and word. It
provides 100% accuracy for line segmentation and 100% for line segmentation as well. Evaluation results show
that our method outperforms several competing methods.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.