The document discusses four different methods for Bangla handwritten digit recognition. Method 1 uses preprocessing techniques like binarization, noise reduction, and segmentation followed by feature extraction and classification with a CNN. It achieves 94% accuracy. Method 2 also uses a CNN called MathNET with data augmentation, achieving 97% accuracy. Method 3 uses preprocessing, HOG feature extraction, and an SVM classifier, achieving 97.08% accuracy. Method 4 develops a dataset, performs data augmentation, uses a multi-layer CNN model with ensembling, and achieves 96.788% accuracy even on noisy images. The methods demonstrate high and improving recognition accuracy for Bangla handwritten digits.
This document presents a simple signature recognition system that uses invariant central moment and modified Zernike moment for feature extraction. The system is divided into preprocessing, feature extraction, and recognition/verification stages. In preprocessing, the input signature image is converted to grayscale and binary, and the region of interest is extracted. Feature extraction uses invariant central moments and Zernike moments to extract shape features. Recognition and verification is performed using a backpropagation neural network for its high accuracy and low computational complexity. The system was tested on a database of 500 signatures from 50 individuals and achieved suitable performance for signature verification.
This document summarizes a research paper about a simple signature recognition system designed using MATLAB. The system extracts features from signatures using invariant central moment and modified Zernike moment for invariant feature extraction. It is divided into preprocessing, feature extraction, and recognition/verification. Preprocessing prepares the signature image for processing. Feature extraction uses invariant central moments and Zernike moments. Recognition uses a backpropagation neural network for classification. The system was tested on a database of 500 signatures from 50 individuals, achieving high accuracy and low computational complexity.
IOSR Journal of Mechanical and Civil Engineering (IOSR-JMCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of mechanical and civil engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in mechanical and civil engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...CSCJournals
Handwritten text and character recognition is a challenging task compared to recognition of handwritten numeral and computer printed text due to its large variety in nature. As practical pattern recognition problems uses bulk data and there is a one step self sufficient deterministic theory to resolve recognition problems by calculating inverse of Hessian Matrix and multiplication the inverse matrix it with first order local gradient vector. But in practical cases when neural network is large the inversing operation of the Hessian Matrix is not manageable and another condition must be satisfied the Hessian Matrix must be positive definite which may not be satishfied. In these cases some repetitive recursive models are taken. In several research work in past decade it was experienced that Neural Network based approach provides most reliable performance in handwritten character and text recognition but recognition performance depends upon some important factors like no of training samples, reliable features and no of features per character, training time, variety of handwriting etc. Important features from different types of handwriting are collected and are fed to the neural network for training. It is true that more no of features increases test efficiency but it takes longer time to converge the error curve. To reduce this training time effectively proper train algorithm should be chosen so that the system provides best train and test efficiency in least possible time that is to provide the system fastest intelligence. We have used several second order conjugate gradient algorithms for training of neural network. We have found that Scaled Conjugate Gradient Algorithm , a second order training algorithm as the fastest for training of neural network for our application. Training using SCG takes minimum time with excellent test efficiency. A scanned handwritten text is taken as input and character level segmentation is done. Some important and reliable features from each character are extracted and used as input to a neural network for training. When the error level reaches into a satisfactory level (10 -12 ) weights are accepted for testing a test script. Finally a lexicon matching algorithm solves the minor misclassification problems.
Hangul Recognition Using Support Vector MachineEditor IJCATR
The recognition of Hangul Image is more difficult compared with that of Latin. It could be recognized from the structural arrangement. Hangul is arranged from two dimensions while Latin is only from the left to the right. The current research creates a system to convert Hangul image into Latin text in order to use it as a learning material on reading Hangul. In general, image recognition system is divided into three steps. The first step is preprocessing, which includes binarization, segmentation through connected component-labeling method, and thinning with Zhang Suen to decrease some pattern information. The second is receiving the feature from every single image, whose identification process is done through chain code method. The third is recognizing the process using Support Vector Machine (SVM) with some kernels. It works through letter image and Hangul word recognition. It consists of 34 letters, each of which has 15 different patterns. The whole patterns are 510, divided into 3 data scenarios. The highest result achieved is 94,7% using SVM kernel polynomial and radial basis function. The level of recognition result is influenced by many trained data. Whilst the recognition process of Hangul word applies to the type 2 Hangul word with 6 different patterns. The difference of these patterns appears from the change of the font type. The chosen fonts for data training are such as Batang, Dotum, Gaeul, Gulim, Malgun Gothic. Arial Unicode MS is used to test the data. The lowest accuracy is achieved through the use of SVM kernel radial basis function, which is 69%. The same result, 72 %, is given by the SVM kernel linear and polynomial.
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
In this paper, the use of Multi-Layer Perceptron (MLP) Neural Network model is proposed for recognizing
unconstrained offline handwritten Numeral strings. The Numeral strings are segmented and isolated
numerals are obtained using a connected component labeling (CCL) algorithm approach. The structural
part of the models has been modeled using a Multilayer Perceptron Neural Network. This paper also
presents a new technique to remove slope and slant from handwritten numeral string and to normalize the
size of text images and classify with supervised learning methods. Experimental results on a database of
102 numeral string patterns written by 3 different people show that a recognition rate of 99.7% is obtained
on independent digits contained in the numeral string of digits includes both the skewed and slant data.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...ijtsrd
The inclusion of time based queries in video indexing application is enables by the recognition of time and date stamps in CCTV video. In this paper, we propose the system for reconstructing the path of the object in surveillance cameras based on time and date optical character recognition system. Since there is no boundary in region for time and date, Discrete Cosine Transform DCT method is applied in order to locate the region area. After the region for time and date is located, it is segmented and then features for the symbols of the time and date are extracted. Back propagation neural network is used for recognition of the features and then stores the result in the database. By using the resulted database, the system reconstructs the path for the object based on time. The proposed system will be implemented in MATLAB. Pyae Phyo Thu | Mie Mie Tin | Ei Phyu Win | Cho Thet Mon "Reconstructing the Path of the Object based on Time and Date OCR in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27981.pdfPaper URL: https://www.ijtsrd.com/home-science/education/27981/reconstructing-the-path-of-the-object-based-on-time-and-date-ocr-in-surveillance-system/pyae-phyo-thu
This document presents a simple signature recognition system that uses invariant central moment and modified Zernike moment for feature extraction. The system is divided into preprocessing, feature extraction, and recognition/verification stages. In preprocessing, the input signature image is converted to grayscale and binary, and the region of interest is extracted. Feature extraction uses invariant central moments and Zernike moments to extract shape features. Recognition and verification is performed using a backpropagation neural network for its high accuracy and low computational complexity. The system was tested on a database of 500 signatures from 50 individuals and achieved suitable performance for signature verification.
This document summarizes a research paper about a simple signature recognition system designed using MATLAB. The system extracts features from signatures using invariant central moment and modified Zernike moment for invariant feature extraction. It is divided into preprocessing, feature extraction, and recognition/verification. Preprocessing prepares the signature image for processing. Feature extraction uses invariant central moments and Zernike moments. Recognition uses a backpropagation neural network for classification. The system was tested on a database of 500 signatures from 50 individuals, achieving high accuracy and low computational complexity.
IOSR Journal of Mechanical and Civil Engineering (IOSR-JMCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of mechanical and civil engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in mechanical and civil engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...CSCJournals
Handwritten text and character recognition is a challenging task compared to recognition of handwritten numeral and computer printed text due to its large variety in nature. As practical pattern recognition problems uses bulk data and there is a one step self sufficient deterministic theory to resolve recognition problems by calculating inverse of Hessian Matrix and multiplication the inverse matrix it with first order local gradient vector. But in practical cases when neural network is large the inversing operation of the Hessian Matrix is not manageable and another condition must be satisfied the Hessian Matrix must be positive definite which may not be satishfied. In these cases some repetitive recursive models are taken. In several research work in past decade it was experienced that Neural Network based approach provides most reliable performance in handwritten character and text recognition but recognition performance depends upon some important factors like no of training samples, reliable features and no of features per character, training time, variety of handwriting etc. Important features from different types of handwriting are collected and are fed to the neural network for training. It is true that more no of features increases test efficiency but it takes longer time to converge the error curve. To reduce this training time effectively proper train algorithm should be chosen so that the system provides best train and test efficiency in least possible time that is to provide the system fastest intelligence. We have used several second order conjugate gradient algorithms for training of neural network. We have found that Scaled Conjugate Gradient Algorithm , a second order training algorithm as the fastest for training of neural network for our application. Training using SCG takes minimum time with excellent test efficiency. A scanned handwritten text is taken as input and character level segmentation is done. Some important and reliable features from each character are extracted and used as input to a neural network for training. When the error level reaches into a satisfactory level (10 -12 ) weights are accepted for testing a test script. Finally a lexicon matching algorithm solves the minor misclassification problems.
Hangul Recognition Using Support Vector MachineEditor IJCATR
The recognition of Hangul Image is more difficult compared with that of Latin. It could be recognized from the structural arrangement. Hangul is arranged from two dimensions while Latin is only from the left to the right. The current research creates a system to convert Hangul image into Latin text in order to use it as a learning material on reading Hangul. In general, image recognition system is divided into three steps. The first step is preprocessing, which includes binarization, segmentation through connected component-labeling method, and thinning with Zhang Suen to decrease some pattern information. The second is receiving the feature from every single image, whose identification process is done through chain code method. The third is recognizing the process using Support Vector Machine (SVM) with some kernels. It works through letter image and Hangul word recognition. It consists of 34 letters, each of which has 15 different patterns. The whole patterns are 510, divided into 3 data scenarios. The highest result achieved is 94,7% using SVM kernel polynomial and radial basis function. The level of recognition result is influenced by many trained data. Whilst the recognition process of Hangul word applies to the type 2 Hangul word with 6 different patterns. The difference of these patterns appears from the change of the font type. The chosen fonts for data training are such as Batang, Dotum, Gaeul, Gulim, Malgun Gothic. Arial Unicode MS is used to test the data. The lowest accuracy is achieved through the use of SVM kernel radial basis function, which is 69%. The same result, 72 %, is given by the SVM kernel linear and polynomial.
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
In this paper, the use of Multi-Layer Perceptron (MLP) Neural Network model is proposed for recognizing
unconstrained offline handwritten Numeral strings. The Numeral strings are segmented and isolated
numerals are obtained using a connected component labeling (CCL) algorithm approach. The structural
part of the models has been modeled using a Multilayer Perceptron Neural Network. This paper also
presents a new technique to remove slope and slant from handwritten numeral string and to normalize the
size of text images and classify with supervised learning methods. Experimental results on a database of
102 numeral string patterns written by 3 different people show that a recognition rate of 99.7% is obtained
on independent digits contained in the numeral string of digits includes both the skewed and slant data.
Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
The document describes a system for recognizing cursive handwriting using feature extraction and an artificial neural network. It involves preprocessing scanned images, segmenting them into individual characters, extracting features from the characters using a diagonal scanning method, and classifying the characters using a neural network. This approach provides higher recognition accuracy compared to conventional methods. The key steps are preprocessing images, segmenting into characters, extracting 54 features from each character by moving along diagonals in a grid, and training a neural network classifier on the extracted features.
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...ijtsrd
The inclusion of time based queries in video indexing application is enables by the recognition of time and date stamps in CCTV video. In this paper, we propose the system for reconstructing the path of the object in surveillance cameras based on time and date optical character recognition system. Since there is no boundary in region for time and date, Discrete Cosine Transform DCT method is applied in order to locate the region area. After the region for time and date is located, it is segmented and then features for the symbols of the time and date are extracted. Back propagation neural network is used for recognition of the features and then stores the result in the database. By using the resulted database, the system reconstructs the path for the object based on time. The proposed system will be implemented in MATLAB. Pyae Phyo Thu | Mie Mie Tin | Ei Phyu Win | Cho Thet Mon "Reconstructing the Path of the Object based on Time and Date OCR in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27981.pdfPaper URL: https://www.ijtsrd.com/home-science/education/27981/reconstructing-the-path-of-the-object-based-on-time-and-date-ocr-in-surveillance-system/pyae-phyo-thu
IRJET- Optical Character Recognition using Image ProcessingIRJET Journal
This document discusses optical character recognition (OCR) using image processing. It begins with an abstract that defines OCR as the conversion of typed, handwritten, or printed text into machine-encoded text from scanned documents or photos. The document then outlines the components and steps of a typical OCR system, including optical scanning, preprocessing, segmentation, character extraction, and recognition. It describes using techniques like thresholding, smoothing, projection profiles, clustering algorithms, and creating a character database to classify and recognize characters for conversion to machine-encoded text.
This document discusses a method for handwritten character recognition using a K-nearest neighbors (K-NN) classification algorithm. It begins by introducing the problem of handwritten character recognition and the challenges involved. It then describes the main steps of the proposed method: preprocessing the image data, extracting features, and classifying characters using K-NN. The document tests the method on the MNIST dataset of handwritten digits, achieving an accuracy of 97.67%. It concludes that the method is able to accurately recognize handwritten characters independently of size, font, or writer style.
Feature Extraction and Feature Selection using Textual Analysisvivatechijri
After pre-processing the images in character recognition systems, the images are segmented based on
certain characteristics known as “features”. The feature space identified for character recognition is however
ranging across a huge dimensionality. To solve this problem of dimensionality, the feature selection and feature
extraction methods are used. Hereby in this paper, we are going to discuss, the different techniques for feature
extraction and feature selection and how these techniques are used to reduce the dimensionality of feature space
to improve the performance of text categorization.
Comparative study of two methods for Handwritten Devanagari Numeral RecognitionIOSR Journals
Abstract : In this paper two different methods for Numeral Recognition are proposed and their results are
compared. The objective of this paper is to provide an efficient and reliable method for recognition of
handwritten numerals. First method employs Grid based feature extraction and recognition algorithm. In this
method the features of the image are extracted by using grid technique and this feature set is then compared
with the feature set of database image for classification. While second method contains Image Centroid Zone
and Zone Centroid Zone algorithms for feature extraction and the features are applied to Artificial Neural
Network for recognition of input image. Machine text recognition is important research area because of its
applications in many areas like Bank, Post office, Hospitals etc.
Keywords: Handwritten Numeral Recognition, Grid Technique, ANN, Feature Extraction, Classification.
OCR for Gujarati Numeral using Neural Networkijsrd.com
This papers functions within to reduce individuality popularity (OCR) program for hand-written Gujarati research. One can find so much of work for Indian own native different languages like Hindi, Gujarati, Tamil, Bengali, Malayalam, Gurumukhi etc., but Gujarati is a vocabulary for which hardly any work is traceable especially for hand-written individuals. Here in this work a nerve program is provided for Gujarati hand-written research popularity. This paper deals with an optical character recognition (OCR) system for handwritten Gujarati numbers. A several break up food ahead nerve program is suggested for variation of research. The functions of Gujarati research are abstracted by four different details of research. Reduction and skew- changes are also done for preprocessing of hand-written research before their variation. This work has purchased approximately 81% of performance for Gujarati handwritten numerals.
Tracking number plate from vehicle usingijfcstjournal
This document presents a new algorithm in MATLAB to extract vehicle number plates from images in various lighting conditions. The algorithm uses preprocessing techniques like grayscale conversion, dilation, and edge detection. It then segments the region of interest containing the number plate and extracts it. Individual characters are then segmented and recognized using template matching. The algorithm achieves 99% accuracy on images taken from a fixed angle and distance under controlled conditions. It is less accurate for images with problematic backgrounds or lighting. The algorithm provides an automated way to extract number plates for applications like traffic monitoring, parking management, and stolen vehicle identification.
Bangla Optical Digits Recognition using Edge Detection MethodIOSR Journals
Abstract:This paper is based on Bangla Optical Digit Recognition (ODR) by the Edge detection technique. In this method, Bangla digit image converted into gray-scale which distributed by an M by N array form. Here input data are considered off-line printed digit’s image which collected from computer generated image, scanned documents or printed text. After addressing the gray-scale image against a variable in the form of an M by N array, where the value of array pointers are shown 255 for total white space, 0 (zero) for total dark space and value between 255 and 0 for mix of white and dark space of the image. At the next process, four edgestouch points as well as each touch point’s ratio use as parameters to determine each Bangla digit uniquely. Keywords-Edge, image,gray-scale, Matrix,ODR.
Numeral recognition is an important research direction in field of pattern recognition, and it has
broad application prospects. Aiming at four arithmetic operations of general printed formats, this article
adopts a multiple hybrid recognition method and is applied to automatically calculating. This method mainly
uses BP neural network and template matching method to distinguish the numerals and operators, in order
to increase the operation speed and recognition accuracy. Sample images of four arithmetic operations are
extracted from printed books, and they are used for testing the performance of proposed recognition
method. The experiments show that the method provides correct recognition rate of 96% and correct
calculation rate of 89%.
Document Layout analysis using Inverse Support Vector Machine (I-SVM) for Hin...IRJET Journal
This document summarizes a research paper that proposes a method for document layout analysis of Hindi newspaper images using inverse support vector machines (I-SVM). The method extracts blocks of text and graphics from newspaper images using bounding boxes. Feature vectors are extracted from each block and classified using an I-SVM classifier to determine the block type and analyze the newspaper layout. Experimental results demonstrate the effectiveness of the proposed algorithm for newspaper layout analysis and article extraction.
IRJET- Document Layout analysis using Inverse Support Vector Machine (I-SV...IRJET Journal
This document discusses using inverse support vector machines (I-SVM) for document layout analysis of Hindi newspaper images for optical character recognition. It proposes a framework that uses bounding box segmentation, feature extraction using subline direction and bounding box shape detection, and I-SVM classification. Preprocessing steps include binarization, removing horizontal/vertical lines, and morphological operations. Experimental results show the algorithm can accurately label blocks in newspaper layouts and extract articles for OCR.
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuMadhu Rock
This document summarizes an integrated approach to content-based image retrieval. It discusses extracting both color and texture features from images using color moments and local binary patterns. The system is tested on a database of 1000 images across 10 classes. Results show the integrated approach of using both color and texture features provides more accurate retrievals than using either feature alone. Evaluation metrics like precision, recall and accuracy are calculated to quantitatively analyze the system's performance. Overall, the proposed multi-feature approach is found to improve content-based image retrieval compared to single-feature methods.
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...IOSR Journals
This document analyzes DCT-based steganography using a modified JPEG luminance quantization table to improve evaluation parameters like PSNR, mean square error, and capacity. The authors propose modifying the default 8x8 quantization table by adjusting frequency values in 4 bands to increase image quality for the embedded stego image. Experimental results on test images show that using the modified table improves PSNR, decreases mean square error, and increases maximum embedding capacity compared to the default table. Therefore, the proposed method allows more secret data to be hidden with less distortion and improved image quality.
This document analyzes DCT-based steganography using a modified JPEG luminance quantization table to improve embedding capacity and image quality. The authors propose modifying the default 8x8 quantization table by changing frequency values to increase the peak signal-to-noise ratio and capacity while decreasing the mean square error of embedded images. Experimental results on test images show increased capacity, PSNR and reduced error when using the modified versus default table, indicating improved stego image quality. The proposed method aims to securely embed more data with less distortion than traditional DCT-based steganography.
Enhancement and Segmentation of Historical Recordscsandit
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Finding similarities between structured documents as a crucial stage for gene...Alexander Decker
This document discusses methods for classifying structured documents by finding similarities between them. It describes how pre-processing steps like thresholding and size normalization are used. A key step is tilting documents based on detecting reference lines using clustering. Features are then extracted from the tilted images, like reference lines and logos, to classify documents. The proposed method calculates tilt angle based on the largest cluster of connected line pixels to properly orient documents for classification.
The document discusses optical character recognition (OCR), including its history, current capabilities, and challenges. OCR is a technology that uses optical mechanisms to automatically recognize text characters, similar to how humans read. It involves converting scanned images of text into machine-encoded text. The summary discusses some of the key difficulties in OCR, such as distinguishing similar characters like 'O' and '0' or interpreting text against backgrounds. It also provides an overview of the paper, which will analyze the advancements and limitations of existing OCR systems to determine if it is suitable for different needs.
This is the Bangla Handwritten Digit Recognition Report. you can see this report for your helping hand.
**Bengali is the world's fifth most spoken language, with 265 million native and non-native speakers accounting for 4% of the global population.
**Despite the large number of Bengali speakers, very little research has been conducted on Bangali handwritten digit recognition.
**The application of the BHwDR system is wide from postal code digit recognition to license plate recognition, digit recognition in cheques in the banking system to exam paper registration number recognition.
Introduction to image processing and pattern recognitionSaibee Alam
this power point presentation provides a brief introduction to image processing and pattern recognition and its related research papers including conclusion
Text Extraction and Recognition Using Median FilterIRJET Journal
This document discusses a method for extracting and recognizing text from digital comic images. It begins with an introduction describing the challenges of text extraction from complex comic images. It then describes the specific method used, which includes preprocessing the image with a median filter to reduce noise, detecting text "balloons" using connected component labeling algorithms, and then applying optical character recognition with an image centroid concept to extract and recognize the text. The key aspects of the proposed method are preprocessing with median filtering for edge preservation, balloon detection using connected component labeling, and using image centroid zones for feature extraction in optical character recognition of the text.
The document summarizes the key steps in an optical character recognition (OCR) system for recognizing printed text:
1. Image acquisition involves obtaining the image, which can be done using scanners or digital cameras.
2. Pre-processing prepares the image for recognition through techniques like converting to grayscale, skew correction, binarization, noise reduction, and thinning.
3. Segmentation separates the image into lines and individual characters.
4. Recognition identifies the characters by comparing features or templates to stored models.
The paper then discusses specific algorithms that could implement grayscale conversion, skew correction, and other steps in the OCR system.
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
The document proposes a comparative analysis of deep learning models for flower recognition and health prediction. Specifically, it aims to:
1) Build and evaluate multiple deep learning models like CNNs and ResNets on public flower datasets to identify the most accurate and efficient architecture for flower classification.
2) Develop models like CNNs, LSTMs, and Transformers on health datasets for tasks such as disease diagnosis and predict outcomes, assessing performance on metrics like accuracy and AUC.
3) Analyze the strengths, weaknesses, computational requirements, and interpretability of different models to provide insights on applicability and improvements in flower recognition and health prediction.
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
The document presents a comparative analysis of deep learning models for flower recognition and health prediction. It describes a dataset of over 8,000 images of 24 flower types, augmented to over 14,000 images. Several deep learning models are evaluated using transfer learning and fine-tuning, including MobileNet, DenseNet201, VGG16, ResNet50V2, Xception, EfficientNetB0, and EfficientNetV2B0. DenseNet201 achieved the highest validation accuracy of 99.06% with fine-tuning and 97.64% with transfer learning, outperforming other models. The study recommends DenseNet201 and Xception for flower recognition and health prediction applications.
IRJET- Optical Character Recognition using Image ProcessingIRJET Journal
This document discusses optical character recognition (OCR) using image processing. It begins with an abstract that defines OCR as the conversion of typed, handwritten, or printed text into machine-encoded text from scanned documents or photos. The document then outlines the components and steps of a typical OCR system, including optical scanning, preprocessing, segmentation, character extraction, and recognition. It describes using techniques like thresholding, smoothing, projection profiles, clustering algorithms, and creating a character database to classify and recognize characters for conversion to machine-encoded text.
This document discusses a method for handwritten character recognition using a K-nearest neighbors (K-NN) classification algorithm. It begins by introducing the problem of handwritten character recognition and the challenges involved. It then describes the main steps of the proposed method: preprocessing the image data, extracting features, and classifying characters using K-NN. The document tests the method on the MNIST dataset of handwritten digits, achieving an accuracy of 97.67%. It concludes that the method is able to accurately recognize handwritten characters independently of size, font, or writer style.
Feature Extraction and Feature Selection using Textual Analysisvivatechijri
After pre-processing the images in character recognition systems, the images are segmented based on
certain characteristics known as “features”. The feature space identified for character recognition is however
ranging across a huge dimensionality. To solve this problem of dimensionality, the feature selection and feature
extraction methods are used. Hereby in this paper, we are going to discuss, the different techniques for feature
extraction and feature selection and how these techniques are used to reduce the dimensionality of feature space
to improve the performance of text categorization.
Comparative study of two methods for Handwritten Devanagari Numeral RecognitionIOSR Journals
Abstract : In this paper two different methods for Numeral Recognition are proposed and their results are
compared. The objective of this paper is to provide an efficient and reliable method for recognition of
handwritten numerals. First method employs Grid based feature extraction and recognition algorithm. In this
method the features of the image are extracted by using grid technique and this feature set is then compared
with the feature set of database image for classification. While second method contains Image Centroid Zone
and Zone Centroid Zone algorithms for feature extraction and the features are applied to Artificial Neural
Network for recognition of input image. Machine text recognition is important research area because of its
applications in many areas like Bank, Post office, Hospitals etc.
Keywords: Handwritten Numeral Recognition, Grid Technique, ANN, Feature Extraction, Classification.
OCR for Gujarati Numeral using Neural Networkijsrd.com
This papers functions within to reduce individuality popularity (OCR) program for hand-written Gujarati research. One can find so much of work for Indian own native different languages like Hindi, Gujarati, Tamil, Bengali, Malayalam, Gurumukhi etc., but Gujarati is a vocabulary for which hardly any work is traceable especially for hand-written individuals. Here in this work a nerve program is provided for Gujarati hand-written research popularity. This paper deals with an optical character recognition (OCR) system for handwritten Gujarati numbers. A several break up food ahead nerve program is suggested for variation of research. The functions of Gujarati research are abstracted by four different details of research. Reduction and skew- changes are also done for preprocessing of hand-written research before their variation. This work has purchased approximately 81% of performance for Gujarati handwritten numerals.
Tracking number plate from vehicle usingijfcstjournal
This document presents a new algorithm in MATLAB to extract vehicle number plates from images in various lighting conditions. The algorithm uses preprocessing techniques like grayscale conversion, dilation, and edge detection. It then segments the region of interest containing the number plate and extracts it. Individual characters are then segmented and recognized using template matching. The algorithm achieves 99% accuracy on images taken from a fixed angle and distance under controlled conditions. It is less accurate for images with problematic backgrounds or lighting. The algorithm provides an automated way to extract number plates for applications like traffic monitoring, parking management, and stolen vehicle identification.
Bangla Optical Digits Recognition using Edge Detection MethodIOSR Journals
Abstract:This paper is based on Bangla Optical Digit Recognition (ODR) by the Edge detection technique. In this method, Bangla digit image converted into gray-scale which distributed by an M by N array form. Here input data are considered off-line printed digit’s image which collected from computer generated image, scanned documents or printed text. After addressing the gray-scale image against a variable in the form of an M by N array, where the value of array pointers are shown 255 for total white space, 0 (zero) for total dark space and value between 255 and 0 for mix of white and dark space of the image. At the next process, four edgestouch points as well as each touch point’s ratio use as parameters to determine each Bangla digit uniquely. Keywords-Edge, image,gray-scale, Matrix,ODR.
Numeral recognition is an important research direction in field of pattern recognition, and it has
broad application prospects. Aiming at four arithmetic operations of general printed formats, this article
adopts a multiple hybrid recognition method and is applied to automatically calculating. This method mainly
uses BP neural network and template matching method to distinguish the numerals and operators, in order
to increase the operation speed and recognition accuracy. Sample images of four arithmetic operations are
extracted from printed books, and they are used for testing the performance of proposed recognition
method. The experiments show that the method provides correct recognition rate of 96% and correct
calculation rate of 89%.
Document Layout analysis using Inverse Support Vector Machine (I-SVM) for Hin...IRJET Journal
This document summarizes a research paper that proposes a method for document layout analysis of Hindi newspaper images using inverse support vector machines (I-SVM). The method extracts blocks of text and graphics from newspaper images using bounding boxes. Feature vectors are extracted from each block and classified using an I-SVM classifier to determine the block type and analyze the newspaper layout. Experimental results demonstrate the effectiveness of the proposed algorithm for newspaper layout analysis and article extraction.
IRJET- Document Layout analysis using Inverse Support Vector Machine (I-SV...IRJET Journal
This document discusses using inverse support vector machines (I-SVM) for document layout analysis of Hindi newspaper images for optical character recognition. It proposes a framework that uses bounding box segmentation, feature extraction using subline direction and bounding box shape detection, and I-SVM classification. Preprocessing steps include binarization, removing horizontal/vertical lines, and morphological operations. Experimental results show the algorithm can accurately label blocks in newspaper layouts and extract articles for OCR.
AN INTEGRATED APPROACH TO CONTENT BASED IMAGERETRIEVAL by MadhuMadhu Rock
This document summarizes an integrated approach to content-based image retrieval. It discusses extracting both color and texture features from images using color moments and local binary patterns. The system is tested on a database of 1000 images across 10 classes. Results show the integrated approach of using both color and texture features provides more accurate retrievals than using either feature alone. Evaluation metrics like precision, recall and accuracy are calculated to quantitatively analyze the system's performance. Overall, the proposed multi-feature approach is found to improve content-based image retrieval compared to single-feature methods.
DCT based Steganographic Evaluation parameter analysis in Frequency domain by...IOSR Journals
This document analyzes DCT-based steganography using a modified JPEG luminance quantization table to improve evaluation parameters like PSNR, mean square error, and capacity. The authors propose modifying the default 8x8 quantization table by adjusting frequency values in 4 bands to increase image quality for the embedded stego image. Experimental results on test images show that using the modified table improves PSNR, decreases mean square error, and increases maximum embedding capacity compared to the default table. Therefore, the proposed method allows more secret data to be hidden with less distortion and improved image quality.
This document analyzes DCT-based steganography using a modified JPEG luminance quantization table to improve embedding capacity and image quality. The authors propose modifying the default 8x8 quantization table by changing frequency values to increase the peak signal-to-noise ratio and capacity while decreasing the mean square error of embedded images. Experimental results on test images show increased capacity, PSNR and reduced error when using the modified versus default table, indicating improved stego image quality. The proposed method aims to securely embed more data with less distortion than traditional DCT-based steganography.
Enhancement and Segmentation of Historical Recordscsandit
Document Analysis and Recognition (DAR) aims to extract automatically the information in the document and also addresses to human comprehension. The automatic processing of degraded
historical documents are applications of document image analysis field which is confronted with many difficulties due to the storage condition and the complexity of the script. The main interest
of enhancement of historical documents is to remove undesirable statistics that appear in the
background and highlight the foreground, so as to enable automatic recognition of documents
with high accuracy. This paper addresses pre-processing and segmentation of ancient scripts, as an initial step to automate the task of an epigraphist in reading and deciphering inscriptions.
Pre-processing involves, enhancement of degraded ancient document images which is achieved through four different Spatial filtering methods for smoothing or sharpening namely Median,
Gaussian blur, Mean and Bilateral filter, with different mask sizes. This is followed by
binarization of the enhanced image to highlight the foreground information, using Otsu
thresholding algorithm. In the second phase Segmentation is carried out using Drop Fall and
WaterReservoir approaches, to obtain sampled characters, which can be used in later stages of
OCR. The system showed good results when tested on the nearly 150 samples of varying
degraded epigraphic images and works well giving better enhanced output for, 4x4 mask size
for Median filter, 2x2 mask size for Gaussian blur, 4x4 mask size for Mean and Bilateral filter.
The system can effectively sample characters from enhanced images, giving a segmentation rate of 85%-90% for Drop Fall and 85%-90% for Water Reservoir techniques respectively
Finding similarities between structured documents as a crucial stage for gene...Alexander Decker
This document discusses methods for classifying structured documents by finding similarities between them. It describes how pre-processing steps like thresholding and size normalization are used. A key step is tilting documents based on detecting reference lines using clustering. Features are then extracted from the tilted images, like reference lines and logos, to classify documents. The proposed method calculates tilt angle based on the largest cluster of connected line pixels to properly orient documents for classification.
The document discusses optical character recognition (OCR), including its history, current capabilities, and challenges. OCR is a technology that uses optical mechanisms to automatically recognize text characters, similar to how humans read. It involves converting scanned images of text into machine-encoded text. The summary discusses some of the key difficulties in OCR, such as distinguishing similar characters like 'O' and '0' or interpreting text against backgrounds. It also provides an overview of the paper, which will analyze the advancements and limitations of existing OCR systems to determine if it is suitable for different needs.
This is the Bangla Handwritten Digit Recognition Report. you can see this report for your helping hand.
**Bengali is the world's fifth most spoken language, with 265 million native and non-native speakers accounting for 4% of the global population.
**Despite the large number of Bengali speakers, very little research has been conducted on Bangali handwritten digit recognition.
**The application of the BHwDR system is wide from postal code digit recognition to license plate recognition, digit recognition in cheques in the banking system to exam paper registration number recognition.
Introduction to image processing and pattern recognitionSaibee Alam
this power point presentation provides a brief introduction to image processing and pattern recognition and its related research papers including conclusion
Text Extraction and Recognition Using Median FilterIRJET Journal
This document discusses a method for extracting and recognizing text from digital comic images. It begins with an introduction describing the challenges of text extraction from complex comic images. It then describes the specific method used, which includes preprocessing the image with a median filter to reduce noise, detecting text "balloons" using connected component labeling algorithms, and then applying optical character recognition with an image centroid concept to extract and recognize the text. The key aspects of the proposed method are preprocessing with median filtering for edge preservation, balloon detection using connected component labeling, and using image centroid zones for feature extraction in optical character recognition of the text.
The document summarizes the key steps in an optical character recognition (OCR) system for recognizing printed text:
1. Image acquisition involves obtaining the image, which can be done using scanners or digital cameras.
2. Pre-processing prepares the image for recognition through techniques like converting to grayscale, skew correction, binarization, noise reduction, and thinning.
3. Segmentation separates the image into lines and individual characters.
4. Recognition identifies the characters by comparing features or templates to stored models.
The paper then discusses specific algorithms that could implement grayscale conversion, skew correction, and other steps in the OCR system.
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
The document proposes a comparative analysis of deep learning models for flower recognition and health prediction. Specifically, it aims to:
1) Build and evaluate multiple deep learning models like CNNs and ResNets on public flower datasets to identify the most accurate and efficient architecture for flower classification.
2) Develop models like CNNs, LSTMs, and Transformers on health datasets for tasks such as disease diagnosis and predict outcomes, assessing performance on metrics like accuracy and AUC.
3) Analyze the strengths, weaknesses, computational requirements, and interpretability of different models to provide insights on applicability and improvements in flower recognition and health prediction.
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
The document presents a comparative analysis of deep learning models for flower recognition and health prediction. It describes a dataset of over 8,000 images of 24 flower types, augmented to over 14,000 images. Several deep learning models are evaluated using transfer learning and fine-tuning, including MobileNet, DenseNet201, VGG16, ResNet50V2, Xception, EfficientNetB0, and EfficientNetV2B0. DenseNet201 achieved the highest validation accuracy of 99.06% with fine-tuning and 97.64% with transfer learning, outperforming other models. The study recommends DenseNet201 and Xception for flower recognition and health prediction applications.
This document describes a student results management system project. It includes three modules: a registration/login module, an admin module, and a student module. The admin module allows administrators to create subjects, classes, and add student results. The student module allows students to view and download their results. The proposed system aims to replace the manual process currently used and allow students easier online access to their results and course information. It discusses the existing manual system, proposed online system, users, modules, tools, technologies, and sample outputs of the new student results management system.
Bangla Hand Written Digit Recognition presentation slide .pptxKhondokerAbuNaim
This document describes a project on Bangla handwritten digit recognition using deep learning models. It discusses preprocessing a dataset of 2500 training and 500 testing Bangla handwritten digit images. Two models - EfficientNetB0 and MobileNet - were trained using baseline, transfer learning, and fine-tuning methods. Fine-tuning achieved the best results, with 99.4% and 94% accuracy for EfficientNetB0 and MobileNet respectively. Limitations and future work are discussed to improve dataset quality and model performance.
The document is a project proposal for Bangla handwritten digit recognition using deep learning. It outlines collecting a dataset of Bangla handwritten digits with various variations, preprocessing the images, using a convolutional neural network model for feature extraction and classification, training the model on the dataset, evaluating the trained model on a test set, and developing a user interface to demonstrate recognition of input digits. The overall goal is to develop an accurate system for recognizing Bangla handwritten digits with applications in fields such as banking, mail, and document digitization.
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...KhondokerAbuNaim
Bangla alphabet handwritten recognition using deep learning is a process of automatically identifying and classifying handwritten Bangla characters using machine learning techniques. This involves training a deep learning model on a large dataset of handwritten Bangla characters to learn the features that distinguish each character from one another. The trained model can then be used to recognize new handwritten Bangla characters and convert them into digital text.
This document summarizes a quiz game project created by four students for their Software Development Project course. The quiz game has five levels with 10 questions each that cover a variety of topics to test users' general knowledge. Users earn one point for each correct answer and can accumulate bonus points based on their score at the end of each level. The game allows users to select their level and will automatically advance them to the next level if they meet the minimum score requirement. The expected outcomes of the project include a start menu, help option, instructions, level selection, displaying scores and bonus scores during and after each level.
This 3 sentence summary provides the essential information from the lab report document in a concise form:
The document is a lab report submitted to a teacher from the Bangladesh Army University of Science and Technology. It includes the student's name, ID, level, and term along with the date of the experiment and submission. The final sentence provides space for the teacher's signature and designation to grade the submitted lab report.
This document appears to be an assignment cover sheet for a student at the Bangladesh Army University of Science and Technology. It includes fields for the student's name, ID number, academic level and term, as well as the course name and number, experiment name, date of experiment and submission, and instructor details.
Online Voting System Project Proposal ( Presentation Slide).pptxKhondokerAbuNaim
This document summarizes a student project to develop an online voting system. The project aims to create a website that allows people to cast votes online through a centralized database and web interface. The system is intended to make voting more efficient, secure, and convenient compared to traditional in-person voting. The document outlines the project objectives, background research on e-voting, proposed methodology using HTML, CSS, PHP, JavaScript and a MySQL database, and a scheduling plan to complete the project over 12 weeks.
Online Voting System project proposal report.docKhondokerAbuNaim
The document proposes the development of an online voting system website that would allow voters to cast their votes from any location using a computer rather than having to go to physical polling places. It outlines the objectives, requirements, design, and timeline for the project, which aims to create a more efficient and accurate electronic voting process for organizations like colleges to conduct elections remotely. Key features of the proposed system include voter and candidate registration, authentication using IDs and passwords, real-time display of voting results, and tracking of who has and has not voted.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
Assessment and Planning in Educational technology.pptxKavitha Krishnan
In an education system, it is understood that assessment is only for the students, but on the other hand, the Assessment of teachers is also an important aspect of the education system that ensures teachers are providing high-quality instruction to students. The assessment process can be used to provide feedback and support for professional development, to inform decisions about teacher retention or promotion, or to evaluate teacher effectiveness for accountability purposes.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Assignment-1-NF.docx
1. Bangladesh Army University of Science and Technology
(BAUST)
Department of Computer Science and Engineering
Assignment #1, Winter 2023 Level-4 Term-I
Course Code: CSE 4131 Course Title: Artificial Neural Networks and Fuzzy
Systems
Submission Date: CO Number: CO2 Full Marks: 15
ID: 200101103 Name: Khondoker Abu Naim
Bangla Handwritten Digit Recognition
1. Introduction
Bangla handwritten digit recognition is a classical problem in the field of computer vision. There
are various kinds of practical application of this system such as OCR, postal code recognition,
license plate recognition, bank checks recognition etc. Recognizing Bangla digit from documents
is becoming more important. The unique number of Bangla digits are total 10. So the recognition
task is to classify 10 different classes. The critical task of handwritten digit recognition is
recognizing unique handwritten digits. Because every human has his own writing styles. But our
contribution is for the more challenging task. The challenging task is about getting robust
performance and high accuracy for large, unbiased, unprocessed, and highly augmented “bangla-
digit” dataset. The dataset is a combination of ten class datasets that were gathered from different
sources and at different times containing blurring, noise, rotation, translation, shear, zooming,
height/width shift,brightness, contrast, occlusions, and superimposition. We have not processed
all kinds of augmentation of this dataset. We have processed blur and noisy images mainly. Then
our processed image are classified by a deep convolutional neural network (CNN).
2. Literature Review
2.1 Method 1
Proposed Method:
The purpose of OCR is to recognize and identify characters in images of text documents and map
them to computer-readable character codes that can be used for further text processing. A typical
workflow for recognizing characters from image documents is shown in FIG. This includes the
following steps:
1) Preprocessing: The input image goes through a series of preprocessing or preprocessing steps.
The purpose of preprocessing is to allow the OCR Engine to work with greater accuracy. This
can be achieved through a series of operations.
a) Binarization: The document image is thresholded to convert the grayscale image to a
binary image. Image thresholding can be global or local (adaptive). Global image
thresholding uses only one threshold for the entire image, whereas local (adaptive)
thresholding uses different thresholds for different image segments according to local
information.
2. b) Noise Reduction: Noise reduction improves image quality. Usually two common
approaches are taken for noise reduction: 1) image filtering such as wiener filter, Gaussian
filter, and median filter, and 2) morphological operations such as erosion and dilation.
c) Normalization: Normalizing inter-user and intra-user variability due to character size or
choice of font family such as boldface is always a good idea. Common normalization steps
include stroke width normalization or thinning, and normalization of aspect ratio and size of
the image.
d) Skew correction: Skew correction methods are employed in order to align the image
document. Major approaches for skew detection include correlation, projection profiles, and
Hough transform.
e) De-skew: The skew of handwritten text is user dependent. The Slant elimination method is
used to reduce variability due to different typefaces and normalize all characters to a
canonical form.
2) Segmentation: The purpose of image segmentation in OCR systems is to extract isolated
characters from image documents. The segmentation step includes the following operations: text line
detection, word extraction, and character segmentation. The segmentation of identified characters is
usually performed in a top-down manner . Line segmentation is performed first, then word
segmentation, then character segmentation
3) Feature Extraction: In the feature extraction step, the segmented characters are transformed into a
set of features called feature vectors. Each character is represented by its feature vector. Feature
extractionprovides dimensionality reduction to extract relevant information from character images to
facilitate better separation and identification of different characters in feature space.
4) Classification: Classification schemes provide decision rules for identifying characters based on
feature vectors. This task can be accomplished by leveraging machine learning approaches such as
Artificial Neural Networks (ANN), K-Nearest Neighbors (KNN), Hidden Markov Models (HMM),
Support Vector Machines (SVM), and standard classifiers.
5) Post-processing: Dictionary-based approaches and contexts can be used to improve recognition
rates. B. Correct spelling errors and select good words.
Result: The recognition accuracy of digit recognition for different feature sets by is based on the
dimension of the zone. For the original 32×32 image (without zoning), the detection accuracy was
rather poor at 78.5%. However, applying zoning to the character image significantly improves the
recognition accuracy of the 16x16 zoning gives a recognition accuracy of 86.5 and 8x8 zoning gives
a best accuracy of 94.0%. If the dimensionality of the zone is further reduced, the detection accuracy
will be lower. For example, 4x4 zoning gives an accuracy of 89.2. This result reflects the fact that
while zoning can help reduce feature dimensionality, as discussed in Section III, excessive feature
reduction can reduce recognition accuracy. For the feature set (8×8 zones) with the best performance,
we also calculated the recognition accuracy for each digit separately. This shows that digits such as 4
and 8 have very high accuracy, while other digits such as 3 and 5 have relatively poor recognition
accuracy.
Limitation: Bangla numeral recognition. The method demonstrates an excellent result with 94%
overall accuracy. This result is very promising, and is likely to improve if pre-processing techniques
such as normalization, skew correction, and slant removal are applied. Further improvement may be
3. achieved with the use of appropriate features specific to the Bangla digits, and different variants of
SRC such as regularized SRC and kernel SRC. Comparison with other conventional classifiers
should be considered in future as a continuation of this work. The results should also be verified for
other standard handwritten character databases such as the ISI database of handwritten Bangla
numerals
2.2 Method 2
Proposed Method: This latest CNN model that is proposed here is called "MathNET" has several
phasesas illustrated beneath.
Dataset: In this CNN model mentioned in record,6000 image (0-9) data from 'Ekush' [8] and 44 other
classes of mathematical symbols collected a total of 26,400 images.These 44 handwritten symbols
were collected by 500 students. Image clarity is based on character size. His background padding of
black in each image is less and the text is white. The images in this data set have an undistorted size
of 28 x 28 pixels, and the edges of the images appear blurred. Then concatenate the two datasets to
get a final dataset with a total of 32,400 images.
Preparation of dataset: In deep learning, variety of data inside dataset is very important.Then resize
the dataset 28x28px, remove the unnecessary black pixel and converted the whole dataset in csv
format for high speed calculation process. The whole data set has 785 columns in every row. Where
28 x 28 = 784 columns contain the pixel or dot value which represent the image and 785 number
columns store the label or class for the digits and symbols.
This model has Maxpool layer, completely attached Dense layer and used Dropout [9] for
regularization method.The first two convolutional layer has filter size of 32 and kernel_size (5,5) and
use activation function ReLU with padding = `same`. The output of dropout_1 goes into layer
conv2d_3 and conv2d_4 as an input. Max_pooling2d_2 layer which is take input from conv2d_3 and
conv2d_4 and gives the output as an input to 25% dropout_2 layer. After performing these 8
operations, the output goes through flatten_1 layer and attached to a dense_1 layer with 256
backstage units.
In this MathNET model refer to used RMSprop [10] [11] optimizer and set learning rate value to 0.
The RMSprop optimizer is equivalent to the momentum gradient descent algorithm. The RMSprop
optimizer limits vertical direction of the oscillations.
Moreover, this can accelerate the learning rate and our algorithm will take bigger steps in a more
rapidly converging horizontal direction.
CNN model works better when it finds a lot of data during training time. Here comes the data
augmentation method. It helps to generate artificial data, to avoid the overfitting of model. By
choosing several augmentation methods these are: zoom_range set to 0.1, haphazardly shift images
horizontally 0.1, haphazardly shift images vertically 0.1.
Result:
Limitation: Finding the delusion from given test set, can be declare that MathNET has been
successfully recognize 97% of the images from test data. On fig 4 top 6 error has been shown. This is
happened because of the wrong labeled data in the test set. And some of the error also confuse us this
can also be made by human.
4. 2.3 Method 3
Proposed Method: The digit recognition process is mainly divided into three main parts:
preprocessing, feature extraction, and classification.
Preprocessing: The steps performed before feature extraction are called preprocessing. The purpose
of preprocessing is to improve the image data to suppress unwanted distortions or to enhance
important image features for further processing. Preprocessing steps include image acquisition,
binarization, denoising, skew detection, segmentation, and scaling .
1) Image Capture: Anydevice with a camera or scanner can capture images [4]. Images from PDF
files can also be imported into the system. The image is a single digit or a series of numbers
collected from license plates, bank checks, zip codes, etc.
2) Image binarization: RGB images are converted to grayscale before binarization. Binarization
is performed based on based on a fixed threshold using Otsu's threshold method .
3) Denoising: Denoising is performed to reduce the possibility of misclassification due to poor
image quality. Here a median filter is used for noise reduction [6]. This is the commonly used
smoothing method.
4) Skew Detection: Skew is usually caused by the image being placed at an angle when it is
captured. Skew is usually removed by rotating the image to an angle opposite the estimated skew
value includes line splitting and character splitting. Line segmentation is performed bhorizontally
scanning the image for a number of white pixel frequencies in each original image.Next, digit
segmentation is performed by scanning each line vertically, gaps between digits are detected, and
subimages are saved.
6) Scaling: To compare feature vectors, all digits must be scaled to a certain size. As the size of
the image increases, more features can be extracted, thus increasing the accuracy of the. memory
requirements and the time taken have also increased by. In contrast, the smaller image has less
features, resulting in less accuracy. All images are scaled to 32x32 matrices to balance the feature
size and processing time.
Result: The result is summarized in Table 3 Shows that for a very large number of training features
linear SVM works very much efficiently. But if the feature size is smaller than the number of
observation then RBF or polynomial kernel is preferred because they fit this kind of data set
properly, resulting in higher accuracy than linear SVM.
Limitation: In this paper, comparative performance of three well-known kernels of SVM
classification algorithm has been investigated to find out the appropriate kernel function for used
sample dataset of Bangla handwritten digits. Experimental result shows that using HOG features,
handwritten digit recognition shows at most 97.08% accuracy for polynomial kernel function. This
performance mostly depends on the preprocessing and feature extraction techniques. However, the
recognition rate can be improved using the combination of more than one feature extraction
techniques.
2.4 Method 4
Proposed Method:
A.Dataset preparation and image preprocessing In this study we`ve trained our model with
recently developed large dataset NumtaDB consists of 85000+ data and trained with 72040
5. specimens from the dataset initially. Before feeding data into the model we`ve done some image
preprocessing tasks to clean unnecessary features and artifact as much as possible to train
efficiently. At first we`ve converted images from RGB to grayscale images then reshaped into
64x64x1 dimension to maintain same volume among all training data. Then we`ve applied
Gaussian blur on the image with a standard deviation of 10. After that, blurred images have been
blended with the grayscale images again using cv2. 5 for blurred image. Peprocessing has been
applied on all train and test images.
B. Image Augmentation: Training data in provided dataset is cleaner and most of them are easily
comprehensible but the validation data or test data have some of the most challenging test cases
to evaluate model performance in most noisy condition. So we had to artificially generate or
augment our dataset to increase the variation with built-in augmentation and image preprocessing
functions from Keras library initially 0.2, height shift range of 0.2. Later we improved accuracy
by increasing main database by generating more augmented image manually for more variation.
Images with salt and pepper noise have been generated using MATLAB function imnoise().
Blurred washed out images with random angle ranging -35,-30,20,10,20,30,40 degree have been
generated using cv2. It has applied normalize box filter on image.
C. Proposed model for classification In this method we`ve experimented with different CNN
models and taken two of our best performing models (Model A and Model B) for ensembling.
First convolutional layers consist of 32 filters with 5x5 kernel size generally extract low level
features like vertical horizontal edges at greater extend followed by second layers consist of 32
filters with 3x3 kernel size in both models. After that, Maxpool layers with 2x2 kernel size and
strides of 2 are employed to reduce the features by taking maximum value which greatly cut the
computation curve and overfitting. Similarly, two convolutional layers consists of 64 filters each
with 3x3 kernel size and Maxpool layers are added similar to the previous Maxpool layer`s
configuration. Experimenting with different configurations, we have eventually found that, using
slightly wider convolutional layer at the end of Model A has offer slight accuracy boost in
Model. Rectified Linear Unit (ReLu) activation function is used in every layer including fully
connected (FC) layers except the final FC layers in both models. The convolution feature maps
are flattened and connected with FC layer with 64 neurons. Dropout layers are added before the
final FC layers with value of 0. Finally FC layers of 10 neurons are added with Softmax
activation function for classification of 10 classes and ensembled the models by averaging the
final output layers. Same padding configuration is used in all convolutional layer in both models.
Result: In this training, 20 percent of 116395 specimens have been used as validation set for
determining model performance and other 96116 specimens were used for training. We`ve also
compared the result by training with original non augmented data size of 72000+ to reflect the
performance comparison among our proposed method and other nonaugmented machine learning
and feature extraction based approach. The model has been implemented with python library
namely Keras v2.4. 2 python library and MATLAB image processing toolbox from MATLAB
r2017a have been used for manual augmentation. 50 GHz, RAM: 8.00 GB, Graphics: NVIDIA
GT-940MX, 2GB) and Google Colaboratory [24] Cloud platform with NVIDIA Tesla K-80 GPU
and 12GB RAM support. We`ve tested with various iteration level and found that minimum 30
and 6 epochs are needed to get the maximum performance in Model A and Model B respectively.
We conclude that our proposed method performs worst in detecting numeral '১' which is
misclassified in 57 cases among 1750 specimens. That’s mean lowest 96.74% accuracy. After
that numeral ‘৯’ has second lowest detection rate with 97.360% accuracy. Our proposed model
6. confuses these two numerals and misclassified '১' as '৯' 26 times and ‘৯’ as ‘১’ 25 times because
numeral '১' and '৯' are sometimes might be bit confusing even in human eyes depending on test
cases. Numeral '৪' has highest detection rate misclassified in only 25 specimens among 1774 test
cases (98.59% accuracy). Some of the test specimens are very confusing even for the human
eyes. among the 17760 specimens only 570 test cases are misclassified. Most of the misclassified
specimens are heavily augmented, noisy data. The model has 96.788%.
Limitation: 98.98% accuracy with image augmentation though he had test images which were not
so noisy. Our proposed model outperforms the previous works on clear images, where we have
achieved 99.2, also very good accuracy with beyond 90% in noisy, highly augmented specimens.
Before augmenting, we can see, the accuracy has been very low for tilted, random box noise and
color shifted specimens. For tilted images the accuracy has been only 13. After augmentation,
accuracy has jumped to 95. As shown in Table VI, our proposed model outperforms some well
stablished models like resnet-18 and lenet-5. Also we compare our model with another ensemble
technique where we`ve trained Model A with 5 fold in Kfold cross validation and got 5 different
model with 5 fold in same architecture but our proposed model outperforms it also.
Result and Analysis
Font Family: Times New Roman, Font Size: 12, Justified
Compare the results of all four methods and write your analysis.
3. Conclusion
Font Family: Times New Roman, Font Size: 12, Justified
References
[1] Khan, Haider Adnan, Abdullah Al Helal, and Khawza I. Ahmed. "Handwritten bangla
digit recognition using sparse representation classifier." 2014 International Conference on
Informatics, Electronics & Vision (ICIEV). IEEE, 2014.
[2] Shuvo, Shifat Nayme, et al. "MathNET: using CNN bangla handwritten digit,
mathematical symbols, and trigonometric function recognition." Soft Computing Techniques
and Applications: Proceeding of the International Conference on Computing and
Communication (IC3 2020). Springer Singapore, 2021.
[3 Rehana, Hasin. "Bangla handwritten digit classification and recognition using SVM
algorithm with HOG features." 2017 3rd International Conference on Electrical Information
and Communication Technology (EICT). IEEE, 2017. .
[4] Noor, Rouhan, Kazi Mejbaul Islam, and Md Jakaria Rahimi. "Handwritten bangla
numeral recognition using ensembling of convolutional neural network." 2018 21st
international conference of computer and information technology (ICCIT). IEEE, 2018.