SlideShare a Scribd company logo
1 of 9
Download to read offline
Bangla Handwritten Digit Recognition Using Deep Learning.
Khondoker Abu Naim (200101103) and Abu Rayhan Mouno (180201118)
Course Code: CSE 4132 Course Title: Artificial Neural Networks and Fuzzy Systems Sessional
Semester: Winter 2023
*Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology (BAUST)
Abstract
Due to the wide range of shapes, sizes, and writing styles, handwritten digit recognition has
always been difficult. For all of the recognition algorithms built in this thesis, the rewarded
method is used. Bangla-digit is the largest dataset. It is a collection for Bengali handwritten
digits. This is a massive dataset with over 3000 images in it. This dataset, however, is incredibly
difficult to work with due to its complexity. The aim of this project is to preprocess images that
can be used to train deep learning models with high accuracy. In this project, various
preprocessing techniques are developed for image processing, with a deep convolutional neural
network (CNN) functioning as the classification algorithm. On the bangla-digit image dataset,
the performance is systematically evaluated of this process. Finally, 93% accuracy is obtained
for bangla-digit dataset.
1. Introduction
Bangla handwritten digit recognition is a classical problem in the field of computer vision.
There are various kinds of practical application of this system such as OCR, postal code
recognition, license plate recognition, bank checks recognition etc. Recognizing Bangla digit
from documents is becoming more important. The unique number of Bangla digits are total
10. So the recognition task is to classify 10 different classes. The critical task of handwritten
digit recognition is recognizing unique handwritten digits. Because every human has his own
writing styles. But our contribution is for the more challenging task. The challenging task is
about getting robust performance and high accuracy for large, unbiased, unprocessed, and
highly augmented “bangla-digit” dataset. The dataset is a combination of ten class datasets
that were gathered from different sources and at different times containing blurring, noise,
rotation, translation, shear, zooming, height/width shift,brightness, contrast, occlusions, and
superimposition. We have not processed all kinds of augmentation of this dataset. We have
processed blur and noisy images mainly. Then our processed image are classified by a deep
convolutional neural network (CNN).
2. Literature Review
Bangla digits were also identified using the local binary pattern. The convolutional neural
network, deep learning approach was used to improve supervised learning and performance.
Handwritten digit recognition was proposed by EMNIST for both balanced and imbalanced
datasets.Researchers used a local binary pattern to identify Bangla digits, achieving more
accuracy. As a result of the recent success of deep learning, especially convolutional neural
network (CNN) for computer vision, many researchers have been encouraged to use the CNN
to recognize handwritten characters and digits as a computer vision challenge.Proposed
handwritten Bangla digit recognition using classification combinations by Dempster–Shafer
(DS) technique.
They used the DS technique and MLP classifier for classification on the training dataset of
6000 handwritten samples, as well as threefold cross-validation.
Use this concept in their genetic algorithm-based handwritten digit recognition technique.
Islam and his colleagues suggested Bayanno-Net, in which they used CNN to understand
Bangla handwritten numerals offered a comparative analysis of existing classification
algorithms for recognizing Bengali handwritten numerals. A model based on handwritten
digits was proposed by Shawon et al. An efficient automated design to generate UML
diagram from natural language specifications by Gulia et al; the main aim of their paper is to
focus on the production of activity diagram and sequence diagram through natural language
specifications and on the other hand by applying a conceptual architecture for creating a
human computing environment involving the applicability of deep learning algorithms.
Intelligent human computing: A Deep Learning-Based Approach by the paper.A Bangla
document classification using deep recurrent neural network with BiLSTM by Saifur Rahman
and Partha Chakraborty et al. The collected data were processed for the Bangla text document
and designing the model architecture training data; it was fitted out into the model. A Bangla
documents classification using transformer-based deep learning models by Md Mahbubur
Rahman, Md. They have applied the BERT and ELECTRA models for Bangla text
classification. The main goal of their paper is to identify fake profiles and eliminate the fake
accounts on online social networks.
3. Dataset Description
For recognition and generation, “bangla-digit” dataset has been used which contains 2746
handwritten Bangla digits. This is one of the biggest datasets of handwritten Bangla digits.
This dataset is divided into 10 subfolder.These 2746 data is collected from 30 persons then
use image augmentation. The image are gray-scale and the dimension is 124∗ 124.The
sample images of the “Bangla–diti” dataset are shown in Fig-1.
Fig1:Sample data Image from bangla-digit dataset.
Class Number of Sample
0 274
1 276
2 276
3 276
4 276
5 275
6 275
7 276
8 275
9 267
Table-1: Summarizes the Dataset.
4. Methodology
The experimentation process was conducted using two cloud-based platforms, namely
Google Colab and Kaggle. To run the experiments, we set up the required software packages
and libraries in both environments, including TensorFlow, Keras, and NumPy. We used
TensorFlow version 2.4.0 and Keras version 2.4.3, which were the latest stable versions at the
time of the experiments.
Three main steps are applied to recognize handwritten digits. Among them, feature extraction
step and classification step are combined in one step. So the number of main steps of our digit
recognition is two. One is preprocessing of images,and another is deep CNN model.
Here we include models EfficientNetB0, MobileNet. These models were chosen based on
their proven performance in image classification tasks.
These steps are shown in Fig.
Fig-2: Block diagram of proposed Bangla digit recognition
Fig-2: Propose Model
Baseline model: A baseline model is a simple model that is used as a starting point for
developing more complex models. It is typically the simplest model that can be built for a
given problem, and it serves as a reference point against which other models can be
Input Image
Image Preprocessing
Feature Extration
Image Classification
Detected Digit
compared. The purpose of a baseline model is to establish a minimum level of performance
that must be exceeded by more complex models in order to justify their additional
complexity.
For example, in machine learning, a baseline model might be a simple linear regression
model that uses only one or two features to make predictions. This model can be used to
establish a baseline level of prediction accuracy, and more complex models can be developed
and compared to this baseline to determine if they are worth the additional computational cost
and complexity.
Transfer learning: Transfer learning was used to train the pre-trained models on the facial
emotion recognition dataset. In this approach, the pre-trained models were used as feature
extractors, and the final layer(s) of the models were replaced with a new layer(s) that was
trained on the facial emotion recognition dataset.
Transfer learning: Transfer learning was used to train the pre-trained models on the facial
emotion recognition dataset. In this approach, the pre-trained models were used as feature
extractors, and the final layer(s) of the models were replaced with a new layer(s) that was
trained on the facial emotion recognition dataset.
Dataset preprocessing: The collected dataset was preprocessed by resizing the images to a
standard size of 124x124 pixels and then converting them to JPEG format. First of all we
collect the image then we crop the image.
Data Augmentation: Data augmentation was performed using the ImageDataGenerator class
from the keras.preprocessing.image module. The following transformations were applied to
the images:
 Rotation: Randomly rotate the image by a certain angle in the range 10.
 Width and Height Shift: Randomly shift the image horizontally and vertically by a
fraction of the image size in the range 0.1, 0.2.
 Zoom: Randomly zoom into the image by a factor in the range 0.0.
Hyperparameters: Several hyperparameters have been used for model training and
optimization, such as:
 Batch Size: This hyperparameter determines the number of samples that are processed
in a single batch. A batch size of 32 has been used.
 Size: This hyperparameter defines the size of the input image after resizing. In the
code, the size has been set to 124x124 pixels.
 Epochs: This hyperparameter defines the number of times the entire dataset is passed
through the model during training. The number of epochs has been set to 16 and it
varies while fine tuning.
 Learning Rate: This hyperparameter determines the step size at which the optimizer
adjusts the model parameters during training. Adam optimizer with a learning rate of
0.001 has been used and also varies while training for different models of different
variations.
 Optimizer: This hyperparameter defines the optimization algorithm used to adjust the
weights of the model during training. Adam optimizer has been used.
5. Result and Analysis
The train and test split ratio for our dataset is 10%. For every training dataset accuracy is
calculated. Here, train and test split ratio is 10%. Three outcomes are obtained after 16
epochs for own dataset. Results are given in Table-2. It also tests the processing time of
training. Eight seconds are required per epoch.
Table-2: Result comparison in different mode.
Fig-3: Model Comparison
EfficientNetB0: The EfficientNetB0 model achieved the highest accuracy in transfer
learning, with a score of 95%. The baseline model achieved 32% accuracy. which is
relatively low. However, fine-tuning improved the accuracy to 99.4%, which is quite good
MobileNet:The mobileNet model did not perform as well as the EffNetB0 model. Transfer
learning achieved a moderate accuracy of 92%. However, the baseline model achieved 9%
accuracy, which is relatively low. However, fine-tuning improved the accuracy to 94%.
31
9
99.4
94
95 92
0
20
40
60
80
100
120
EffNetB0 MobileNet
Result Analysis of Model Comparison
Baseline Fine-tuning Transfer learning
Fig-4: Graph of model accuracy and loss
6. Stakeholders
The stakeholders for Bangla handwriting recognition for a report could include:
 Researchers and developers working in the field of Bangla handwriting recognition
technology.
 Users of the technology, such as businesses or organizations that utilize Bangla
handwriting recognition for data entry or analysis.
 Government bodies and policy-makers who are interested in promoting the
development and use of Bangla handwriting recognition technology.
 Educational institutions and instructors who teach Bangla language and writing, as
handwriting recognition technology could be used in language learning.
 Bangla language and culture advocates, who may be interested in preserving and
promoting the use of the Bangla language in the digital age.

7. Issues Encountered
 Large image size: The size (in terms of space) of the images used in the dataset was
too large, which made it difficult to work with. As a result, preprocessing techniques
had to be used to reduce the image size to a more manageable level.
 Face data augmentation problem.
 Variability in handwriting: Bengali script has many complex characters, and each
character can be written in several different ways. This variability makes it difficult to
develop a reliable recognition system that can accurately recognize all variations of
each character.
8. Conclusion, Limitations and Future Recommendations
Despite having lots of variation in the test set our ensemble of residual networks based
approach with Xception architecture performed really well even though the final model has
comparatively lower number of parameters and the models were trained with limited
resources.The impacts of using residual network and batch-normalization were prominent to
improve the overall performance of the classifier model. The number of parameters can be
further reduced with more optimized set of parameter selection while introducing more
augmentation can improve overall performance of the model further.
Limitations:
i. Sample size: The sample size used in this research was limited, which may have
affected the statistical power and generalizability of the findings.
ii. Participant Expression Accuracy: The study was limited by the participants' inability
to accurately convey their facial expressions, which may have impacted the reliability
and validity of the data collected.
iii. Data quality: The quality of the data used in this research was dependent on the
accuracy and completeness of the responses provided by the participants, which may
have been affected by recall bias or response.
.
Here are some potential future recommendations:
i. Explore additional pre-trained models: While the current implementation uses a
variety of pre-trained models, there are many other options available. Exploring
additional models could help improve the accuracy of the facial emotion recognition
system.
ii. Improve dataset quality: Collecting a high-quality dataset is crucial for training
accurate machine learning models. To improve the performance of the current
system, more high-quality data could be collected, particularly for underrepresented
classes
References
[1] Chakraborty, Partha, Syeda Surma Jahanapi, and Tanupriya Choudhury. "Bangla
Handwritten Digit Recognition." Cyber Intelligence and Information Retrieval: Proceedings
of CIIR 2021. Springer Singapore, 2022.
[2] Hossain, M. Zahid, M. Ashraful Amin, and Hong Yan. "Rapid feature extraction for
Bangla handwritten digit recognition." 2011 International Conference on Machine Learning
and Cybernetics. Vol. 4. IEEE, 2011.
[3] Fahim Sikder, Md. "Bangla handwritten digit recognition and generation." Proceedings of
International Joint Conference on Computational Intelligence: IJCCI 2018. Springer
Singapore, 2020.
[4] Hoq, Md Nazmul, et al. "A comparative overview of classification algorithm for Bangla
handwritten digit recognition." Proceedings of International Joint Conference on
Computational Intelligence: IJCCI 2018. Springer Singapore, 2020.
[5] Rabby, AKM Shahariar Azad, et al. "Bangla handwritten digit recognition using
convolutional neural network." Emerging Technologies in Data Mining and Information
Security: Proceedings of IEMIS 2018, Volume 1. Springer Singapore, 2019.
Appendix
Attainment of Complex Engineering Problem (CP)
S.L. CP No. Attainment Remarks
1. P1: Depth of
Knowledge Required
K3 (Engineering Fundamentals):
K4 (Engineering Specialization):
K5 (Design):
K6 (Technology):
K8 (Research):
2. P2: Range of
Conflicting
Requirements
3. P3: Depth of Analysis
Required
4. P4: Familiarity of
Issues
5. P5: Extent of
Applicable Codes
6. P6: Extent of
Stakeholder
Involvement and
Conflicting
Requirements
7. P7: Interdependence
Mapping of Complex Engineering Activities (CA)
S.L. CA No. Attainment Remarks
1. A1: Range of
resources
2. A2: Level of
interaction
3. A3: Innovation
4. A4: Consequences for
Society and the
Environment
5. A5: Familiarity
Code:
 https://www.kaggle.com/code/khondokerabunaim/mobilenetv2transfer
 https://www.kaggle.com/code/khondokerabunaim/mobilenetv2finetune
 https://www.kaggle.com/code/khondokerabunaim/mobilenetv2baseline

More Related Content

Similar to Bangla Handwritten Digit Recognition Report.pdf

IRJET- Emotion Classification of Human Face Expressions using Transfer Le...
IRJET-  	  Emotion Classification of Human Face Expressions using Transfer Le...IRJET-  	  Emotion Classification of Human Face Expressions using Transfer Le...
IRJET- Emotion Classification of Human Face Expressions using Transfer Le...IRJET Journal
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindsporeijdms
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learningijtsrd
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET Journal
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...ijaia
 
AGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxAGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxssuserb4a9ba
 
ageandgenderdetection-220802061020-9ee5a2cd.pptx
ageandgenderdetection-220802061020-9ee5a2cd.pptxageandgenderdetection-220802061020-9ee5a2cd.pptx
ageandgenderdetection-220802061020-9ee5a2cd.pptxdhaliwalharsh055
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrMelanie Smith
 
FACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMFACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMIRJET Journal
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodTELKOMNIKA JOURNAL
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkIRJET Journal
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
 
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLESIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLEIRJET Journal
 

Similar to Bangla Handwritten Digit Recognition Report.pdf (20)

One shot learning
One shot learningOne shot learning
One shot learning
 
IRJET- Emotion Classification of Human Face Expressions using Transfer Le...
IRJET-  	  Emotion Classification of Human Face Expressions using Transfer Le...IRJET-  	  Emotion Classification of Human Face Expressions using Transfer Le...
IRJET- Emotion Classification of Human Face Expressions using Transfer Le...
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
 
IRJET- American Sign Language Classification
IRJET- American Sign Language ClassificationIRJET- American Sign Language Classification
IRJET- American Sign Language Classification
 
Real-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU AccelerationReal-Time Face Tracking with GPU Acceleration
Real-Time Face Tracking with GPU Acceleration
 
CUDA Accelerated Face Recognition
CUDA Accelerated Face RecognitionCUDA Accelerated Face Recognition
CUDA Accelerated Face Recognition
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
 
AGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxAGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptx
 
ageandgenderdetection-220802061020-9ee5a2cd.pptx
ageandgenderdetection-220802061020-9ee5a2cd.pptxageandgenderdetection-220802061020-9ee5a2cd.pptx
ageandgenderdetection-220802061020-9ee5a2cd.pptx
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
 
FACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEMFACIAL EMOTION RECOGNITION SYSTEM
FACIAL EMOTION RECOGNITION SYSTEM
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network method
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural Network
 
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION
 
Ag044216224
Ag044216224Ag044216224
Ag044216224
 
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLESIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 

More from KhondokerAbuNaim

A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
 
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...KhondokerAbuNaim
 
Student_results_management_system proposel.pdf
Student_results_management_system proposel.pdfStudent_results_management_system proposel.pdf
Student_results_management_system proposel.pdfKhondokerAbuNaim
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxKhondokerAbuNaim
 
Bangla_handwritten_dig1] final proposal .pdf
Bangla_handwritten_dig1] final proposal .pdfBangla_handwritten_dig1] final proposal .pdf
Bangla_handwritten_dig1] final proposal .pdfKhondokerAbuNaim
 
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...KhondokerAbuNaim
 
BAUST Lab report cover page .docx
BAUST Lab report cover page .docxBAUST Lab report cover page .docx
BAUST Lab report cover page .docxKhondokerAbuNaim
 
BAUST Assignment cover page .docx
BAUST Assignment cover page .docxBAUST Assignment cover page .docx
BAUST Assignment cover page .docxKhondokerAbuNaim
 
Online Voting System Project Proposal ( Presentation Slide).pptx
Online Voting System Project Proposal ( Presentation Slide).pptxOnline Voting System Project Proposal ( Presentation Slide).pptx
Online Voting System Project Proposal ( Presentation Slide).pptxKhondokerAbuNaim
 
Online Voting System project proposal report.doc
Online Voting System project proposal report.docOnline Voting System project proposal report.doc
Online Voting System project proposal report.docKhondokerAbuNaim
 

More from KhondokerAbuNaim (11)

A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
 
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
A Comparative Analysis of Deep Learning Modelsfor Flower Recognition and Heal...
 
Student_results_management_system proposel.pdf
Student_results_management_system proposel.pdfStudent_results_management_system proposel.pdf
Student_results_management_system proposel.pdf
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
 
Bangla_handwritten_dig1] final proposal .pdf
Bangla_handwritten_dig1] final proposal .pdfBangla_handwritten_dig1] final proposal .pdf
Bangla_handwritten_dig1] final proposal .pdf
 
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
Project t Proposal Bangla alphabet handwritten recognition using deep learnin...
 
Quiz-Gam_1-converted.pptx
Quiz-Gam_1-converted.pptxQuiz-Gam_1-converted.pptx
Quiz-Gam_1-converted.pptx
 
BAUST Lab report cover page .docx
BAUST Lab report cover page .docxBAUST Lab report cover page .docx
BAUST Lab report cover page .docx
 
BAUST Assignment cover page .docx
BAUST Assignment cover page .docxBAUST Assignment cover page .docx
BAUST Assignment cover page .docx
 
Online Voting System Project Proposal ( Presentation Slide).pptx
Online Voting System Project Proposal ( Presentation Slide).pptxOnline Voting System Project Proposal ( Presentation Slide).pptx
Online Voting System Project Proposal ( Presentation Slide).pptx
 
Online Voting System project proposal report.doc
Online Voting System project proposal report.docOnline Voting System project proposal report.doc
Online Voting System project proposal report.doc
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 

Bangla Handwritten Digit Recognition Report.pdf

  • 1. Bangla Handwritten Digit Recognition Using Deep Learning. Khondoker Abu Naim (200101103) and Abu Rayhan Mouno (180201118) Course Code: CSE 4132 Course Title: Artificial Neural Networks and Fuzzy Systems Sessional Semester: Winter 2023 *Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology (BAUST) Abstract Due to the wide range of shapes, sizes, and writing styles, handwritten digit recognition has always been difficult. For all of the recognition algorithms built in this thesis, the rewarded method is used. Bangla-digit is the largest dataset. It is a collection for Bengali handwritten digits. This is a massive dataset with over 3000 images in it. This dataset, however, is incredibly difficult to work with due to its complexity. The aim of this project is to preprocess images that can be used to train deep learning models with high accuracy. In this project, various preprocessing techniques are developed for image processing, with a deep convolutional neural network (CNN) functioning as the classification algorithm. On the bangla-digit image dataset, the performance is systematically evaluated of this process. Finally, 93% accuracy is obtained for bangla-digit dataset. 1. Introduction Bangla handwritten digit recognition is a classical problem in the field of computer vision. There are various kinds of practical application of this system such as OCR, postal code recognition, license plate recognition, bank checks recognition etc. Recognizing Bangla digit from documents is becoming more important. The unique number of Bangla digits are total 10. So the recognition task is to classify 10 different classes. The critical task of handwritten digit recognition is recognizing unique handwritten digits. Because every human has his own writing styles. But our contribution is for the more challenging task. The challenging task is about getting robust performance and high accuracy for large, unbiased, unprocessed, and highly augmented “bangla-digit” dataset. The dataset is a combination of ten class datasets that were gathered from different sources and at different times containing blurring, noise, rotation, translation, shear, zooming, height/width shift,brightness, contrast, occlusions, and superimposition. We have not processed all kinds of augmentation of this dataset. We have processed blur and noisy images mainly. Then our processed image are classified by a deep convolutional neural network (CNN). 2. Literature Review Bangla digits were also identified using the local binary pattern. The convolutional neural network, deep learning approach was used to improve supervised learning and performance. Handwritten digit recognition was proposed by EMNIST for both balanced and imbalanced datasets.Researchers used a local binary pattern to identify Bangla digits, achieving more accuracy. As a result of the recent success of deep learning, especially convolutional neural network (CNN) for computer vision, many researchers have been encouraged to use the CNN to recognize handwritten characters and digits as a computer vision challenge.Proposed handwritten Bangla digit recognition using classification combinations by Dempster–Shafer (DS) technique.
  • 2. They used the DS technique and MLP classifier for classification on the training dataset of 6000 handwritten samples, as well as threefold cross-validation. Use this concept in their genetic algorithm-based handwritten digit recognition technique. Islam and his colleagues suggested Bayanno-Net, in which they used CNN to understand Bangla handwritten numerals offered a comparative analysis of existing classification algorithms for recognizing Bengali handwritten numerals. A model based on handwritten digits was proposed by Shawon et al. An efficient automated design to generate UML diagram from natural language specifications by Gulia et al; the main aim of their paper is to focus on the production of activity diagram and sequence diagram through natural language specifications and on the other hand by applying a conceptual architecture for creating a human computing environment involving the applicability of deep learning algorithms. Intelligent human computing: A Deep Learning-Based Approach by the paper.A Bangla document classification using deep recurrent neural network with BiLSTM by Saifur Rahman and Partha Chakraborty et al. The collected data were processed for the Bangla text document and designing the model architecture training data; it was fitted out into the model. A Bangla documents classification using transformer-based deep learning models by Md Mahbubur Rahman, Md. They have applied the BERT and ELECTRA models for Bangla text classification. The main goal of their paper is to identify fake profiles and eliminate the fake accounts on online social networks. 3. Dataset Description For recognition and generation, “bangla-digit” dataset has been used which contains 2746 handwritten Bangla digits. This is one of the biggest datasets of handwritten Bangla digits. This dataset is divided into 10 subfolder.These 2746 data is collected from 30 persons then use image augmentation. The image are gray-scale and the dimension is 124∗ 124.The sample images of the “Bangla–diti” dataset are shown in Fig-1. Fig1:Sample data Image from bangla-digit dataset. Class Number of Sample 0 274 1 276 2 276 3 276 4 276 5 275 6 275 7 276 8 275
  • 3. 9 267 Table-1: Summarizes the Dataset. 4. Methodology The experimentation process was conducted using two cloud-based platforms, namely Google Colab and Kaggle. To run the experiments, we set up the required software packages and libraries in both environments, including TensorFlow, Keras, and NumPy. We used TensorFlow version 2.4.0 and Keras version 2.4.3, which were the latest stable versions at the time of the experiments. Three main steps are applied to recognize handwritten digits. Among them, feature extraction step and classification step are combined in one step. So the number of main steps of our digit recognition is two. One is preprocessing of images,and another is deep CNN model. Here we include models EfficientNetB0, MobileNet. These models were chosen based on their proven performance in image classification tasks. These steps are shown in Fig. Fig-2: Block diagram of proposed Bangla digit recognition Fig-2: Propose Model Baseline model: A baseline model is a simple model that is used as a starting point for developing more complex models. It is typically the simplest model that can be built for a given problem, and it serves as a reference point against which other models can be Input Image Image Preprocessing Feature Extration Image Classification Detected Digit
  • 4. compared. The purpose of a baseline model is to establish a minimum level of performance that must be exceeded by more complex models in order to justify their additional complexity. For example, in machine learning, a baseline model might be a simple linear regression model that uses only one or two features to make predictions. This model can be used to establish a baseline level of prediction accuracy, and more complex models can be developed and compared to this baseline to determine if they are worth the additional computational cost and complexity. Transfer learning: Transfer learning was used to train the pre-trained models on the facial emotion recognition dataset. In this approach, the pre-trained models were used as feature extractors, and the final layer(s) of the models were replaced with a new layer(s) that was trained on the facial emotion recognition dataset. Transfer learning: Transfer learning was used to train the pre-trained models on the facial emotion recognition dataset. In this approach, the pre-trained models were used as feature extractors, and the final layer(s) of the models were replaced with a new layer(s) that was trained on the facial emotion recognition dataset. Dataset preprocessing: The collected dataset was preprocessed by resizing the images to a standard size of 124x124 pixels and then converting them to JPEG format. First of all we collect the image then we crop the image. Data Augmentation: Data augmentation was performed using the ImageDataGenerator class from the keras.preprocessing.image module. The following transformations were applied to the images:  Rotation: Randomly rotate the image by a certain angle in the range 10.  Width and Height Shift: Randomly shift the image horizontally and vertically by a fraction of the image size in the range 0.1, 0.2.  Zoom: Randomly zoom into the image by a factor in the range 0.0. Hyperparameters: Several hyperparameters have been used for model training and optimization, such as:  Batch Size: This hyperparameter determines the number of samples that are processed in a single batch. A batch size of 32 has been used.  Size: This hyperparameter defines the size of the input image after resizing. In the code, the size has been set to 124x124 pixels.  Epochs: This hyperparameter defines the number of times the entire dataset is passed through the model during training. The number of epochs has been set to 16 and it varies while fine tuning.  Learning Rate: This hyperparameter determines the step size at which the optimizer adjusts the model parameters during training. Adam optimizer with a learning rate of 0.001 has been used and also varies while training for different models of different variations.  Optimizer: This hyperparameter defines the optimization algorithm used to adjust the weights of the model during training. Adam optimizer has been used.
  • 5. 5. Result and Analysis The train and test split ratio for our dataset is 10%. For every training dataset accuracy is calculated. Here, train and test split ratio is 10%. Three outcomes are obtained after 16 epochs for own dataset. Results are given in Table-2. It also tests the processing time of training. Eight seconds are required per epoch. Table-2: Result comparison in different mode. Fig-3: Model Comparison EfficientNetB0: The EfficientNetB0 model achieved the highest accuracy in transfer learning, with a score of 95%. The baseline model achieved 32% accuracy. which is relatively low. However, fine-tuning improved the accuracy to 99.4%, which is quite good MobileNet:The mobileNet model did not perform as well as the EffNetB0 model. Transfer learning achieved a moderate accuracy of 92%. However, the baseline model achieved 9% accuracy, which is relatively low. However, fine-tuning improved the accuracy to 94%. 31 9 99.4 94 95 92 0 20 40 60 80 100 120 EffNetB0 MobileNet Result Analysis of Model Comparison Baseline Fine-tuning Transfer learning
  • 6. Fig-4: Graph of model accuracy and loss 6. Stakeholders The stakeholders for Bangla handwriting recognition for a report could include:  Researchers and developers working in the field of Bangla handwriting recognition technology.  Users of the technology, such as businesses or organizations that utilize Bangla handwriting recognition for data entry or analysis.  Government bodies and policy-makers who are interested in promoting the development and use of Bangla handwriting recognition technology.  Educational institutions and instructors who teach Bangla language and writing, as handwriting recognition technology could be used in language learning.  Bangla language and culture advocates, who may be interested in preserving and promoting the use of the Bangla language in the digital age.  7. Issues Encountered  Large image size: The size (in terms of space) of the images used in the dataset was too large, which made it difficult to work with. As a result, preprocessing techniques had to be used to reduce the image size to a more manageable level.  Face data augmentation problem.  Variability in handwriting: Bengali script has many complex characters, and each character can be written in several different ways. This variability makes it difficult to develop a reliable recognition system that can accurately recognize all variations of each character. 8. Conclusion, Limitations and Future Recommendations Despite having lots of variation in the test set our ensemble of residual networks based approach with Xception architecture performed really well even though the final model has
  • 7. comparatively lower number of parameters and the models were trained with limited resources.The impacts of using residual network and batch-normalization were prominent to improve the overall performance of the classifier model. The number of parameters can be further reduced with more optimized set of parameter selection while introducing more augmentation can improve overall performance of the model further. Limitations: i. Sample size: The sample size used in this research was limited, which may have affected the statistical power and generalizability of the findings. ii. Participant Expression Accuracy: The study was limited by the participants' inability to accurately convey their facial expressions, which may have impacted the reliability and validity of the data collected. iii. Data quality: The quality of the data used in this research was dependent on the accuracy and completeness of the responses provided by the participants, which may have been affected by recall bias or response. . Here are some potential future recommendations: i. Explore additional pre-trained models: While the current implementation uses a variety of pre-trained models, there are many other options available. Exploring additional models could help improve the accuracy of the facial emotion recognition system. ii. Improve dataset quality: Collecting a high-quality dataset is crucial for training accurate machine learning models. To improve the performance of the current system, more high-quality data could be collected, particularly for underrepresented classes References [1] Chakraborty, Partha, Syeda Surma Jahanapi, and Tanupriya Choudhury. "Bangla Handwritten Digit Recognition." Cyber Intelligence and Information Retrieval: Proceedings of CIIR 2021. Springer Singapore, 2022. [2] Hossain, M. Zahid, M. Ashraful Amin, and Hong Yan. "Rapid feature extraction for Bangla handwritten digit recognition." 2011 International Conference on Machine Learning and Cybernetics. Vol. 4. IEEE, 2011. [3] Fahim Sikder, Md. "Bangla handwritten digit recognition and generation." Proceedings of International Joint Conference on Computational Intelligence: IJCCI 2018. Springer Singapore, 2020. [4] Hoq, Md Nazmul, et al. "A comparative overview of classification algorithm for Bangla handwritten digit recognition." Proceedings of International Joint Conference on Computational Intelligence: IJCCI 2018. Springer Singapore, 2020. [5] Rabby, AKM Shahariar Azad, et al. "Bangla handwritten digit recognition using convolutional neural network." Emerging Technologies in Data Mining and Information Security: Proceedings of IEMIS 2018, Volume 1. Springer Singapore, 2019.
  • 8. Appendix Attainment of Complex Engineering Problem (CP) S.L. CP No. Attainment Remarks 1. P1: Depth of Knowledge Required K3 (Engineering Fundamentals): K4 (Engineering Specialization): K5 (Design): K6 (Technology): K8 (Research): 2. P2: Range of Conflicting Requirements 3. P3: Depth of Analysis Required 4. P4: Familiarity of Issues 5. P5: Extent of Applicable Codes 6. P6: Extent of Stakeholder Involvement and Conflicting Requirements 7. P7: Interdependence Mapping of Complex Engineering Activities (CA) S.L. CA No. Attainment Remarks 1. A1: Range of resources 2. A2: Level of interaction 3. A3: Innovation 4. A4: Consequences for Society and the Environment 5. A5: Familiarity Code:  https://www.kaggle.com/code/khondokerabunaim/mobilenetv2transfer