The document summarizes a research article that proposes using a convolutional neural network (CNN) to improve classification of Arabic alphabet characters. The researchers developed a CNN model and tested various optimization algorithms to achieve high recognition accuracy on two Arabic handwritten character datasets. Their best-performing model achieved 98.48% accuracy on one dataset and 91.24% accuracy on the other, outperforming other state-of-the-art models for Arabic character recognition. The researchers augmented the datasets with various techniques to improve the CNN model's robustness when faced with limited Arabic handwriting data.
Efficient feature descriptor selection for improved Arabic handwritten words ...IJECEIAES
This document presents a new approach for recognizing Arabic handwritten words that combines three image descriptors for feature extraction: histogram of oriented gradients (HOG), Gabor filter, and local binary pattern (LBP). The approach applies preprocessing techniques to standardize images from the IFN/ENIT dataset before extracting features using the three descriptors. Three k-nearest neighbor models are trained on the extracted features and their predictions are combined using majority voting. Testing achieves a high recognition rate of 99.88%, outperforming previous methods on this dataset. The results demonstrate the effectiveness of combining multiple descriptors and classifiers for Arabic handwriting recognition.
Arabic handwriting recognition system using convolutional neural.pdfNesrine Wagaa
The document presents a new dataset of Arabic letters written by children aged 7-12, called Hijja, containing 47,434 characters written by 591 participants. It also proposes a convolutional neural network model for offline Arabic handwriting recognition. The model achieves accuracies of 97% on an existing dataset and 88% on the new Hijja dataset, outperforming other models. The document provides background on convolutional neural networks and reviews related work on Arabic handwriting recognition using various methods.
An improved Arabic text classification method using word embeddingIJECEIAES
Feature selection (FS) is a widely used method for removing redundant or irrelevant features to improve classification accuracy and decrease the model’s computational cost. In this paper, we present an improved method (referred to hereafter as RARF) for Arabic text classification (ATC) that employs the term frequency-inverse document frequency (TF-IDF) and Word2Vec embedding technique to identify words that have a particular semantic relationship. In addition, we have compared our method with four benchmark FS methods namely principal component analysis (PCA), linear discriminant analysis (LDA), chi-square, and mutual information (MI). Support vector machine (SVM), k-nearest neighbors (K-NN), and naive Bayes (NB) are three machine learning based algorithms used in this work. Two different Arabic datasets are utilized to perform a comparative analysis of these algorithms. This paper also evaluates the efficiency of our method for ATC on the basis of performance metrics viz accuracy, precision, recall, and F-measure. Results revealed that the highest accuracy achieved for the SVM classifier applied to the Khaleej-2004 Arabic dataset with 94.75%, while the same classifier recorded an accuracy of 94.01% for the Watan-2004 Arabic dataset.
Artificial Neural Networks have proved their efficiency in a large number of research domains. In this paper, we have applied Artificial Neural Networks on Arabic text to prove correct language modeling, text generation, and missing text prediction. In one hand, we have adapted Recurrent Neural Networks architectures to model Arabic language in order to generate correct Arabic sequences. In the other hand, Convolutional Neural Networks have been parameterized, basing on some specific features of Arabic, to predict missing text in Arabic documents. We have demonstrated the power of our adapted models in generating and predicting correct Arabic text comparing to the standard model. The model had been trained and tested on known free Arabic datasets. Results have been promising with sufficient accuracy.
ARABIC ONLINE HANDWRITING RECOGNITION USING NEURAL NETWORKijaia
This article presents the development of an Arabic online handwriting recognition system. To develop our
system, we have chosen the neural network approach. It offers solutions for most of the difficulties linked
to Arabic script recognition. We test the approach with our collected databases. This system shows a good
result and it has a high accuracy (98.50% for characters, 96.90% for words).
An Ensemble Learning Approach Using Decision Fusion For The Recognition Of Ar...Brittany Brown
The document presents a new ensemble learning approach for recognizing Arabic handwritten characters that combines decision fusion of machine learning classifiers with feature extraction. It achieves state-of-the-art accuracy of 97.97% on the IFHCDB dataset and 92.91% on the AIA9K dataset, outperforming individual machine learning classifiers and deep learning networks. The approach first applies a grayscale skeletonization technique to extract structural and statistical features from images. It then builds a model that fuses the results of support vector machines, K-nearest neighbors, and random forest classifiers to take advantage of each algorithm.
A study of feature extraction for Arabic calligraphy characters recognitionIJECEIAES
Optical character recognition (OCR) is one of the widely used pattern recognition systems. However, the research on ancient Arabic writing recognition has suffered from a lack of interest for decades, despite the availability of thousands of historical documents. One of the reasons for this lack of interest is the absence of a standard dataset, which is fundamental for building and evaluating an OCR system. In 2022, we published a database of ancient Arabic words as the only public dataset of characters written in Al- Mojawhar Moroccan calligraphy. Therefore, such a database needs to be studied and evaluated. In this paper, we explored the proposed database and investigated the recognition of Al-Mojawhar Arabic characters. We studied feature extraction by using the most popular descriptors used in Arabic OCR. The studied descriptors were associated with different machine learning classifiers to build recognition models and verify their performance. In order to compare the learned and handcrafted features on the proposed dataset, we proposed a deep convolutional neural network for character recognition. Regarding the complexity of the character shapes, the results obtained were very promising, especially by using the convolutional neural network model, which gave the highest accuracy score.
Arabic Handwritten Text Recognition and Writer IdentificationMustafa Salam
A seminar of Ph.D. theses which explain a proposed system for recognize the Arabic handwritten text and identify the text writer. Several proposed steps are described in details in this seminar and the obtained results are viewed in detail.
Efficient feature descriptor selection for improved Arabic handwritten words ...IJECEIAES
This document presents a new approach for recognizing Arabic handwritten words that combines three image descriptors for feature extraction: histogram of oriented gradients (HOG), Gabor filter, and local binary pattern (LBP). The approach applies preprocessing techniques to standardize images from the IFN/ENIT dataset before extracting features using the three descriptors. Three k-nearest neighbor models are trained on the extracted features and their predictions are combined using majority voting. Testing achieves a high recognition rate of 99.88%, outperforming previous methods on this dataset. The results demonstrate the effectiveness of combining multiple descriptors and classifiers for Arabic handwriting recognition.
Arabic handwriting recognition system using convolutional neural.pdfNesrine Wagaa
The document presents a new dataset of Arabic letters written by children aged 7-12, called Hijja, containing 47,434 characters written by 591 participants. It also proposes a convolutional neural network model for offline Arabic handwriting recognition. The model achieves accuracies of 97% on an existing dataset and 88% on the new Hijja dataset, outperforming other models. The document provides background on convolutional neural networks and reviews related work on Arabic handwriting recognition using various methods.
An improved Arabic text classification method using word embeddingIJECEIAES
Feature selection (FS) is a widely used method for removing redundant or irrelevant features to improve classification accuracy and decrease the model’s computational cost. In this paper, we present an improved method (referred to hereafter as RARF) for Arabic text classification (ATC) that employs the term frequency-inverse document frequency (TF-IDF) and Word2Vec embedding technique to identify words that have a particular semantic relationship. In addition, we have compared our method with four benchmark FS methods namely principal component analysis (PCA), linear discriminant analysis (LDA), chi-square, and mutual information (MI). Support vector machine (SVM), k-nearest neighbors (K-NN), and naive Bayes (NB) are three machine learning based algorithms used in this work. Two different Arabic datasets are utilized to perform a comparative analysis of these algorithms. This paper also evaluates the efficiency of our method for ATC on the basis of performance metrics viz accuracy, precision, recall, and F-measure. Results revealed that the highest accuracy achieved for the SVM classifier applied to the Khaleej-2004 Arabic dataset with 94.75%, while the same classifier recorded an accuracy of 94.01% for the Watan-2004 Arabic dataset.
Artificial Neural Networks have proved their efficiency in a large number of research domains. In this paper, we have applied Artificial Neural Networks on Arabic text to prove correct language modeling, text generation, and missing text prediction. In one hand, we have adapted Recurrent Neural Networks architectures to model Arabic language in order to generate correct Arabic sequences. In the other hand, Convolutional Neural Networks have been parameterized, basing on some specific features of Arabic, to predict missing text in Arabic documents. We have demonstrated the power of our adapted models in generating and predicting correct Arabic text comparing to the standard model. The model had been trained and tested on known free Arabic datasets. Results have been promising with sufficient accuracy.
ARABIC ONLINE HANDWRITING RECOGNITION USING NEURAL NETWORKijaia
This article presents the development of an Arabic online handwriting recognition system. To develop our
system, we have chosen the neural network approach. It offers solutions for most of the difficulties linked
to Arabic script recognition. We test the approach with our collected databases. This system shows a good
result and it has a high accuracy (98.50% for characters, 96.90% for words).
An Ensemble Learning Approach Using Decision Fusion For The Recognition Of Ar...Brittany Brown
The document presents a new ensemble learning approach for recognizing Arabic handwritten characters that combines decision fusion of machine learning classifiers with feature extraction. It achieves state-of-the-art accuracy of 97.97% on the IFHCDB dataset and 92.91% on the AIA9K dataset, outperforming individual machine learning classifiers and deep learning networks. The approach first applies a grayscale skeletonization technique to extract structural and statistical features from images. It then builds a model that fuses the results of support vector machines, K-nearest neighbors, and random forest classifiers to take advantage of each algorithm.
A study of feature extraction for Arabic calligraphy characters recognitionIJECEIAES
Optical character recognition (OCR) is one of the widely used pattern recognition systems. However, the research on ancient Arabic writing recognition has suffered from a lack of interest for decades, despite the availability of thousands of historical documents. One of the reasons for this lack of interest is the absence of a standard dataset, which is fundamental for building and evaluating an OCR system. In 2022, we published a database of ancient Arabic words as the only public dataset of characters written in Al- Mojawhar Moroccan calligraphy. Therefore, such a database needs to be studied and evaluated. In this paper, we explored the proposed database and investigated the recognition of Al-Mojawhar Arabic characters. We studied feature extraction by using the most popular descriptors used in Arabic OCR. The studied descriptors were associated with different machine learning classifiers to build recognition models and verify their performance. In order to compare the learned and handcrafted features on the proposed dataset, we proposed a deep convolutional neural network for character recognition. Regarding the complexity of the character shapes, the results obtained were very promising, especially by using the convolutional neural network model, which gave the highest accuracy score.
Arabic Handwritten Text Recognition and Writer IdentificationMustafa Salam
A seminar of Ph.D. theses which explain a proposed system for recognize the Arabic handwritten text and identify the text writer. Several proposed steps are described in details in this seminar and the obtained results are viewed in detail.
The effect of training set size in authorship attribution: application on sho...IJECEIAES
Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original author among a set of candidate authors. Several research papers were published and several methods and models were developed for many languages. However, the number of related works for Arabic is limited. Moreover, investigating the impact of short words length and training set size is not well addressed. To the best of our knowledge, no published works or researches, in this direction or even in other languages, are available. Therefore, we propose to investigate this effect, taking into account different stylomatric combination. The Mahalanobis distance (MD), Linear Regression (LR), and Multilayer Perceptron (MP) are selected as AA classifiers. During the experiment, the training dataset size is increased and the accuracy of the classifiers is recorded. The results are quite interesting and show different classifiers behaviours. Combining word-based stylomatric features with n-grams provides the best accuracy reached in average 93%.
ATAR: Attention-based LSTM for Arabizi transliterationIJECEIAES
A non-standard romanization of Arabic script, known as Arbizi, is widely used in Arabic online and SMS/chat communities. However, since state-of-the-art tools and applications for Arabic NLP expects Arabic to be written in Arabic script, handling contents written in Arabizi requires a special attention either by building customized tools or by transliterating them into Arabic script. The latter approach is the more common one and this work presents two significant contributions in this direction. The first one is to collect and publicly release the first large-scale “Arabizi to Arabic script” parallel corpus focusing on the Jordanian dialect and consisting of more than 25 k pairs carefully created and inspected by native speakers to ensure highest quality. Second, we present ATAR, an ATtention-based LSTM model for ARabizi transliteration. Training and testing this model on our dataset yields impressive accuracy (79%) and BLEU score (88.49).
The document proposes an automatic Arabic grammar error correction model based on expectation-maximization routing and bidirectional agreement between left-to-right and right-to-left models. It generates synthetic training data using back-translation and direct error injection. The model uses capsule networks and dynamic layer aggregation to efficiently combine information across layers. It also introduces a regularization term to minimize divergence between the two models and resolve the exposure bias problem during training. Experiments on two benchmarks show the proposed model achieves state-of-the-art performance for Arabic grammar error correction.
Writer Identification via CNN Features and SVMIRJET Journal
This document summarizes a research paper that proposes a model for identifying the author of Arabic handwritten documents based on their handwriting style. The model uses a convolutional neural network to extract features from images of handwritten text that have been augmented through data preprocessing techniques. A support vector machine is then used to classify the extracted features and identify the writer. The proposed method was tested on a dataset of 202 Arabic writers, achieving a classification accuracy of 97.20%.
Design and Development of a 2D-Convolution CNN model for Recognition of Handw...CSCJournals
Owing to the innumerable appearances due to different writers, their writing styles, technical environment differences and noise, the handwritten character recognition has always been one of the most challenging task in pattern recognition. The emergence of deep learning has provided a new direction to break the limits of decades old traditional methods. There exist many scripts in the world which are being used by millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. Feature based approaches derive important properties from the test patterns and employ them in a more sophisticated classification model. Feature extraction using Zernike moment and Polar harmonic transformation techniques was also performed and a moderate classification accuracy was also achieved. The problems faced while using these techniques led us to use CNN based recognition approach which is capable of learning the feature vector from the training character image samples in an unsupervised manner in the sense that no hand-crafting is employed to determine the feature vector. This paper presents a deep learning paradigm using a Convolution Neural Network (CNN) which is implemented for handwritten Gurumukhi and devanagari character recognition (HGDCR). In the present experiment, the training of a 34-layer CNN for a 35 class self-generated handwritten Gurumukhi and 60 class (50 alphabet and 10 digits) handwritten Devanagari character dataset was performed on a GPU (Graphic Processing Unit) machine. The experiment resulted with an average recognition accuracy of more than 92% in case of Handwritten Gurumukhi Character dataset and 97.25% in case of Handwritten Devanagari Character dataset. It was also concluded that the training and classification through our network design performed about 10 times faster than on a moderately fast CPU. The advantage of this framework is proved by the experimental results.
Hybrid convolutional neural networks-support vector machine classifier with d...TELKOMNIKA JOURNAL
This research paper explores the hybrid models for Javanese character recognition using 15600 characters gathered from digital and handwritten sources. The hybrid model combines the merit of deep learning using convolutional neural networks (CNN) to involve feature extraction and a machine learning classifier using support vector machine (SVM). The dropout layer also manages overfitting problems and enhances training accuracy. For evaluation purposes, we also compared CNN models with three different architectures with multilayer perceptron (MLP) models with one and two hidden layer(s). In this research, we evaluated three variants of CNN architectures and the hybrid CNN-SVM models on both the accuracy of classification and training time. The experimental outcomes showed that the classification performances of all CNN models outperform the classification performances of both MLP models. The highest testing accuracy for basic CNN is 94.2% when using model 3 CNN. The increment of hidden layers to the MLP model just slightly enhances the accuracy. Furthermore, the hybrid model gained the highest accuracy result of 98.35% for classifying the testing data when combining model 3 CNN with the SVM classifier. We get that the hybrid CNN-SVM model can enhance the accuracy results in the Javanese characters recognition.
Spoken language identification using i-vectors, x-vectors, PLDA and logistic ...journalBEEI
This document discusses spoken language identification using i-vectors and x-vectors for feature extraction, and PLDA and logistic regression for classification. It examines extracting features from Javanese, Sundanese, and Minangkabau languages, then classifying the languages using various parameters. The study finds that x-vector outperforms i-vector when using PLDA classification, except when using logistic regression, where i-vector performs better. It tunes parameters for i-vector UBM size, i-vector dimension, x-vector max frame size, and num repeats, reporting equal error rates to evaluate performance on test segments of 3, 10 and 30 seconds.
Dialect classification using acoustic and linguistic features in Arabic speechIAESIJAI
Speech dialects refer to linguistic and pronunciation variations in the speech
of the same language. Automatic dialect classification requires considerable
acoustic and linguistic differences between different dialect categories of
speech. This paper proposes a classification model composed of a
combination of classifiers for the Arabic dialects by utilizing both the
acoustic and linguistic features of spontaneous speech. The acoustic
classification comprises of an ensemble of classifiers focusing on different
frequency ranges within the short-term spectral features, as well as a
classifier utilizing the ‘i-vector’, whilst the linguistic classifiers use features
extracted by transformer models pre-trained on large Arabic text datasets. It
has been shown that the proposed fusion of multiple classifiers achieves a
classification accuracy of 82.44% for the identification task of five Arabic
dialects. This represents the highest accuracy reported on the dataset, despite
the relative simplicity of the proposed model, and has shown its applicability
and relevance for dialect identification tasks.
Arabic handwritten digits recognition based on convolutional neural networks ...nooriasukmaningtyas
Handwritten digits recognition has attracted the attention of researchers in pattern recognition fields, due to its importance in many applications in public real life, such as read bank checks and formal documents which is a continuous challenge in the last years. For this motivation, the researchers created several algorithms in recognition of different human languages, but the problem of the Arabic language is still widespread. Concerning its importance in many Arab and Islamic countries, because the people of these countries speak this language, However, there is still a little work to recognize patterns of letters and digits. In this paper, a new method is proposed that used pre-trained convolutional neural networks with resnet-34 model what is known as transfer learning for recognizing digits in the arabic language that provides us a high accuracy when this type of network is applied. This work uses a famous arabic handwritten digits dataset that called MADBase that contains 60000 training and 1000 testing samples that in later steps was converted to grayscale samples for convenient handling during the training process. This proposed method recorded the highest accuracy compared to previous methods, which is 99.6%.
IRJET- Review on Optical Character RecognitionIRJET Journal
This document provides a review of optical character recognition (OCR) technologies. It discusses research that has been conducted on OCR for English, Arabic, and Devanagari characters. For English characters, various studies that used techniques like neural networks, support vector machines, and nearest neighbor classification are summarized. For Arabic characters, research using neural networks, Haar-like feature extraction and boosting classifiers achieved recognition rates from 87% to 99%. Studies on Devanagari characters employed techniques such as curvelet transforms, neural networks, k-means clustering and support vector machines and achieved recognition rates from 90% to 98.5%. In general, the document reviews past work on OCR and the techniques researchers have used to recognize
TUNING DARI SPEECH CLASSIFICATION EMPLOYING DEEP NEURAL NETWORKSkevig
Recently, many researchers have focused on building and improving speech recognition systems to
facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system
has become an important and commontool from games to translation systems, robots, and so on. However,
there is still a need for research on speech recognition systems for low-resource languages. This article
deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients
(MFCCs) feature extraction method and three different deep neural networks including Convolutional
Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP). We evaluate our
models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms.
This study obtained the impressive result of 98.365% average accuracy.
Tuning Dari Speech Classification Employing Deep Neural Networkskevig
Recently, many researchers have focused on building and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system has become an important and common tool from games to translation systems, robots, and so on. However, there is still a need for research on speech recognition systems for low-resource languages. This article deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural networks including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP). We evaluate our models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms. This study obtained the impressive result of 98.365% average accuracy.
Arabic tweeps dialect prediction based on machine learning approach IJECEIAES
In this paper, we present our approach for profiling Arabic authors on Twitter, based on their tweets. We consider here the dialect of an Arabic author as an important trait to be predicted. For this purpose, many indicators, feature vectors and machine learning-based classifiers were implemented. The results of these classifiers were compared to find out the best dialect prediction model. The best dialect prediction model was obtained using random forest classifier with full forms and their stems as feature vector.
TUNING DARI SPEECH CLASSIFICATION EMPLOYING DEEP NEURAL NETWORKSIJCI JOURNAL
Recently, many researchers have focused on building and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system has become an important and common tool from games to translation systems, robots, and so on. However, there is still a need for research on speech recognition systems for low-resource languages. This article deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural networks including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP), and two hybrid models of CNN and RNN. We evaluate our models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms. This study obtained the impressive result of 98.365% average accuracy.
In This paper we presented new approach for cursive Arabic text recognition system. The objective is to propose methodology analytical offline recognition of handwritten Arabic for rapid implementation.The first part in the writing recognition system is the preprocessing phase is the preprocessing phase to prepare the data was introduces and extracts a set of simple statistical features by two methods : from a window which is sliding long that text line the right to left and the approach VH2D (consists in projecting every character on the abscissa, on the ordinate and the diagonals 45° and 135°) . It then injects the resulting feature vectors to Hidden Markov Model (HMM) and combined the two HMM by multi-stream approach.
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITIONijnlc
This document presents a multi-stream HMM approach for offline handwritten Arabic word recognition. It extracts two sets of features from each word using a sliding window approach and VH2D projection approach. These features are input to separate HMM classifiers, and the outputs are combined in a multi-stream HMM to provide more reliable recognition. The system is evaluated on 200 words, achieving a recognition rate of 83.8% using the multi-stream approach compared to 78.2% and 76.6% for the individual classifiers.
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITIONkevig
In This paper we presented new approach for cursive Arabic text recognition system. The objective is to propose methodology analytical offline recognition of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase is the preprocessing phase to prepare the data was introduces and extracts a set of simple statistical features by two methods : from a window which is sliding long that text line the right to left and the approach VH2D (consists in projecting every character on the abscissa, on the ordinate and the diagonals 45° and 135°) . It then injects the resulting feature vectors to Hidden Markov Model (HMM) and combined the two HMM by multi-stream approach.
The document describes a two-layer classification technique for online handwritten Gujarati character recognition. In the first layer, an SVM classifier with an RBF kernel is used. In the second layer, a k-NN classifier is used for characters identified as similar by the first layer. The system achieves an average accuracy of 94.65% and processing time of 0.095 seconds per stroke using a hybrid feature set including derivatives, zoning, and normalized chain codes.
Data entry is a time consuming and erroneous procedure in its nature. In addition, validity
check of submitted information is not easier than retyping it. In a mega-corporation like Kanoon
Farhangi Amoozesh, there are almost no way to control the authenticity of students' educational
background. By the virtue of fast computer architectures, optical character recognition, a.k.a.
OCR, systems have become viable. Unfortunately, general-purpose OCR systems like Google's
Tesseract are not handful because they don't have any a-priori information about what they are
reading. In this paper the authors have taken a in-depth look on what has done in the field of
OCR in the last 60 years. Then, a custom-made system adapted to the problem is presented
which is way more accurate than general purpose OCRs. The developed system reads more than
60 digits per second. As shown in the Results section, the accuracy of the devised method is
reasonable enough to be exposed in public use.
ArabicWordNet: fine-tuning MobileNetV2-based model for Arabic Handwritten Wor...IRJET Journal
The document describes a study that developed a deep learning model called ArabicWordNet for recognizing Arabic handwritten words. Researchers evaluated five pre-trained convolutional neural network architectures on an Arabic handwritten words dataset and found that MobileNetV2 performed best, achieving 96.93% validation accuracy. They then fine-tuned the MobileNetV2 model through an ablation study, improving validation accuracy to 98.98% and test accuracy to 99.02%. The proposed ArabicWordNet model uses a MobileNetV2 architecture with additional layers and is effective for recognizing Arabic handwritten words.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
More Related Content
Similar to Improved Arabic Alphabet Characters Classification Using.pdf
The effect of training set size in authorship attribution: application on sho...IJECEIAES
Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original author among a set of candidate authors. Several research papers were published and several methods and models were developed for many languages. However, the number of related works for Arabic is limited. Moreover, investigating the impact of short words length and training set size is not well addressed. To the best of our knowledge, no published works or researches, in this direction or even in other languages, are available. Therefore, we propose to investigate this effect, taking into account different stylomatric combination. The Mahalanobis distance (MD), Linear Regression (LR), and Multilayer Perceptron (MP) are selected as AA classifiers. During the experiment, the training dataset size is increased and the accuracy of the classifiers is recorded. The results are quite interesting and show different classifiers behaviours. Combining word-based stylomatric features with n-grams provides the best accuracy reached in average 93%.
ATAR: Attention-based LSTM for Arabizi transliterationIJECEIAES
A non-standard romanization of Arabic script, known as Arbizi, is widely used in Arabic online and SMS/chat communities. However, since state-of-the-art tools and applications for Arabic NLP expects Arabic to be written in Arabic script, handling contents written in Arabizi requires a special attention either by building customized tools or by transliterating them into Arabic script. The latter approach is the more common one and this work presents two significant contributions in this direction. The first one is to collect and publicly release the first large-scale “Arabizi to Arabic script” parallel corpus focusing on the Jordanian dialect and consisting of more than 25 k pairs carefully created and inspected by native speakers to ensure highest quality. Second, we present ATAR, an ATtention-based LSTM model for ARabizi transliteration. Training and testing this model on our dataset yields impressive accuracy (79%) and BLEU score (88.49).
The document proposes an automatic Arabic grammar error correction model based on expectation-maximization routing and bidirectional agreement between left-to-right and right-to-left models. It generates synthetic training data using back-translation and direct error injection. The model uses capsule networks and dynamic layer aggregation to efficiently combine information across layers. It also introduces a regularization term to minimize divergence between the two models and resolve the exposure bias problem during training. Experiments on two benchmarks show the proposed model achieves state-of-the-art performance for Arabic grammar error correction.
Writer Identification via CNN Features and SVMIRJET Journal
This document summarizes a research paper that proposes a model for identifying the author of Arabic handwritten documents based on their handwriting style. The model uses a convolutional neural network to extract features from images of handwritten text that have been augmented through data preprocessing techniques. A support vector machine is then used to classify the extracted features and identify the writer. The proposed method was tested on a dataset of 202 Arabic writers, achieving a classification accuracy of 97.20%.
Design and Development of a 2D-Convolution CNN model for Recognition of Handw...CSCJournals
Owing to the innumerable appearances due to different writers, their writing styles, technical environment differences and noise, the handwritten character recognition has always been one of the most challenging task in pattern recognition. The emergence of deep learning has provided a new direction to break the limits of decades old traditional methods. There exist many scripts in the world which are being used by millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. Feature based approaches derive important properties from the test patterns and employ them in a more sophisticated classification model. Feature extraction using Zernike moment and Polar harmonic transformation techniques was also performed and a moderate classification accuracy was also achieved. The problems faced while using these techniques led us to use CNN based recognition approach which is capable of learning the feature vector from the training character image samples in an unsupervised manner in the sense that no hand-crafting is employed to determine the feature vector. This paper presents a deep learning paradigm using a Convolution Neural Network (CNN) which is implemented for handwritten Gurumukhi and devanagari character recognition (HGDCR). In the present experiment, the training of a 34-layer CNN for a 35 class self-generated handwritten Gurumukhi and 60 class (50 alphabet and 10 digits) handwritten Devanagari character dataset was performed on a GPU (Graphic Processing Unit) machine. The experiment resulted with an average recognition accuracy of more than 92% in case of Handwritten Gurumukhi Character dataset and 97.25% in case of Handwritten Devanagari Character dataset. It was also concluded that the training and classification through our network design performed about 10 times faster than on a moderately fast CPU. The advantage of this framework is proved by the experimental results.
Hybrid convolutional neural networks-support vector machine classifier with d...TELKOMNIKA JOURNAL
This research paper explores the hybrid models for Javanese character recognition using 15600 characters gathered from digital and handwritten sources. The hybrid model combines the merit of deep learning using convolutional neural networks (CNN) to involve feature extraction and a machine learning classifier using support vector machine (SVM). The dropout layer also manages overfitting problems and enhances training accuracy. For evaluation purposes, we also compared CNN models with three different architectures with multilayer perceptron (MLP) models with one and two hidden layer(s). In this research, we evaluated three variants of CNN architectures and the hybrid CNN-SVM models on both the accuracy of classification and training time. The experimental outcomes showed that the classification performances of all CNN models outperform the classification performances of both MLP models. The highest testing accuracy for basic CNN is 94.2% when using model 3 CNN. The increment of hidden layers to the MLP model just slightly enhances the accuracy. Furthermore, the hybrid model gained the highest accuracy result of 98.35% for classifying the testing data when combining model 3 CNN with the SVM classifier. We get that the hybrid CNN-SVM model can enhance the accuracy results in the Javanese characters recognition.
Spoken language identification using i-vectors, x-vectors, PLDA and logistic ...journalBEEI
This document discusses spoken language identification using i-vectors and x-vectors for feature extraction, and PLDA and logistic regression for classification. It examines extracting features from Javanese, Sundanese, and Minangkabau languages, then classifying the languages using various parameters. The study finds that x-vector outperforms i-vector when using PLDA classification, except when using logistic regression, where i-vector performs better. It tunes parameters for i-vector UBM size, i-vector dimension, x-vector max frame size, and num repeats, reporting equal error rates to evaluate performance on test segments of 3, 10 and 30 seconds.
Dialect classification using acoustic and linguistic features in Arabic speechIAESIJAI
Speech dialects refer to linguistic and pronunciation variations in the speech
of the same language. Automatic dialect classification requires considerable
acoustic and linguistic differences between different dialect categories of
speech. This paper proposes a classification model composed of a
combination of classifiers for the Arabic dialects by utilizing both the
acoustic and linguistic features of spontaneous speech. The acoustic
classification comprises of an ensemble of classifiers focusing on different
frequency ranges within the short-term spectral features, as well as a
classifier utilizing the ‘i-vector’, whilst the linguistic classifiers use features
extracted by transformer models pre-trained on large Arabic text datasets. It
has been shown that the proposed fusion of multiple classifiers achieves a
classification accuracy of 82.44% for the identification task of five Arabic
dialects. This represents the highest accuracy reported on the dataset, despite
the relative simplicity of the proposed model, and has shown its applicability
and relevance for dialect identification tasks.
Arabic handwritten digits recognition based on convolutional neural networks ...nooriasukmaningtyas
Handwritten digits recognition has attracted the attention of researchers in pattern recognition fields, due to its importance in many applications in public real life, such as read bank checks and formal documents which is a continuous challenge in the last years. For this motivation, the researchers created several algorithms in recognition of different human languages, but the problem of the Arabic language is still widespread. Concerning its importance in many Arab and Islamic countries, because the people of these countries speak this language, However, there is still a little work to recognize patterns of letters and digits. In this paper, a new method is proposed that used pre-trained convolutional neural networks with resnet-34 model what is known as transfer learning for recognizing digits in the arabic language that provides us a high accuracy when this type of network is applied. This work uses a famous arabic handwritten digits dataset that called MADBase that contains 60000 training and 1000 testing samples that in later steps was converted to grayscale samples for convenient handling during the training process. This proposed method recorded the highest accuracy compared to previous methods, which is 99.6%.
IRJET- Review on Optical Character RecognitionIRJET Journal
This document provides a review of optical character recognition (OCR) technologies. It discusses research that has been conducted on OCR for English, Arabic, and Devanagari characters. For English characters, various studies that used techniques like neural networks, support vector machines, and nearest neighbor classification are summarized. For Arabic characters, research using neural networks, Haar-like feature extraction and boosting classifiers achieved recognition rates from 87% to 99%. Studies on Devanagari characters employed techniques such as curvelet transforms, neural networks, k-means clustering and support vector machines and achieved recognition rates from 90% to 98.5%. In general, the document reviews past work on OCR and the techniques researchers have used to recognize
TUNING DARI SPEECH CLASSIFICATION EMPLOYING DEEP NEURAL NETWORKSkevig
Recently, many researchers have focused on building and improving speech recognition systems to
facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system
has become an important and commontool from games to translation systems, robots, and so on. However,
there is still a need for research on speech recognition systems for low-resource languages. This article
deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients
(MFCCs) feature extraction method and three different deep neural networks including Convolutional
Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP). We evaluate our
models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms.
This study obtained the impressive result of 98.365% average accuracy.
Tuning Dari Speech Classification Employing Deep Neural Networkskevig
Recently, many researchers have focused on building and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system has become an important and common tool from games to translation systems, robots, and so on. However, there is still a need for research on speech recognition systems for low-resource languages. This article deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural networks including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP). We evaluate our models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms. This study obtained the impressive result of 98.365% average accuracy.
Arabic tweeps dialect prediction based on machine learning approach IJECEIAES
In this paper, we present our approach for profiling Arabic authors on Twitter, based on their tweets. We consider here the dialect of an Arabic author as an important trait to be predicted. For this purpose, many indicators, feature vectors and machine learning-based classifiers were implemented. The results of these classifiers were compared to find out the best dialect prediction model. The best dialect prediction model was obtained using random forest classifier with full forms and their stems as feature vector.
TUNING DARI SPEECH CLASSIFICATION EMPLOYING DEEP NEURAL NETWORKSIJCI JOURNAL
Recently, many researchers have focused on building and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system has become an important and common tool from games to translation systems, robots, and so on. However, there is still a need for research on speech recognition systems for low-resource languages. This article deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural networks including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP), and two hybrid models of CNN and RNN. We evaluate our models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms. This study obtained the impressive result of 98.365% average accuracy.
In This paper we presented new approach for cursive Arabic text recognition system. The objective is to propose methodology analytical offline recognition of handwritten Arabic for rapid implementation.The first part in the writing recognition system is the preprocessing phase is the preprocessing phase to prepare the data was introduces and extracts a set of simple statistical features by two methods : from a window which is sliding long that text line the right to left and the approach VH2D (consists in projecting every character on the abscissa, on the ordinate and the diagonals 45° and 135°) . It then injects the resulting feature vectors to Hidden Markov Model (HMM) and combined the two HMM by multi-stream approach.
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITIONijnlc
This document presents a multi-stream HMM approach for offline handwritten Arabic word recognition. It extracts two sets of features from each word using a sliding window approach and VH2D projection approach. These features are input to separate HMM classifiers, and the outputs are combined in a multi-stream HMM to provide more reliable recognition. The system is evaluated on 200 words, achieving a recognition rate of 83.8% using the multi-stream approach compared to 78.2% and 76.6% for the individual classifiers.
A MULTI-STREAM HMM APPROACH TO OFFLINE HANDWRITTEN ARABIC WORD RECOGNITIONkevig
In This paper we presented new approach for cursive Arabic text recognition system. The objective is to propose methodology analytical offline recognition of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase is the preprocessing phase to prepare the data was introduces and extracts a set of simple statistical features by two methods : from a window which is sliding long that text line the right to left and the approach VH2D (consists in projecting every character on the abscissa, on the ordinate and the diagonals 45° and 135°) . It then injects the resulting feature vectors to Hidden Markov Model (HMM) and combined the two HMM by multi-stream approach.
The document describes a two-layer classification technique for online handwritten Gujarati character recognition. In the first layer, an SVM classifier with an RBF kernel is used. In the second layer, a k-NN classifier is used for characters identified as similar by the first layer. The system achieves an average accuracy of 94.65% and processing time of 0.095 seconds per stroke using a hybrid feature set including derivatives, zoning, and normalized chain codes.
Data entry is a time consuming and erroneous procedure in its nature. In addition, validity
check of submitted information is not easier than retyping it. In a mega-corporation like Kanoon
Farhangi Amoozesh, there are almost no way to control the authenticity of students' educational
background. By the virtue of fast computer architectures, optical character recognition, a.k.a.
OCR, systems have become viable. Unfortunately, general-purpose OCR systems like Google's
Tesseract are not handful because they don't have any a-priori information about what they are
reading. In this paper the authors have taken a in-depth look on what has done in the field of
OCR in the last 60 years. Then, a custom-made system adapted to the problem is presented
which is way more accurate than general purpose OCRs. The developed system reads more than
60 digits per second. As shown in the Results section, the accuracy of the devised method is
reasonable enough to be exposed in public use.
ArabicWordNet: fine-tuning MobileNetV2-based model for Arabic Handwritten Wor...IRJET Journal
The document describes a study that developed a deep learning model called ArabicWordNet for recognizing Arabic handwritten words. Researchers evaluated five pre-trained convolutional neural network architectures on an Arabic handwritten words dataset and found that MobileNetV2 performed best, achieving 96.93% validation accuracy. They then fine-tuned the MobileNetV2 model through an ablation study, improving validation accuracy to 98.98% and test accuracy to 99.02%. The proposed ArabicWordNet model uses a MobileNetV2 architecture with additional layers and is effective for recognizing Arabic handwritten words.
Similar to Improved Arabic Alphabet Characters Classification Using.pdf (20)
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
The CBC machine is a common diagnostic tool used by doctors to measure a patient's red blood cell count, white blood cell count and platelet count. The machine uses a small sample of the patient's blood, which is then placed into special tubes and analyzed. The results of the analysis are then displayed on a screen for the doctor to review. The CBC machine is an important tool for diagnosing various conditions, such as anemia, infection and leukemia. It can also help to monitor a patient's response to treatment.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
UNLOCKING HEALTHCARE 4.0: NAVIGATING CRITICAL SUCCESS FACTORS FOR EFFECTIVE I...amsjournal
The Fourth Industrial Revolution is transforming industries, including healthcare, by integrating digital,
physical, and biological technologies. This study examines the integration of 4.0 technologies into
healthcare, identifying success factors and challenges through interviews with 70 stakeholders from 33
countries. Healthcare is evolving significantly, with varied objectives across nations aiming to improve
population health. The study explores stakeholders' perceptions on critical success factors, identifying
challenges such as insufficiently trained personnel, organizational silos, and structural barriers to data
exchange. Facilitators for integration include cost reduction initiatives and interoperability policies.
Technologies like IoT, Big Data, AI, Machine Learning, and robotics enhance diagnostics, treatment
precision, and real-time monitoring, reducing errors and optimizing resource utilization. Automation
improves employee satisfaction and patient care, while Blockchain and telemedicine drive cost reductions.
Successful integration requires skilled professionals and supportive policies, promising efficient resource
use, lower error rates, and accelerated processes, leading to optimized global healthcare outcomes.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
2. (i) Suggesting a CNN model for recognizing Arabic
handwritten characters.
(ii) Tuning of different hyperparameters to improve the
model performance.
(iii) Applying different optimization algorithms.
Reporting the effectiveness of the best ones.
(iv) Presenting different data augmentation techniques.
Reporting the influence of each method on the
improvement of Arabic handwritten characters
recognition.
(v) Mixing two different Arabic handwritten characters
datasets for shape varying. Testing the impact of the
presented data augmentation approaches on the
mixed dataset.
The rest of this paper is organized as follows. In Section 2,
we expose the related works in Arabic handwritten character
classification. In Sections 3 and 4, we describe the convolution
neural network architecture and the model tuning hyper-
parameters. In Section 5, we make a detailed description of
various used optimization algorithms. In Section 6, we de-
scribe the different utilized data augmentation techniques
chosen in this study. In Section 7, we provide an overview of
the experimental results showing the CNN distinguished
performance. Section 8 is conclusion and possible future
research directions.
2. Related Work
In recent years, many studies have addressed the classifi-
cation and recognition of letters, including Arabic hand-
written characters. On the other hand, there are a smaller
number of proposed approaches for recognizing individual
characters in the Arabic language. As a result, Arabic
handwritten character recognition is less common com-
pared to English, French, Chinese, Devanagari, Hangul,
Malayalam, etc.
Impressive results were achieved in the classification of
handwritten characters from different languages, using deep
learning models and in particular the CNN.
El-Sawy et al. [6] gathered their own Arabic Handwritten
Character dataset (AHCD) from 60 participants. AHCD
consists of 16.800 characters. They have achieved a classi-
fication accuracy of 88% by using a CNN model consisting of
2 convolutional layers. To improve the CNN performance,
regularization and different optimization techniques have
been implemented to the model. The testing accuracy was
improved to 94.93%.
Altwaijry and Turaiki [13] presented a new Arabic
handwritten letters dataset (named “Hijja”). It comprised
47.434 characters written by 591 participants. Their pro-
posed CNN model was able to achieve 88% and 97% testing
accuracy, using the Hijja and AHCD datasets, respectively.
Younis [19] designed a CNN model to recognize Arabic
handwritten characters. The CNN consisted of three con-
volutional layers followed by one final fully connected layer.
The model achieved an accuracy of 94.7% for the AHCD
database and 94.8% for the AIA9K (Arabic alphabet’s
dataset).
Latif et al. [20] designed a CNN to recognize a mix of
handwriting of multiple languages: Persian, Devanagari,
Eastern Arabic, Urdu, and Western Arabic. The input image
is of size (28 × 28) pixels, followed by two convolutional
layers, and then a max-pooling operation is applied to both
convolution layers. The overall accuracy of the combined
multilanguage database was 99.26%. The average accuracy is
around 99% for each individual language.
Alrobah and Albahl [21] analyzed the Hijja dataset and
found irregularities, such as some distorted letters, blurred
symbols, and some blurry characters. They used the CNN
model to extract the important features and SVM model for
data classification. They achieved a testing accuracy of 96.3%.
Mudhsh et al. [22] designed the VGG net architecture for
recognizing Arabic handwritten characters and digits. The
model consists of 13 convolutional layers, 2 max-pooling
layers, and 3 fully connected layers. Data augmentation and
Figure 1: One possible connected component box for the word
“”ءارمأ.
Table 1: Twenty-eight shapes of Arabic alphabets.
No. Name Isolated Beginning Middle End
1 Alif ا ا إـ أـ
2 Baa ب ـب ـبـ بـ
3 Tea ت ـت ـتـ تـ
4 Thea ث ـث ـثـ ثـ
5 Jam ج ـج ـجـ جـ
6 Haa ح ـح ـحـ حـ
7 Khaa خ ـخ ـخـ خـ
8 Daal د د دـ دـ
9 Thaal ذ ذ ذـ ذـ
10 Raa ر ر رـ رـ
11 Zaay ز ز زـ زـ
12 Seen س ـس ـسـ سـ
13 Sheen ش ـش ـشـ شـ
14 Sad ص ـص ـصـ صـ
15 Dhad ض ـض ـضـ ضـ
16 Tah ط ـط ـطـ طـ
17 Dha ظ ـظ ـظـ ظـ
18 Ain ع ـع ـعـ عـ
19 Ghen غ ـغ ـغـ غـ
20 Fa ف ـف ـفـ فـ
21 Qaf ق ـق ـقـ قـ
22 Kaf ك ـك ـكـ كـ
23 Lam ل ـل ـلـ لـ
24 Meem م ـم ـمـ مـ
25 Noon ن ـن ـنـ نـ
26 Ha ه ـه ـهـ هـ
27 Waw و و وـ وـ
28 Yaa ي ـي ـيـ يـ
2 Computational Intelligence and Neuroscience
3. dropout methods were used to avoid the overfitting prob-
lem. The model was trained and evaluated by using two
different datasets: the ADBase for the Arabic handwritten
digits classification topic and HACDB for the Arabic
handwritten characters classification task. The model
achieved an accuracy of 99.66% and 97.32% for ADBase and
HACDB, respectively.
Boufenar et al. [23] used the popular CNN architecture
Alexnet. It consists of 5 convolutional layers, 3 max-pooling
layers, and 3 fully connected layers. Experiments were
conducted on two different databases, OIHACDB-40 and
AHCD. Based on the good tuning of the CNN hyper-
parameters and by using dropout and minibatch techniques,
a CNN accuracy of 100% and 99.98% for OIHACDB-40 and
AHCD was achieved.
Mustapha et al. [24] proposed a Conditional Deep
Convolutional Generative Adversarial Network (CDCGAN)
for a guided generation of isolated handwritten Arabic
characters. The CDCGAN was trained on the AHCD dataset.
They achieved a 10% performance gap between real and
generated handwritten Arabic characters.
Table 2 summarizes the literature reviewed for recog-
nizing Arabic handwriting characters using the CNN
models. From the previous literature, we notice that most
CNN architectures have been trained by using adult Arabic
handwriting letters “AHCD”. In addition, we observe that
most researchers try to improve the performance through
the good tuning of the CNN model hyperparameters.
3. TheProposedArabicHandwrittenCharacters
Recognition System
As shown in Figure 2, the model that we proposed in this
study is composed of three principal components: CNN
proposed architecture, optimization algorithms, and data
augmentation techniques.
In this paper, the proposed CNN model contains four
convolution layers, two max-pooling operations, and an
ANN model with three fully hidden layers used for the
classification. To avoid the overfitting problems and improve
the model performance, various optimization techniques
were used such as dropout, minipatch, choice of the acti-
vation function, etc.
Figure 3 describes the proposed CNN model. Also, in
this work, the recognition performance of Arabic hand-
written letters was improved through the good choice of the
optimization algorithm and by using different data aug-
mentation techniques “geometric transformations, feature
space augmentation, noise injection, and mixing images.”
4. Convolution Neural Network Architecture
A CNN model [25–34] is a series of convolution layers
followed by fully connected layers. Convolution layers allow
the extraction of important features from the input data.
Fully connected layers are used for the classification of data.
The CNN input is the image to be classified; the output
corresponds to the predicted class of the Arabic handwritten
character.
4.1. Input Data. The input data is an image I of size
(m × m × s). (m × m) Defines the width and the height of the
image and s denotes the space or number of channels. The value
of s is1 for a grayscale image and equals 3 for a RGB color image.
4.2. Convolution Layer. The convolution layer consists of a
convolution operation followed by a pooling operation.
4.2.1. Convolution Operation. The basic concept of the
classical convolution operation between an input image I of
dimension (m × m) and a filter F of size (n × n) is defined as
follows (see Figure 4):
C � I ⊗ F. (1)
Here, ⊗ denotes the convolution operation. C is the con-
volution map of size (a× a), where a � (m − n + 2p/sL) + 1. sL
is the stride and denotes the number of pixels by which
F is sliding over I. p is the padding; often it is necessary to add a
bounding of zeros around I to preserve complete image in-
formation. Figure 4 is an example of the convolution operation
between an input image of dimension (8× 8) and a filter F of
size (3× 3). Here, the convolution map C is of size (6× 6) with a
stride sL � 1 and a padding p � 0.
Generally, a nonlinear activation function is applied on
the convolution map C. The commonly used activation
functions are Sigmoid [34–36], Hyperbolic Tangent “Tanh”
[35, 37], and Rectified Linear Unit “ReLU” [37, 38] where
Ca � f(C), (2)
here, Ca is the convolution map after applying the nonlinear
activation function f. Figure 5 shows the Ca map when the
ReLU activation function is applied on C.
4.2.2. Pooling Operation. The pooling operation is used to
reduce the dimension of Ca thus reducing the computational
complexity of the network. During the pooling operation, a
kernel K of size (sp × sp) is sliding over Ca. sp denotes the
number of patches by which K is sliding over Ca. In our
analysis sp is set to 2. The pooling operation is expressed as
P � pool Ca, (3)
where P is the pooling map and pool is the pooling oper-
ation. The commonly used pooling operations are average-
pooling, max-pooling, and min-pooling. Figure 6 describes
the concept of average-pooling and max-pooling operations
using a kernel of size (2 × 2) and a stride of 2.
4.3. Concatenation Operation. The concatenation operation
maps the set of the convoluted images into a vector called the
concatenation vector Y.
Y �
Pc
1
Pc
2
⋮
Pc
n
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
, (4)
Computational Intelligence and Neuroscience 3
4. here, Pc
i is the output of the ith
convolution layer. n denotes
the number of filters applied on the convoluted images Pc−1
i−1 .
4.4. Fully Connected Layer. The CNN classification opera-
tion is performed through the fully connected layer [39]. Its
input is the concatenation vector Y; the predicted class y is
the output of the CNN classifier. The classification oper-
ation is performed through a series of t fully connected
hidden layers. Each fully connected hidden layer is a
parallel collection of artificial neurons. Like synapses in the
biological brain, the artificial neurons are connected
through weights W. The model output of the ith
fully
connected hidden layer is
Table 2: Summary of Arabic handwritten characters recognition using CNN model.
References Year Dataset Type (size) Method Optimization
Accuracy
(%)
Loss
(%)
El-Sawy et al. [6] 2017 AHCD
Chars
(16,800)
CNN (i) Minibatch 94.93 5.1
Mudhsh et al. [22] 2017
ADBase
Digits
(6.600) CNN (based on VGG
net)
(ii) Dropout 99.6 —
HACDB
Chars
(70.000)
(iii) Data
augmentation
97.32 —
Boufenar et al. [23] 2017
OIHACDB Chars
(6.600)
CNN (based on
Alexnet)
(i) Dropout 100
—
AHCD (ii) Minibatch 99.98
Younis [19] 2018
AHCD Chars
(8.737)
CNN —
97.7 —
AIA9K 94.8 —
Latif et al. [20] 2018
Mix of handwriting of
multiple languages
Chars CNN — 99.26 0.02
Altwaijry and
Turaiki [13]
2020
Hijja Chars
(47,434)
CNN —
88 —
AHCD 97 —
Alrobah &Albahl
[21]
2021 Hijja
Chars
(47,434)
CNN + SVM — 96.3 —
Mustapha et al. [24] 2021 AHCD CDCGAN — — —
Input: Arabic handwritten character
image
Dropout Mini Batch
ELU cross-
entropy
Learning
Rate
SoftMax
AdaDelta
Nadam
Mixing images datasets
AHCD & Hijja
Speckle
Recognized character
Salt-and-pepper
Gaussian
Zooming
Flipping
Rotation
Geometric transformations &
Feature space augmentation
Data
augmentation
techniques
Shifting
Noise injection
AdaMax
Adagrad
ADAM
Gradient descent
with Momentum
RMSprop
Gradient descent
ANN Classifier
CNN
proposed
architecture
Optimization
Algorithms
Figure 2: Proposed recognition schema for Arabic handwritten characters datasets.
4 Computational Intelligence and Neuroscience
5. Yi
� f Hi
, (5)
where the weight sum vector Hi
is
Hi
� Wi
Y(i− 1)
+ Bi
, (6)
here, f is a nonlinear activation function (sigmoid, Tanh,
ReLU, etc.). The bias value Bi
defines the activation level of
the artificial neurons.
5. CNN Learning Process
A trained CNN is a system capable of determining the exact
class of a given input data. The training is achieved through
an update of the layer’s parameters (filters, weights, and
biases) based on the error between the CNN predicted class
and the class label. The CNN learning process is an iterative
process based on the feedforward propagation and back-
propagation operations.
5.1. Feedforward Propagation. For the CNN model, the
feedforward equations can be derived from (1)–(5) and (6).
The Softmax activation [40, 41] function is applied in the
final layer to generate the predicted value of the class of the
input image I. For a multiclass model, the Softmax is
expressed as follows:
yi �
exp hi
c
j exp hj
, (7)
where c denotes the number of classes, yi is the ith
coor-
dinate of the output vector y, and the artificial neural output
hi � n
j�1 hiwij.
5.2. Backpropagation. To update the CNN parameters and
perform the learning process, a backpropagation optimi-
zation algorithm is developed to minimize a selected cost
function E. In this analysis, the cross-entropy (CE) cost
function [40] is used.
E � −
p
j�1
�
yi log yi + 1 + �
yilog 1 − yi, (8)
here, �
yi is the desired output (data label).
The most used optimization algorithm to solve classi-
fication problems is the gradient descent (GD). Various
optimizers for the GD algorithm such as momentum,
AdaGrad, RMSprop, Adam, AdaMax, and Nadam were used
to improve the CNN performance.
5.2.1. Gradient Descent [40, 42]. GD is the simplest form of
optimization gradient descent algorithms. It is easy to im-
plement and gives significant classification accuracy. The
general update equation of the CNN parameters using the
GD algorithm is
φ(t + 1) � φ(t) − α
zE
zφ(t)
, (9)
where φ represents the update of the filters F, the weights W,
and the biases B. (zE/zφ) is the gradient with respect to the
parameter φ. α is the model learning rate. A too-large value
of α may lead to the divergence of the GD algorithm and may
cause the oscillation of the model performance. A too-small
α stops the learning process.
5.2.2. Gradient Descent with Momentum [43]. The mo-
mentum hyperparameter m defines the velocity by which the
learning rate α must be increased when the model ap-
proaches to the minimal of the cost function E. The update
equations using the momentum GD algorithm are expressed
as follows:
Input Layer
(Image)
(32 × 32)
(28 × 1)
Output
(Softmax)
(3 × 3)
16
CNN 1st
Layer
(3 × 3)
CNN 2nd
Layer
16
(3 × 3)
CNN 3rd
Layer
32
(3 × 3)
CNN 4th
Layer
32
ELU ELU
ELU
ELU
Dropout
Dropout
ELU ELU
Flatten
(2 × 2)
max-
pooling
Fully Connected
1000
Fully Connected
1000
(2 × 2)
max-
pooling
Figure 3: Proposed CNN architecture.
0 0 0 0 0 0 0 0
0
0
0
0
0
0
0
0
0
0
0
F
C
0
0 1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1 -2
-2
-2
-2
-2
-2
1
2
2
2
2
2
1
1
1
2
2
0
0
0
0 0
0 0
0 0
0 0
0 0
0 0 0
0
0
0
0
I
0
0
0
0
0
0
0
0 0 0 0 0 0
0
1
1 1 1 1
1
1
1
1
1
1
1
1
1 1 1 1 1 1
0 0 0
0
0
0
Figure 4: Example of 2D convolution operation [13, 31].
Computational Intelligence and Neuroscience 5
6. v(t) � m v(t − 1) +
zE
zφ
,
φ(t + 1) � φ(t) − α v(t),
(10)
where v(t) is the moment gained at tth
iteration.
5.2.3. AdaGrad [44]. In this algorithm, the learning rate is a
function of the gradient (zE/zφ). It is defined as follows:
α(t) �
β
������
�
G(t) + ε
, (11)
where
β �
α(0)
�
ε
√ ,
G(t) �
t
i�1
zE
zφ(t)
2
,
(12)
where ϵ is a small smoothing value used to avoid the division
by 0 and G(t) is the sum of the squares of the gradients
(zE/zφ(t)).
With a small magnitude of (zE/zφ), the value of α is
increasing. If (zE/zφ) is very large, the value of α is a
constant. AdaGrad optimization algorithm changes the
learning rate for each parameter at a given time t with
considering the previous gradient update. The parameter
update equation using AdaGrad is expressed as follows:
φ(t + 1) � φ(t) − α(t)
zE
zφ(t)
. (13)
5.2.4. AdaDelta [45]. The issue of AdaGrad is that with
much iteration the learning rate becomes very small which
leads to a slow convergence. To fix this problem, AdaDelta
algorithm proposed to take an exponentially decaying av-
erage as a solution, where
E G2
(t)
� cE G2
(t − 1)
+(1 − c)G2
(t),
Δθ(t) �
1
����������
�
E G2
(t)
+ ε
G(t),
φ(t + 1) � φ(t) − αΔθ(t),
(14)
where E[G2
(t)] is the decaying average over past squared
gradients and c is a set usually around 0.9.
5.2.5. RMSprop [45, 46]. In reality, RMSprop is identical to
AdaDelta’s initial update vector, which we derived above:
E G2
(t)
� 0.9 E G2
(t − 1)
+ 0.1G2
(t),
φ(t + 1) � φ(t) − αΔθ(t).
(15)
5.2.6. ADAM [17, 45, 46]. This gradient descent optimizer
algorithm computes the learning rate α based on two vectors:
r(t) � β1r(t − 1) + 1 − β1
zE
zφ(t)
,
v(t) � β2v(t − 1) + 1 − β2
zE
zφ(t)
2
,
(16)
where r(t) and v(t) are the 1st
and the 2nd
order moments
vectors. β1 and β2 are the decay rates. r(t − 1) and v(t − 1)
represent the mean and the variance of the previous
gradient.
When r(t) and v(t) are very small, a large step size is
needed for parameters update. To avoid this issue, a bias
correction value is added to r(t) and v(t).
r(t) �
r(t)
1 − βt
1
,
v(t) �
v(t)
1 − βt
2
,
(17)
where βt
1 is β1 power t and βt
2 is β2 power t.
The Adam update equation is expressed as follows:
φ(t + 1) � φ(t) − α
r(t)
������
�
v(t) + ε
. (18)
-2
-2 -2
-2
-2
-1
-1
-1
-1
-1
-1
-1
-1 0 0 0
1
1
1 0 0
C
0 0 0 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1 1
1
1
1
1
1
0 0
0 0
0 0
0 0
2
2
2
2
2
2
-1
Ca
Relu
Figure 5: Convolution map after applying ReLU activation
function [13, 25, 31].
Max (1, 2, 0, 1)=2
Average (5, 2, 0, 1) =2
1
1
1
1
0
0
0
0
2 2
2
2
1
5
5
7
3
2
2
2
5
5
3
7
Figure 6: Average-pooling, and max-pooling with filter (2 × 2)
and stride 2 [13, 31].
6 Computational Intelligence and Neuroscience
7. 5.2.7. AdaMax [45, 47]. The factor v(t) in the Adam al-
gorithm adjusts the gradient inversely proportionate to the
ℓ2 norm of previous gradients (via the v(t − 1)) and current
gradient t (zE/zφ(t)) :
v(t) � β2v(t − 1) + 1 − β2
zE
zφ(t)
2
. (19)
The generalization of this update to the ℓp norm is as
follows:
v(t) � β
p
2 v(t − 1) + 1 − β
p
2
zE
zφ(t)
p
. (20)
To avoid being numerically unstable, ℓ1 and ℓ2 norms
are most common in practice. However, in general ℓ∞ also
shows stable behavior. As a result, the authors propose
AdaMax and demonstrate that v(t) with ℓ∞ converges to
the more stable value. Here,
u(t) � β∞
2 v(t − 1) + 1 − β∞
2
zE
zφ(t)
∞
� max β2. v(t − 1),
zE
zφ(t)
,
φ(t + 1) � φ(t) − α
r(t)
u(t)
.
(21)
5.2.8. Nadam [43]. It is a combination of Adam and NAG,
where the parameters update equation using NAG is defined
as follows:
v(t) � c v(t − 1) + α
zE
zφ(t)
− c v(t − 1)
,
φ(t + 1) � φ(t) − αv(t).
(22)
The update equation using Nadam is expressed as
follows:
r � β1
r(t) +
1 − β1
1 − β(t)1
zE
zφ(t)
,
φ(t + 1) � φ(t) −
α
������
�
v(t) + ε
r.
(23)
6. Data Augmentation Techniques
Deep convolutional neural networks are heavily reliant on
big data to achieve excellent performance and avoid the
overfitting problem.
To solve the problem of insufficient data for Arabic
handwritten characters, we present some basic data aug-
mentation techniques that enhance the size and quality of
training datasets.
The image augmentation approaches used in this study
include geometric transformations, feature space augmen-
tation, noise injection, and mixing images.
Data augmentation based on geometric transformations
and feature space augmentation [17, 48] is often related to
the application of rotation, flipping, shifting, and zooming.
6.1. Rotation. The input data is rotated right or left on an
axis between 1° and 359°. The rotation degree parameter has
a significant impact on the safety of the dataset. For example,
on digit identification tasks like MNIST, slight rotations like
1 to 20 or −1 to −20 could be useful, but when the rotation
degree increases, properly the CNN network cannot accu-
rately distinguish between some digits.
6.2. Flipping. The input image is flipped horizontally or
vertically. This augmentation is one of the simplest to im-
plement and has proven useful on some datasets such as
ImageNet and CIFAR-10.
6.3. Shifting. The input image is shifting right, left, up, or
down. This transformation is a highly effective adjustment to
prevent positional bias. Figure 7 shows an example of
shifting data augmentation technique using Arabic alphabet
characters.
6.4. Zooming. The input image is zooming, either by adding
some pixels around the image or by applying random zooms
to the image. The amount of zooming has an influence on the
quality of the image; for example, if we apply a lot of
zooming, we can lose some image pixels.
6.5. Noise Injection. As it could be seen on Arabic hand-
written characters, natural noises are presented in images.
Noises make recognition more difficult and for this reason,
noises are reduced by image preprocessing techniques. The
cos of noise reduction is to perform a high classification, but
it causes the alteration of the character shape. The main
datasets in this research topic are considered with denoising
images. The question which we answer here is how the
method could be robust to any noise.
Adding noise [48, 49] to a convolution neural network
during training helps the model learn more robust features,
resulting in better performance and faster learning. We can
add several types of noise when recognizing images, such as
the following.
(i) Gaussian noise: injecting a matrix of random values
drawn from a Gaussian distribution
(ii) Salt-and-pepper noise: changing randomly a certain
amount of the pixels to completely white or com-
pletely black
(iii) Speckle noise: only adding black pixels “pepper” or
white pixels “salt”
Adding noise to the input data is the most commonly
used approach, but during training, we can add random
noise to other parts of the CNN model. Some examples
include the following:
Computational Intelligence and Neuroscience 7
8. (i) Adding noise to the outputs of each layer
(ii) Adding noise to the gradients to update the model
parameters
(iii) Adding noise to the target variables
6.6.MixingImage’sDatabases. In this study, we augment the
training dataset by mixing two different Arabic handwritten
characters datasets, AHCD and Hijja, respectively. AHCD is
a clean database, but Hijja is a dataset with very low-reso-
lution images. It comprises many distorted alphabets
images.
Then, we evaluate the influence of different mentioned
data augmentation techniques (geometric transformations,
feature space augmentation, and noise injection) on the
recognition performance of the new mixing dataset.
7. Experimental Results and Discussion
7.1. Datasets. In this study, two datasets of Arabic hand-
written characters were used: Arabic handwritten characters
dataset “AHCD” and Hijja dataset.
AHCD [6] comprises 16.800 handwritten characters of
size (32 × 32 ×1) pixels. It was written by 60 participants
between the ages of 19 and 40 years and most of the par-
ticipants are right handed. Each participant wrote the Arabic
alphabet from “alef” to “yeh” 10 times. The dataset has 28
classes. It is divided into a training set of 13.440 characters
and a testing set of 3.360 characters.
Hijja dataset [13] consists of 4.434 Arabic characters of
size (32 × 32 ×1) pixels. It was written by 591 school children
ranging in age between 7 to 12 years. Collecting data from
children is a very hard task. Malformed characters are
characteristic of children’s handwriting; therefore the
dataset comprises repeated letters, missing letters, and many
distorted or unclear characters. The dataset has 29 classes. It
is divided into a training set of 37.933 characters and a
testing set of 9.501 characters (80% for training and 20% for
test).
Figure 8 shows a sample of AHCD and Hijja Arabic
handwritten letters datasets.
7.2. Experimental Environment and Performance Evaluation.
In this study the implementation and the evaluation of the
CNN model are done out in Keras deep learning environ-
ment with TensorFlow backend on Google Colab using GPU
accelerator.
We evaluate the performance of our proposed model via
the following measures:
Accuracy (A) is a measure for how many correct
predictions your model made for the complete test
dataset:
A �
TP + TN
TP + TN + FN + FP
. (24)
Recall (R) is the fraction of images that are correctly
classified over the total number of images that belong to
class:
R �
TP
TP + FN
. (25)
Precision (P) is the fraction of images that are correctly
classified over the total number of images classified:
P �
TP
TP + FP
. (26)
F1 measure is a combination of Recall and Precision
measures:
F1 � 2 ∗
P ∗ R
P + R
. (27)
Here, TP � true positive (is the total number of images
that can be correctly labeled as belonging to a class x), FP �
false positive (represents the total number of images that
have been incorrectly labeled as belonging to a class x),
FN � false negative (represents the total number of images
that have been incorrectly labeled as not belonging to a class
x), TN � true negative (represents the total number of images
that have been correctly labeled as not belonging to a class x).
Also we draw the area under the ROC curve (AUC),
where we have the following.
An ROC curve (receiver operating characteristic curve)
is a graph showing the performance of all classification
thresholds. This curve plots two parameters:
(i) True-positive rate
(ii) False-positive rate
AUC stands for “area under the ROC curve.” That is,
AUC measures the entire two-dimensional area underneath
the entire ROC curve from (0.0) to (1.1).
7.3. Tuning of CNN Hyperparameters. The objective is to
choose the best model that fits the AHCD and Hijja datasets
well. Many try-and-error trials in the network configuration
tuning mechanism were performed.
The best performance was achieved when the CNN
model was constructed of four convolution layers followed
by three fully connected hidden layers. The model starts with
0
0 20
10
20
30
0
0 20
10
20
30
0
0 20
10
20
30
0
0 20
10
20
30
0
Down Shifted Up Shifted
0 20
10
20
30
Shifted Left
Shifted Right
Original
Figure 7: Data augmentation through shifting a single data in four
directions by two pixels.
8 Computational Intelligence and Neuroscience
9. two convolution layers with 16 filters of size (3 × 3), then the
remaining 2 convolution layers are with 32 filters of size
(3 × 3), and each two convolution layers are followed by
max-pooling layers with (2 × 2) kernel dimension. Finally,
three fully connected layers (dense layers) with Softmax
activation function to perform prediction. ELU, a nonlinear
activation function, was used to remove negative values by
converting them into 0.001. The values of weights and bias
are updated by a backward propagation process to minimize
the loss function.
To reduce the overfitting problem a dropout of 0.6 rate is
added to a model between the dense layers and applies to
outputs of the prior layer that are fed to the subsequent layer.
The optimized parameters used to improve the CNN per-
formance were as follows: Optimizer algorithm is Adam, the
loss function is the cross-entropy, learning rate � 0.001,
batch size � 16, and epochs � 40.
We compare our model to CNN-for-AHCD over both
the Hijja dataset and the AHCD dataset. The code for CNN-
for-AHCD is available online [31], which allows comparison
of its performance over various datasets.
On the Hijja dataset, which has 29 classes, our model
achieved an average overall test set accuracy of 88.46%,
precision of 87.98%, recall of 88.46%, and an F1 score of
88.47%, while CNN-for-AHCD achieved an average overall
test set accuracy of 80%, precision of 80.79%, recall of
80.47%, and an F1 score of 80.4%.
On the AHCD dataset, which has 28 classes, our model
achieved an average overall test set accuracy of 96.66%,
precision of 96.75%, recall of 96.67%, and an F1 score of
96.67%, while CNN-for-AHCD achieved an average overall
test set accuracy of 93.84%, precision of 93.99%, recall of
93.84%, and an F1 score of 93.84%.
The detailed metrics are reported per character in
Table 3.
We note that our model outperforms CNN-for-AHCD
by a large margin on all metrics.
Figure 9 shows the testing result AUC of AHCD and
Hijja dataset.
7.4.OptimizerAlgorithms. The objective is to choose the best
optimizers algorithms that fit the AHCD and Hijja best
performance. In this context, we tested the influence of the
following algorithms on the classification of handwritten
Arabic characters:
(i) Adam
(ii) SGD
(iii) RMSprop
(iv) AdaGrad
(v) Nadam
(vi) Momentum
(vii) AdaMax
By using Nadam optimization algorithm, on the Hijja
dataset, our model achieved an average overall test set ac-
curacy of 88.57%, precision of 87.86%, recall of 87.98%, and
an F1 score of 87.95%.
On the AHCD dataset, our model achieved an average
overall test set accuracy of 96.73%, precision of 96.80%,
recall of 96.73%, and an F1 score of 96.72%.
The detailed results of different optimizations algorithms
are mentioned in Table 4.
7.5. Results of Data Augmentation Techniques. Generally, the
neural network performance is improved through the good
tuning of the model hyperparameters. Such improvement in
the CNN accuracy is linked to the availability of training
dataset. However, the networks are heavily reliant on big
data to avoid overfitting problem and perform well.
Data augmentation is the solution to the problem of
limited data. The image augmentation techniques used and
discussed in this study include geometric transformations
and feature space augmentation (rotation, shifting, flipping,
and zooming), noise injection, and mixing images from two
different datasets.
For the geometric transformations and feature space
augmentation, we try to well choose the percentage of
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
(a)
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
0
0
20
20
(b)
Figure 8: Arabic handwritten letters dataset. (a) AHCD dataset. (b) Hijja dataset.
Computational Intelligence and Neuroscience 9
12. rotation, shifting, flipping, and zooming for the model
attending a good performance. For example, if we rotate the
Latin handwritten number database (MNIST) by 180°, the
network will not be able to accurately distinguish between
the handwritten digits “6” and “9”. Likewise, on the AHCD
and Hijja datasets, if rotating or flipping techniques are
used the network will be unable to distinguish between
some handwritten Arabic characters. For example, as
shown in Figure 10, with a rotation of 180°, the character
Daal isolated ()د will be the same as the character Noon
isolated ()ن.
The detailed results of rotation, shifting, flipping, and
zooming data augmentation techniques are mentioned in
Table 5.
As shown in Table 5 and Figure 11, by using rotation and
shifting augmentation approaches, our model achieved a
testing accuracy of 98.48% and 91.24% on AHCD dataset
and Hijja dataset, respectively. We achieved this accuracy
through rotating the input image by 10° and shifting it just by
one pixel.
Adding noise is a technique used to augment the training
input data. Also in most of the cases, this is bound to increase
the robustness of our network.
In this work we used the three types of noise to augment
our data:
(i) Gaussian noise
(ii) Salt-and-pepper noise
0
30
20
10
0
Daal_isolated
20 0
30
20
10
0
Noon_isolated
20
(a)
0
30
20
10
0
Lam beginning
20 0
30
20
10
0
Alif end
20
(b)
Figure 10: Example of Arabic handwritten characters that cannot be properly distinguished when applying the rotation and flipping data
augmentation techniques. (a) Confusion caused by rotation. (b) Confusion caused by flipping.
Table 5: Experimental results of data augmentation techniques on Hijja and AHCD datasets.
Dataset
Data augmentation techniques with using Nadam algorithm
Rotation Shifting Rotation and shifting Flipping Zooming
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
AHCD 99.82 97.64 99.56 98.09 99.41 98.48 99.69 97.29 99.85 98.00
Hijja 97.18 89.86 92.72 90.28 91.73 91.24 95.39 88.94 99.16 88.66
0.5
0 5 10 15 20
Epoch
25 30 35 40
0.6
0.7
0.8
Accuracy
0.9
1.0
train accuracy
test accuracy
train accuracy augmented
test accuracy augmented
(a)
0.5
0 5 10 15 20
Epoch
25 30 35 40
0.6
0.7
0.8
Accuracy
0.9
0.4
train accuracy
test accuracy
train accuracy augmented
test accuracy augmented
(b)
Figure 11: Rotation and shifting data augmentation results of Arabic handwritten characters datasets. (a) Rotation and shifting results of
AHCD dataset. (b) Rotation and shifting results of Hijja dataset.
12 Computational Intelligence and Neuroscience
13. (iii) Speckle noise
The detailed results of different types of noise injection
are mentioned in Table 6. As shown by adding different types
of noise, the model accuracy is improved, which demon-
strate the robustness of our proposed architecture. We
achieved good results when adding noise to the outputs of
each layer.
The proposed idea in this study is to augment the
number of training databases by mixing the two datasets
AHCD and Hijja, and then we apply the previously men-
tioned data augmentation methods on the new mixed
dataset. Our purpose to use malformed handwritten char-
acters as it proposes the Hijja dataset is to improve the
accuracy of our method with noised data.
The detailed results of data augmentation techniques on
the mixed database are mentioned in Table 7. As shown, the
model performance depends on the rate of using Arabic
handwriting “Hijja” database. The children had trouble
following the reference paper, which results in very low-
resolution images comprising many unclear characters.
Therefore mixing the datasets would certainly reduce
performance.
8. Conclusions and Possible Future
Research Directions
In this paper, we proposed a convolution neural network
(CNN) to recognize Arabic handwritten characters dataset.
We have trained the model on two Arabic datasets AHCD
and Hijja. By the good tuning of the network
hyperparameters, we achieved an accuracy of 96.73% and
88.57% on AHCD and Hijja.
To improve the model performance, we have imple-
mented different optimization algorithms. For both data-
bases, we achieved an excellent performance by using
Nadam optimizer.
To solve the problem of insufficient Arabic handwritten
datasets, we have applied different data augmentation
techniques. The augmentation approaches are based on
geometric transformation, feature space augmentation,
noise injection, and mixing of datasets.
By using rotation and shifting techniques, we achieved a
good accuracy equal to 98.48% and 91.24% on AHCD and
Hijja.
To improve the robustness of the CNN model and in-
crease the number of training datasets, we added three types
of noise (Gaussian noise, Salt-and-pepper, and Speckle
noise).
Also in this work we first augmented the database by
mixing two Arabic handwritten characters datasets; then we
tested the results of the previously mentioned data aug-
mentation techniques on the new mixed dataset, where the
first database “AHCD” comprises clear images with a very
good resolution, but the second database “Hijja” has many
distorted characters. Experimentally show that the geo-
metric transformations (rotation, shifting, and flipping),
feature space augmentation, and noise injection always
improve the network performance, but the rate of using the
unclean database “Hijja” harms the model accuracy.
An interesting future direction is the cleaning and
processing of Hijja dataset to eliminate the problem of low-
Table 7: Experimental results on mixing of Hijja and AHCD through different data augmentation techniques.
Mixed dataset
Data augmentation techniques with using Nadam algorithm
Rotation Shifting Rotation and shifting Flipping Gaussian noise
Training Testing
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
Training
accuracy
Testing
accuracy
80(%):
AHCD
20(%):
Hijja
20(%):
AHCD
10(%):
Hijja
99.62 97.42 99.36 98.02 99.38 98.32 99.77 97.08 99.15 96.74
80(%):
Hijja
20(%):
AHCD
20(%):
Hijja
10(%):
AHCD
97.36 88.47 93.66 89.07 94.91 90.54 95.21 88.21 96.68 88.49
80(%):
AHCD
80(%):
Hijja
20(%):
AHCD
20(%):
Hijja
97.27 74.53 97.16 75.13 98.13 78.13 98.16 74.22 96.98 74.02
Table 6: Experimental results on the Hijja and AHCD through noise injection.
Type of noise injection with using Nadam algorithm
Gaussian noise Salt-and-pepper noise Speckle noise
Training accuracy Testing accuracy Training accuracy Testing accuracy Training accuracy Testing accuracy
AHCD 99.17 96.78 99.16 97.15 99.14 96.82
Hijja 97.18 89.85 92.96 90.10 91.32 89.73
Computational Intelligence and Neuroscience 13
14. resolution and unclear images and then the implementation
of the proposed CNN network and data augmentation
techniques on the new mixed and cleaned database.
In addition, we are interested in evaluating the result of
other augmentation approaches, like adversarial training,
neural style transfer, and generative adversarial networks on
the recognition of Arabic handwritten characters dataset.
We plan to incorporate our work into an application for
children that teaches Arabic spelling.
Abbreviations
AHCR: Arabic handwritten characters recognition
DL: Deep learning
CNNs: Convolution neural networks
AHCD: Arabic handwritten character dataset
SVM: Support vector machine
ADBase: Arabic digits database
HACDB: Handwritten Arabic characters database
OIHACDB: Offline handwritten Arabic character database
CDCGAN: Conditional deep convolutional generative
adversarial network
Tanh: Hyperbolic tangent
ReLU: Rectified linear unit
CE: Cross-entropy
GD: Gradient descent
NAG: Nesterov accelerated gradient
TP: True positive
FP: False positive
FN: False Negative
TN: True negative
AUC: Area under curve
ROC: Receiver operating curve
ELU: Exponential linear unit
Symbols
I : Image
m: Width and height of the image
c: Number of channels
F: Filter
n: Filter size
⊗: Convolution operation
C: Convolution map
a: Size of convolution map
Sc: Stride
p: Padding
f: Nonlinear activation function
Ca: Convolution map after applying f
K: Kernel
Sp: Number of patches
pool: Pooling operation
P: Pooling map
Y0
: Concatenation vector
Pc
i : Output of the convolution layer
Pc−1
i−1 : Convoluted image
Yi− 1
: Input of the i th fully connected hidden layer
Yi
: Output of the i th fully connected hidden layer
Hi
: Weight sum vector
Bi
: Bias
E: Cost function
�
yi: Desired output
ϕ: Update of the filter F
(zE/zφ): Gradient
α: Model learning
m: Momentum
v(t): Moment gained at the i th iteration
ε: Smoothing value
G(t): Sum of the squares of the gradient
E[G(t)2
]: Decaying overage
r(t): Moments vector
β: Decay rate
r(t − 1): Mean of the previous gradient
v(t − 1): Variance of the previous gradient.
Data Availability
Previously reported AHCD data were used to support this
study and are available at https://www.kaggle.com/mloey1/
ahcd1. These prior studies (and datasets) are cited at relevant
places within the text as [43].
Conflicts of Interest
The authors declare that there are no conflicts of interest
regarding the publication of this study.
References
[1] Y. M. Alginahi, “Arabic character segmentation: a survey,”
International Journal on Document Analysis and Recognition,
vol. 16, no. 2, pp. 105–126, 2013.
[2] M. T. Parvez and S. A. Mahmoud, “Offline arabic handwritten
text recognition: a survey,” ACM Computing Surveys, vol. 45,
no. 2, pp. 1–35, 2013.
[3] D. K. Sahu and C. V. Jawahar, “Unsupervised feature learning
for optical character recognition,” in Proceedings of the 2015
13th International Conference on Document Analysis and
Recognition (ICDAR), pp. 1041–1045, Tunis, Tunisia, August
2015.
[4] J. Sueiras, V. Ruiz, A. Sanchez, and J. F. Velez, “Offline
continuous handwriting recognition using sequence to se-
quence neural networks,” Neurocomputing, vol. 289,
pp. 119–128, 2018.
[5] J. Bai, Z. Chen, B. Feng, and B. Xu, “Image character rec-
ognition using deep convolutional neural network learned
from different languages,” in Proceedings of the 2014 IEEE
International Conference on Image Processing (ICIP),
pp. 2560–2564, Paris, France, October 2014.
[6] A. El-Sawy, M. Loey, and E. Hazem, “Arabic handwritten
characters recognition using convolutional neural network,”
WSEAS Transactions on Computer Research, vol. 5, pp. 11–19,
2017.
[7] J. Xiao, Z. Xuehong, H. Chuangxia, Y. Xiaoguang,
W. Fenghua, and M. Zhong, “A new approach for stock price
analysis and prediction based on SSA and SVM,” Interna-
tional Journal of Information Technology Decision Making,
vol. 18, pp. 287–310, 2019.
[8] N. Wagaa and H. Kallel, “Vector-based back propagation
algorithm of supervised convolution neural network,” in
Proceedings of the International Conference on Control,
14 Computational Intelligence and Neuroscience
15. Automation and Diagnosis (ICCAD), Paris, France, October
2020.
[9] Y. Liang, J. Wang, S. Zhou, Y. Gong, and N. Zheng, “In-
corporating image priors with deep convolutional neural
networks for image super-resolution,” Neurocomputing,
vol. 194, pp. 340–347, 2016.
[10] H. M. Najadat, A. A. Alshboul, and A. F. Alabed, “Arabic
handwritten characters recognition using convolutional
neural network,” in Proceedings of the 10th International
Conference on Information and Communication Systems,
Irbid, Jordan, June 2019.
[11] H. Kaur and S. Rani, “Handwritten gurumukhi character
recognition using convolution neural network,” International
Journal of Computational Intelligence Research, vol. 13,
pp. 933–943, 2017.
[12] P. Ahamed, S. Kundu, T. Khan, V. Bhateja, R. Sarkar, and
A. Faruk Mollah, “Handwritten arabic numerals recognition
using convolutional neural network,” Journal of Ambient In-
telligence and Humanized Computing, vol. 11, pp. 5445–5457,
2020.
[13] N. Altwaijry and I. Al-Turaiki, “Arabic handwriting recog-
nition system using convolutional neural network,” Neural
Computing Applications, vol. 33, 2020.
[14] K. V. Greeshma and K. Sreekumar, “Hyperparameter opti-
mization and regularization on fashion-MNIST classifica-
tion,” International Journal of Recent Technology and
Engineering (IJRTE), vol. 8, pp. 2277–3878, 2019.
[15] O. Y. Bakhteev and V. V. Strijov, “Comprehensive analysis of
gradient-based hyperparameter optimization algorithms,”
Annals of Operations Research, vol. 289, 2020.
[16] S. Derya, “A comparison of optimization algorithms for deep
learning,” International Journal of Pattern Recognition and
Artificial Intelligence, vol. 34, pp. 51–65, 2020.
[17] A. H. Garcı́a and P. König, “Further advantages of data
augmentation on convolutional neural networks,” in Pro-
ceedings of the International Conference on Artificial Neural
Networks, pp. 95–103, Springer, Cham, Switzerland, Sep-
tember 2018.
[18] A. A. Hidayata, K. Purwandaria, T. W. Cenggoroa, and
B. Pardamean, “A convolutional neural network-based an-
cient sundanese character classifier with data augmentation,”
in Proceedings of the 5th International Conference on Com-
puter Science and Computational Intelligence, pp. 195–201,
Indonesia, 2021.
[19] K. Younis, “Arabic handwritten characters recognition based
on deep convolutional neural networks,” Jordan Journal
Computers and Information Technology (JJCIT), vol. 3, 2018.
[20] G. Latif, J. Alghazo, L. Alzubaidi, M. M. Naseer, and
Y. Alghazo, “Deep convolutional neural network for recog-
nition of unified multi-language handwritten numerals,” in
Proceedings of the 2018 IEEE 2nd International Workshop on
Arabic and Derived Script Analysis and Recognition (ASAR),
pp. 90–95, London, UK, March 2018.
[21] N. Alrobah and S. Albahli, “A hybrid deep model for rec-
ognizing arabic handwritten characters,” IEEE Acces, vol. 9,
pp. 87058–87069, 2021.
[22] M. A. Mudhsh and R. Almodfer, “Arabic handwritten al-
phanumeric character recognition using very deep neural
network,” MDPI, Information, vol. 8, 2017.
[23] C. Boufenar, A. Kerboua, and M. Batouche, “Investigation on
deep learning for off-line handwritten arabic character rec-
ognition,” Cognitive Systems Research, vol. 50, 2017.
[24] I. B. Mustapha, S. Hasan, H. Nabus, and S. M. Shamsuddin,
“Conditional deep convolutional generative adversarial
networks for isolated handwritten arabic character genera-
tion,” Arabian Journal for Science and Engineering, 2021.
[25] A. Yuan, G. Bai, L. Jiao, and Y. Liu, “Offline handwritten
English character recognition based on convolutional neural
network,” in Proceedings of the 2012 10th IAPR International
Workshop on Document Analysis Systems (DAS), pp. 125–129,
IEEE, Gold Coast, Australia, March 2012.
[26] M. Mhapsekar, P. Mhapsekar, A. Mhatre, and V. Sawant,
“Implementation of residual network (ResNet) for devanagari
handwritten character recognition,” Algorithms for Intelligent
Systems, pp. 137–148, 2020.
[27] N. K. Pius and A. Johny, “Malayalam handwritten character
recognition system using convolutional neural network,”
International Journal of Applied Engineering Research, vol. 15,
pp. 918–920, 2020.
[28] R. S. Hussien, A. A. Elkhidir, and M. G. Elnourani, “Optical
character recognition of arabic handwritten characters using
neural network,” in Proceedings of the 2015 International
Conference on Computing, Control, Networking, Electronics
and Embedded Systems Engineering (ICCNEEE), pp. 456–461,
IEEE, Khartoum, Sudan, September 2015.
[29] A. ElAdel, R. Ejbali, M. Zaied, and C. B. Amar, “Dyadic
multiresolution analysis-based deep learning for arabic
handwritten character classification,” in Proceedings of the
2015 IEEE 27th International Conference on Tools with Ar-
tificial Intelligence (ICTAI), pp. 807–812, IEEE, Vietri sul
Mare, Italy, November 2015.
[30] M. Elleuch, N. Tagougui, and M. Kherallah, “Arabic handwritten
characters recognition using deep belief neural networks,” in
Proceedings of the 2015 12th International Multi-Conference on
Systems, Signals & Devices (SSD), pp.1–5, IEEE, Mahdia, Tunisia,
March 2015.
[31] A. Elsawy, M. Loey, and H. M El-Bakry, “Arabic handwritten
characters recognition using convolutional neural network,”
WSEAS Transactions on Computer Research, vol. 5, pp. 11–19,
2017.
[32] K. M. M. Yaagoup and M. E. M. Mus, “Online Arabic
handwriting characters recognition using deep learning,”
International Journal of Advanced Research in Computer and
Communication Engineering, vol. 9, 2020.
[33] M. Shams, A. Elsonbaty, and W. ElSawy, “Arabic handwritten
character recognition based on convolution neural networks
and support vector machine,” International Journal of Ad-
vanced Computer Science and Applications, vol. 11, 2020.
[34] S. Narayan, “The generalized sigmoid activation function:
competitive supervised learning,” Information Sciences,
vol. 99, no. 1-2, pp. 69–82, 1997.
[35] B. L. Kalman and C. Kwasny, “Why tanh: choosing a sig-
moidal function,” in Proceedings of the JCNN International
Joint Conference on Neural Networks, IEEE, Baltimore, MD,
USA, June 1992.
[36] B. I. Dlimi and H. Kallel, “Robust Neural Control for Robotic
Manipulators,” International journal of Enhanced Research in
Science, Technology and Engineering, IJERSTE, vol. 5, no. 2,
pp. 198–205, 2016.
[37] P. Sibi, S. Allwyn Jones, and P. Siddarth, “Analysis of different
activation functions using back propagation neural net-
works,” Journal of Theoretical and Applied Information
Technology, vol. 47, 2013.
[38] G. Lin and W. Shen, “Research on convolutional neural
network based on improved relu piecewise activation func-
tion,” Procedia Computer Science, vol. 131, pp. 977–984, 2018.
[39] M. Kubat, “Artificial neural networks,” An Introduction to
Machine Learning, pp. 91–111, 2015.
Computational Intelligence and Neuroscience 15
16. [40] L. Bottou, “Large-scale machine learning with stochastic
gradient descent,” Proceedings of COMPSTAT’2010, pp. 177–
186, 2010.
[41] D. Wang, L. Huang, and L. Tang, “Dissipativity and synchro-
nization of generalized BAM neural networks with multivariate
discontinuous activations,” IEEE Transactions on Neural Net-
work Learning System, vol. 29, pp. 3815–3827, 2017.
[42] R. Sutton, “Two problems with back propagation and other
steepest descent learning procedures for networks,” in Pro-
ceedings of the Eighth Annual Conference of the Cognitive
Science Society, pp. 823–832, Chicago, IL, USA, December
1986.
[43] I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the
importance of initialization and momentum in deep learn-
ing,” in Proceedings of the International Conference on Ma-
chine Learning, pp. 1139–1147, Atlanta, GA, USA, June 2013.
[44] J. Duchi, E. Hazan, and Y. Singer, “Adaptive sub gradient
methods for online learning and stochastic optimization,”
Journal of Machine Learning Research, vol. 12, no. Jul,
pp. 2121–2159, 2011.
[45] S. Derya, “A comparison of optimization algorithms for deep
learning,” International Journal of Pattern Recognition and
Artificial Intelligence, 2020.
[46] G. Hinton, N. Srivastava, and K. Swersky, rmsprop: Divide the
Gradient by a Running Average of its Recent Magnitude, 2012.
[47] D. P. Kingma and L. J. Ba, “Adam: a method for stochastic
optimization,” in Proceedings of the International Conference
on Learning Representations (ICLR), San Diego, CA, USA,
May 2015.
[48] S. Connor and M. K. Taghi, “A survey on image data aug-
mentation for deep learning,” Journal of Big Data, vol. 6, 2019.
[49] P. Panda and K. Roy, “Implicit advers arial data augmentation
and robustness with noise-based learning,” Neural Networks,
vol. 141, pp. 1139–1147, 2021.
16 Computational Intelligence and Neuroscience