Recognition of Bengali Handwritten Digits Using Convolutional Neural Network Architectures is a part of Digital Image Processing using some basic moves digits of Bangl languages are recognized
Digital image processing recognition of bengali handwritten digits using convolutional neural network architectures
1. Recognition of Bengali Handwritten Digits Using Convolutional
Neural Network Architectures
Summary : Recognition of Handwritten Digits has always been the pioneer interest in deep
learning. Due to the lacking’s required dataset, in this particular field the research count is
significantly low. NamtaDB dataset, has a collection of Bengali Handwritten Digit dataset,
which contains 85,000 digit’s from 2700 contributors. It is implicitly made confirm that dataset
represent the diversity as it is collected both from children and adults who are from different
regions.
Different types of augmentation on dataset is being applied for this kind of work. As for the
example Rotation, Translation, Blurring, Zoom in, salt pepper noise etc are used.
Rotating Image : Dimensions may not be preserved as previous after rotation.
This rotation takes place with respect to the center. And computes the
Inverse transformation of every pixel. RGB images computed observing
every color plane.
Translation : The movement of digits location along row or column. Shift
The image with respect to x and y coordinates by increasing or decreasing
value of the coordinates.
Blurring : smooth’s the operation, and it is applied on a linear filter on initial
image. The effect is to average out rapid changes in pixel intensity.
Hue saturation Value shifting (HSV) : An alternative representation of RGB
image. HSV color models were created as a more convenient way for us to
specify colors.
Superimpose : It is done to replicate the effect of text written on the back of an
2. already written page. image is vertically flipped and a weighted sum of two
images is taken.
Some of the augmentations includes affine transformation, Coarse dropout, Addition of noise,
Superimpose, Inversion etc. One of the difficult type of augmentation is to remove Coarse
dropout and affine transformation. Deleting Digit edge, scaling images, rotating, shear mapping,
and sometimes multiple of these effects are combined.
Even if an image is visually okay, it still may not be a good data for a Convolutional Neural
Network. So for the aid, median blur of filter size 9x9 is used. It removes the Gaussian noises
and sharp edges. Some problem may occur like some test data become victim of the blurring, and
the digit can be blurred to a point that is unrecognizable.
Sometimes when images are inverted but the outcome result show, the inversion don’t give any
change in result. The system is able to recognize the data set in background perfectly. It happens
for the cause of using Convolutional layers work on edges and that’s why it smoothly recognizes
the corners of the background or foreground color.
Some dataset from the NamtaDB is mislabeled even after the rigorous checking. Some samples
get distorted which is not easy to recognize for human being. If there is any update from
NamtaDB with minimum number of mislabeled images then handling these distorted images will
be easier. Here, the obtained accuracy rate is 99.3359%. This might be addressed in future work
in broader problems such as license plate recognition, or handwritten character recognition.