Handwritten digits recognition report

Handwritten Digits Recognition
Jan 14, 2020
Abstract
In this paper I present an Keras Sequential Neural Network to tackle the recognition
of human handwritten digits. The Neural Network proposed here is experimented on
the well-known MNIST data set. Without any pre-processing of the data set, our
Neural Network achieves quite low classification error. Combined with clustering
techniques, we can build artificial intelligence system which can automatically
segment individual digit from images and find its corresponding label.
Introduction
In this project, a handwritten digits recognition system was implemented with the famous
MNIST data set. This is not a new topic and the after several decades, the MNIST data set is
still very popular and important for evaluation and validation of new algorithms.
Handwritten digits recognition problem has been studied by researchers since 1998 with
almost all the algorithms designed by then and even until now. The test error rate decreased
from 12% in 1988 by linear classifier to 0.23% in 2012 by convolutional nets, and these days
more and more data scientists and machine learning experts are trying to develop and
validate unsupervised learning methods such as auto-encoder and deep learning model.
Methods
The program I implement will mainly focus on identifying 0-9 from segmented
pictures of handwritten digits.For simplicity, input images are pre-treated to
be of certain fixed size, and each input image should contain only one unknown digit
in the middle. These requirements are not too harsh because they can be achieved
using simple image processing or computer vision techniques. In addition, such pre-
treated image data set are easy to obtain. In my implementation,the popular MNIST
data set ([1]) is a good choice. Each image in MNIST is already normalized to 28x28
in the above sense and the data set itself is publicly available. The MNIST data set
is really a huge one: it contains 60000 training samples and 10000 test samples.
And it has become a standard data set for testing various algorithms.The output of
my program will be the corresponding 0-9 digit contained in input image. The
method I use is Keras Sequential Neural Network using Tensorflow. Unlike lazy
learning method such as Nearest Neighbor Classifier that stores the whole training
set and classify new input case by case, This will implicitly learn the corresponding
rule between image of handwritten digits and the actual 0-9 identities. Preferably, I

want the output units provide the conditional probability (thus the output of each unit
is between 0 and 1, and the outputs of all 10 units will sum to 1) of each class to
which each input belongs, and the unit that has the maximum output will determine
the class label. As a result, softmax activation is the desirable choice for output
units.The unit corresponding to the right class label of each input has
value 0.91, while other units have the same value 0.01. I do not set them to be 1s
and 0s because extreme value are hard to be achieved by activation functions.
Results and Discussions
Among all the methods I tried on MNIST dataset, I preferred the model which I built using
Keras Sequential Neural Network using Tensorflow, has the best performance, which is
97.2899% accuracy, which is quiet accurate 2.73% error rate, but not always repeatable.
Tweaking the kernel sizes and numbers of convolutional layers indicates that more
convolutional layers are helpful to improve the accuracy while change kernel size has little
effect and the accuracy even decreased sometime.

Source Code
import tensorflow as tf
mnist=tf.keras.datasets.mnist
(x_train,y_train),(x_test,y_test)=mnist.load_data()
x_train=tf.keras.utils.normalize(x_train,axis=1)
x_test=tf.keras.utils.normalize(x_test,axis=1)
model=tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train,y_train,epochs=3)
val_loss,val_acc=model.evaluate(x_test,y_test)
print(val_loss*100,val_acc*100)
import matplotlib.pyplot as plt
plt.imshow(x_train[0])
plt.show()
print(x_train[0])
model.save("epic_num_reader.model")
new_model=tf.keras.models.load_model("epic_num_reader.model")
predictions=new_model.predict([x_test])
print(predictions)
import numpy as np
print(np.argmax(predictions,axis=1))

for i in range(0,11):
first_image=x_test[i]
first_image=np.array(first_image,dtype='float')
pixels=first_image.reshape((28,28))
plt.imshow(pixels)
plt.show()
Conclusion
In conclusion, I implement a large Sequential neural network for human
handwritten digits. I train the NN with cross entropy using error back
propagation to obtain optimal weights values. I also find useful applications
for the NN I generate. By doing this project, I have practiced using what
I have learned in the Machine Learning course. My program
probably does not beat the state of the art in handwritten digit recognition.
However, I have for the first time observed the practical problems of using
the powerful Neural Networks. This experience will definitely be helpful for
my future research.
Acknowledgments
I want to thank the whole Eckovation team and specially the mentors for this
course Machine Learning. I really learn a lot of new things from their
lectures. I am also grateful to them for constant support for their useful
discussions in Forum group with details of the project.
____________
Name: Swayamdipta Saha
E-mail: swayamsaha@gmail.com

Handwritten digits recognition report

More Related Content

What's hot

Similar to Handwritten digits recognition report

Recently uploaded

Handwritten digits recognition report