This document describes a pattern recognition application for classifying banknotes as genuine or counterfeit. Several supervised and unsupervised classifiers are implemented and compared using a banknote dataset containing features extracted from images. The best performing classifiers were the quadratic and k-means classifiers, which had misclassification error rates of around 11-13%. The application could provide an automated solution to classify banknotes and help address the problem of counterfeiting.
insect anatomy and insect body wall and their physiology
Application of pattern recognition techniques in banknote classification
1. Application of pattern recognition techniques
in banknote classification
Nikolaos Mouzakitis
February 27, 2019
Abstract
In this paper a pattern recognition application used for banknotes
classification is described, based on a bank-notes dataset available in
UCI Machine Learning repository. Description of the proposed classi-
fiers in this particular separation problem are compared and analyzed
before conducting the conclusion about the efficiency of the particu-
lar solution. Supervised classifiers such as linear and quadratic are
implemented, while k-means and single linkage algorith are used for
unsupervised classification. For the supervised section there have been
created three different splits for training and validation sets those of
70%/30%,80%/20% and 90%/10%.
1 Introduction
Even if in the last decade the use of electronic money and digital currency
increased rapidly, in the recent years with the high technological advances
and steps forward in color printing, counterfeiting problems are becoming
more and more serious. In a perfect scenario only authorities would have
the ability to print banknotes but forged currency is appearing more often
with the aid of technology. The detection of such counterfeits is a required
important task where money exchange is taking place. As for the size of the
problem Grace and Sheema[3] describe how the counterfeit problem in India
became big hurdle,as well as how managing a large amount of counterfeits
imposes additional problems to banks or other organizations.
A general conclusion is that anybody with a computer or laser printer is
1
2. capable of generating forged banknotes and this fact is adding value in ap-
plications that solve succesfully this particular problem.
Several solutions have been proposed in this context, such as classification by
convolutional neural network(CNN)[5], detection based on Edge Detection
[7] while Feng,Ren,Zhang and Suen proposed recognition of serial numbers
of the banknotes [8]. Also, Raho,Khiat and Al-Hamami [9] research the so-
lution of using k-Nearest Neighbor algorithm to determine the category of
the currency and its reality.
2 Proposed Application
In order to provide a solution for the problem and avoid manual testing
of each banknote which is a time consuming task, there is a need for an
automated application to solve the problem with efficiency. The proposed
application is based upon a set of classifiers that include supervised algo-
rithms (linear and quadratic classifiers) and unsupervised such as k-means
and hierarchical grouping linkage algorithms. Solution proposed is a system
that requires a camera for acquiring the desired image of the banknote to
classify, a Waveform Transform tool and one can choose whether to imple-
ment the classifiers locally ( microcontroller/microprocessor) or if having a
connection(Ethernet/Cloud) classifiers can run in a distributed platform and
send back the results of classification.
2.1 Description
The input source of data in the proposed application is the captured camera
images of the new instance for classification. After image is generated fea-
ture extraction is taking place using a Wavelet Transformation tool in order
to get the values of the 4(four) main features as they are described in the
Dataset subsection. When all four features are extracted as shown in Figure
1, instance is getting classified by the several classifiers, either locally or in a
distributed way. Results as said can be generated locally or distributed and
transmitted back to the interested microprocessor and the final decision is
made upon the classifiers results. Notice that in order of the application to
do exactly what is designed to do various digital image processing techiques
alongside with pattern recognition algorithms are essential in the implemen-
tation. Another important thing to notice is that because various classifiers
2
3. are implemented the final decision can also be a result of ’voting’ between
every classifier and get a result cross-validated classifier wide. By doing that
we can also have a new metric on how sure are we about a certain decision,
by knowing how each classifier reacted and classified the certain instance.
Figure 1: Distributed application scheme
2.2 Dataset
The dataset used is owned by Volker Lohweg (University of Applied Sciences,
Ostwestfalen-Lippe), while it is dontated by Helene Darkser (University of
Applied Sciences, Ostwestfalen-Lippe) and contains information extracted
from images of real and forged banknotes. The data are available for down-
load in the UCI Machine Learning repository. The implemented classifiers
are evaluated and compared using this dataset. It contains 1372 instances
and it is associated in the classification task while it has rather a large amount
of web hits approximately 150000 at the moment. For the extraction of the
features a Wavelet Transform tool was used. There are five attributes in the
dataset, including the class(genuine-counterfeit), variance of WTi (Wavelet
Transformed Image),the skewness of the WTi,curtosis of WTi and the en-
tropy of the image. Boxplots of the data are displayed in Figure 2 (as they
are generated by Octave-Forge) for the four attributes of the set. While im-
plementing and visualizing the linear and quadratic classifiers the features
selected are the two first features( variance and skewness of banknotes).
3
4. Figure 2: Boxplots of dataset’s features
2.3 Software
The software used to generate results was selected to be Octave, GNU open-
source project with a high degree of similarity to MATLAB. Several packages
that supported the results include the nan,statistics that implement cru-
cial functions that were used.
2.4 Classifiers
In Figures 3,4 and 5 we can see the instances in the plots for all sam-
ples,training and validation sets when training set is 80% and 20% of data are
the validation set. In the following section the classifiers used are described
and their output of classifying is visualized.
2.5 Linear classifier
The first classifier that is implemented is the linear classifier.
Linear classifiers identify in which class an instance belongs by making clas-
sification decision based on the value of a linear combination of the charac-
4
5. Figure 3: All samples of the dataset
Figure 4: Training Set
5
6. Figure 5: Validation Set
terestics. In our proposed implementation the 2 selelected features are the
variance and skewness and classifier was based in the Euclidean distance met-
ric. In Figure 6 we can inspect the classifier and the results of classifier on
the validation set when the partition was following the 80/20 per cent split.
Results have also been produced for other splits such as 70/30 and 90/10.
2.6 Quadratic classifier
Quadratic classifiers are based in QDA(quadratic discriminant analysis) and
as statistical classification it separates the two classes with conic sections(i.e
line, circle, ellipse, parabola or hyperbola) and it is obvious that the quadratic
model is a generalization of the linear model, with the advantage that can
make some more complex separating surfaces. The second classifier that is
implemented is the quadratic classifier.
In this implementation again the features selected were variance and skewness
of banknotes since they offer the best method to distinguish the class and
results include the splits between training and validation of 70/30, 80/20 and
90/10 per cent.
In Figure 7 is visualized the 80/20 result of the quadratic classification using
QDA.
6
7. Figure 6: Results of linear classifier visualized
Figure 7: Results of quadratic classifier visualized
7
8. 2.7 K-means
The first algorithm of unsupervised classification that was implemented is
k-means clustering with strict definition of clusters to two.
K-means aims to partition the dataset of n instances in k clusters, where
every instance belongs in the cluster with the nearest mean. In Figure 8 we
can see the results of kmeans clustering in the banknote dataset.
Figure 8: Visualization of K-mean clustering
2.8 Hierarchical Clustering
Singe linkage clustering have been also implemented and evaluated on the
dataset. This is a method that belongs to hierarchical clustering.
This method creates bottom-up clusters each time combining two clusters
that contain the closest pairs of instances and end up creating thin and long
clusters where nearby elements tend to have small distance.
Dendrograms are used to visualize the linking between the instances in this
form of clustering.
In the dendrogram(Figure 9) one can see that, there are two big clusters
representing the two classes(genuine and counterfeits) while on the far right
of the dendrogram are lying some possibly outlier instances that haven’t been
removed from the dataset.
8
9. Figure 9: Dendrogram of single linkage clustering
3 Results
Linear classifier had an error of 12.65% in 80/20 split, 13.542% in 70/30 split
and 11.98% in 90/10.
Quadratic classifier had 11.29% of error in 80/20 split, 12.187% on 70/30
split and 11.82% on 90/10 split.
Kmeans showed incorrect classification error of 12.75%.
Single linkage algorithm as shown in the dendrogram detects two main classes
where the majority of the instances belong to. Misclassification is in the same
rate among all the implemented classifiers.
4 Conclusion
In conclusion as also shown by the results the classifiers were all close on
their classification errors but some better classification results were shown for
the quadratic classifier. Quadratic classifier got better results than linear
classifier in every split of the dataset examined, while k-means algorithm did
not produce significant different error rate. Also in the application context
by having in mind that there can be a ’voting’ system in considering the
final decision, results from every classifier can be weighted in order to get
the final decision to address the class of a new instance. Such an approach
will make use of all the information each classifier can offer as well it can be
9
10. customed by changing the weight of each classifier’s contribution in the final
decision.
5 Bibliography
1. Recognition of Fake Currency Based on Security Thread Feature of Cur-
rency, E.Pilania, B.Arora
2. Forgery Detection and Value Identification of Euro Banknotes, A.Bruna,
G.M.Farinella, G.Guarnera, S.Battiato
3. A Survey on Fake Indian Paper Currency Identification System, P.J.Grace,
A.Sheema
4. Wikipedia.org
5. Multi-National Banknote Classification Based on Visible-Light Line Sen-
sor and Convolutional Neural Network, T.D.Pham, D.E.Lee, K.R.Park
6. Counterfeit Currency Recognition Using SVM With Note to Coin Ex-
changer, S.V.Walke, D.D.Chandwadkar
7. Original and Counterfeit Money Detection Based on Edge Detection,
M.Akbar, AwaluddinA.Sedayu,S.Widyarto
8. Automatic recognition of serial numbers in bank notes, B.Feng, M.Ren,
X-Y.Zhang, C.Y.Suen
9. Cash Currencies Recognition Using k-Nearest Nighbor Classifier, G.I.Raho,
A.Al-Khiat, A.H.Al-Hamami
10