SlideShare a Scribd company logo
Application of Data Mining and Machine
Learning for Credit Card Fraud Detection
A Comparative Analysis on two Academic Papers
Author: Christian Adom
Computer Science Department , City University London
26th April 2015
Abstract
The use of credit cards as a payment method has become increas-
ingly popular in recent years. As advancements in e-commerce tech-
nologies continue to emerge, consumers are increasingly taking advan-
tage of the convenience and flexibility offered by credit card purchases.
This change in consumer behaviour has given rise to an unprecedented
increase in cases of credit card frauds and subsequently lead to sub-
stantial financial loss for consumers, issuing banks and merchants in
the payments industry. In this paper we compare and discuss two aca-
demic research papers that attempt to apply data mining and machine
learning techniques to address the problem of credit card fraud detec-
tion and prevention. Our aim is to critically assess the methodologies,
techniques and results presented by the researchers for both papers in
countering the problem of credit card fraud. Next we identify the areas
of similarities in the approach and methodologies used, while clearly
delineating between the differences in the techniques implemented. Fi-
nally we provide a short excursion of the use of these techniques in
industry.
1 Introduction
The use of credit cards as the primary payment method has become in-
creasingly popular with consumers in recent years. A study conducted in
2014 by the UK Cards Association revealed there were approximately 175.6
million cards in issue (55.4 million of which were credit cards) and that card
expenditure rose by £0.6 billion, amounting to a total of £49.0 billion. [1]
1
There is strong evidence to suggest that this growing trend in consumer
behaviour is driven by advancements in e-commerce technologies and con-
sumers are increasingly taking advantage of the convenience and flexibility
offered by credit card purchases.
Although a prosperous economic time for the e-commerce market, issuing
banks, merchants and payment providers are now facing an unprecedented
rise in the number of credit card fraud cases as a result of this rise. A study
conducted by the Financial Fraud Action UK (FFA) estimates the fraud
losses on UK cards totalled £450.4 million in 2013, a 16 % increases from
£388.3 million in 2012. [2]
In an effort to address this growing problem, a number of fraud detection
models and techniques have been proposed by researchers within both the
payments industry and academia. A survey of the current literature on fraud
detection methods reveals a number of approaches to tackling this problem.
Below are a few of the research areas: [3] [4] [5]
• Bayesian Network
• Genetic Algorithm
• Neural Network
• Support Vector Machine
• Decision Tree
• Fuzzy Logic Based System
• Hidden Markov Model
• Meta Learning Strategy
For the purpose of this paper, we will discuss two academic research papers
that address the issue of credit card fraud detection and prevention, namely:
1. Credit Card Fraud detection using Hidden Markov Model [6]
2. Neural data mining for Credit card Fraud Detection [7]
The aim of this paper is to critically assess the methodologies, techniques
and results presented by the researchers in countering the problem of credit
card fraud. Areas of similarities in the approach and methodologies used are
identified, while clearly delineating between the differences in the techniques
implemented.
2
Generally, credit card fraud can be divided into two types, off-line and
on-line fraud. For off-line fraud, the crime is committed by using the stolen
physical card to make (usually face-to-face) unauthorized transactions. In
the case of on-line fraud, the fraud is committed by stealing the card details
(without the knowledge of the legal card holder) and proceeding to make
unauthorized transaction through the internet, phone or other card-holder
not present (CNP) channels [8]
In order to successfully detect this kind of fraud it is necessary to develop
methods to analyse the spending pattern on a card and find inconsistencies
with respect to the normal spending patterns of the card-holder.
Generally, humans exhibit specific behaviour in their spending habits, thus
every card-holder can be represented by a set of patterns containing unique
information such as; purchase category, time/location of purchase, trans-
action amount, etc. Deviation from known patterns is considered to be a
potential threat to the model.
The principles introduced above are the key insights into successful fraud
detection presented in both papers, and it is on this common ground that
we will address the research. The remainder of this paper is organised as
follows: Section 2 and 3 presents a summarisation and review of the two
papers under comparison. Section 4 covers a comparative analysis.
2 Credit Card Fraud Detection Using Hidden Markov
Model
Credit card fraud detection for CNP transactions using a Hidden Markov
Model (HMM) is based on the analysis of the spending pattern on a card
holder. The key approach is to model the sequence of operations for pro-
cessing credit card transactions using the HMM.
The details of items purchased in individual transactions (not known to
Fraud detection system) are represented as the underlying finite Markov
chain, which are not observable. The transactions can only be observed
through a stochastic process that produces the sequence of the amount of
money spent in each transaction.
The HMM is trained with normal behaviour of card-holder by generating
synthetic data - This need to generate artificial data is due to difficulties re-
searchers face in obtaining real credit card data sets due to security, privacy
and cost issues. Upon completion of model training, the Fraud detection sys-
tem (FDS) is tested by running transactions through the system - Incoming
3
transactions not accepted by HMM with sufficiently high probability are
marked as fraud.
2.1 HMM Background
In probability theory, a Markov Model is a stochastic model used to model
randomly changing systems where it is assumed that future states depend
only on the present state and not on the sequence of events that precede
it. [9] When the system being modelled is assumed to be a Markov process
with unobserved (hidden) states, we can represent it as a Hidden Markov
Model.
A HMM has a finites set of states which are governed by a set of transi-
tion probabilities, where for any particular state, an outcome or observation
can be generated according to an associated probability distribution. It is
only the outcome and not the state that is visible to an external observer
(reference to the term ”Hidden”).
2.1.1 Characteristics of an HMM
To set the foundation for understanding how HMM’s are applied to the
problem of fraud detection, it is to necessary to provide a formal description
of the characteristics of an HMM.
Each Hidden Markov Model is defined by the following elements: states,
observation symbols, transition probabilities, and initial probabilities.
1. The N states of the model is defined as:
S = {S1, S2, ..., Sn} (1)
2. The M observation symbols per state is defined as:
V = {V1, V2, ..., Vm} (2)
3. The state transition probability distribution A is given by:
aij = P (qt+1 = Sj|qt = Sj) (3)
where:
qt = current state
The transition probabilities satisfy the stochastic constraints:
aij ≥ 0, 1 ≤ i, j ≤ N and
N
j=1
aij = 1, 1 ≤ i ≤ N
4
4. The observation symbol probability distribution B in each state is
given by:
bj(k) = P (vk = Sj) , 1 ≤ j ≤ N, 1 ≤ k ≤ M (4)
The observation symbol probabilities satisfy the stochastic constraints:
bj(k) ≥ 0, 1 ≤ j ≤ N 1 ≤ k ≤ M and
M
k=1
bj(k) = 1, 1 ≤ j ≤ N
5. The initial state probability vector π is the probability that the model
is in state Si at time t = 0 and is defined as:
πi = P (q1 = Si) , 1 ≤ i ≤ N such that
N
i=1
πi = 1 (5)
From the above definitions, the complete specification of an HMM re-
quires the estimation of two model parameters, N and M, and three proba-
bility distributions A, B, and π. This is represented by the notation:
λ = (A, B, π) (6)
2.2 Implementing the HMM for Credit Card Fraud Detec-
tion
With the mathematical characteristics of the HMM defined , we now turn to
providing a high level outline of the fraud detection process using an HMM.
This can be summarised in three steps:
1. Each incoming transaction is submitted to the FDS (usually running
at an issuing bank) for verification.
2. The FDS tries to find an anomaly in the transaction based on the
spending profile of the card-holder.
3. If the FDS confirms the transaction to be fraudulent, it raises an alert
and the issuing bank declines the transaction.
This general process raises a number of questions, such as:
• How are the credit card transaction processing operations mapped in
terms of an HMM ?
• How are the spending profiles of the cardholders determined and cat-
egorised?
We discuss these and other issues in the next section.
5
2.2.1 HMM for Credit Card Transaction Processing
The process for mapping credit card transaction processes in terms of a
HMM can be enumerated in six steps:
1. Decide on observation symbols in model.
2. Quantize purchase values x into M price ranges V1, V2, ..., Vm forming
the observation symbols.
3. Define the observation symbols as, V = l, m, h making M = 3
where: l = low, m = medium, h = high
4. The transition in purchase type is used as the state transition in the
model.
5. The set of all possible types of purchases forms the set of hidden states
of the HMM.
6. Compute the probability matrices A, B and π
We can make some general comments about the steps defined above:
In step 4 the transition in purchase type is used as the state transition in the
model. This choice seems to contradict our intuition, since a credit card-
holder makes different kinds of purchases of different amounts over a period
of time, the natural choice would be to consider the sequence of transaction
amounts instead. However, the sequence of types of purchase is considered
to be more reliable compared to the transaction amounts because a card-
holder makes purchases depending on his/her need for obtaining different
types of items over a period of time. This spending behaviour subsequently
generates a sequence of transaction amounts. Furthermore, the individual
transaction amount generally depends on the associated type of purchase.
Figure 1: Special case of fully connected HMM
6
In step 6, the optimal values for these parameters are determined in the
training phase using a ”Baum-Welch (forward–backward) algorithm.
With the process defined we present a graphical representation of a HMM
as shown in Figure 1. This is a special case of a fully connected HMM in
which every state of the model can be reached in a single step from another
state. In this figure GR, EL and MI represent Groceries, Electronics and
Miscalenoues purchases respectively.
2.3 Process flow of FDS
Figure 2: Process flow of the FDS
After obtaining an estimate of the HMM parameters through the train-
ing phase, Abhinav et al obtain an initial sequence of symbols from the
cardholders transactions. This sequence can then be passed parametrically
to the HMM to compute the probability of acceptance:
α1 = P (O1, O2, ..., OR|λ) (7)
They then form another sequence of length R by discarding O1 and
adding OR+1 in the new sequence. This is then passed parametrically to the
HMM and the probability of acceptance is computed as:
α2 = P (O2, O3, ..., OR+1|λ) (8)
where OR+1 is the symbol generated by a new transaction at time t + 1
The metric for accepting or declining the sequence is defined as:
∆α = α1 − α2 (9)
7
where if ∆α > 0 they conclude the sequence is accepted by the HMM
with low probability thus it is potentially fraud, given that the following
additional condition holds:
∆α
α1
≥ Threshold (10)
Alternatively, if the condition does not hold then OR+1 is permanently
added in the sequence and the new sequence is used for determining the
validity of the next transaction.
The complete process flow of the FDS is shown in Figure 2, where the
process is divided into two separate phases (Training and Detection)
2.4 Results and Analysis
In the final stage of testing and analysis, large-scale simulations were carried
out to test the effectiveness of the FDS system. Abhinav et al used True
Positive (TP) , False Positive (FP), TP-FP spread and Accuracy metrics,
to measure the capability of the system. In this context, TP represents the
fraction of fraudulent transactions correctly classified as fraudulent, whereas
FP is the fraction of genuine transactions incorrectly classified as fraudulent.
Furthermore the converse of these metrics, True negative (TN) and False
Negative (FN) were also used as part of accuracy calculation.
To measure the performance of the FDS , the difference between TP and
FP, called the TP-FP spread, was used as a metric. Accuracy represents
the fraction of total number of transactions (both genuine and fraudulent)
that have been detected correctly, and is given by:
TP + TN
TP + TN + FP + FN
(11)
Lastly, experiments were carried out to determine the correct combina-
tion of HMM design parameters namely; number of states, sequence length,
and threshold value. After obtaining these parameters , a comparative study
with another Fraud detection system was carried out as a means of bench-
marking the system.
2.5 Performance Comparison
The performance of the proposed system was measured while varying the
number of fraudulent transactions and spending profile of the card-holder.
The performance was then compared with the credit card fraud detection
technique proposed by S.J Stolfo et al in the paper “Credit Card Fraud
Detection Using Meta-Learning [10]
8
Abhinav et al carried out experiments by considering four profiles, noting
that one of them is a mixed profile, meaning that spending profile was not
considered in their approach. The profiles they considered are (55 35 10),
(70 20 10), and (95 3 2). Here, (x, y, z) profile represents a low spending
profile card-holder who has been carried out x % of their transactions in the
low, y % in medium, and z % in the high range. The goal was to determine
how the system performed for different mixes of transaction amount ranges
in the transactions.
For every combination of spending profile and malicious transaction dis-
tribution, they carried out 100 test runs and recorded the average result.
For consistency the same set of data was used to determine the performance
of both the approach used by Abhinav et al and S.J Stolfo et al (denoted
”OA” and” ST” respectively for convenience).
Fig 3a shows the variation of TP and FP for the two approaches using
the spending profile (95 3 2). The variation of TP-FP and Accuracy is also
shown in Fig 3b. The graph shows that the TP of the researchers approach
is markedly close to Stolfo et al’s approach. Furthermore, both approaches
have similar values of FP.
They concluded that the two systems had comparable accuracies and average
TP-FP spread and showed a similar trend with variation in µ.
Figure 3: Performance variation of the two systems (OA and ST) for the
spending profile (95 3 2)
9
From the testing results, Abhinav et al conclusion was that the proposed
system has an overall Accuracy of 80%, even under large input condition
variations, which is much higher than the overall Accuracy of the method
proposed by Stolfo. Their system therefore correctly detects most of the
fraudulent transactions.
3 Neural Data Mining for Card Fraud Detection
This paper applies a combination of data mining techniques and a neural
network algorithm to address the fraud detection problem. The aim is to ob-
tain high fraud coverage, combined with a low false alarm rate. The general
approach is to model the sequence of operations in credit card transaction
processing using a confidence-based neural network. To ensure the accuracy
and effectiveness of fraud detection, receiver operating characteristic (ROC)
analysis is applied as a means of measuring the accuracy of the model.
A neural network is initially trained with synthetic data, then if an in-
coming credit card transaction is not accepted by the trained neural network
model (NNM) with sufficiently low confidence, it is considered to be fraudu-
lent. The paper shows how confidence value, neural network algorithm and
ROC can be combined successfully to perform credit card fraud detection.
3.1 Background on Neural Networks
Fraud detection using Artificial Neural Network takes inspiration from the
biological nervous system and brain function of humans beings. The ap-
proach is based on the brains ability to learn from past experience and
apply the data/knowledge in decision making and problem solving. The
goal of researchers have been to apply these principals to credit card fraud
detection methods.[11]
The neural network is usually trained with both personal and historical
transaction data of the card-holder, such as occupation, income, transaction
amount, purchase location, frequency and time period of purchases. In ad-
dition to training the network with this information, it is also very common
to include the variety of credit card fraud faced by a particular issuing bank
into the training data.
The neural network is typically depicted as having three interconnected
layers: (As shown in Figure 4)
• Input Layer: Receives input from an external source such as a database
10
Figure 4: Multi-layered neural network
• Hidden Layer: A layer hidden from external observer and receives
input from input layer or another hidden layer
• Output Layer: Exposes the network to external observers and provides
the final output of the network
The output of the neural network generally takes real values between 0
and 1. If the output is below some specified threshold values (for example
0.6 ) then the transaction is classified as genuine. If the output is above some
specified threshold then the ”probability” that the transaction is fraudulent
is considered to be high. This is a rather subtle distinction.
3.2 Data Set
As discussed in previous sections, a critical part of designing an effective
NNM is to supply the model with realistic transactional data. However due
to security, privacy and cost issues, researchers face difficulties in obtaining
credit card data sets, the typical approach taken by researchers has been to
generate ”synthetic” data to facilitate the development and testing of the
model.
11
In generating the data, it is necessary to provide the neural network with
a mix of genuine as well as fraudulent transactions to train the classifiers.
Tao et al specify a ratio of approximately 100 good transactions for each
fraudulent transaction in the training data set in order to accurately simulate
real customer transactions.
In designing the data set for the model, it is important to clearly specify
the credit card payment-related training data attributes for the NMM. Tao
et al specify key attributes such as:
• time of transaction
• location of transaction
• type of merchandise
• business code for merchandise
• business type for merchandise
• transaction amount
Furthermore, the idea of ”Actual Target Values” are used (for classifica-
tion) to guide the neural network learning process, where a target value 1
represents abnormal, and 0 represents normal.
3.3 Calculation of confidence value
The unique approach introduced by the Tao et al in the application of Neu-
ral network techniques for fraud detection is to convert both training and
testing data into confidence values before putting into NNM. These values
are formatted to the range [0.0, 1.0] and each input contains historical in-
formation at this time. Furthermore, they categorise the input attributes
into discrete and continuous values, where attributes such time, location,
type of merchandise, e.t.c are defined as discrete whilst an attribute such
as transaction amount belongs to the continuous category. Therefore there
are two separate methods proposed for the calculation of confidence values
based on the category of input attributes.
To illustrate the calculation of confidence values for discrete and continu-
ous attributes, they consider the location of the transaction and transaction
amount respectively as examples:
12
Given a sequence of transactions:
X = {x1, x2, ..., xn}
The confidence for the transaction location is given by:
C(xi) =
mxi
n
(12)
where:
n is the number of uses of the credit card
xi for i = 1, 2, ..., 3 is the location of use of the credit card
mxi denotes the number of uses of credit card in the location xi
The confidence for the transaction location is given by:
C(xi) = e
−1
2
(
xi − µ
σ
)2
(13)
where:
n is the number of uses of the credit card
xi for i = 1, 2, ..., 3 is the transaction amount i of use of the credit card
σ is standard deviation for the transaction amount
µ is the average of transaction amount
The purpose of the calculation of the confidence values is two fold. Firstly
it will be tested against a threshold value that enables the researchers to
determine whether the transaction is genuine or fraudulent. Furthermore,
through the confidence calculation, neural network input is formatted to the
range [0.0, 1.0] - where all input values achieve the purpose of format - The
formatted data will help to speed up the neural network learning process.
3.4 Back Propagation and Receiver operating characteristic
A brief overview of the theory and operation of NNM were discussed in
section 3 and 3.1. In this section we look at the methods applied by Tao et
al in further detail.
3.4.1 Back Propagation
In the proposed system, the reseachers apply a multi-layer neural network
model and a backpropagation (BP) algorithm on the model.
The BP algorithm is a common approach to training artificial neural net-
works. It computes the gradient of a cost function with respects to all the
weights in the network. The gradient is passed as input into an optimization
method (such as steepest descent) which subsequently updates the weights,
with the aim of minimizing the cost function[12].
13
In this study the BP algorithm learns by iteratively processing a data set
of training ”tuples” (a finite ordered list of elements):
X = x1, x2, ..., xn
The algorithm compares the networks prediction for each tuple with the
actual known target value. For each training tuple, the weights are adjusted
to minimize the mean squared error between the networks prediction and the
actual target value. The modifications are made in the backwards direction,
from the output layer Y = y1, y2, ..., yn, through each hidden layer down to
the first hidden layer.
For this study the researchers used a sigmoid function:
S(t) =
1
1 + e−t
(14)
for the nodes in the hidden layers and the output layer.
3.4.2 Receiver operating characteristic (ROC)
In this paper, ROC analysis has the dual purpose of ensuring the accuracy
and performance of the model is adequate. This is achieved by obtaining
an optimal threshold for determining whether a transaction is genuine or
fraudulent. This threshold value is tested against the output of the NNM,
which takes the form of confidence value Y = y1, y2, ..., yn. Here the classi-
fication of the transaction as genuine or fraudulent will then be determined
by whether this confidence value is higher or lower than the threshold.
A crucial part of ROC analysis is the specification of the confusion matrix
and Table 1 shows the layout of the matrix. The confusion matrix compares
actual classification values (rows) against model predictions of fraud. If the
model predicted fraud high accuracy, all observations in the confusion matrix
would reside in the two cells labelled ”True Positive” and ”True Nega-
tive”. The objective is to maximize correct predictions while managing the
increase in false alarms. [13]
In this context the False Positive Rate is the ratio of abnormal spending
pattern incorrectly detected as normal over total abnormal spending pattern
and is given by:
FPR =
FP
FP + TN
(15)
Conversely the True Positive Rate is also given by the ratio:
TPR =
TP
TP + FN
(16)
14
Table 1: Confusion matrix
Prediction Classification
Y N
Actual Classification Y True Positive False Negative
N False Positive True Negative
With the FPR and TPR metrics defined, we introduce another important
metric at this point, namely the Youden Index. In medical/biological
sciences, the Youden Index (or Youden exponent as defined in this paper) is
typically a used as a summary measure of the ROC curve. It both measures
the effectiveness of a ”diagonistic marker” and enables the selection of an
optimal threshold value (cutoff point) for the marker.
It is defined as J = Senesitivity + speficity − 1 [14]
Its value ranges from 0 to 1, where a value of 1 indicates that there are no
false positives or false negatives, i.e. the test is perfect. (Value of 0 indicates
the converse)
In the context of this paper, when considering the optimal point on the
ROC curve, Tao et al define the maximal number of Youden exponent E as:
E = TPR − FPR (17)
Then taking into consideration the cost of false negative and false posi-
tive, the weighted exponet (CE) is defined as:
CE =
FNC
FPC + FNC
∗ TPR −
FNC
FPC + FNC
∗ FPR (18)
where:
FNC is the cost of false negative and FPC is the cost of false positive
Satisfying the constraints:
0 ≤ FPC ≤ 1 , 0 ≤ FNC ≤ 1 , FPC + FNC = 0 (19)
And:
FPC = FNC = 0 (20)
Now when equation [20] holds, then equation [18] reduces to:
CE =
1
2
∗ (TPR − FPR) =
1
2
∗ E (21)
The crucial point under illustration here is that the use of cost of weighted
exponent overcomes the inadequacies of setting threshold without consider-
ing error cost.
15
3.5 Results and Analysis
In this section we summarise the experimental results and concluding anal-
ysis obtained by the researchers after running test transactions through the
FDS.
Firstly, 7000 records of synthetic data was used for training the NNM and
3000 for for testing purposes.
Figure 5: Calculation of confidence and classification values per record
The table in Figure 5 shows 10 records of card-holder behaviour attributes
(Time of transactions, Merchant type, Business code, etc) with their asso-
ciated values (0.56, 0.83, 0.55, etc) after confidence values were calculated.
For each record a target value of 0 (normal) or 1 (abnormal) was assigned
based on testing the output yi against a specified threshold value (see section
on ROC)
We now consider the ROC curve shown in Figure 6. From this the re-
searchers attempt to show that setting the threshold value at 0.4 improves
the detection accuracy of the model. Without considering the cost factor
(CE) the model obtains the optimal value at the point where the TP rate
hits 91.2 % and FPR is at 13.55%, providing a reasonably good ratio of
True positive to False positive. They can then choose to factor the cost by
computing the threshold value according to equation [21], noting that the
optimal threshold value will be adjusted when the relative cost changes.
16
Figure 6: ROC curve
4 Comparative Analysis
In this section we compare and contrasts both papers by identifying the
areas of similarities in the approach used, while clearly delineating between
the differences. Figure 7 provides an illustration of the key areas that will be
discussed in this section, where we attempt to show that there is a general
concord in the scope, motivation and methodologies applied in both papers
whilst expounding on the primary difference in the implementation and
techniques applied to solving the fraud detection problem.
4.1 Areas of concordance
4.1.1 Scope
As an introduction to this section, it should be stated that these two papers
were carefully selected from a range of available literature on the topic of
fraud detection, due to the fact that they both approached this issue from a
perspective that incorporated data mining, machine learning and statistical
techniques within the context of credit card fraud detection. Therefore
17
Figure 7: Illustration of various components of both papers
the scope of their research were closely aligned, making for an interesting
exposition and comparison.
4.1.2 Motivation
As discussed in section 1, in the retail market environment, e-commerce has
rapidly grown and gained popularity due to the ability to facilitate instan-
taneous transactions. Subsequently credit card payment has become the
most important means of payment due to rapid development in informa-
tion technology globally. However as the usage of credit card increases the
rate of fraudulent practices is also increasing substantially. Both Abhinav
et al. and Tao et al are acutely aware of this problem and its implication
for card-holders and especially issuing banks, who face the risks of losing
millions in fraud compensation and fines. It is therefore clear that it is this
common concern that acts as the motivation for the research presented by
both papers.
4.1.3 Methodology
The usage of transaction data to understand the spending pattern of card-
holders and to detect credit card fraud is not a new concept and has been
largely recognised by researchers in this area (as is evidenced by the numer-
ous literature on fraud detection techniques) as the most effective means of
solving the fraud detection problem. It is therefore not surprising that we
should find this methodological approach adopted by both Abhinav et al.
and Tao et al in their work.
18
The common approach adopted by both is to model the sequence of op-
erations for processing credit card transactions, then test the model by run-
ning transactions through them. This subsequently leads to Abhinav et al.
and Tao et al introducing TP an FP metrics as a means to ensure the ac-
curacy and effectiveness of their fraud detection model. Furthermore, the
researchers analyse historical transactional data and attempt to find incon-
sistency in spending pattern as a means of detecting fraudulent transactions.
To ensure the fraud detection model is successful in its primary purpose,
Abhinav and Toa place a strong emphasis on training their models with
realistic data that contains a good mix/ratio of fraudulent to genuine trans-
actions. However they both have to deal with issues surrounding security,
privacy and cost of obtaining real transaction data, hence the need to gen-
erate synthetic data.
Lastly, both researchers run test transactions against their trained models,
making the decision to accept or reject transactions based on a specified
threshold value.
4.2 Implementation and Technical differences
4.2.1 Modelling the sequence of operations in credit card trans-
action processing
In the approach proposed by Abhinav et al, the key idea is to model the se-
quence of operations for processing credit card transaction using the HMM,
where the details of items purchased in individual transactions are repre-
sented as the underlying finite Markov chain, which are not observable. The
transactions can only be observed through a stochastic process that produces
the sequence of the amount of money spent in each transaction. Whilst in
the work of Tao et al the the sequence of operations in credit card trans-
action processing is modelled using a confidence-based neural network and
ROC analysis. The calculation of confidence values are introduced for both
discrete and continuous input attributes respectively.
4.3 Training Data and Learning Time
The unique approach introduced by the Tao et al in the application of Neu-
ral network techniques for fraud detection is to convert the both training
and testing data into confidence values before putting into NNM. These val-
ues are formatted to the range [0.0, 1.0] and each input contains historical
information at this time. In Abhinav et al approach, once the sequence
is formed from the cardholder’s transactions, it is passed directly into the
HMM without formatting.
19
In regards to training and learning time of the model, it is important to
note the differences in training approach employed by both researchers. In
Abhinav’s implementation, although the training is done offline, the learning
time of the model can have a strong impact on the scalability of the system.
This is due to the fact that an HMM has to be trained for every cardholder.
Considering the fact that an issuing bank such as HSBC processes millions of
transactions for equally large number of cardholders, their implementation
could lead to issues surrounding performance and scalability of the system.
In Tao’ approach, the challenge to the performance and scalability of their
system lies in finding an efficient optimization algorithm for minimizing the
gradient cost function and subsequently updating the weights. Fortunately,
optimization algorithms are well studied and a large body of techniques are
available that can be implemented to address this problem.
4.4 Threshold values
In Tao et al work, the key insight is taking into consideration the cost of
false negative and false positive, the approach overcomes the inadequacies
of setting threshold without considering error cost by adjusting the optimal
threshold value when the relative cost changes. Furthermore the application
of ROC analysis is introduced to show that setting the threshold value at the
optimal value improves the detection accuracy of the model. Alternatively in
Abhinav’s implementation the threshold value is learnt empirically through
the training stage using the Baum-welch algorithm, then is effectively fixed
for the card-holders markov model.
4.5 Acceptance and Declining of incoming transactions
In the Tao’s implementation, the output of the neural network generally
takes real values between 0 and 1. If the output is below some specified
threshold values then the transaction is classified as genuine, alternatively
if the output is above some specified threshold then the ”probability” that
the transaction is fraudulent is considered to be high.
Comparatively, in Abhinav work, after obtaining an estimate of the HMM
parameters through the training phase, they obtain an initial sequence of
symbols from the cardholders transactions. This sequence can then be
passed parametrically to the HMM to compute the probability of accep-
tance. Subsequently, if the metric ∆α > 0 they conclude the sequence is
accepted by the HMM with low probability thus it is potentially fraud, if
additional conditions are satisfied.
20
5 Usage of HMM and NMM in Industry
By way of contextualising these papers in terms of the application of the
methods discussed, it is helpful to examine one case of general application
of these techniques and another case related directly to credit card fraud
detection. Firstly considering HMM’s, they have applications in a broad
number of scientific and mathematical fields where the goal is to recover
a data sequence that is not immediately observable (but other data that
depend on the sequence are)
For example HMM’s have applications in Automatic speech recognition,
where the model is trained to recognize speech utterance from given obser-
vations. [15]. They have also been used extensively in biological sequence
analysis, where HMMs can be used to solve various sequence analysis prob-
lems such as pairwise and multiple sequence alignments, gene annotation,
classification, etc. [16]
Neural networks models are a immensely popular technique for fraud de-
tection and are used by some of the worlds largest banks. In fact Santander
bank uses a fraud detection system called Falcon Fraud Manager from
FICO which is based heavily on neural models and leverages adaptive ana-
lytics. The adaptive model adjusts the base neural network ”Falcon score”
in response to real-time fraud tactics that were not present at the time of
the neural network model training[17][18]. FICO analytics software and
tools are used across multiple industries to manage risk, fight fraud, build
more profitable customer relationships, optimize operations and meet strict
government regulations[19].
References
[1] UK Cards Association
http://www.theukcardsassociation.org.uk/wm documents/December
2014 Full Report.pdf
[2] Financial Fraud Action UK
http://www.financialfraudaction.org.uk/downloads.asp?genre=consumer
[3] S. Benson Edwin Raj, A. Annie Portia Analysis on Credit Card Fraud
Detection Methods. International Conference on Computer, Communica-
tion and Electrical Technology – ICCCET2011, 18th, 19th March, 2011
[4] Masoumeh Zareapoor, Seeja.K.R, and M.Afshar.Alam Analysis of Credit
Card Fraud Detection Techniques: based on Certain Design Criteria.
International Journal of Computer Applications (0975 – 8887) Volume
52– No.3, August 2012
21
[5] Khyati Chaudhary, Jyoti Yadav, Bhawna Mallick A review of Fraud
Detection Techniques: Credit Card. International Journal of Computer
Applications (0975 – 8887) Volume 45– No.1, May 2012.
[6] Abhinav Srivastava, Amlan Kundu, Shamik Sural and Arun K. Majum-
dar, Credit Card Fraud detection using Hidden Markov Model. IEEE
Transactions on dependable and secure computing VOL. 5, NO. 1,
January-March 2008
[7] Tao Guo, Gui-Yang Li, Neural data mining for Credit card Fraud Detec-
tion. Proceedings of the Seventh International Conference on Machine
Learning and Cybernetics, Kunming, 12-15 July 2008
[8] Yufeng Kou, Chang-Tien Lu, Sirirat Sinvongwattana, Survey of Fraud
Detection Techniques. Proceedings of the 2004 IEEE International Con-
ference on Networking, Sensing & Control Taipei, Taiwan, March 21-23,
2004
[9] Markov Model
http://en.wikipedia.org/wiki/Markov model
[10] S.J. Stolfo, D.W. Fan, W. Lee, A.L. Prodromidis, and P.K. Chan, Credit
Card Fraud Detection Using Meta-Learning: Issues and Initial Results.
Proc. AAAI Workshop AI Methods in Fraud and Risk Management, pp.
83-90, 1997
[11] Khyati Chaudhary, Jyoti Yadav, Bhawna Mallick A review of Fraud
Detection Techniques: Credit Card. International Journal of Computer
Applications (0975 – 8887) Volume 45– No.1, May 2012.
[12] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J Learning
representations by back-propagating errors. Nature 323 (6088): 533–536,
(8 October 1986)
[13] Using Data Mining Techniques for Fraud Detection, A SAS Institute
Best Practices Paper
http://www.ag.unr.edu/gf/dm/dmfraud.pdf
[14] Ronen Fluss, David Faraggi, and Benjamin Reiser Estimation of the
Youden index and its associated cutoff point . Biometrical Journal, 2005
[15] HMM Speech Recognition
http://www.fysiskplanering.se/fou/cuppsats.nsf/all/e156a6197d8b0678c1256bbb003f62
[16] Byung-Jun Yoon, Hidden Markov Models and their Applications in
Biological Sequence Analysis
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2766791/
22
[17] FICO Analytics
http://www.fico.com/en/node/8140?file=5380
[18] FICO Analytics
http://www.fico.com/en/blogs/tag/score-performance/page/5/
[19] FICO Analytics
http://www.fico.com/en/about-us
23

More Related Content

What's hot

Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Credit Card Fraud Detection System: A Survey
Credit Card Fraud Detection System: A SurveyCredit Card Fraud Detection System: A Survey
Credit Card Fraud Detection System: A SurveyIJMER
 
IRJET- Fraud Detection in Online Credit Card Payment
IRJET-  	  Fraud Detection in Online Credit Card PaymentIRJET-  	  Fraud Detection in Online Credit Card Payment
IRJET- Fraud Detection in Online Credit Card PaymentIRJET Journal
 
Detecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataDetecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataITIIIndustries
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
Game Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionGame Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionIOSR Journals
 
Unsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud DetectionUnsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud DetectionIRJET Journal
 
Ijigsp v6-n2-6
Ijigsp v6-n2-6Ijigsp v6-n2-6
Ijigsp v6-n2-6Anita Pal
 
Subscription fraud analytics using classification
Subscription fraud analytics using classificationSubscription fraud analytics using classification
Subscription fraud analytics using classificationSomdeep Sen
 
Emerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detectionEmerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detectionUmasree Raghunath
 
Telecommunication Fraud Detection and Prevention
Telecommunication Fraud Detection and PreventionTelecommunication Fraud Detection and Prevention
Telecommunication Fraud Detection and PreventionSumera Khan
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsAlejandro Correa Bahnsen, PhD
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
Dn31538540
Dn31538540Dn31538540
Dn31538540IJMER
 
Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method IJMER
 
Pollyanna Document Classifier
Pollyanna Document ClassifierPollyanna Document Classifier
Pollyanna Document ClassifierVijay PG
 
IRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET Journal
 
CNP Payment Fraud and its Affect on Gift Cards
CNP Payment Fraud and its Affect on Gift CardsCNP Payment Fraud and its Affect on Gift Cards
CNP Payment Fraud and its Affect on Gift CardsChristopher Uriarte
 
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle :  A Guide For Private Label IssuersUnderstanding the Card Fraud Lifecycle :  A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle : A Guide For Private Label IssuersChristopher Uriarte
 

What's hot (20)

Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Credit Card Fraud Detection System: A Survey
Credit Card Fraud Detection System: A SurveyCredit Card Fraud Detection System: A Survey
Credit Card Fraud Detection System: A Survey
 
IRJET- Fraud Detection in Online Credit Card Payment
IRJET-  	  Fraud Detection in Online Credit Card PaymentIRJET-  	  Fraud Detection in Online Credit Card Payment
IRJET- Fraud Detection in Online Credit Card Payment
 
Detecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataDetecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency Data
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Game Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime DetectionGame Theory Approach for Identity Crime Detection
Game Theory Approach for Identity Crime Detection
 
Unsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud DetectionUnsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud Detection
 
Ijigsp v6-n2-6
Ijigsp v6-n2-6Ijigsp v6-n2-6
Ijigsp v6-n2-6
 
Subscription fraud analytics using classification
Subscription fraud analytics using classificationSubscription fraud analytics using classification
Subscription fraud analytics using classification
 
Emerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detectionEmerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detection
 
Telecommunication Fraud Detection and Prevention
Telecommunication Fraud Detection and PreventionTelecommunication Fraud Detection and Prevention
Telecommunication Fraud Detection and Prevention
 
Fraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive AnalyticsFraud Detection with Cost-Sensitive Predictive Analytics
Fraud Detection with Cost-Sensitive Predictive Analytics
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
Dn31538540
Dn31538540Dn31538540
Dn31538540
 
Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method
 
Pollyanna Document Classifier
Pollyanna Document ClassifierPollyanna Document Classifier
Pollyanna Document Classifier
 
IRJET-Fake Product Review Monitoring
IRJET-Fake Product Review MonitoringIRJET-Fake Product Review Monitoring
IRJET-Fake Product Review Monitoring
 
CNP Payment Fraud and its Affect on Gift Cards
CNP Payment Fraud and its Affect on Gift CardsCNP Payment Fraud and its Affect on Gift Cards
CNP Payment Fraud and its Affect on Gift Cards
 
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle :  A Guide For Private Label IssuersUnderstanding the Card Fraud Lifecycle :  A Guide For Private Label Issuers
Understanding the Card Fraud Lifecycle : A Guide For Private Label Issuers
 

Viewers also liked

Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model finalRitu Sarkar
 
Survey on Credit Card Fraud Detection Using Different Data Mining Techniques
Survey on Credit Card Fraud Detection Using Different Data Mining TechniquesSurvey on Credit Card Fraud Detection Using Different Data Mining Techniques
Survey on Credit Card Fraud Detection Using Different Data Mining Techniquesijsrd.com
 
Frauds in telecom sector
Frauds in telecom sectorFrauds in telecom sector
Frauds in telecom sectorsksahu099
 
Data Mining in telecommunication industry
Data Mining in telecommunication industryData Mining in telecommunication industry
Data Mining in telecommunication industrypragya ratan
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
Data mining in Telecommunications
Data mining in TelecommunicationsData mining in Telecommunications
Data mining in TelecommunicationsMohsin Nadaf
 
Anomaly detection in deep learning
Anomaly detection in deep learningAnomaly detection in deep learning
Anomaly detection in deep learningAdam Gibson
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentationHernan Huwyler
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsSalah Amean
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationScott Mongeau
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014Sri Ambati
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017LinkedIn
 

Viewers also liked (17)

Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Credit risk scoring model final
Credit risk scoring model finalCredit risk scoring model final
Credit risk scoring model final
 
Big Data CDR Analyzer - Kanthaka
Big Data CDR Analyzer - KanthakaBig Data CDR Analyzer - Kanthaka
Big Data CDR Analyzer - Kanthaka
 
Survey on Credit Card Fraud Detection Using Different Data Mining Techniques
Survey on Credit Card Fraud Detection Using Different Data Mining TechniquesSurvey on Credit Card Fraud Detection Using Different Data Mining Techniques
Survey on Credit Card Fraud Detection Using Different Data Mining Techniques
 
Lecture - Data Mining
Lecture - Data MiningLecture - Data Mining
Lecture - Data Mining
 
Frauds in telecom sector
Frauds in telecom sectorFrauds in telecom sector
Frauds in telecom sector
 
Data Mining in telecommunication industry
Data Mining in telecommunication industryData Mining in telecommunication industry
Data Mining in telecommunication industry
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data mining in Telecommunications
Data mining in TelecommunicationsData mining in Telecommunications
Data mining in Telecommunications
 
Anomaly detection in deep learning
Anomaly detection in deep learningAnomaly detection in deep learning
Anomaly detection in deep learning
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and Mitigation
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
 

Similar to Application of Data Mining and Machine Learning techniques for Fraud Detection_v.1.0

Analysis of Spending Pattern on Credit Card Fraud Detection
Analysis of Spending Pattern on Credit Card Fraud DetectionAnalysis of Spending Pattern on Credit Card Fraud Detection
Analysis of Spending Pattern on Credit Card Fraud DetectionIOSR Journals
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...IRJET Journal
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfShrutiGarg649495
 
A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining TechniquesA Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining TechniquesIJSRD
 
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...IRJET Journal
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.Shakas Technologies
 
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSFRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSIRJET Journal
 
Credit Card Fraud Detection System Using Machine Learning Algorithm
Credit Card Fraud Detection System Using Machine Learning AlgorithmCredit Card Fraud Detection System Using Machine Learning Algorithm
Credit Card Fraud Detection System Using Machine Learning AlgorithmIRJET Journal
 
Problem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid ModelProblem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid ModelIJMIT JOURNAL
 
Software for Payment Cards: Choosing Wisely
Software for Payment Cards: Choosing WiselySoftware for Payment Cards: Choosing Wisely
Software for Payment Cards: Choosing WiselyCognizant
 
IRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET Journal
 
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of FraudstersSecure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of FraudstersCognizant
 
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGCREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGIRJET Journal
 
Review on Fraud Detection in Electronic Payment Gateway
Review on Fraud Detection in Electronic Payment GatewayReview on Fraud Detection in Electronic Payment Gateway
Review on Fraud Detection in Electronic Payment GatewayIRJET Journal
 
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...The potentials for e-Commerce payments' growth in Ethiopia and the need for s...
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...The i-Capital Africa Institute
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionIRJET Journal
 
A rule-based machine learning model for financial fraud detection
A rule-based machine learning model for financial fraud detectionA rule-based machine learning model for financial fraud detection
A rule-based machine learning model for financial fraud detectionIJECEIAES
 
credit card fruad detection from the fake users.pptx
credit card fruad detection from the fake users.pptxcredit card fruad detection from the fake users.pptx
credit card fruad detection from the fake users.pptx227r1a0519
 

Similar to Application of Data Mining and Machine Learning techniques for Fraud Detection_v.1.0 (20)

J017216164
J017216164J017216164
J017216164
 
Analysis of Spending Pattern on Credit Card Fraud Detection
Analysis of Spending Pattern on Credit Card Fraud DetectionAnalysis of Spending Pattern on Credit Card Fraud Detection
Analysis of Spending Pattern on Credit Card Fraud Detection
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
 
A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining TechniquesA Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
 
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
A Comparative Study on Online Transaction Fraud Detection by using Machine Le...
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONSFRAUD DETECTION IN CREDIT CARD TRANSACTIONS
FRAUD DETECTION IN CREDIT CARD TRANSACTIONS
 
Credit Card Fraud Detection System Using Machine Learning Algorithm
Credit Card Fraud Detection System Using Machine Learning AlgorithmCredit Card Fraud Detection System Using Machine Learning Algorithm
Credit Card Fraud Detection System Using Machine Learning Algorithm
 
Problem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid ModelProblem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid Model
 
Aw03303090311
Aw03303090311Aw03303090311
Aw03303090311
 
Software for Payment Cards: Choosing Wisely
Software for Payment Cards: Choosing WiselySoftware for Payment Cards: Choosing Wisely
Software for Payment Cards: Choosing Wisely
 
IRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention System
 
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of FraudstersSecure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
Secure Payments: How Card Issuers and Merchants Can Stay Ahead of Fraudsters
 
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGCREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
 
Review on Fraud Detection in Electronic Payment Gateway
Review on Fraud Detection in Electronic Payment GatewayReview on Fraud Detection in Electronic Payment Gateway
Review on Fraud Detection in Electronic Payment Gateway
 
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...The potentials for e-Commerce payments' growth in Ethiopia and the need for s...
The potentials for e-Commerce payments' growth in Ethiopia and the need for s...
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
 
A rule-based machine learning model for financial fraud detection
A rule-based machine learning model for financial fraud detectionA rule-based machine learning model for financial fraud detection
A rule-based machine learning model for financial fraud detection
 
credit card fruad detection from the fake users.pptx
credit card fruad detection from the fake users.pptxcredit card fruad detection from the fake users.pptx
credit card fruad detection from the fake users.pptx
 

Application of Data Mining and Machine Learning techniques for Fraud Detection_v.1.0

  • 1. Application of Data Mining and Machine Learning for Credit Card Fraud Detection A Comparative Analysis on two Academic Papers Author: Christian Adom Computer Science Department , City University London 26th April 2015 Abstract The use of credit cards as a payment method has become increas- ingly popular in recent years. As advancements in e-commerce tech- nologies continue to emerge, consumers are increasingly taking advan- tage of the convenience and flexibility offered by credit card purchases. This change in consumer behaviour has given rise to an unprecedented increase in cases of credit card frauds and subsequently lead to sub- stantial financial loss for consumers, issuing banks and merchants in the payments industry. In this paper we compare and discuss two aca- demic research papers that attempt to apply data mining and machine learning techniques to address the problem of credit card fraud detec- tion and prevention. Our aim is to critically assess the methodologies, techniques and results presented by the researchers for both papers in countering the problem of credit card fraud. Next we identify the areas of similarities in the approach and methodologies used, while clearly delineating between the differences in the techniques implemented. Fi- nally we provide a short excursion of the use of these techniques in industry. 1 Introduction The use of credit cards as the primary payment method has become in- creasingly popular with consumers in recent years. A study conducted in 2014 by the UK Cards Association revealed there were approximately 175.6 million cards in issue (55.4 million of which were credit cards) and that card expenditure rose by £0.6 billion, amounting to a total of £49.0 billion. [1] 1
  • 2. There is strong evidence to suggest that this growing trend in consumer behaviour is driven by advancements in e-commerce technologies and con- sumers are increasingly taking advantage of the convenience and flexibility offered by credit card purchases. Although a prosperous economic time for the e-commerce market, issuing banks, merchants and payment providers are now facing an unprecedented rise in the number of credit card fraud cases as a result of this rise. A study conducted by the Financial Fraud Action UK (FFA) estimates the fraud losses on UK cards totalled £450.4 million in 2013, a 16 % increases from £388.3 million in 2012. [2] In an effort to address this growing problem, a number of fraud detection models and techniques have been proposed by researchers within both the payments industry and academia. A survey of the current literature on fraud detection methods reveals a number of approaches to tackling this problem. Below are a few of the research areas: [3] [4] [5] • Bayesian Network • Genetic Algorithm • Neural Network • Support Vector Machine • Decision Tree • Fuzzy Logic Based System • Hidden Markov Model • Meta Learning Strategy For the purpose of this paper, we will discuss two academic research papers that address the issue of credit card fraud detection and prevention, namely: 1. Credit Card Fraud detection using Hidden Markov Model [6] 2. Neural data mining for Credit card Fraud Detection [7] The aim of this paper is to critically assess the methodologies, techniques and results presented by the researchers in countering the problem of credit card fraud. Areas of similarities in the approach and methodologies used are identified, while clearly delineating between the differences in the techniques implemented. 2
  • 3. Generally, credit card fraud can be divided into two types, off-line and on-line fraud. For off-line fraud, the crime is committed by using the stolen physical card to make (usually face-to-face) unauthorized transactions. In the case of on-line fraud, the fraud is committed by stealing the card details (without the knowledge of the legal card holder) and proceeding to make unauthorized transaction through the internet, phone or other card-holder not present (CNP) channels [8] In order to successfully detect this kind of fraud it is necessary to develop methods to analyse the spending pattern on a card and find inconsistencies with respect to the normal spending patterns of the card-holder. Generally, humans exhibit specific behaviour in their spending habits, thus every card-holder can be represented by a set of patterns containing unique information such as; purchase category, time/location of purchase, trans- action amount, etc. Deviation from known patterns is considered to be a potential threat to the model. The principles introduced above are the key insights into successful fraud detection presented in both papers, and it is on this common ground that we will address the research. The remainder of this paper is organised as follows: Section 2 and 3 presents a summarisation and review of the two papers under comparison. Section 4 covers a comparative analysis. 2 Credit Card Fraud Detection Using Hidden Markov Model Credit card fraud detection for CNP transactions using a Hidden Markov Model (HMM) is based on the analysis of the spending pattern on a card holder. The key approach is to model the sequence of operations for pro- cessing credit card transactions using the HMM. The details of items purchased in individual transactions (not known to Fraud detection system) are represented as the underlying finite Markov chain, which are not observable. The transactions can only be observed through a stochastic process that produces the sequence of the amount of money spent in each transaction. The HMM is trained with normal behaviour of card-holder by generating synthetic data - This need to generate artificial data is due to difficulties re- searchers face in obtaining real credit card data sets due to security, privacy and cost issues. Upon completion of model training, the Fraud detection sys- tem (FDS) is tested by running transactions through the system - Incoming 3
  • 4. transactions not accepted by HMM with sufficiently high probability are marked as fraud. 2.1 HMM Background In probability theory, a Markov Model is a stochastic model used to model randomly changing systems where it is assumed that future states depend only on the present state and not on the sequence of events that precede it. [9] When the system being modelled is assumed to be a Markov process with unobserved (hidden) states, we can represent it as a Hidden Markov Model. A HMM has a finites set of states which are governed by a set of transi- tion probabilities, where for any particular state, an outcome or observation can be generated according to an associated probability distribution. It is only the outcome and not the state that is visible to an external observer (reference to the term ”Hidden”). 2.1.1 Characteristics of an HMM To set the foundation for understanding how HMM’s are applied to the problem of fraud detection, it is to necessary to provide a formal description of the characteristics of an HMM. Each Hidden Markov Model is defined by the following elements: states, observation symbols, transition probabilities, and initial probabilities. 1. The N states of the model is defined as: S = {S1, S2, ..., Sn} (1) 2. The M observation symbols per state is defined as: V = {V1, V2, ..., Vm} (2) 3. The state transition probability distribution A is given by: aij = P (qt+1 = Sj|qt = Sj) (3) where: qt = current state The transition probabilities satisfy the stochastic constraints: aij ≥ 0, 1 ≤ i, j ≤ N and N j=1 aij = 1, 1 ≤ i ≤ N 4
  • 5. 4. The observation symbol probability distribution B in each state is given by: bj(k) = P (vk = Sj) , 1 ≤ j ≤ N, 1 ≤ k ≤ M (4) The observation symbol probabilities satisfy the stochastic constraints: bj(k) ≥ 0, 1 ≤ j ≤ N 1 ≤ k ≤ M and M k=1 bj(k) = 1, 1 ≤ j ≤ N 5. The initial state probability vector π is the probability that the model is in state Si at time t = 0 and is defined as: πi = P (q1 = Si) , 1 ≤ i ≤ N such that N i=1 πi = 1 (5) From the above definitions, the complete specification of an HMM re- quires the estimation of two model parameters, N and M, and three proba- bility distributions A, B, and π. This is represented by the notation: λ = (A, B, π) (6) 2.2 Implementing the HMM for Credit Card Fraud Detec- tion With the mathematical characteristics of the HMM defined , we now turn to providing a high level outline of the fraud detection process using an HMM. This can be summarised in three steps: 1. Each incoming transaction is submitted to the FDS (usually running at an issuing bank) for verification. 2. The FDS tries to find an anomaly in the transaction based on the spending profile of the card-holder. 3. If the FDS confirms the transaction to be fraudulent, it raises an alert and the issuing bank declines the transaction. This general process raises a number of questions, such as: • How are the credit card transaction processing operations mapped in terms of an HMM ? • How are the spending profiles of the cardholders determined and cat- egorised? We discuss these and other issues in the next section. 5
  • 6. 2.2.1 HMM for Credit Card Transaction Processing The process for mapping credit card transaction processes in terms of a HMM can be enumerated in six steps: 1. Decide on observation symbols in model. 2. Quantize purchase values x into M price ranges V1, V2, ..., Vm forming the observation symbols. 3. Define the observation symbols as, V = l, m, h making M = 3 where: l = low, m = medium, h = high 4. The transition in purchase type is used as the state transition in the model. 5. The set of all possible types of purchases forms the set of hidden states of the HMM. 6. Compute the probability matrices A, B and π We can make some general comments about the steps defined above: In step 4 the transition in purchase type is used as the state transition in the model. This choice seems to contradict our intuition, since a credit card- holder makes different kinds of purchases of different amounts over a period of time, the natural choice would be to consider the sequence of transaction amounts instead. However, the sequence of types of purchase is considered to be more reliable compared to the transaction amounts because a card- holder makes purchases depending on his/her need for obtaining different types of items over a period of time. This spending behaviour subsequently generates a sequence of transaction amounts. Furthermore, the individual transaction amount generally depends on the associated type of purchase. Figure 1: Special case of fully connected HMM 6
  • 7. In step 6, the optimal values for these parameters are determined in the training phase using a ”Baum-Welch (forward–backward) algorithm. With the process defined we present a graphical representation of a HMM as shown in Figure 1. This is a special case of a fully connected HMM in which every state of the model can be reached in a single step from another state. In this figure GR, EL and MI represent Groceries, Electronics and Miscalenoues purchases respectively. 2.3 Process flow of FDS Figure 2: Process flow of the FDS After obtaining an estimate of the HMM parameters through the train- ing phase, Abhinav et al obtain an initial sequence of symbols from the cardholders transactions. This sequence can then be passed parametrically to the HMM to compute the probability of acceptance: α1 = P (O1, O2, ..., OR|λ) (7) They then form another sequence of length R by discarding O1 and adding OR+1 in the new sequence. This is then passed parametrically to the HMM and the probability of acceptance is computed as: α2 = P (O2, O3, ..., OR+1|λ) (8) where OR+1 is the symbol generated by a new transaction at time t + 1 The metric for accepting or declining the sequence is defined as: ∆α = α1 − α2 (9) 7
  • 8. where if ∆α > 0 they conclude the sequence is accepted by the HMM with low probability thus it is potentially fraud, given that the following additional condition holds: ∆α α1 ≥ Threshold (10) Alternatively, if the condition does not hold then OR+1 is permanently added in the sequence and the new sequence is used for determining the validity of the next transaction. The complete process flow of the FDS is shown in Figure 2, where the process is divided into two separate phases (Training and Detection) 2.4 Results and Analysis In the final stage of testing and analysis, large-scale simulations were carried out to test the effectiveness of the FDS system. Abhinav et al used True Positive (TP) , False Positive (FP), TP-FP spread and Accuracy metrics, to measure the capability of the system. In this context, TP represents the fraction of fraudulent transactions correctly classified as fraudulent, whereas FP is the fraction of genuine transactions incorrectly classified as fraudulent. Furthermore the converse of these metrics, True negative (TN) and False Negative (FN) were also used as part of accuracy calculation. To measure the performance of the FDS , the difference between TP and FP, called the TP-FP spread, was used as a metric. Accuracy represents the fraction of total number of transactions (both genuine and fraudulent) that have been detected correctly, and is given by: TP + TN TP + TN + FP + FN (11) Lastly, experiments were carried out to determine the correct combina- tion of HMM design parameters namely; number of states, sequence length, and threshold value. After obtaining these parameters , a comparative study with another Fraud detection system was carried out as a means of bench- marking the system. 2.5 Performance Comparison The performance of the proposed system was measured while varying the number of fraudulent transactions and spending profile of the card-holder. The performance was then compared with the credit card fraud detection technique proposed by S.J Stolfo et al in the paper “Credit Card Fraud Detection Using Meta-Learning [10] 8
  • 9. Abhinav et al carried out experiments by considering four profiles, noting that one of them is a mixed profile, meaning that spending profile was not considered in their approach. The profiles they considered are (55 35 10), (70 20 10), and (95 3 2). Here, (x, y, z) profile represents a low spending profile card-holder who has been carried out x % of their transactions in the low, y % in medium, and z % in the high range. The goal was to determine how the system performed for different mixes of transaction amount ranges in the transactions. For every combination of spending profile and malicious transaction dis- tribution, they carried out 100 test runs and recorded the average result. For consistency the same set of data was used to determine the performance of both the approach used by Abhinav et al and S.J Stolfo et al (denoted ”OA” and” ST” respectively for convenience). Fig 3a shows the variation of TP and FP for the two approaches using the spending profile (95 3 2). The variation of TP-FP and Accuracy is also shown in Fig 3b. The graph shows that the TP of the researchers approach is markedly close to Stolfo et al’s approach. Furthermore, both approaches have similar values of FP. They concluded that the two systems had comparable accuracies and average TP-FP spread and showed a similar trend with variation in µ. Figure 3: Performance variation of the two systems (OA and ST) for the spending profile (95 3 2) 9
  • 10. From the testing results, Abhinav et al conclusion was that the proposed system has an overall Accuracy of 80%, even under large input condition variations, which is much higher than the overall Accuracy of the method proposed by Stolfo. Their system therefore correctly detects most of the fraudulent transactions. 3 Neural Data Mining for Card Fraud Detection This paper applies a combination of data mining techniques and a neural network algorithm to address the fraud detection problem. The aim is to ob- tain high fraud coverage, combined with a low false alarm rate. The general approach is to model the sequence of operations in credit card transaction processing using a confidence-based neural network. To ensure the accuracy and effectiveness of fraud detection, receiver operating characteristic (ROC) analysis is applied as a means of measuring the accuracy of the model. A neural network is initially trained with synthetic data, then if an in- coming credit card transaction is not accepted by the trained neural network model (NNM) with sufficiently low confidence, it is considered to be fraudu- lent. The paper shows how confidence value, neural network algorithm and ROC can be combined successfully to perform credit card fraud detection. 3.1 Background on Neural Networks Fraud detection using Artificial Neural Network takes inspiration from the biological nervous system and brain function of humans beings. The ap- proach is based on the brains ability to learn from past experience and apply the data/knowledge in decision making and problem solving. The goal of researchers have been to apply these principals to credit card fraud detection methods.[11] The neural network is usually trained with both personal and historical transaction data of the card-holder, such as occupation, income, transaction amount, purchase location, frequency and time period of purchases. In ad- dition to training the network with this information, it is also very common to include the variety of credit card fraud faced by a particular issuing bank into the training data. The neural network is typically depicted as having three interconnected layers: (As shown in Figure 4) • Input Layer: Receives input from an external source such as a database 10
  • 11. Figure 4: Multi-layered neural network • Hidden Layer: A layer hidden from external observer and receives input from input layer or another hidden layer • Output Layer: Exposes the network to external observers and provides the final output of the network The output of the neural network generally takes real values between 0 and 1. If the output is below some specified threshold values (for example 0.6 ) then the transaction is classified as genuine. If the output is above some specified threshold then the ”probability” that the transaction is fraudulent is considered to be high. This is a rather subtle distinction. 3.2 Data Set As discussed in previous sections, a critical part of designing an effective NNM is to supply the model with realistic transactional data. However due to security, privacy and cost issues, researchers face difficulties in obtaining credit card data sets, the typical approach taken by researchers has been to generate ”synthetic” data to facilitate the development and testing of the model. 11
  • 12. In generating the data, it is necessary to provide the neural network with a mix of genuine as well as fraudulent transactions to train the classifiers. Tao et al specify a ratio of approximately 100 good transactions for each fraudulent transaction in the training data set in order to accurately simulate real customer transactions. In designing the data set for the model, it is important to clearly specify the credit card payment-related training data attributes for the NMM. Tao et al specify key attributes such as: • time of transaction • location of transaction • type of merchandise • business code for merchandise • business type for merchandise • transaction amount Furthermore, the idea of ”Actual Target Values” are used (for classifica- tion) to guide the neural network learning process, where a target value 1 represents abnormal, and 0 represents normal. 3.3 Calculation of confidence value The unique approach introduced by the Tao et al in the application of Neu- ral network techniques for fraud detection is to convert both training and testing data into confidence values before putting into NNM. These values are formatted to the range [0.0, 1.0] and each input contains historical in- formation at this time. Furthermore, they categorise the input attributes into discrete and continuous values, where attributes such time, location, type of merchandise, e.t.c are defined as discrete whilst an attribute such as transaction amount belongs to the continuous category. Therefore there are two separate methods proposed for the calculation of confidence values based on the category of input attributes. To illustrate the calculation of confidence values for discrete and continu- ous attributes, they consider the location of the transaction and transaction amount respectively as examples: 12
  • 13. Given a sequence of transactions: X = {x1, x2, ..., xn} The confidence for the transaction location is given by: C(xi) = mxi n (12) where: n is the number of uses of the credit card xi for i = 1, 2, ..., 3 is the location of use of the credit card mxi denotes the number of uses of credit card in the location xi The confidence for the transaction location is given by: C(xi) = e −1 2 ( xi − µ σ )2 (13) where: n is the number of uses of the credit card xi for i = 1, 2, ..., 3 is the transaction amount i of use of the credit card σ is standard deviation for the transaction amount µ is the average of transaction amount The purpose of the calculation of the confidence values is two fold. Firstly it will be tested against a threshold value that enables the researchers to determine whether the transaction is genuine or fraudulent. Furthermore, through the confidence calculation, neural network input is formatted to the range [0.0, 1.0] - where all input values achieve the purpose of format - The formatted data will help to speed up the neural network learning process. 3.4 Back Propagation and Receiver operating characteristic A brief overview of the theory and operation of NNM were discussed in section 3 and 3.1. In this section we look at the methods applied by Tao et al in further detail. 3.4.1 Back Propagation In the proposed system, the reseachers apply a multi-layer neural network model and a backpropagation (BP) algorithm on the model. The BP algorithm is a common approach to training artificial neural net- works. It computes the gradient of a cost function with respects to all the weights in the network. The gradient is passed as input into an optimization method (such as steepest descent) which subsequently updates the weights, with the aim of minimizing the cost function[12]. 13
  • 14. In this study the BP algorithm learns by iteratively processing a data set of training ”tuples” (a finite ordered list of elements): X = x1, x2, ..., xn The algorithm compares the networks prediction for each tuple with the actual known target value. For each training tuple, the weights are adjusted to minimize the mean squared error between the networks prediction and the actual target value. The modifications are made in the backwards direction, from the output layer Y = y1, y2, ..., yn, through each hidden layer down to the first hidden layer. For this study the researchers used a sigmoid function: S(t) = 1 1 + e−t (14) for the nodes in the hidden layers and the output layer. 3.4.2 Receiver operating characteristic (ROC) In this paper, ROC analysis has the dual purpose of ensuring the accuracy and performance of the model is adequate. This is achieved by obtaining an optimal threshold for determining whether a transaction is genuine or fraudulent. This threshold value is tested against the output of the NNM, which takes the form of confidence value Y = y1, y2, ..., yn. Here the classi- fication of the transaction as genuine or fraudulent will then be determined by whether this confidence value is higher or lower than the threshold. A crucial part of ROC analysis is the specification of the confusion matrix and Table 1 shows the layout of the matrix. The confusion matrix compares actual classification values (rows) against model predictions of fraud. If the model predicted fraud high accuracy, all observations in the confusion matrix would reside in the two cells labelled ”True Positive” and ”True Nega- tive”. The objective is to maximize correct predictions while managing the increase in false alarms. [13] In this context the False Positive Rate is the ratio of abnormal spending pattern incorrectly detected as normal over total abnormal spending pattern and is given by: FPR = FP FP + TN (15) Conversely the True Positive Rate is also given by the ratio: TPR = TP TP + FN (16) 14
  • 15. Table 1: Confusion matrix Prediction Classification Y N Actual Classification Y True Positive False Negative N False Positive True Negative With the FPR and TPR metrics defined, we introduce another important metric at this point, namely the Youden Index. In medical/biological sciences, the Youden Index (or Youden exponent as defined in this paper) is typically a used as a summary measure of the ROC curve. It both measures the effectiveness of a ”diagonistic marker” and enables the selection of an optimal threshold value (cutoff point) for the marker. It is defined as J = Senesitivity + speficity − 1 [14] Its value ranges from 0 to 1, where a value of 1 indicates that there are no false positives or false negatives, i.e. the test is perfect. (Value of 0 indicates the converse) In the context of this paper, when considering the optimal point on the ROC curve, Tao et al define the maximal number of Youden exponent E as: E = TPR − FPR (17) Then taking into consideration the cost of false negative and false posi- tive, the weighted exponet (CE) is defined as: CE = FNC FPC + FNC ∗ TPR − FNC FPC + FNC ∗ FPR (18) where: FNC is the cost of false negative and FPC is the cost of false positive Satisfying the constraints: 0 ≤ FPC ≤ 1 , 0 ≤ FNC ≤ 1 , FPC + FNC = 0 (19) And: FPC = FNC = 0 (20) Now when equation [20] holds, then equation [18] reduces to: CE = 1 2 ∗ (TPR − FPR) = 1 2 ∗ E (21) The crucial point under illustration here is that the use of cost of weighted exponent overcomes the inadequacies of setting threshold without consider- ing error cost. 15
  • 16. 3.5 Results and Analysis In this section we summarise the experimental results and concluding anal- ysis obtained by the researchers after running test transactions through the FDS. Firstly, 7000 records of synthetic data was used for training the NNM and 3000 for for testing purposes. Figure 5: Calculation of confidence and classification values per record The table in Figure 5 shows 10 records of card-holder behaviour attributes (Time of transactions, Merchant type, Business code, etc) with their asso- ciated values (0.56, 0.83, 0.55, etc) after confidence values were calculated. For each record a target value of 0 (normal) or 1 (abnormal) was assigned based on testing the output yi against a specified threshold value (see section on ROC) We now consider the ROC curve shown in Figure 6. From this the re- searchers attempt to show that setting the threshold value at 0.4 improves the detection accuracy of the model. Without considering the cost factor (CE) the model obtains the optimal value at the point where the TP rate hits 91.2 % and FPR is at 13.55%, providing a reasonably good ratio of True positive to False positive. They can then choose to factor the cost by computing the threshold value according to equation [21], noting that the optimal threshold value will be adjusted when the relative cost changes. 16
  • 17. Figure 6: ROC curve 4 Comparative Analysis In this section we compare and contrasts both papers by identifying the areas of similarities in the approach used, while clearly delineating between the differences. Figure 7 provides an illustration of the key areas that will be discussed in this section, where we attempt to show that there is a general concord in the scope, motivation and methodologies applied in both papers whilst expounding on the primary difference in the implementation and techniques applied to solving the fraud detection problem. 4.1 Areas of concordance 4.1.1 Scope As an introduction to this section, it should be stated that these two papers were carefully selected from a range of available literature on the topic of fraud detection, due to the fact that they both approached this issue from a perspective that incorporated data mining, machine learning and statistical techniques within the context of credit card fraud detection. Therefore 17
  • 18. Figure 7: Illustration of various components of both papers the scope of their research were closely aligned, making for an interesting exposition and comparison. 4.1.2 Motivation As discussed in section 1, in the retail market environment, e-commerce has rapidly grown and gained popularity due to the ability to facilitate instan- taneous transactions. Subsequently credit card payment has become the most important means of payment due to rapid development in informa- tion technology globally. However as the usage of credit card increases the rate of fraudulent practices is also increasing substantially. Both Abhinav et al. and Tao et al are acutely aware of this problem and its implication for card-holders and especially issuing banks, who face the risks of losing millions in fraud compensation and fines. It is therefore clear that it is this common concern that acts as the motivation for the research presented by both papers. 4.1.3 Methodology The usage of transaction data to understand the spending pattern of card- holders and to detect credit card fraud is not a new concept and has been largely recognised by researchers in this area (as is evidenced by the numer- ous literature on fraud detection techniques) as the most effective means of solving the fraud detection problem. It is therefore not surprising that we should find this methodological approach adopted by both Abhinav et al. and Tao et al in their work. 18
  • 19. The common approach adopted by both is to model the sequence of op- erations for processing credit card transactions, then test the model by run- ning transactions through them. This subsequently leads to Abhinav et al. and Tao et al introducing TP an FP metrics as a means to ensure the ac- curacy and effectiveness of their fraud detection model. Furthermore, the researchers analyse historical transactional data and attempt to find incon- sistency in spending pattern as a means of detecting fraudulent transactions. To ensure the fraud detection model is successful in its primary purpose, Abhinav and Toa place a strong emphasis on training their models with realistic data that contains a good mix/ratio of fraudulent to genuine trans- actions. However they both have to deal with issues surrounding security, privacy and cost of obtaining real transaction data, hence the need to gen- erate synthetic data. Lastly, both researchers run test transactions against their trained models, making the decision to accept or reject transactions based on a specified threshold value. 4.2 Implementation and Technical differences 4.2.1 Modelling the sequence of operations in credit card trans- action processing In the approach proposed by Abhinav et al, the key idea is to model the se- quence of operations for processing credit card transaction using the HMM, where the details of items purchased in individual transactions are repre- sented as the underlying finite Markov chain, which are not observable. The transactions can only be observed through a stochastic process that produces the sequence of the amount of money spent in each transaction. Whilst in the work of Tao et al the the sequence of operations in credit card trans- action processing is modelled using a confidence-based neural network and ROC analysis. The calculation of confidence values are introduced for both discrete and continuous input attributes respectively. 4.3 Training Data and Learning Time The unique approach introduced by the Tao et al in the application of Neu- ral network techniques for fraud detection is to convert the both training and testing data into confidence values before putting into NNM. These val- ues are formatted to the range [0.0, 1.0] and each input contains historical information at this time. In Abhinav et al approach, once the sequence is formed from the cardholder’s transactions, it is passed directly into the HMM without formatting. 19
  • 20. In regards to training and learning time of the model, it is important to note the differences in training approach employed by both researchers. In Abhinav’s implementation, although the training is done offline, the learning time of the model can have a strong impact on the scalability of the system. This is due to the fact that an HMM has to be trained for every cardholder. Considering the fact that an issuing bank such as HSBC processes millions of transactions for equally large number of cardholders, their implementation could lead to issues surrounding performance and scalability of the system. In Tao’ approach, the challenge to the performance and scalability of their system lies in finding an efficient optimization algorithm for minimizing the gradient cost function and subsequently updating the weights. Fortunately, optimization algorithms are well studied and a large body of techniques are available that can be implemented to address this problem. 4.4 Threshold values In Tao et al work, the key insight is taking into consideration the cost of false negative and false positive, the approach overcomes the inadequacies of setting threshold without considering error cost by adjusting the optimal threshold value when the relative cost changes. Furthermore the application of ROC analysis is introduced to show that setting the threshold value at the optimal value improves the detection accuracy of the model. Alternatively in Abhinav’s implementation the threshold value is learnt empirically through the training stage using the Baum-welch algorithm, then is effectively fixed for the card-holders markov model. 4.5 Acceptance and Declining of incoming transactions In the Tao’s implementation, the output of the neural network generally takes real values between 0 and 1. If the output is below some specified threshold values then the transaction is classified as genuine, alternatively if the output is above some specified threshold then the ”probability” that the transaction is fraudulent is considered to be high. Comparatively, in Abhinav work, after obtaining an estimate of the HMM parameters through the training phase, they obtain an initial sequence of symbols from the cardholders transactions. This sequence can then be passed parametrically to the HMM to compute the probability of accep- tance. Subsequently, if the metric ∆α > 0 they conclude the sequence is accepted by the HMM with low probability thus it is potentially fraud, if additional conditions are satisfied. 20
  • 21. 5 Usage of HMM and NMM in Industry By way of contextualising these papers in terms of the application of the methods discussed, it is helpful to examine one case of general application of these techniques and another case related directly to credit card fraud detection. Firstly considering HMM’s, they have applications in a broad number of scientific and mathematical fields where the goal is to recover a data sequence that is not immediately observable (but other data that depend on the sequence are) For example HMM’s have applications in Automatic speech recognition, where the model is trained to recognize speech utterance from given obser- vations. [15]. They have also been used extensively in biological sequence analysis, where HMMs can be used to solve various sequence analysis prob- lems such as pairwise and multiple sequence alignments, gene annotation, classification, etc. [16] Neural networks models are a immensely popular technique for fraud de- tection and are used by some of the worlds largest banks. In fact Santander bank uses a fraud detection system called Falcon Fraud Manager from FICO which is based heavily on neural models and leverages adaptive ana- lytics. The adaptive model adjusts the base neural network ”Falcon score” in response to real-time fraud tactics that were not present at the time of the neural network model training[17][18]. FICO analytics software and tools are used across multiple industries to manage risk, fight fraud, build more profitable customer relationships, optimize operations and meet strict government regulations[19]. References [1] UK Cards Association http://www.theukcardsassociation.org.uk/wm documents/December 2014 Full Report.pdf [2] Financial Fraud Action UK http://www.financialfraudaction.org.uk/downloads.asp?genre=consumer [3] S. Benson Edwin Raj, A. Annie Portia Analysis on Credit Card Fraud Detection Methods. International Conference on Computer, Communica- tion and Electrical Technology – ICCCET2011, 18th, 19th March, 2011 [4] Masoumeh Zareapoor, Seeja.K.R, and M.Afshar.Alam Analysis of Credit Card Fraud Detection Techniques: based on Certain Design Criteria. International Journal of Computer Applications (0975 – 8887) Volume 52– No.3, August 2012 21
  • 22. [5] Khyati Chaudhary, Jyoti Yadav, Bhawna Mallick A review of Fraud Detection Techniques: Credit Card. International Journal of Computer Applications (0975 – 8887) Volume 45– No.1, May 2012. [6] Abhinav Srivastava, Amlan Kundu, Shamik Sural and Arun K. Majum- dar, Credit Card Fraud detection using Hidden Markov Model. IEEE Transactions on dependable and secure computing VOL. 5, NO. 1, January-March 2008 [7] Tao Guo, Gui-Yang Li, Neural data mining for Credit card Fraud Detec- tion. Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008 [8] Yufeng Kou, Chang-Tien Lu, Sirirat Sinvongwattana, Survey of Fraud Detection Techniques. Proceedings of the 2004 IEEE International Con- ference on Networking, Sensing & Control Taipei, Taiwan, March 21-23, 2004 [9] Markov Model http://en.wikipedia.org/wiki/Markov model [10] S.J. Stolfo, D.W. Fan, W. Lee, A.L. Prodromidis, and P.K. Chan, Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results. Proc. AAAI Workshop AI Methods in Fraud and Risk Management, pp. 83-90, 1997 [11] Khyati Chaudhary, Jyoti Yadav, Bhawna Mallick A review of Fraud Detection Techniques: Credit Card. International Journal of Computer Applications (0975 – 8887) Volume 45– No.1, May 2012. [12] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J Learning representations by back-propagating errors. Nature 323 (6088): 533–536, (8 October 1986) [13] Using Data Mining Techniques for Fraud Detection, A SAS Institute Best Practices Paper http://www.ag.unr.edu/gf/dm/dmfraud.pdf [14] Ronen Fluss, David Faraggi, and Benjamin Reiser Estimation of the Youden index and its associated cutoff point . Biometrical Journal, 2005 [15] HMM Speech Recognition http://www.fysiskplanering.se/fou/cuppsats.nsf/all/e156a6197d8b0678c1256bbb003f62 [16] Byung-Jun Yoon, Hidden Markov Models and their Applications in Biological Sequence Analysis http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2766791/ 22
  • 23. [17] FICO Analytics http://www.fico.com/en/node/8140?file=5380 [18] FICO Analytics http://www.fico.com/en/blogs/tag/score-performance/page/5/ [19] FICO Analytics http://www.fico.com/en/about-us 23