SlideShare a Scribd company logo
1 of 3
Download to read offline
International Journal of Computer Trends and Technology Volume 69 Issue 8, 1-3, Aug, 2021
ISSN: 2231 – 2803 /doi:10.14445/22312803/IJCTT-V69I8P101 © 2021 Seventh Sense Research Group®
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Credit Card Fraud Detection Using Unsupervised
Machine Learning Algorithms
Hariteja Bodepudi
Irving, USA
Received Date: 01 July 2021
Revised Date: 02 August 2021
Accepted Date: 13 August 2021
Abstract: In the modern era, the usage of the internet has
increased a lot these days and becomes an essential part of
the life. As the e-commerce has increased, the buying and
selling the products over the internet becomes more easy
and flexible. The usage of online shopping, online bill
payment has increased a lot these days with the
introduction of the modern technology like online banking,
credit card payments.
Due to the increase of the online payment and online
shopping, the risk of credit card usage also increased as
the credit card was used in many places as it becomes
hard for the bank to distinguish the real transactions of the
consumer versus the fraud transactions. Also, the credit
card fraud transaction can also happen if the customer
accidentally loses the credit card. So, it becomes complex
for the banks to stop the fraud transactions at that point.
This paper talks about how to detect the fraud transactions
and block the payments before processing by using the
Machine Learning on a real-time basis. This paper clearly
explains about how the anomalies can be detected in an
unsupervised approach and to achieve the higher
accuracy. This Anomaly detection process used in this
paper can be applied in a wide range of applications like
fraud detections in banking, underbilling/overbilling for
customers in telecommunications, security monitoring,
network traffic, health care, and a wide range of
manufacturing industries.
Keywords — Anomaly, Machine Learning, Supervised
Learning, Unsupervised Learning
I. INTRODUCTION
Ecommerce is playing a crucial role in the day-to-day
activities of humans. The Usage of the internet and IOT
devices has increased a lot these days, and that results in
massive volumes of data . [1] As the data increases due to
the collection of data from different forms like structured
and semi-structured data, which is commonly referred to as
Big Data . [2]This large volume of data becomes more
essential for the organizations for the data analysis,
especially for the banks to track the transactions on a real-
time basis. Security of credit cards has become a big
threat in the modern world due to the wide range of online
usage and wide growing data transactions to be analyzed to
detect fraud. It becomes easy for the users to do shopping,
paying bills online in the busy life rather than the physical
visits.
As the usage of credit cards across the globe and online
has increased and data have grown drastically so, it
becomes a challenge for the bank to manually detect the
fraudulent transactions and block the usage. This paper
clearly explains the usage of Artificial Intelligence
Machine Learning Unsupervised Algorithms to detect the
Anomalies.
II. ANOMALY DETECTION
Anomaly detection, which is also referred to as Outlier
Detection, helps us to identify the events, data points that
are far different from the other normal events. These
Anomalies are the data points that deviate from the normal
data points behavior. [3]
Anomalies can be detected in many ways. In this paper,
I’m focusing on detecting the Anomalies, i.e., Fraudulent
Transactions of the Bank, using the Unsupervised Machine
Learning Algorithms.
III. MACHINE LEARNING
It belongs to the branch of Artificial Intelligence . The
concept of Machine Learning algorithms is to learn from
the data, identify the behaviors and predict the future with
the minimal intervention of the mankind. [4]
There are two popular methods for the Machine Learning
widely used across the globe
They are :
1) Supervised Learning Algorithms
2) Unsupervised Learning Algorithms
A. Supervised Learning Algorithms
Supervised Learning Algorithms uses the approach of the
Learning the data patterns using the labeled data, i.e., using
the output. These algorithms are trained with the training
data set, which has both inputs and outputs. The algorithm
is used to learn how the input features are related to the
target variable, i.e., output, in this approach . As the output
is fed into this algorithm, it will adjust the weights until the
model has fitted perfectly . Once the model is trained using
the train data set and that model is used to predict the
dataset, which is unseen before .
Hariteja Bodepudi / IJCTT, 69(8), 1-3, 2021
2
B. Unsupervised Learning Algorithms
Unsupervised Algorithms are the algorithms that are used
for unlabeled data set, i.e., which doesn’t have the output
variable . These algorithms are used to find the unknown
patterns in the data and are used to analyze and segment
them into clusters based on the behaviors. Similar
behavioral patterns of the data are clustered into similar
groups . These algorithms are widely useful for the
unlabeled dataset . [5]
IV. IMPLEMENTATION
In this paper, I’m using the Kaggle credit card data set to
find the fraudulent transactions using the unsupervised
algorithms . [6] So, the credit card data set is not trained
with the output variable, i.e., the target variable . It is
directly trained on the actual dataset without any labels .
After training on the dataset and the next step is to predict
on the same dataset to find the fraudulent transactions in
the whole dataset .
To predict the anomalies in the dataset, I’m used 3
unsupervised algorithms to test the accuracy and see the
best performing algorithm . The reason for using the
unsupervised approach in this paper is mostly in the real
world; there won't be any labeled data available, and also
for the fraud detection, the unsupervised approach is the
best approach .
The algorithms I’m using in this paper are:
1) Isolation Forest
2) Local Outlier Factor
3) One Class SVM
A. Isolation Forest
Isolation Forest belongs to the unsupervised Algorithm .
Isolation Forest algorithm works in the same methodology
as the Random Forest and is constructed on the concept of
decision trees. The main idea and methodology of the
isolation forest algorithm are to detect the anomalies, i.e.,
fraud transactions, instead of using the normal
characteristics of data points. It randomly selects the
features and then randomly splits the feature between min
and max values. [7]
The important parameter in the Isolation Forest Algorithm
is the contamination parameter which we can add the
outlier value like 0.1 if it is 10%, and if we are not sure
about the value can default it to auto for the algorithm to
take care of the outliers while fitting .
Figure1: Isolation Forest [8]
From the figure1, it is clear how the outliers are separated
from the normal data points in the isolation factor . Red
data points that are far from the other data points are
considered as the outliers
B. Local Outlier Factor
Local Outlier Factor (LOF) is an unsupervised algorithm
used in the identification of the anomalies or fraud
transactions in the data . The methodology of the Local
outlier Factor Algorithm measures the density score of
each sample and weighs the scores. It compares the scores
with the other data points, and the lowest score data points
are the anomalies . [9]
Figure 2: Local Outlier Factor (LOF) [10]
From the figure2 we can see the data points and the outlier
scores, and the lower density from the neighbor points are
considered as outliers.
C. One Class SVM
One Class Support vector machine is the unsupervised
algorithm used to detect the outliers in the given dataset .
The idea of the One class SVM is to classify the whole
data into binary classes, i.e., inliers and outliers . It takes
the density of the higher class and the extremity points,
i.e., lower density is considered as outliers . [11]
Figure3: One-Class Support Vector Machine [12]
It is clear from the figure 3 how the One class SVM
separates the outliers . The yellow-colored data points in
the figure are considered as anomalies .
Hariteja Bodepudi / IJCTT, 69(8), 1-3, 2021
3
I trained and predicted the Kaggle credit card data using
the above 3 unsupervised algorithms to identify the
anomalies, i.e., to separate the fraud transactions from the
normal transactions.
V. RESULTS
After predicting with unsupervised algorithms Isolation
Forest, Local Outlier Factor, and One class SVM . The
Isolation Forest outperforms than other 2 algorithms .
A. Accuracy of Model
Algorithm Accuracy
Isolation Forest 99.74%
Local Outlier Factor 99.65%
One-Class SVM 70.09%
Figure 4: Results Table from the Algorithms
Figure5: Results Plot from the Algorithms
From the above figure4 and figure5 Results table and plot ,
we can see the accuracy of Isolation Forest is 99.74%,
Local Outlier Factor is 99.65%, and One-Class SVM is
70.09%.
So, when compared with all the 3 models, results of
Isolation Forest and Local Outlier Factor Algorithms are
almost performing similarly and better than the One class
SVM, which is poorly performing .
VI. CONCLUSION
This paper talks about how the credit card fraudulent
transactions can be detected with the unsupervised
algorithms. In the real world, we won’t be having the
outputs for the inputs, so this paper concentrates on
unsupervised learning so that way these algorithms can
detect the anomalies without even observing the output
variables .
Unsupervised algorithms are very useful and good
detectors of the anomalies . The algorithms used in this
paper can be applied to any field for anomaly detection,
i.e., to detect the outliers or fraudulent transactions .
VII. REFERENCES
[1] H. Bodepudi., Faster The Slow Running RDBMS Queries With
Spark Framework, (2020). [Online]. Available:
https://www.researchgate.net/publication/347390599_Faster_The_
Slow_Running_RDBMS_Queries_With_Spark_Framework.
[Accessed 07 2021].
[2] H. Bodepudi., Data Transfer Between RDBMS and HDFS By
Using The Spark Framework In Sqoop For Better Performance,
Internation Journal of Computer Trends and Technology, 69(3)
(2021) 1.
[3] Anaomaly-Detection., [Online]. Available:
https://avinetworks.com/glossary/anomaly-detection/. [Accessed
July 2021]., (2021).
[4] What is Machine Learning., [Online]. Available:
https://www.ibm.com/cloud/learn/machine-learning. [Accessed
July 2021].
[5] Popular Machine Learning Methods, [Online]. Available:
https://www.sas.com/en_us/insights/analytics/machine-
learning.html. [Accessed July 2021].
[6] CreditCard Data, [Online]. Available:
https://www.kaggle.com/mlg-
ulb/creditcardfraud?select=creditcard.csv. [Accessed July 2021].
[7] H2o.ai Data Science, ,[Online]. Available:
https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/if.html.
[8] Isolation Forest Example, [Online]. Available: https://scikit-
learn.org/stable/auto_examples/ensemble/plot_isolation_forest.htm
l#sphx-glr-auto-examples-ensemble-plot-isolation-forest-py.
[9] Anaomaly Detection With Local Outlier Factor, [Online].
Available: https://www.datatechnotes.com/2020/04/anomaly-
detection-with-local-outlier-factor-in-python.html.
[10] Outlier detection with Local Outlier Factor (LOF), [Online].
Available: https://scikit-
learn.org/stable/auto_examples/neighbors/plot_lof_outlier_detecti
on.html. [Accessed August 2021].
[11] One-Class Support Vector Machine., [Online]. Available:
https://www.xlstat.com/en/solutions/features/1-class-support-
vector-machine.
[12] One-Class SVM.M [Online]. Available: https://scikit-
learn.org/stable/auto_examples/svm/plot_oneclass.html#sphx-glr-
auto-examples-svm-plot-oneclass-py.
99.74% 99.65%
70.09%
ACCURACY
Accuracy of the Model
Isolation Forest Local Outlier Factor
One-Class SVM

More Related Content

What's hot

Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
kalpesh1908
 

What's hot (20)

Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learning
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learning
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?
 
Credit card fraud dection
Credit card fraud dectionCredit card fraud dection
Credit card fraud dection
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detection
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining TechniquesA Survey of Online Credit Card Fraud Detection using Data Mining Techniques
A Survey of Online Credit Card Fraud Detection using Data Mining Techniques
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 
CREDIT_CARD.ppt
CREDIT_CARD.pptCREDIT_CARD.ppt
CREDIT_CARD.ppt
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine Learning
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Anomaly detection Workshop slides
Anomaly detection Workshop slidesAnomaly detection Workshop slides
Anomaly detection Workshop slides
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 

Similar to Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms

credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
Venkat Projects
 

Similar to Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms (20)

A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...A Review of deep learning techniques in detection of anomaly incredit card tr...
A Review of deep learning techniques in detection of anomaly incredit card tr...
 
IRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation Forest
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
 
IRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention SystemIRJET - Online Credit Card Fraud Detection and Prevention System
IRJET - Online Credit Card Fraud Detection and Prevention System
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
 
Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...Empirical analysis of ensemble methods for the classification of robocalls in...
Empirical analysis of ensemble methods for the classification of robocalls in...
 
A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
 
Detection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine LearningDetection of Fake Accounts in Instagram Using Machine Learning
Detection of Fake Accounts in Instagram Using Machine Learning
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
Unsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud DetectionUnsupervised Learning for Credit Card Fraud Detection
Unsupervised Learning for Credit Card Fraud Detection
 
IRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud DetectionIRJET- Survey on Credit Card Fraud Detection
IRJET- Survey on Credit Card Fraud Detection
 
Problem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid ModelProblem Reduction in Online Payment System Using Hybrid Model
Problem Reduction in Online Payment System Using Hybrid Model
 
Fraudulent Activities Detection in E-commerce Websites
Fraudulent Activities Detection in E-commerce WebsitesFraudulent Activities Detection in E-commerce Websites
Fraudulent Activities Detection in E-commerce Websites
 
Emerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detectionEmerging technologies enabling in fraud detection
Emerging technologies enabling in fraud detection
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
 
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONMACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
 
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...
 
credit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstractcredit card fraud analysis using predictive modeling python project abstract
credit card fraud analysis using predictive modeling python project abstract
 

Recently uploaded

Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 

Recently uploaded (20)

Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 

Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms

  • 1. International Journal of Computer Trends and Technology Volume 69 Issue 8, 1-3, Aug, 2021 ISSN: 2231 – 2803 /doi:10.14445/22312803/IJCTT-V69I8P101 © 2021 Seventh Sense Research Group® This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms Hariteja Bodepudi Irving, USA Received Date: 01 July 2021 Revised Date: 02 August 2021 Accepted Date: 13 August 2021 Abstract: In the modern era, the usage of the internet has increased a lot these days and becomes an essential part of the life. As the e-commerce has increased, the buying and selling the products over the internet becomes more easy and flexible. The usage of online shopping, online bill payment has increased a lot these days with the introduction of the modern technology like online banking, credit card payments. Due to the increase of the online payment and online shopping, the risk of credit card usage also increased as the credit card was used in many places as it becomes hard for the bank to distinguish the real transactions of the consumer versus the fraud transactions. Also, the credit card fraud transaction can also happen if the customer accidentally loses the credit card. So, it becomes complex for the banks to stop the fraud transactions at that point. This paper talks about how to detect the fraud transactions and block the payments before processing by using the Machine Learning on a real-time basis. This paper clearly explains about how the anomalies can be detected in an unsupervised approach and to achieve the higher accuracy. This Anomaly detection process used in this paper can be applied in a wide range of applications like fraud detections in banking, underbilling/overbilling for customers in telecommunications, security monitoring, network traffic, health care, and a wide range of manufacturing industries. Keywords — Anomaly, Machine Learning, Supervised Learning, Unsupervised Learning I. INTRODUCTION Ecommerce is playing a crucial role in the day-to-day activities of humans. The Usage of the internet and IOT devices has increased a lot these days, and that results in massive volumes of data . [1] As the data increases due to the collection of data from different forms like structured and semi-structured data, which is commonly referred to as Big Data . [2]This large volume of data becomes more essential for the organizations for the data analysis, especially for the banks to track the transactions on a real- time basis. Security of credit cards has become a big threat in the modern world due to the wide range of online usage and wide growing data transactions to be analyzed to detect fraud. It becomes easy for the users to do shopping, paying bills online in the busy life rather than the physical visits. As the usage of credit cards across the globe and online has increased and data have grown drastically so, it becomes a challenge for the bank to manually detect the fraudulent transactions and block the usage. This paper clearly explains the usage of Artificial Intelligence Machine Learning Unsupervised Algorithms to detect the Anomalies. II. ANOMALY DETECTION Anomaly detection, which is also referred to as Outlier Detection, helps us to identify the events, data points that are far different from the other normal events. These Anomalies are the data points that deviate from the normal data points behavior. [3] Anomalies can be detected in many ways. In this paper, I’m focusing on detecting the Anomalies, i.e., Fraudulent Transactions of the Bank, using the Unsupervised Machine Learning Algorithms. III. MACHINE LEARNING It belongs to the branch of Artificial Intelligence . The concept of Machine Learning algorithms is to learn from the data, identify the behaviors and predict the future with the minimal intervention of the mankind. [4] There are two popular methods for the Machine Learning widely used across the globe They are : 1) Supervised Learning Algorithms 2) Unsupervised Learning Algorithms A. Supervised Learning Algorithms Supervised Learning Algorithms uses the approach of the Learning the data patterns using the labeled data, i.e., using the output. These algorithms are trained with the training data set, which has both inputs and outputs. The algorithm is used to learn how the input features are related to the target variable, i.e., output, in this approach . As the output is fed into this algorithm, it will adjust the weights until the model has fitted perfectly . Once the model is trained using the train data set and that model is used to predict the dataset, which is unseen before .
  • 2. Hariteja Bodepudi / IJCTT, 69(8), 1-3, 2021 2 B. Unsupervised Learning Algorithms Unsupervised Algorithms are the algorithms that are used for unlabeled data set, i.e., which doesn’t have the output variable . These algorithms are used to find the unknown patterns in the data and are used to analyze and segment them into clusters based on the behaviors. Similar behavioral patterns of the data are clustered into similar groups . These algorithms are widely useful for the unlabeled dataset . [5] IV. IMPLEMENTATION In this paper, I’m using the Kaggle credit card data set to find the fraudulent transactions using the unsupervised algorithms . [6] So, the credit card data set is not trained with the output variable, i.e., the target variable . It is directly trained on the actual dataset without any labels . After training on the dataset and the next step is to predict on the same dataset to find the fraudulent transactions in the whole dataset . To predict the anomalies in the dataset, I’m used 3 unsupervised algorithms to test the accuracy and see the best performing algorithm . The reason for using the unsupervised approach in this paper is mostly in the real world; there won't be any labeled data available, and also for the fraud detection, the unsupervised approach is the best approach . The algorithms I’m using in this paper are: 1) Isolation Forest 2) Local Outlier Factor 3) One Class SVM A. Isolation Forest Isolation Forest belongs to the unsupervised Algorithm . Isolation Forest algorithm works in the same methodology as the Random Forest and is constructed on the concept of decision trees. The main idea and methodology of the isolation forest algorithm are to detect the anomalies, i.e., fraud transactions, instead of using the normal characteristics of data points. It randomly selects the features and then randomly splits the feature between min and max values. [7] The important parameter in the Isolation Forest Algorithm is the contamination parameter which we can add the outlier value like 0.1 if it is 10%, and if we are not sure about the value can default it to auto for the algorithm to take care of the outliers while fitting . Figure1: Isolation Forest [8] From the figure1, it is clear how the outliers are separated from the normal data points in the isolation factor . Red data points that are far from the other data points are considered as the outliers B. Local Outlier Factor Local Outlier Factor (LOF) is an unsupervised algorithm used in the identification of the anomalies or fraud transactions in the data . The methodology of the Local outlier Factor Algorithm measures the density score of each sample and weighs the scores. It compares the scores with the other data points, and the lowest score data points are the anomalies . [9] Figure 2: Local Outlier Factor (LOF) [10] From the figure2 we can see the data points and the outlier scores, and the lower density from the neighbor points are considered as outliers. C. One Class SVM One Class Support vector machine is the unsupervised algorithm used to detect the outliers in the given dataset . The idea of the One class SVM is to classify the whole data into binary classes, i.e., inliers and outliers . It takes the density of the higher class and the extremity points, i.e., lower density is considered as outliers . [11] Figure3: One-Class Support Vector Machine [12] It is clear from the figure 3 how the One class SVM separates the outliers . The yellow-colored data points in the figure are considered as anomalies .
  • 3. Hariteja Bodepudi / IJCTT, 69(8), 1-3, 2021 3 I trained and predicted the Kaggle credit card data using the above 3 unsupervised algorithms to identify the anomalies, i.e., to separate the fraud transactions from the normal transactions. V. RESULTS After predicting with unsupervised algorithms Isolation Forest, Local Outlier Factor, and One class SVM . The Isolation Forest outperforms than other 2 algorithms . A. Accuracy of Model Algorithm Accuracy Isolation Forest 99.74% Local Outlier Factor 99.65% One-Class SVM 70.09% Figure 4: Results Table from the Algorithms Figure5: Results Plot from the Algorithms From the above figure4 and figure5 Results table and plot , we can see the accuracy of Isolation Forest is 99.74%, Local Outlier Factor is 99.65%, and One-Class SVM is 70.09%. So, when compared with all the 3 models, results of Isolation Forest and Local Outlier Factor Algorithms are almost performing similarly and better than the One class SVM, which is poorly performing . VI. CONCLUSION This paper talks about how the credit card fraudulent transactions can be detected with the unsupervised algorithms. In the real world, we won’t be having the outputs for the inputs, so this paper concentrates on unsupervised learning so that way these algorithms can detect the anomalies without even observing the output variables . Unsupervised algorithms are very useful and good detectors of the anomalies . The algorithms used in this paper can be applied to any field for anomaly detection, i.e., to detect the outliers or fraudulent transactions . VII. REFERENCES [1] H. Bodepudi., Faster The Slow Running RDBMS Queries With Spark Framework, (2020). [Online]. Available: https://www.researchgate.net/publication/347390599_Faster_The_ Slow_Running_RDBMS_Queries_With_Spark_Framework. [Accessed 07 2021]. [2] H. Bodepudi., Data Transfer Between RDBMS and HDFS By Using The Spark Framework In Sqoop For Better Performance, Internation Journal of Computer Trends and Technology, 69(3) (2021) 1. [3] Anaomaly-Detection., [Online]. Available: https://avinetworks.com/glossary/anomaly-detection/. [Accessed July 2021]., (2021). [4] What is Machine Learning., [Online]. Available: https://www.ibm.com/cloud/learn/machine-learning. [Accessed July 2021]. [5] Popular Machine Learning Methods, [Online]. Available: https://www.sas.com/en_us/insights/analytics/machine- learning.html. [Accessed July 2021]. [6] CreditCard Data, [Online]. Available: https://www.kaggle.com/mlg- ulb/creditcardfraud?select=creditcard.csv. [Accessed July 2021]. [7] H2o.ai Data Science, ,[Online]. Available: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/if.html. [8] Isolation Forest Example, [Online]. Available: https://scikit- learn.org/stable/auto_examples/ensemble/plot_isolation_forest.htm l#sphx-glr-auto-examples-ensemble-plot-isolation-forest-py. [9] Anaomaly Detection With Local Outlier Factor, [Online]. Available: https://www.datatechnotes.com/2020/04/anomaly- detection-with-local-outlier-factor-in-python.html. [10] Outlier detection with Local Outlier Factor (LOF), [Online]. Available: https://scikit- learn.org/stable/auto_examples/neighbors/plot_lof_outlier_detecti on.html. [Accessed August 2021]. [11] One-Class Support Vector Machine., [Online]. Available: https://www.xlstat.com/en/solutions/features/1-class-support- vector-machine. [12] One-Class SVM.M [Online]. Available: https://scikit- learn.org/stable/auto_examples/svm/plot_oneclass.html#sphx-glr- auto-examples-svm-plot-oneclass-py. 99.74% 99.65% 70.09% ACCURACY Accuracy of the Model Isolation Forest Local Outlier Factor One-Class SVM