SlideShare a Scribd company logo
1 of 26
Credit Card Fraudulent
Transaction Detection
Garvit Burad Dr. Brijendra Gupta
Scholar Associate Professor
Author Co-Author
Department of Computer Engineering
Poornima Institute of Engineering & Technology
The Problem
$21.84 Billion
Lost worldwide due to Credit Card Frauds in 2015
Less than 0.2% Transactions are Fraud
Hence very hard to predict
Our Dataset had
2,84,807 Authentic Transactions
492 Fraudulent Transactions
0.172% of Total Transactions
How Frauds are Recognized
The methods that are generally employed by
the credit card companies.
● Location: Purchase made from different location
● Items you buy: If you deviate from your regular
buying pattern or time
● Frequency: Make a large number of transactions in
short period of time
● Amount: Suddenly if the costly items are purchased
Challenges
Since the data is highly skewed
Authentic Fraud
Challenges
■ Normal machine learning algorithms like
◻Logistic Regression, SVM Classifiers etc.
Would give 99%+ Accuracy
But we can get 99.8% accuracy even if we
classify all the Frauds as Legitimate
Understanding the Data
■ The dataset contains 31 features.
■Out of which V1 to V28 have been
transformed with PCA for privacy of
users
■ The other three features are time,
frequency and the class either 1 for
fraud or 0 otherwise
Approaching the Problem
■ Given the class imbalance, the confusion matrix
accuracy would not be good criteria.
■ So we used Area Under Precision Recall Curve
(AUPC) and Receiver operating characteristic curve
(ROC)
■ A high area under the curve represents both high
recall and high precision
■ High precision relates to a low false positive rate, and
high recall relates to a low false negative rate
■ The ROC plots the true positive (TP) rate against the
false positive (FP) rate at various threshold settings
Dividing the dataset into Test &
Train
To validate our model's prediction
Divide dataset into
◻75% for Training (2,13,605 rows)
◻25% for Testing (71,202 rows)
Resampling Methods
■ The resampling methods are used to adjust
the class distribution of the data as the
minority class is not equally represented.
■There are two methods to perform the
resampling.
◻Oversampling
◻Undersampling
1. Preprocessing
2. Filtering
3. Classification
4. Decision Trees
Feature Engineering
■ To select only those features which matter
the most
■ Those features can be identified by
◻finding the correlation factor among the
features
◻studying the domain of the problem
◻by transforming some existing features to
make them more worthy
1. Preprocessing
2. Filtering
3. Classification
4. Decision Trees
Correlation Factor Among Features
Logistic Regression
■ Logistic regression is a binary classifier
based on the regression model, where the
dependable variable is categorical.
■ In logistic regression, we find
logit(P) = a + bX,
1. Preprocessing
2. Filtering
3. Classification
4. Decision Trees
Logistic Regression
Logistic Regression
we can observe that our logistic regression
classifier classifies
98% as true negative and 92% as true positive
but the precision for the fraud class is 0.06.
---Classification Report---
precision -recall
0 1.00 0.98
1 0.06 0.92
LOGISTIC REGRESSION
WITH SMOTE
■ SMOTE stands for Synthetic Minority
Oversampling Technique to an input dataset
■ It works by generating new instances of data
from the existing by taking feature space of
each target class and its nearest neighbors
■ This is a statistical technique to increase the
number of samples in minority class in the
dataset to make it balanced.
LOGISTIC REGRESSION WITH
SMOTE
■ After SMOTE sample size becomes
◻Fraud 21,334
◻Legitimate 21,334
Results:
◻Accuracy: 98.7%
◻Precision: 8.95%
◻Recall: 82.9%
◻AUC: 94.5%
RANDOM FOREST CLASSIFIER
AFTER SMOTE
■ A random forest classifier fits a multiple number of
decision tree classifiers on various subsamples of
the dataset
■ Then calculates the average to improve the
accuracy of the prediction and control over fitting.
■ We have used the random forest classifier with the
Gini impurity
■ It is a measure of how many times a randomly
selected element from the dataset would be
incorrectly labeled.
RANDOM FOREST CLASSIFIER AFTER
SMOTE
RANDOM FOREST CLASSIFIER
AFTER SMOTE
■ Results
Accuracy: 99.94%
Precision: 85.43%
Recall: 79.27%
AUC: 93.50%
LOGISTIC REGRESSION WITH
UNDERSAMPLING
■ Applying the undersampling to make the
number of samples from the majority class
equal to the number of samples in minority
class
■ Sample size: becomes
◻Number of Fraud cases: 492
◻Number of normal cases:492
LOGISTIC REGRESSION WITH
UNDERSAMPLING
Confusion Matrix from Test Data
LOGISTIC REGRESSION WITH
UNDERSAMPLING
Same model when tested on the original
dataset without Undersampling
LOGISTIC REGRESSION WITH
UNDERSAMPLING
ROC curve for the complete original training
dataset with logistic regression model trained by
undersampled data. The AUC is 0.96.
APPLICATIONS
● These techniques of detecting the fraud
transactions can be implemented to make
the transactions through the credit cards
more secure.
● The learned models can be deployed on
high-performance computers to monitor
all the transactions in real time and stop
the ongoing transaction while also alerting
the owner about the steps required to
further eliminate the risk
CONCLUSION
The best AUC for the ROC curve were
achieved by:
Logistic Regression with SMOTE:0.95
AUC
Logistic regression with undersampling:
0.96 AUC
These are the methods that performed
better than other methods.
References
[1] G. O. [1]Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with
Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining
(CIDM), IEEE, 2015
[2] [2]http://mlg.ulb.ac.be(URL).
[3] [3]http://www.businesswire.com/news/home/20150804007054/en/Global-Card-Fraud-Losses-Reach-16.31-Billion
[4] [4]http://www.kaggle.com(URL).
[5] [5]The Nilson Report
[6] [6]http://www.thinksaveretire.com/2015/09/14/how-credit-card-fraud-detection-works/(URL).
[7] [7]SMOTE: Synthetic Minority Over-sampling Technique. Nitesh V. Chawla chawla@csee.usf.edu. Department of
Computer Science and Engineering
[8] [8]Area Under the Precision-Recall Curve: Point Estimates and Confidence Intervals Kendrick Boyd1 , Kevin H. Eng
, and C. David Page
[9] [9]http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html(URL).
[10] [11]Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
[11] [10]Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001-,
http://www.scipy.org/ [Online; accessed 2017-03-30]
[12] [11]Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient
Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37

More Related Content

What's hot

Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionkalpesh1908
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION K Srinivas Rao
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learningdataalcott
 
Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Andrea Dal Pozzolo
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learningwgyn
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksDatabricks
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detectionPEIPEI HAN
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Pratibha Singh
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)k.surya kumar
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud DetectionNitesh Kumar
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learningijtsrd
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detectionijtsrd
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014Sri Ambati
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
 

What's hot (20)

Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION CREDIT CARD FRAUD DETECTION
CREDIT CARD FRAUD DETECTION
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?Is Machine learning useful for Fraud Prevention?
Is Machine learning useful for Fraud Prevention?
 
Fraud detection
Fraud detectionFraud detection
Fraud detection
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learning
 
Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
Credit card payment_fraud_detection
Credit card payment_fraud_detectionCredit card payment_fraud_detection
Credit card payment_fraud_detection
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learning
 
CREDIT_CARD.ppt
CREDIT_CARD.pptCREDIT_CARD.ppt
CREDIT_CARD.ppt
 
Credit Card Fraud Detection
Credit Card Fraud DetectionCredit Card Fraud Detection
Credit Card Fraud Detection
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 

Similar to Credit Card Fraudulent Transaction Detection Research Paper

MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
IRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET Journal
 
Leveragin research, behavioural and demeographic data
Leveragin research, behavioural and demeographic dataLeveragin research, behavioural and demeographic data
Leveragin research, behavioural and demeographic dataMRS
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detectionjagan477830
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfShrutiGarg649495
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxyatintaneja6
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnodot
 
malware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectmalware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectNaveenAd4
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
Data Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersData Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersMohitMhapuskar
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal
 
Injection Attack detection using ML for
Injection Attack detection using ML  forInjection Attack detection using ML  for
Injection Attack detection using ML forKhazane Hassan
 
Know Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsKnow Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsAll Things Open
 
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllisterOSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllisterNETWAYS
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITOMarcoMellia
 
AI Class Topic 2: Step-by-step Process for AI development
AI Class Topic 2: Step-by-step Process for AI developmentAI Class Topic 2: Step-by-step Process for AI development
AI Class Topic 2: Step-by-step Process for AI developmentValue Amplify Consulting
 

Similar to Credit Card Fraudulent Transaction Detection Research Paper (20)

Credit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptxCredit Card Fraud Detection_ Mansi_Choudhary.pptx
Credit Card Fraud Detection_ Mansi_Choudhary.pptx
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
IRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine Learning
 
Leveragin research, behavioural and demeographic data
Leveragin research, behavioural and demeographic dataLeveragin research, behavioural and demeographic data
Leveragin research, behavioural and demeographic data
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
 
Analytics for large-scale time series and event data
Analytics for large-scale time series and event dataAnalytics for large-scale time series and event data
Analytics for large-scale time series and event data
 
malware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year projectmalware detection ppt for vtu project and other final year project
malware detection ppt for vtu project and other final year project
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1
 
Data Mining to Classify Telco Churners
Data Mining to Classify Telco ChurnersData Mining to Classify Telco Churners
Data Mining to Classify Telco Churners
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...
 
Injection Attack detection using ML for
Injection Attack detection using ML  forInjection Attack detection using ML  for
Injection Attack detection using ML for
 
credit card.pptx
credit card.pptxcredit card.pptx
credit card.pptx
 
Know Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsKnow Your Data: The stats behind your alerts
Know Your Data: The stats behind your alerts
 
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllisterOSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
AI Class Topic 2: Step-by-step Process for AI development
AI Class Topic 2: Step-by-step Process for AI developmentAI Class Topic 2: Step-by-step Process for AI development
AI Class Topic 2: Step-by-step Process for AI development
 

Recently uploaded

Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Recently uploaded (20)

Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Credit Card Fraudulent Transaction Detection Research Paper

  • 1. Credit Card Fraudulent Transaction Detection Garvit Burad Dr. Brijendra Gupta Scholar Associate Professor Author Co-Author Department of Computer Engineering Poornima Institute of Engineering & Technology
  • 2. The Problem $21.84 Billion Lost worldwide due to Credit Card Frauds in 2015 Less than 0.2% Transactions are Fraud Hence very hard to predict Our Dataset had 2,84,807 Authentic Transactions 492 Fraudulent Transactions 0.172% of Total Transactions
  • 3. How Frauds are Recognized The methods that are generally employed by the credit card companies. ● Location: Purchase made from different location ● Items you buy: If you deviate from your regular buying pattern or time ● Frequency: Make a large number of transactions in short period of time ● Amount: Suddenly if the costly items are purchased
  • 4. Challenges Since the data is highly skewed Authentic Fraud
  • 5. Challenges ■ Normal machine learning algorithms like ◻Logistic Regression, SVM Classifiers etc. Would give 99%+ Accuracy But we can get 99.8% accuracy even if we classify all the Frauds as Legitimate
  • 6. Understanding the Data ■ The dataset contains 31 features. ■Out of which V1 to V28 have been transformed with PCA for privacy of users ■ The other three features are time, frequency and the class either 1 for fraud or 0 otherwise
  • 7. Approaching the Problem ■ Given the class imbalance, the confusion matrix accuracy would not be good criteria. ■ So we used Area Under Precision Recall Curve (AUPC) and Receiver operating characteristic curve (ROC) ■ A high area under the curve represents both high recall and high precision ■ High precision relates to a low false positive rate, and high recall relates to a low false negative rate ■ The ROC plots the true positive (TP) rate against the false positive (FP) rate at various threshold settings
  • 8. Dividing the dataset into Test & Train To validate our model's prediction Divide dataset into ◻75% for Training (2,13,605 rows) ◻25% for Testing (71,202 rows)
  • 9. Resampling Methods ■ The resampling methods are used to adjust the class distribution of the data as the minority class is not equally represented. ■There are two methods to perform the resampling. ◻Oversampling ◻Undersampling 1. Preprocessing 2. Filtering 3. Classification 4. Decision Trees
  • 10. Feature Engineering ■ To select only those features which matter the most ■ Those features can be identified by ◻finding the correlation factor among the features ◻studying the domain of the problem ◻by transforming some existing features to make them more worthy 1. Preprocessing 2. Filtering 3. Classification 4. Decision Trees
  • 12. Logistic Regression ■ Logistic regression is a binary classifier based on the regression model, where the dependable variable is categorical. ■ In logistic regression, we find logit(P) = a + bX, 1. Preprocessing 2. Filtering 3. Classification 4. Decision Trees
  • 14. Logistic Regression we can observe that our logistic regression classifier classifies 98% as true negative and 92% as true positive but the precision for the fraud class is 0.06. ---Classification Report--- precision -recall 0 1.00 0.98 1 0.06 0.92
  • 15. LOGISTIC REGRESSION WITH SMOTE ■ SMOTE stands for Synthetic Minority Oversampling Technique to an input dataset ■ It works by generating new instances of data from the existing by taking feature space of each target class and its nearest neighbors ■ This is a statistical technique to increase the number of samples in minority class in the dataset to make it balanced.
  • 16. LOGISTIC REGRESSION WITH SMOTE ■ After SMOTE sample size becomes ◻Fraud 21,334 ◻Legitimate 21,334 Results: ◻Accuracy: 98.7% ◻Precision: 8.95% ◻Recall: 82.9% ◻AUC: 94.5%
  • 17. RANDOM FOREST CLASSIFIER AFTER SMOTE ■ A random forest classifier fits a multiple number of decision tree classifiers on various subsamples of the dataset ■ Then calculates the average to improve the accuracy of the prediction and control over fitting. ■ We have used the random forest classifier with the Gini impurity ■ It is a measure of how many times a randomly selected element from the dataset would be incorrectly labeled.
  • 19. RANDOM FOREST CLASSIFIER AFTER SMOTE ■ Results Accuracy: 99.94% Precision: 85.43% Recall: 79.27% AUC: 93.50%
  • 20. LOGISTIC REGRESSION WITH UNDERSAMPLING ■ Applying the undersampling to make the number of samples from the majority class equal to the number of samples in minority class ■ Sample size: becomes ◻Number of Fraud cases: 492 ◻Number of normal cases:492
  • 22. LOGISTIC REGRESSION WITH UNDERSAMPLING Same model when tested on the original dataset without Undersampling
  • 23. LOGISTIC REGRESSION WITH UNDERSAMPLING ROC curve for the complete original training dataset with logistic regression model trained by undersampled data. The AUC is 0.96.
  • 24. APPLICATIONS ● These techniques of detecting the fraud transactions can be implemented to make the transactions through the credit cards more secure. ● The learned models can be deployed on high-performance computers to monitor all the transactions in real time and stop the ongoing transaction while also alerting the owner about the steps required to further eliminate the risk
  • 25. CONCLUSION The best AUC for the ROC curve were achieved by: Logistic Regression with SMOTE:0.95 AUC Logistic regression with undersampling: 0.96 AUC These are the methods that performed better than other methods.
  • 26. References [1] G. O. [1]Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015 [2] [2]http://mlg.ulb.ac.be(URL). [3] [3]http://www.businesswire.com/news/home/20150804007054/en/Global-Card-Fraud-Losses-Reach-16.31-Billion [4] [4]http://www.kaggle.com(URL). [5] [5]The Nilson Report [6] [6]http://www.thinksaveretire.com/2015/09/14/how-credit-card-fraud-detection-works/(URL). [7] [7]SMOTE: Synthetic Minority Over-sampling Technique. Nitesh V. Chawla chawla@csee.usf.edu. Department of Computer Science and Engineering [8] [8]Area Under the Precision-Recall Curve: Point Estimates and Confidence Intervals Kendrick Boyd1 , Kevin H. Eng , and C. David Page [9] [9]http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html(URL). [10] [11]Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011. [11] [10]Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ [Online; accessed 2017-03-30] [12] [11]Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37