SlideShare a Scribd company logo
1 of 9
Data mining for credit
card fraud: A
comparative study
SiddharthaBhattacharyya, SanjeevJha, KurianTharakunne,
ChristopherWestland (2011)
DOR BAHDUR BUDHATHOKI SID:45057
NARENDRA SHARMA SID:45040
Abstract
• This paper evaluates two advanced data mining approaches, support
vector machines and random forests, together with the well-known
logistic regression, as part of an attempt to better detect (and thus control
and prosecute) credit card fraud. The study is based on real-life data of
transactions from an international credit card operation.
Introduction
• Data mining- Practice of examining large pre-existing database in order to
generate new information. It is helpful to turn raw data into useful
information, knowledge discovery, predictive analysis( to apply past
outcomes to predict future)
• Credit Card Fraud- there are two types of credit card fraud,
1) Application fraud- obtaining new card from issuing companies using false
information.
2) Behavioral fraud- Includes mail theft, stolen card , counterfeit card and
card holder not present.
Methods
• There are three data mining techniques used to predict credit card fraud.
1) Logistic Regression(LR)- Appropriate when dependent variable is
categorical, here dependent variable fraud is binary.
2) Support Vector machines(SVM)- statistical learning techniques that have
been found very successful in variety of tasks. SVMs are linear classifier
that work on hi dimensional feature space with out incorporating any
additional computational complexity.
3) Random Forest (RF)-It is ensemble of classification of trees models.
Ensembles work well when individual numbers are dissimilar and random
forests obtain variation among individual.
Results
• This section presents results from the experiments comparing the
performance of Logistic regression (LR), Random Forests (RF) and Support
Vector Machines (SVM) model developed from training data carrying
varying levels of fraud cases.
Result Contd.
Discussion
• This paper examined the performance of two advanced data mining
techniques, random forests and support vector machines, together with
logistic regression, for credit card fraud detection.
• A real-life dataset on credit card transactions from the January 2006–
January 2007 period was used in their evaluation.
• Random forests and SVM are two approaches that have gained prominence
in recent years with noted superior performance across a range of
applications. Till date, their use for credit card fraud prediction has been
limited.
Discussion
• They use data under sampling, a simple approach which has been noted to
perform well and examine the performance of the three techniques with
varying levels of data under sampling. For performance assessment, they
use a test dataset with much lower fraud rate (0.5%) than in the training
datasets with different levels of under sampling.
Thank You

More Related Content

What's hot

Basic Statistics & Data Analysis
Basic Statistics & Data AnalysisBasic Statistics & Data Analysis
Basic Statistics & Data AnalysisAjendra Sharma
 
Probability in statistics
Probability in statisticsProbability in statistics
Probability in statisticsSukirti Garg
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
Introduction to statistics 2013
Introduction to statistics 2013Introduction to statistics 2013
Introduction to statistics 2013Mohammad Ihmeidan
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
Statistical software packages
Statistical software packagesStatistical software packages
Statistical software packagesKm Ashif
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualizationDr. Hamdan Al-Sabri
 
Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis Peter Reimann
 
Research Method EMBA chapter 11
Research Method EMBA chapter 11Research Method EMBA chapter 11
Research Method EMBA chapter 11Mazhar Poohlah
 
Definition Of Statistics
Definition Of StatisticsDefinition Of Statistics
Definition Of StatisticsJoshua Rumagit
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013sonu kumar
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysisSetia Pramana
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining systemramya marichamy
 
Statistik Chapter 1
Statistik Chapter 1Statistik Chapter 1
Statistik Chapter 1WanBK Leo
 
Business statistics review
Business statistics reviewBusiness statistics review
Business statistics reviewFELIXARCHER
 
Variance rover system
Variance rover systemVariance rover system
Variance rover systemeSAT Journals
 

What's hot (20)

Basic Statistics & Data Analysis
Basic Statistics & Data AnalysisBasic Statistics & Data Analysis
Basic Statistics & Data Analysis
 
Probability in statistics
Probability in statisticsProbability in statistics
Probability in statistics
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
Introduction to statistics 2013
Introduction to statistics 2013Introduction to statistics 2013
Introduction to statistics 2013
 
Data mining
Data miningData mining
Data mining
 
DM
DMDM
DM
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
Statistical software packages
Statistical software packagesStatistical software packages
Statistical software packages
 
Exploratory data analysis data visualization
Exploratory data analysis data visualizationExploratory data analysis data visualization
Exploratory data analysis data visualization
 
Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis
 
Data analysis
Data analysisData analysis
Data analysis
 
Research Method EMBA chapter 11
Research Method EMBA chapter 11Research Method EMBA chapter 11
Research Method EMBA chapter 11
 
Definition Of Statistics
Definition Of StatisticsDefinition Of Statistics
Definition Of Statistics
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
 
Architecture of data mining system
Architecture of data mining systemArchitecture of data mining system
Architecture of data mining system
 
Statistik Chapter 1
Statistik Chapter 1Statistik Chapter 1
Statistik Chapter 1
 
Business statistics review
Business statistics reviewBusiness statistics review
Business statistics review
 
Variance rover system
Variance rover systemVariance rover system
Variance rover system
 

Similar to Data mining for credit card fraud

Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfShrutiGarg649495
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection IJSTA
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learningijtsrd
 
IRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET Journal
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionIRJET Journal
 
A data mining approach to predict
A data mining approach to predictA data mining approach to predict
A data mining approach to predictIJDKP
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET Journal
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningCredit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningIRJET Journal
 
Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningCredit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningIRJET Journal
 
credit scoring paper published in eswa
credit scoring paper published in eswacredit scoring paper published in eswa
credit scoring paper published in eswaAkhil Bandhu Hens, FRM
 
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONMACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONmlaij
 

Similar to Data mining for credit card fraud (20)

Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
 
Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection Meta Classification Technique for Improving Credit Card Fraud Detection
Meta Classification Technique for Improving Credit Card Fraud Detection
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
A Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine LearningA Study on Credit Card Fraud Detection using Machine Learning
A Study on Credit Card Fraud Detection using Machine Learning
 
IRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation ForestIRJET- Credit Card Fraud Detection using Isolation Forest
IRJET- Credit Card Fraud Detection using Isolation Forest
 
Unit 4 Advanced Data Analytics
Unit 4 Advanced Data AnalyticsUnit 4 Advanced Data Analytics
Unit 4 Advanced Data Analytics
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
NCCT.pptx
NCCT.pptxNCCT.pptx
NCCT.pptx
 
CREDIT_CARD.ppt
CREDIT_CARD.pptCREDIT_CARD.ppt
CREDIT_CARD.ppt
 
A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
 
Data mining on Financial Data
Data mining on Financial DataData mining on Financial Data
Data mining on Financial Data
 
F033026029
F033026029F033026029
F033026029
 
Data mining
Data miningData mining
Data mining
 
A data mining approach to predict
A data mining approach to predictA data mining approach to predict
A data mining approach to predict
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningCredit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine Learning
 
Credit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine LearningCredit Card Fraud Detection Using Machine Learning
Credit Card Fraud Detection Using Machine Learning
 
credit scoring paper published in eswa
credit scoring paper published in eswacredit scoring paper published in eswa
credit scoring paper published in eswa
 
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONMACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTION
 

Recently uploaded

一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp onlinebalibahu1313
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like BitcoinDOT TECH
 

Recently uploaded (20)

一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 

Data mining for credit card fraud

  • 1. Data mining for credit card fraud: A comparative study SiddharthaBhattacharyya, SanjeevJha, KurianTharakunne, ChristopherWestland (2011) DOR BAHDUR BUDHATHOKI SID:45057 NARENDRA SHARMA SID:45040
  • 2. Abstract • This paper evaluates two advanced data mining approaches, support vector machines and random forests, together with the well-known logistic regression, as part of an attempt to better detect (and thus control and prosecute) credit card fraud. The study is based on real-life data of transactions from an international credit card operation.
  • 3. Introduction • Data mining- Practice of examining large pre-existing database in order to generate new information. It is helpful to turn raw data into useful information, knowledge discovery, predictive analysis( to apply past outcomes to predict future) • Credit Card Fraud- there are two types of credit card fraud, 1) Application fraud- obtaining new card from issuing companies using false information. 2) Behavioral fraud- Includes mail theft, stolen card , counterfeit card and card holder not present.
  • 4. Methods • There are three data mining techniques used to predict credit card fraud. 1) Logistic Regression(LR)- Appropriate when dependent variable is categorical, here dependent variable fraud is binary. 2) Support Vector machines(SVM)- statistical learning techniques that have been found very successful in variety of tasks. SVMs are linear classifier that work on hi dimensional feature space with out incorporating any additional computational complexity. 3) Random Forest (RF)-It is ensemble of classification of trees models. Ensembles work well when individual numbers are dissimilar and random forests obtain variation among individual.
  • 5. Results • This section presents results from the experiments comparing the performance of Logistic regression (LR), Random Forests (RF) and Support Vector Machines (SVM) model developed from training data carrying varying levels of fraud cases.
  • 7. Discussion • This paper examined the performance of two advanced data mining techniques, random forests and support vector machines, together with logistic regression, for credit card fraud detection. • A real-life dataset on credit card transactions from the January 2006– January 2007 period was used in their evaluation. • Random forests and SVM are two approaches that have gained prominence in recent years with noted superior performance across a range of applications. Till date, their use for credit card fraud prediction has been limited.
  • 8. Discussion • They use data under sampling, a simple approach which has been noted to perform well and examine the performance of the three techniques with varying levels of data under sampling. For performance assessment, they use a test dataset with much lower fraud rate (0.5%) than in the training datasets with different levels of under sampling.