Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Example-Dependent Cost-Sensitive Credit Card Fraud Detection

5,260 views

Published on

Credit card fraud is a growing problem that affects card holders around the world. Fraud detection has been an interesting topic in machine learning. Nevertheless, current state of the art credit card fraud detection algorithms miss to include the real costs of credit card fraud as a measure to evaluate algorithms. In this paper a new comparison measure that realistically represents the monetary gains and losses due to fraud detection is proposed. Moreover, using the proposed cost measure a cost sensitive method based on Bayes minimum risk is presented. This method is compared with state of the art algorithms and shows improvements up to 23% measured by cost. The results of this paper are based on real life transactional data provided by a large European card processing company.

Published in: Technology
  • If you are looking for trusted essay writing service I highly recommend ⇒⇒⇒WRITE-MY-PAPER.net ⇐⇐⇐ The service I received was great. I got an A on my final paper which really helped my grade. Knowing that I can count on them in the future has really helped relieve the stress, anxiety and workload. I recommend everyone to give them a try. You'll be glad you did.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Check the source ⇒ www.HelpWriting.net ⇐ This site is really helped me out gave me relief from headaches. Good luck!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Yes you are right. There are many research paper writing services available now. But almost services are fake and illegal. Only a genuine service will treat their customer with quality research papers. ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Example-Dependent Cost-Sensitive Credit Card Fraud Detection

  1. 1. Example-Dependent Cost-Sensitive Credit Card Fraud Detection March 21st, 2014 Alejandro Correa Bahnsen with Djamila Aouada, SnT Björn Ottersten, SnT
  2. 2. Introduction € 500 € 600 € 700 € 800 2007 2008 2009 2010 2011E 2012E Europe fraud evolution Internettransactions(millions of euros) 2
  3. 3. Introduction $- $1.0 $2.0 $3.0 $4.0 $5.0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 US fraud evolution Online revenue lost due to fraud (Billions of dollars) 3
  4. 4. • Increasing fraud levels around the world • Different technologies and legal requirements makes it harder to control • Lack of collaboration between academia and practitioners, leading to solutions that fail to incorporate practical issues of credit card fraud detection: • Financial comparison measures • Huge class imbalance • Response time measure in milliseconds Introduction 4
  5. 5. • Introduction • Database • Evaluation • Bayes Minimum Risk • Experiments • Probability Calibration • Other applications • Conclusions & Future Work Agenda 5
  6. 6. Simplify transaction flow Fraud?? Network 6
  7. 7. Data • Larger European card processing company • Jan2012 – Jun2013 card present transactions • 1,638,772 Transactions • 3,444 Frauds • 0.21%Fraud rate • 205,542 EUR lost due to fraud on test dataset Jun13 May13 Apr13 Mar13 Feb13 Jan13 … … … Mar12 Feb12 Jan12 Test Train 7
  8. 8. • Raw attributes • Other attributes: Age, country of residence, postal code, type of card Data TRXID Client ID Date Amount Location Type Merchant Group Fraud 1 1 2/1/12 6:00 580 Ger Internet Airlines No 2 1 2/1/12 6:15 120 Eng Present Car Rent No 3 2 2/1/12 8:20 12 Bel Present Hotel Yes 4 1 3/1/12 4:15 60 Esp ATM ATM No 5 2 3/1/12 9:18 8 Fra Present Retail No 6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes 8
  9. 9. • Derived attributes Data Trx ID Client ID Date Amount Location Type Merchant Group Fraud No. of Trx – same client– last 6 hour Sum – same client– last 7 days 1 1 2/1/12 6:00 580 Ger Internet Airlines No 0 0 2 1 2/1/12 6:15 120 Eng Present Car Renting No 1 580 3 2 2/1/12 8:20 12 Bel Present Hotel Yes 0 0 4 1 3/1/12 4:15 60 Esp ATM ATM No 0 700 5 2 3/1/12 9:18 8 Fra Present Retail No 0 12 6 1 3/1/12 9:55 1210 Ita Internet Airlines Yes 1 760 By Group Last Function Client None hour Count Credit Card Transaction Type day Sum(Amount) Merchant week Avg(Amount) Merchant Category month Merchant Country 3 months – Combination of following criteria: 9
  10. 10. Date of transaction 04/03/2012 - 03:14 07/03/2012 - 00:47 07/03/2012 - 02:57 08/03/2012 - 02:08 14/03/2012 - 22:15 25/03/2012 - 05:03 26/03/2012 - 21:51 28/03/2012 - 03:41 𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛 = 1 𝑛 𝑡 𝑃𝑒𝑟𝑖𝑜𝑑𝑖𝑐 𝑀𝑒𝑎𝑛 = tan _2−1 sin(𝑡) , cos(𝑡) 𝑃𝑒𝑟𝑖𝑜𝑑𝑖𝑐 𝑆𝑡𝑑 = 𝑙𝑛 1 1 𝑛 sin 𝑡 2 + 1 𝑛 cos 𝑡 2 𝑡 ~ 𝑣𝑜𝑛𝑚𝑖𝑠𝑒𝑠 𝑘 ≈ 1 𝑠𝑡𝑑 𝑃 −𝑧𝑡 < 𝑡 < 𝑧𝑡 = 0.95 -1 -1 24h 6h 12h 18h Data 10
  11. 11. Date of transaction 04/03/2012 - 03:14 07/03/2012 - 00:47 07/03/2012 - 02:57 08/03/2012 - 02:08 14/03/2012 - 22:15 25/03/2012 - 05:03 26/03/2012 - 21:51 28/03/2012 - 03:41 -1 -1 24h 6h 12h 18h 02/04/2012 - 02:02 03/04/2012 - 12:10 new features Inside CI(0.95) last 30 days Inside CI(0.95) last 7 days Inside CI(0.5) last 30 days Inside CI(0.5) last 7 days Data 11
  12. 12. • Misclassification = 1 − 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 • Recall = 𝑇𝑃 𝑇𝑃+𝐹𝑁 • Precision = 𝑇𝑃 𝑇𝑃+𝐹𝑃 • F-Score = 2 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 Evaluation True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑝𝑖=1) TP FP Legitimate (𝑝𝑖=0) FN TN • Confusion matrix 12
  13. 13. • Motivation: • Equal misclassification results • Frauds carry different cost Evaluation - Financialmeasure TRX ID Amount Fraud 1 580 No 2 120 No 3 12 Yes 4 60 No 5 8 No 6 1210 Yes Miss-Class 2 / 6 Cost 1222 Prediction (Fraud?) No No Yes No Yes No 2 / 6 1212 Prediction (Fraud?) No No No No Yes Yes 2 / 6 14 Prediction (Fraud?) No No No No No No Algorithm 1 Algorithm 3Algorithm 2 13
  14. 14. • Cost matrix where: Evaluation - Financialmeasure True Class (𝑦𝑖) Fraud (𝑦𝑖=1) Legitimate (𝑦𝑖=0) Predicted class (𝑝𝑖) Fraud (𝑐𝑖=1) 𝐶 𝑇𝑃𝑖 = Ca 𝐶 𝐹𝑃𝑖 = Ca Legitimate (𝑐𝑖=0) 𝐶 𝐹𝑁𝑖 = Amt(i) 𝐶 𝑇𝑁𝑖 = 0 Ca Administrative costs Amt Amount of transaction i 𝐶𝑜𝑠𝑡 = 𝑦𝑖 𝑐𝑖 𝐶 𝑎 + 1 − 𝑐𝑖 𝐴𝑚𝑡𝑖 + (1 − 𝑦𝑖)𝑐𝑖 𝐶 𝑎 𝑚 𝑖=1 14 • Evaluation measure
  15. 15. • Introduction • Database • Evaluation • Bayes Minimum Risk • Experiments • Probability Calibration • Other applications • Conclusions & Future Work Agenda 15
  16. 16. • Decision model based on quantifying tradeoffs between various decisions using probabilities and the costs that accompany such decisions • Risk of classification Bayes Minimum Risk 16
  17. 17. • If then 𝑡 𝐵𝑀𝑅 𝑖 = 𝐶 𝑎 𝐴𝑚𝑡𝑖 Bayes Minimum Risk 17 • Example-dependent threshold
  18. 18. • Estimation of the fraud probabilities using • Decision Trees • Logistic Regression • Random Forest • Datasets Experiments Database Transactions Frauds Losses Total 1,638,772 0.21% 860,448 Train 815,368 0.21% 416,369 Validation 412,137 0.22% 238,537 Test 411,267 0.21% 205,542 18
  19. 19. Experiments 0.00 0.05 0.10 0.15 0.20 0.25 0 50,000 100,000 150,000 200,000 250,000 No Model RF DT LR Cost(Euros) Cost F1-Score 19
  20. 20. Experiments 0.00 0.05 0.10 0.15 0.20 0.25 - 50,000 100,000 150,000 200,000 250,000 No Model RF BMR DT BMR LR BMR RF DT LR Cost(Euros) Cost F1-Score 20 • Cost its reduced when using BRM • The F1-Score is also reduced
  21. 21. Probability Calibration • When using the output of a binary classier as a basis for decision making, there is a need for a probability that not only separates well between positive and negative examples, but that also assesses the real probability of the event 21
  22. 22. Probability Calibration • Reliability Diagram 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P(pf) P(pf|x) base LR RF DT 22
  23. 23. Probability Calibration • ROC Convex Hull calibration ROC CurveClass (y) Prob (p) 0 0.0 1 0.1 0 0.2 0 0.3 1 0.4 0 0.5 1 0.6 1 0.7 0 0.8 1 0.9 1 1.0 23
  24. 24. Probability Calibration • ROC Convex Hull calibration ROC ConvexHull Curve Class (y) Prob (p) Cal Prob 0.0 0 0 0.1 1 0.333 0.2 0 0.333 0.3 0 0.333 0.4 1 0.5 0.5 0 0.5 0.6 1 0.666 0.7 1 0.666 0.8 0 0.666 0.9 1 1 1.0 1 1 the calibratedprobabilitiesare extracted by first group the probabilitiesaccording to the pointsin the ROCCH curve, and then make the calibratedprobabilitiesbe the slope(T) for each group T. 24
  25. 25. Probability Calibration • Reliability Diagram 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P(pf) P(pf|x) base RF DT LR 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1P(pf) P(pf|x) base Cal RF Cal DT Cal LR 25
  26. 26. • Extra 1.5% decrease in cost by using calibrated probabilities Experiments 0.00 0.05 0.10 0.15 0.20 0.25 - 50,000 100,000 150,000 200,000 250,000 No Model RF BMR CAL BMR DT BMR CAL BMR LR BMR CAL BMR RF DT LR Cost(Euros) Cost F1-Score 26
  27. 27. • Introduction • Database • Evaluation • Bayes Minimum Risk • Experiments • Probability Calibration • Other applications • Conclusions & Future Work Agenda 27
  28. 28. Other Applications • Direct Marketing: Banking LTD offers http://archive.ics.uci.edu/ml/datasets/Bank+Marketing • Credit Scoring: 2011 Kaggle competition Give Me Some Credit http://www.kaggle.com/c/GiveMeSomeCredit/ • Credit Scoring: 2009 Pacific-Asia Knowledge Discovery and Data Mining conference (PAKDD)competition http://sede.neurotech.com.br:443/PAKDD2009/ 28
  29. 29. Other Applications • Direct Marketing: Banking LTD offers where int(i) is the expected interest gains of customer i • Datasets Cost Matrix True Class (𝑦𝑖) Accept (𝑦𝑖=1) Decline (𝑦𝑖=0) Predicted class (𝑝𝑖) Accept (𝑐𝑖=1) 𝐶 𝑇𝑃𝑖 = Ca 𝐶 𝐹𝑃𝑖 = Ca Decline (𝑐𝑖=0) 𝐶 𝐹𝑁𝑖 = Int(i) 𝐶 𝑇𝑁𝑖 = 0 29 Database Examples Acceptance Int Total 47,562 12.56% 394,211 Train 19,119 12.64% 156,676 Validation 11,809 12.78% 97,498 Test 11,815 12.23% 97,594
  30. 30. Other Applications • Direct Marketing: Banking LTD offers 0.00 0.10 0.20 0.30 0.40 - 4,000 8,000 12,000 16,000 No Model RF BMR CAL BMR DT BMR CAL BMR LR BMR CAL BMR RF DT LR Cost(Euros) Cost F1-Score 95,594 30 • Extra 13.4%decrease in cost by using calibrated probabilities
  31. 31. Other Applications • Credit Scoring where 𝑙𝑔𝑑 is the loss given default, 𝐶𝑙𝑖 is the credit line of client i, 𝑟𝑖 is the expected profit of client i, and 𝐶 𝐹𝑃 𝑎 is the expected cost of lending the money to an alternativeborrower. • Datasets Kaggle Credit Dataset PAKDD Credit Dataset Cost Matrix True Class (𝑦𝑖) Accept (𝑦𝑖=1) Decline (𝑦𝑖=0) Predicted class (𝑝𝑖) Accept (𝑐𝑖=1) 𝐶 𝑇𝑃𝑖 = 0 𝐶 𝐹𝑃𝑖 = 𝑟𝑖 + 𝐶 𝐹𝑃𝑎 Decline (𝑐𝑖=0) 𝐶 𝐹𝑁𝑖 = 𝐶𝑙 𝑖 ∗ 𝑙𝑔𝑑 𝐶 𝑇𝑁𝑖 = 0 31 Database Examples Default Total 112,915 6.74% Train 45,358 6.83% Validation 33,850 6.67% Test 33,707 6.71% Database Examples Default Total 38,969 19.88% Train 15,614 19.98% Validation 11,711 20.02% Test 11,644 19.63%
  32. 32. Other Applications • Credit Scoring Kaggle Credit Dataset PAKDD Credit Dataset 0.00 0.10 0.20 0.30 0.40 - 5.00 10.00 15.00 20.00 25.00 RF BMR CAL BMR Cost(MillionsEuros) Cost F1-Score 0.00 0.10 0.20 0.30 0.40 - 0.20 0.40 0.60 0.80 1.00 RF BMR CAL BMRCost(MillionsEuros) Cost F1-Score 32 • Extra 0.9% decrease in cost by using calibrated probabilities
  33. 33. Conclusion • Selecting models based on traditional statisticsdoes not give the best results in terms of cost • Models should be evaluated taking into account real financial costsof the application • Algorithms should be developed to incorporate those real financial costs • Calibration of probabilities yields to further decrease in cost 33
  34. 34. Future work • Example Dependent Cost Sensitive Decision Trees • Example-Dependent Cost-Sensitive Calibration Method • Applications: • Corporate credit risk • Involuntary & Voluntary Churn in TV subscription market 34
  35. 35. References • Correa Bahnsen, A., Stojanovic, A., Aouada, D., & Ottersten, B. (2013). Cost Sensitive Credit Card Fraud Detection using Bayes Minimum Risk. In International Conference on Machine Learning and Applications. Miami, USA: IEEE. • Correa Bahnsen, A., Stojanovic, A., Aouada, D., & Ottersten, B. (2014). Improving Credit Card Fraud Detection with Calibrated Probabilities. In SIAM International Conference on Data Mining. Philadelphia, USA: SIAM. • Correa Bahnsen, A., Aouada, D., & Ottersten, B. (2014). Example-Dependent Cost- Sensitive Credit Scoring using Bayes Minimum Risk. Submitted to ECAI 2014. 35
  36. 36. Contact information Alejandro Correa Bahnsen University of Luxembourg Luxembourg al.bahnsen@gmail.com http://www.linkedin.com/in/albahnsen http://www.slideshare.net/albahnsen 36

×