Distributed Data Mining In Credit Card Fraud Detection
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Distributed Data Mining In Credit Card Fraud Detection

  • 690 views
Uploaded on

final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects,......

final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
690
On Slideshare
690
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
17
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. DISTRIBUTED DATA MINING IN CREDIT CARD FRAUD DETECTION INTRODUCTION Credit card transactions grow in number, taking a larger share of any country’s payment system and this is turn has led to a higher rate of stolen account numbers and subsequent losses by banks. Hence, improved fraud detection has become essential to maintain the viability of the country’s payment system. Banks have used early fraud warning systems for some years. Large- scale data-mining techniques can improve on the state of the art in commercial practice. Scalable techniques to analyze massive amounts of transaction data that efficiently compute fraud detectors in a timely manner is an important problem, especially for e- commerce. Besides scalability and efficiency, the fraud-detection task exhibits technical problems that include skewed distributions of training data and non-uniform cost per error, both of which have not been widely studied in the knowledge-discovery and datamining community. In this project, a deep survey is made and evaluates a number of techniques that address these three main issues concurrently. Our proposed methods of combining multiple learned fraud detectors under a “cost model” are general and demonstrably useful; our empirical results demonstrate that we can significantly reduce loss due to fraud through distributed data mining of fraud models. DATA MINING AND MACHINE LEARNING The aim of data mining is to extract knowledge from large amounts of data. This knowledge is nontrivial and hidden in the data. Machine learning is often used in data mining.
  • 2. DATA MINING: A DEFINITION Art/Science of uncovering non-trivial, valuable information from a large database Emphasis on: Non-obvious (difficult) Useful (cost vs benefit) Large (automatic) Yet, no rules, provided that the process is efficient in time, space and human resources. Data Mining is the process of finding interesting trends or patterns in large datasets in order to guide future decisions. Related to exploratory data analysis (area of statistics) and knowledge discovery (area in artificial intelligence, machine learning). Data Mining is characterized by having VERY LARGE datasets. DATA MINING VS. MACHINE LEARNING Size: Databases are usually very large so algorithms must scale well Design Purpose: Databases are not usually designed for data mining (but for other purposes), and thus, may not have convenient attributes Errors and Noise: Databases almost always contain errors The aim of machine learning is to adapt to new circumstances, to detect and extrapolate. A distinction can be made between unsupervised and supervised machine learning algorithms.
  • 3. PROPOSED SYSTEM In today’s increasingly electronic society and with the rapid advances of electronic commerce on the Internet, the use of credit cards for purchases has become convenient and necessary. Credit card transactions have become the de facto standard for Internet and Webbased e-commerce. The US government estimates that credit cards accounted for approximately US $13 billion in Internet sales during 1998. This figure is expected to grow rapidly each year. However, the growing number of credit card transactions provides more opportunity for thieves to steal credit card numbers and subsequently commit fraud. When banks lose money because of credit card fraud, cardholders pay for all of that loss through higher interest rates, higher fees, and reduced benefits. Hence, it is in both the banks’ and cardholders’ interest to reduce illegitimate use of credit cards by early fraud detection. For many years, the credit card industry has studied computing models for automated detection systems; recently, these models have been the subject of academic research, especially with respect to e- commerce. The credit card fraud-detection domain presents a number of challenging issues for data mining: There are millions of credit card transactions processed each day. Mining such massive amounts of data requires highly efficient techniques that scale. The data are highly skewed—many more transactions are legitimate than fraudulent. Typical accuracy-based mining techniques can generate highly accurate fraud detectors by simply predicting that all transactions are legitimate, although this is equivalent to not detecting fraud at all.
  • 4. Each transaction record has a different dollar amount and thus has a variable potential loss, rather than a fixed misclassification cost per error type, as is commonly assumed in cost-based mining techniques. Our approach addresses the efficiency and scalability issues in several ways. We divide large data set of labeled transactions (either fraudulent or legitimate) into smaller subsets, apply mining techniques to generate classifiers in parallel, and combine the resultant base models by metalearning from the classifiers’ behavior to generate a metaclassifier. Our approach treats the classifiers as black boxes so that we can employ a variety of learning algorithms. Besides extensibility, combining multiple models computed over all available data produces metaclassifiers that can offset the loss of predictive performance that usually occurs when mining from data subsets or sampling. Furthermore, when we use the learned classifiers (for example, during transaction authorization), the base classifiers can execute in parallel, with the metaclassifier then combining their results. So, our approach is highly efficient in generating these models and also relatively efficient in applying them. Another parallel approach focuses on parallelizing a particular algorithm on a particular parallel architecture. However, a new algorithm or architecture requires a substantial amount of parallel- programming work. Although our architecture and algorithm-independent approach is not as efficient as some fine-grained parallelization approaches, it lets users plug different off-the-shelf learning programs into a parallel and distributed environment with relative ease and eliminates the need for expensive parallel hardware.
  • 5. We are going to use the ADACost algorithm. SOFTWARE TOOLS • ASP .NET • Oracle Database HARDWARE TOOLS • Pentium Server with Client