SlideShare a Scribd company logo
Team 7
Lalit Jain
Lipsa Panda
Sameer Goel
 Data Points that deviates from what is standard ,normal or expected or do not
conform an expected pattern.
This seems easy, why even worry about it?
The answer is yes if the following three conditions are met.
1. You have labeled training data
2. Anomalous and normal classes are balanced ( say at least 1:5)
3. Data is not auto correlated. ( That one data point does not depend on earlier data
points. This often breaks in time series data).
 Anomalies can be classified as Point , Collective or Contextual .
 Point Anomaly
 If an individual data
instance can be
considered as anomalous
with respect to the rest
of the data (e.g. purchase
with large transaction
value)
 Collective Anomaly
 If a collection of related data
instances is anomalous with
respect to the entire data
set, but not individual
values (e.g. breaking rhythm
in ECG)
 Contextual Anomaly
 If a data instance is anomalous
in a specific context, but not
otherwise ( anomaly if occur at
certain time or certain region.
e.g. large spike at middle of
night)
Application Domains
Intrusion Detection
Fraud Detection
Traffic Analysis
Labels
Anomaly Type
Nature of Data
Output
Problem Characteristics
Anomaly Detection
Technique
Research Areas:
Machine Learning
Data
Mining Statistics
Information
Theory Spectral
Theory
 Our datasets contains transactions made by credit cards in September 2013 by European
cardholders, where there are 492 frauds out of 284,807 transactions.
Note: Dataset was provided us already pre processed and PCA transformed due to
confidentiality issues.
Target Variable: Class
0  Normal Transactions (Non-Fraud)
1  Fraud Transactions (Fraud)
 The data is highly skewed, the
positive class (frauds) account
for only 0.172% of all
transactions.
1) Data sampling:
In which the training instances
are modified in such a way to
produce a more or less balanced
class distribution that allow
classifiers to perform in a similar
manner to standard
classification. Oversample the
minority class, Undersample the
majority class, Synthesize new
minority classes.
E.g. SMOTE, ROSE,
EasyEnsemble, BalanceCascade,
etc
2) Algorithmic modification: This procedure is oriented towards the adaptation of base
learning methods to be more attuned to class imbalance issues
3) Cost-sensitive learning: This type of solutions incorporate approaches at the data
level, at the algorithmic level, or at both levels combined, considering higher costs for
the misclassification of examples of the positive class with respect to the negative class,
and therefore, trying to minimize higher cost errors
E.g. CostSensitiveClassifier.
Generating artificial anomalies
 New rare class examples are
generated inside the regions of
existing rare class examples
 Artificial anomalies are
generated around the edges of
the sparsely populated data
regions Classify synthetic
outliers vs. real normal data
using active learning
Synthetic Minority Over-sampling Technique
Looks highly accurate model with model
accuracy of ~89%.
However for Anomaly Detection, we should consider
following metrics
The Area Under the ROC curve (AUC) is a good
general statistic. It is equal to the probability that a
random positive example will be ranked above a
random negative example.
The F1 Score is the harmonic mean of precision and
recall. It is commonly used in text processing when an
aggregate measure is sought.
Cohen’s Kappa is an evaluation statistic that takes
into account how much agreement would be expected
by chance.
Changing the threshold from a range of 0 to 0.5 and checking the AUC.
1  Fraud Transactions
0  Non-Fraud Transactions
We were able to predict 98% credit card
fraud at the same time maintaining a high
precision and recall.
Demo
1. Live Credit Card Fraud Detection – (SMOTE)
2. Single Transaction – (One Class SVM)
3. Batch Execution
1. (G.E.A.P.A. Batista, R.C. Prati, M.C. Monard, A study of the behaviour of several
methods for balancing machine learning training data, SIGKDD Explorations 6 (1)
(2004) 20–29. doi: 10.1145/1007730.1007735, N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P.
Kegelmeyer, SMOTE: synthetic minority over-sampling technique, Journal of Artificial
Intelligent Research 16 (2002) 321–357. doi: 10.1613/jair.953).
2. (B. Zadrozny, C. Elkan, Learning and making decisions when costs and probabilities are
both unknown, in: Proceedings of the 7th International Conference on Knowledge
Discovery and Data Mining (KDD’01), 2001, pp. 204–213.).
3. (P. Domingos, Metacost: a general method for making classifiers cost–sensitive, in:
Proceedings of the 5th International Conference on Knowledge Discovery and Data
Mining (KDD’99), 1999, pp. 155–164., B. Zadrozny, J. Langford, N. Abe, Cost–sensitive
learning by cost–proportionate example weighting, in: Proceedings of the 3rd IEEE
International Conference on Data Mining (ICDM’03), 2003, pp. 435–442.)
4. https://www.analyticsvidhya.com/blog/2016/03/practical-guide-deal-imbalanced-
classification-problems/
Anomaly detection- Credit Card Fraud Detection
Anomaly detection- Credit Card Fraud Detection

More Related Content

What's hot

Class imbalance problem1
Class imbalance problem1Class imbalance problem1
Class imbalance problem1
chs71
 
Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random Undersampling
IRJET Journal
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
QuantUniversity
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
Luis Borbon
 
Approach to BSA/AML Rule Thresholds
Approach to BSA/AML Rule ThresholdsApproach to BSA/AML Rule Thresholds
Approach to BSA/AML Rule Thresholds
Mayank Johri
 
Ways to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performanceWays to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performance
Mala Deep Upadhaya
 
Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop
QuantUniversity
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
QuantUniversity
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use cases
Sridhar Ratakonda
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
Sara Hooker
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Simplilearn
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...Yao Wu
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup Slides
QuantUniversity
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
Derek Kane
 
Statistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, DescribingStatistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, Describing
Galit Shmueli
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
Sara Hooker
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachErik De Monte
 

What's hot (17)

Class imbalance problem1
Class imbalance problem1Class imbalance problem1
Class imbalance problem1
 
Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random Undersampling
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
Approach to BSA/AML Rule Thresholds
Approach to BSA/AML Rule ThresholdsApproach to BSA/AML Rule Thresholds
Approach to BSA/AML Rule Thresholds
 
Ways to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performanceWays to evaluate a machine learning model’s performance
Ways to evaluate a machine learning model’s performance
 
Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use cases
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup Slides
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Statistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, DescribingStatistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, Describing
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
 

Viewers also liked

Credit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client PresentationCredit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client Presentation
Ayapparaj SKS
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
k.surya kumar
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly Detection
Lalit Jain
 
Real-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactionsReal-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactions
Mariusz Rafało
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detection
Justluk Luk
 
7 Keys to Fraud Prevention, Detection and Reporting
7 Keys to Fraud Prevention, Detection and Reporting7 Keys to Fraud Prevention, Detection and Reporting
7 Keys to Fraud Prevention, Detection and Reporting
Brown Smith Wallace
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
Hernan Huwyler
 
Fraud detection
Fraud detectionFraud detection
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionkalpesh1908
 
Introduction to Neural Networks - Perceptron
Introduction to Neural Networks - PerceptronIntroduction to Neural Networks - Perceptron
Introduction to Neural Networks - Perceptron
Hannes Hapke
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
Alejandro Correa Bahnsen, PhD
 
Sixth sense technology ppt
Sixth sense technology pptSixth sense technology ppt
Sixth sense technology ppt
Baljeet singh Chauhan
 
Sixth Sense Technology
Sixth Sense Technology Sixth Sense Technology
Sixth Sense Technology
Arjun R Krishna
 
6thsensetechnology by www.avnrpptworld.blogspot.com
6thsensetechnology by www.avnrpptworld.blogspot.com6thsensetechnology by www.avnrpptworld.blogspot.com
6thsensetechnology by www.avnrpptworld.blogspot.com
avnrworld
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
Akhil Ak
 
Credit Card Fraud 97
Credit Card Fraud 97Credit Card Fraud 97
Credit Card Fraud 97alessio d
 
Sixth sense technology ppt
Sixth sense technology pptSixth sense technology ppt
Sixth sense technology ppt
Mohammad Adil
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
Jai Rabindra
 
All About Sixth Sense Technology
All About Sixth Sense TechnologyAll About Sixth Sense Technology
All About Sixth Sense Technology
Vishmita Shetty
 
Sixth sense technology
Sixth sense technology Sixth sense technology
Sixth sense technology Mohamed Sahl
 

Viewers also liked (20)

Credit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client PresentationCredit Card Fraud Detection Client Presentation
Credit Card Fraud Detection Client Presentation
 
Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)Credit card fraud detection methods using Data-mining.pptx (2)
Credit card fraud detection methods using Data-mining.pptx (2)
 
Credit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly DetectionCredit Card Fraud Detection - Anomaly Detection
Credit Card Fraud Detection - Anomaly Detection
 
Real-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactionsReal-time fraud detection in credit card transactions
Real-time fraud detection in credit card transactions
 
Analysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detectionAnalysis of-credit-card-fault-detection
Analysis of-credit-card-fault-detection
 
7 Keys to Fraud Prevention, Detection and Reporting
7 Keys to Fraud Prevention, Detection and Reporting7 Keys to Fraud Prevention, Detection and Reporting
7 Keys to Fraud Prevention, Detection and Reporting
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
 
Fraud detection
Fraud detectionFraud detection
Fraud detection
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Introduction to Neural Networks - Perceptron
Introduction to Neural Networks - PerceptronIntroduction to Neural Networks - Perceptron
Introduction to Neural Networks - Perceptron
 
2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice2013 credit card fraud detection why theory dosent adjust to practice
2013 credit card fraud detection why theory dosent adjust to practice
 
Sixth sense technology ppt
Sixth sense technology pptSixth sense technology ppt
Sixth sense technology ppt
 
Sixth Sense Technology
Sixth Sense Technology Sixth Sense Technology
Sixth Sense Technology
 
6thsensetechnology by www.avnrpptworld.blogspot.com
6thsensetechnology by www.avnrpptworld.blogspot.com6thsensetechnology by www.avnrpptworld.blogspot.com
6thsensetechnology by www.avnrpptworld.blogspot.com
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
 
Credit Card Fraud 97
Credit Card Fraud 97Credit Card Fraud 97
Credit Card Fraud 97
 
Sixth sense technology ppt
Sixth sense technology pptSixth sense technology ppt
Sixth sense technology ppt
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
 
All About Sixth Sense Technology
All About Sixth Sense TechnologyAll About Sixth Sense Technology
All About Sixth Sense Technology
 
Sixth sense technology
Sixth sense technology Sixth sense technology
Sixth sense technology
 

Similar to Anomaly detection- Credit Card Fraud Detection

Multi-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data MiningMulti-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data Mining
IOSR Journals
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
Salford Systems
 
Top 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfTop 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdf
Datacademy.ai
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchkevinlan
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
jim
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Researchbutest
 
Dealing with imbalanced data sets.pdf
Dealing with imbalanced data sets.pdfDealing with imbalanced data sets.pdf
Dealing with imbalanced data sets.pdf
NagaVarthini
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
Francisco E. Figueroa-Nigaglioni
 
Poor man's missing value imputation
Poor man's missing value imputationPoor man's missing value imputation
Poor man's missing value imputation
Leonardo Auslender
 
Explore ML day 1
Explore ML day 1Explore ML day 1
Explore ML day 1
preetikumara
 
Feature selection with imbalanced data in agriculture
Feature selection with  imbalanced data in agricultureFeature selection with  imbalanced data in agriculture
Feature selection with imbalanced data in agriculture
Aboul Ella Hassanien
 
BSA_AML Rule Tuning
BSA_AML Rule TuningBSA_AML Rule Tuning
BSA_AML Rule TuningMayank Johri
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Boston Institute of Analytics
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
Boston Institute of Analytics
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSKAMIL MAJEED
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
IJERA Editor
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
Abhimanyu Dwivedi
 
Lecture7 Ml Machines That Can Learn
Lecture7 Ml Machines That Can LearnLecture7 Ml Machines That Can Learn
Lecture7 Ml Machines That Can LearnKodok Ngorex
 
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdfTop Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Artificial Intelligence Board of America
 

Similar to Anomaly detection- Credit Card Fraud Detection (20)

Multi-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data MiningMulti-Cluster Based Approach for skewed Data in Data Mining
Multi-Cluster Based Approach for skewed Data in Data Mining
 
Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values Imputation Techniques For Market Research Datasets With Missing Values
Imputation Techniques For Market Research Datasets With Missing Values
 
Top 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfTop 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdf
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Dealing with imbalanced data sets.pdf
Dealing with imbalanced data sets.pdfDealing with imbalanced data sets.pdf
Dealing with imbalanced data sets.pdf
 
Classification and decision tree classifier machine learning
Classification and decision tree classifier machine learningClassification and decision tree classifier machine learning
Classification and decision tree classifier machine learning
 
Poor man's missing value imputation
Poor man's missing value imputationPoor man's missing value imputation
Poor man's missing value imputation
 
Explore ML day 1
Explore ML day 1Explore ML day 1
Explore ML day 1
 
Feature selection with imbalanced data in agriculture
Feature selection with  imbalanced data in agricultureFeature selection with  imbalanced data in agriculture
Feature selection with imbalanced data in agriculture
 
BSA_AML Rule Tuning
BSA_AML Rule TuningBSA_AML Rule Tuning
BSA_AML Rule Tuning
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODS
 
Analysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOTAnalysis on different Data mining Techniques and algorithms used in IOT
Analysis on different Data mining Techniques and algorithms used in IOT
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Lecture7 Ml Machines That Can Learn
Lecture7 Ml Machines That Can LearnLecture7 Ml Machines That Can Learn
Lecture7 Ml Machines That Can Learn
 
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdfTop Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 

Recently uploaded

Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 

Recently uploaded (20)

Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 

Anomaly detection- Credit Card Fraud Detection

  • 1. Team 7 Lalit Jain Lipsa Panda Sameer Goel
  • 2.  Data Points that deviates from what is standard ,normal or expected or do not conform an expected pattern. This seems easy, why even worry about it?
  • 3. The answer is yes if the following three conditions are met. 1. You have labeled training data 2. Anomalous and normal classes are balanced ( say at least 1:5) 3. Data is not auto correlated. ( That one data point does not depend on earlier data points. This often breaks in time series data).
  • 4.  Anomalies can be classified as Point , Collective or Contextual .  Point Anomaly  If an individual data instance can be considered as anomalous with respect to the rest of the data (e.g. purchase with large transaction value)  Collective Anomaly  If a collection of related data instances is anomalous with respect to the entire data set, but not individual values (e.g. breaking rhythm in ECG)  Contextual Anomaly  If a data instance is anomalous in a specific context, but not otherwise ( anomaly if occur at certain time or certain region. e.g. large spike at middle of night)
  • 5. Application Domains Intrusion Detection Fraud Detection Traffic Analysis Labels Anomaly Type Nature of Data Output Problem Characteristics Anomaly Detection Technique Research Areas: Machine Learning Data Mining Statistics Information Theory Spectral Theory
  • 6.  Our datasets contains transactions made by credit cards in September 2013 by European cardholders, where there are 492 frauds out of 284,807 transactions. Note: Dataset was provided us already pre processed and PCA transformed due to confidentiality issues. Target Variable: Class 0  Normal Transactions (Non-Fraud) 1  Fraud Transactions (Fraud)
  • 7.  The data is highly skewed, the positive class (frauds) account for only 0.172% of all transactions.
  • 8. 1) Data sampling: In which the training instances are modified in such a way to produce a more or less balanced class distribution that allow classifiers to perform in a similar manner to standard classification. Oversample the minority class, Undersample the majority class, Synthesize new minority classes. E.g. SMOTE, ROSE, EasyEnsemble, BalanceCascade, etc
  • 9. 2) Algorithmic modification: This procedure is oriented towards the adaptation of base learning methods to be more attuned to class imbalance issues 3) Cost-sensitive learning: This type of solutions incorporate approaches at the data level, at the algorithmic level, or at both levels combined, considering higher costs for the misclassification of examples of the positive class with respect to the negative class, and therefore, trying to minimize higher cost errors E.g. CostSensitiveClassifier.
  • 10. Generating artificial anomalies  New rare class examples are generated inside the regions of existing rare class examples  Artificial anomalies are generated around the edges of the sparsely populated data regions Classify synthetic outliers vs. real normal data using active learning Synthetic Minority Over-sampling Technique
  • 11.
  • 12. Looks highly accurate model with model accuracy of ~89%. However for Anomaly Detection, we should consider following metrics The Area Under the ROC curve (AUC) is a good general statistic. It is equal to the probability that a random positive example will be ranked above a random negative example. The F1 Score is the harmonic mean of precision and recall. It is commonly used in text processing when an aggregate measure is sought. Cohen’s Kappa is an evaluation statistic that takes into account how much agreement would be expected by chance.
  • 13.
  • 14. Changing the threshold from a range of 0 to 0.5 and checking the AUC.
  • 15. 1  Fraud Transactions 0  Non-Fraud Transactions We were able to predict 98% credit card fraud at the same time maintaining a high precision and recall.
  • 16.
  • 17.
  • 18.
  • 19. Demo 1. Live Credit Card Fraud Detection – (SMOTE) 2. Single Transaction – (One Class SVM) 3. Batch Execution
  • 20. 1. (G.E.A.P.A. Batista, R.C. Prati, M.C. Monard, A study of the behaviour of several methods for balancing machine learning training data, SIGKDD Explorations 6 (1) (2004) 20–29. doi: 10.1145/1007730.1007735, N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, Journal of Artificial Intelligent Research 16 (2002) 321–357. doi: 10.1613/jair.953). 2. (B. Zadrozny, C. Elkan, Learning and making decisions when costs and probabilities are both unknown, in: Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining (KDD’01), 2001, pp. 204–213.). 3. (P. Domingos, Metacost: a general method for making classifiers cost–sensitive, in: Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining (KDD’99), 1999, pp. 155–164., B. Zadrozny, J. Langford, N. Abe, Cost–sensitive learning by cost–proportionate example weighting, in: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM’03), 2003, pp. 435–442.) 4. https://www.analyticsvidhya.com/blog/2016/03/practical-guide-deal-imbalanced- classification-problems/