SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1359
Credit Card Fraud Detection Using Machine Learning & Data Science
Ishika Sharma1 Shivjyoti Dalai2, Venktesh Tiwari3, Ishwari Singh 4, Seema Kharb5
1,2,3 Students, Computer Science Engineering, SRM University, Sonipat
4Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana,
5Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - A method for 'Credit Card Fraud Detection' is
created in this study. As the number of scammers grows every
day. Credit cards are used for fraudulent transactions, and
there are several sorts of fraud. As a result, various techniques
such as Logistic Regression, Random Forest, and Naive Bayes
are utilized to tackle this problem. This transaction is
evaluated individually, and whateverworksbestiscarried out.
The primary purpose is to detect fraud by filtering the
aforementioned strategies in order to achieve a better
outcome.
Key Words: Credit Card, Fraud Detection,RandomForest,
Naïve Bayes, Logistic Regression.
1. INTRODUCTION
Credit card fraud is a broad word for theft and fraud
perpetrated using or utilizing a credit card at the moment of
payment. The goal may be to buy something without paying
for it or withdraw money from an account without
permission. Identity theft is often accompanied by credit
card fraud. According to the Federal Trade Commission of
the United States, the rate of identity theft remained steady
during the mid-2000s, but it jumped by 21% in 2008. Even
though credit card fraud, the crime most people connect
with ID theft, fell to a fraction of total ID theft complaints in
2000, roughly 10 million transactions, or one out of every
1300, were fraudulent. In addition, 0.05 percent (5 out of
10,000) of all monthly active accounts were fake. Today,
fraud detection systems keep track of a twelfth of one
percent of all transactions performed, resulting in billions of
dollars in losses. Credit card fraud is one of the most serious
issues facing businesses today. However, to successfully
detect fraud, it is necessary first to comprehend the
processes of fraud execution. Fraudsters use a variety of
methods to perpetrate credit cardfraud.CreditCardFraudis
described as "when an individual uses another person's
credit card for personal reasons while the card owner and
the card issuer are unaware that the card is being used."
Theft of the actual card or the critical data linked with the
account, such as the card account number or other
information that must be given to a merchant during a valid
transaction, is where card fraud begins. Card numbers,
usually the Primary Account Number (PAN), are often
reproduced on the card, and the data is stored in machine-
readable format on a magnetic stripe on the reverse.
2. METHODOLOGY
This part should provide the method and analysis used in
your research project. Using keywords from your title in the
first few phrases is a simple and effective method to follow.
A. Data Collection
The data-gathering phase is the first step in the project; this
dataset comprises a collectionoftransactions,someofwhich
are real and others are fraudulent. The data-gatheringphase
is the first step in the project; this dataset includes a
collection of transactions, some of which are real and others
that are fraudulent. The data-gatheringphaseisthefirststep
in the project; this dataset comprises a collection of
transactions, some of which are real and others are
fraudulent.
B. Credit Card Dataset
A credit card transaction data set was gathered via Kaggle,
and it comprises a total of 2,84,808 credit card transactions
from a European bank. It divides transactions into "positive
class" and "negative class." The data set is highly skewed,
with roughly 0.172 percent of transactions being fraudulent
and the remainder being legitimate; this indicates that just
492 of the 2,84,808 transactions are fraudulent, and the rest
are genuine ones. So, we oversampled to balance the data
set, resulting in 60% of fraud transactions and 40% genuine
ones.
C. Preprocessing of Dataset
Selected data is formatted, cleaned, and sampled in this
module. The following are some of the data pre-processing
steps:
a) Formatting: The chosen data might not be in the correct
format. We may prefer data in a file format over a relational
database or vice versa.
b) Cleaning is the process of removing or correcting missing
data. The dataset may contain recordsthat areincompleteor
have null values. Such records must be deleted.
c) Sampling: The class distribution in credit card
transactions is uneven because the number of frauds in the
dataset is fewer than the total number of transactions. As a
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1360
result, the sampling approach is utilized to tackle this
problem.
D. Loading of Dataset
The dataset is loaded after it has been pre-processed.
Various library functions can be used to load the dataset. In
this case, we used the read CSV method of Python's Pandas
library to load a dataset in CSV or Microsoft Excel format; in
terms of python, it is called a DATAFRAME. dataset =
pd.read_csv('creditcard.csv')
E. Splitting of Dataset
To compensate for the dataset's imbalance, we used the
ADASYN oversampling technique, which oversamples both
the number of fraudulent and genuine transactions to a
specific number, resulting in a positive and negative range
that is nearly equal. After the dataset is oversampled, the
samples are split into Train and Test data. A suitable ratio is
to be performed for the model (Usually, 70% for Train data
and 30% for Test data are chosen, anyone can choose their
ration). The train dataset can be further split into train data
and validation data.
F. Building Model
After the data has been split into train and test data which is
70% and 30%, respectively, the training data is now utilized
for the model building. The dataset contains 31 features,out
of which 30 features or columns are the independent
features, and the last column called the CLASS column, is the
dependent feature. So here, the dataset is split into four
categories: xtrain, ytrain, xtest, and ytest, representing
independent training features, dependent training features,
independent test features, and dependent test features.
G. Algorithms
a) Logistic Regression -Regression is a regression model
that analyses therelationshipbetweenmultipleindependent
variables and has a categorical dependent variable. There
are many different logistic regression models, including
binary, multiple, and binomial logistic models. The Binary
Logistic Regression model calculates the likelihood of a
binary response based on one or more predictors.
Fig 1- Logistic Regression expression
The above equation represents the logistic regression in
mathematical form.
b) Random Forest - Random Forest can be used to rank the
importance of variables in a regression or classification
problem in a natural way. Random forest is a tree-based
algorithm that createsseveral treesandcombinesthe results
to improve the model's generalization ability. An ensemble
method is a technique for combining trees. Ensembling is
nothing more than putting togethera groupofweak learners
(individual trees) to create a stronglearner.RandomForests
can be used to solve problems involving regression and
classification. The dependent variable in regression
problems is continuous. The dependent variable in
classification problems is categorical.
Fig 2- Random Forest expression
c) Naïve Bayes - A Bayesian classifier is a statistical method
that uses Bayes' theorem to calculate the probability that a
feature belongs to a specific class. It is referred to as naive
because it assumes that the possibilities of individual
components are independent of one another, which is
extremely unlikely to occur inthereal world.Theprobability
of an event occurring is calculated by considering the
likelihood of another event occurring. It's possibletowriteit
as:
Fig 3- Naïve Bayes expression
Where the posterior probability of target class c P(c|X) is
calculated from P(c), P(X|c), and P(X).
H. Training of Model
After building the model, the model is trained using Train
data and validation data. The model is trained using the
library function fit () function.
I. Evaluating the Model The model can be evaluated by using
various metrics. These are
a) Interpreting Loss and Validationloss -Lossistheresult
of a bad prediction. A loss is a number indicating how bad
the model's prediction was on a single example. Loss can be
validation loss and training loss.
b) Interpreting Accuracy and Validation Accuracy -
Validation accuracy and accuracy need to be converged in a
good model.
c) Confusion Matrix - A confusion matrix summarizes
classification problem prediction results. The correct and
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1361
incorrect predictions are totaled and broken down by class
using count values. The confusion matrix's key is this. The
confusion matrix depicts the various ways in which your
classification model becomes perplexed when making
predictions. It reveals the number of errors made by a
classifier and the types of errors made.
Here,
•Class 1: Positive
•Class 2: Negative Classification Rate/ Accuracy
Classification Rate or Accuracy is given by the relation:
Fig 4- Accuracy Expression
Recall - Recall can be defined as the ratio of the total number
of correctly classified positive examples divided by the total
number of positive examples. High Recall indicates the class
is correctly recognized (a small number of FN).
Fig 5- Recall Expression
Precision - To get the precision value, we divide the total
number of correctly classified positive examples by thetotal
number of predicted positive examples. High precision
indicates that an example labeled as positive is positive (a
small number of FP).
Fig 6- Precision Expression
J. Saving the Model
After building the model, the model is saved to our device.
The model can be saved in .pkl format or .h5 format. To save
the model in .pkl format, python provides usa librarynamed
Pickle, and to save it in .h5 format, the Tensorflow library is
used. I have used the Pickle library and saved the models in
.pkl format.
3. MODEL AND ANALYSIS
Fig 8- RandomForest Classifier
Fig 9: Naïve Bayes model
4. RESULTS AND DISCUSSION
The accuracy results we got from the three algorithms are
shown in the following table below.
Table- Accuracy Results
Fig 7- Logistic Regression
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1362
Training vs. Test Data in Dataset
Fig 10- Training vs. Test Data
Normal vs. Fraud Transaction after Oversampling
Fig 11- Fraudulent vs. non-Fraudulent
Correlation matrix
Fig12- Correlation Matrix
4. CONCLUSIONS
Various machine learning algorithms for detecting fraud in
credit card transactions were reviewed in this paper. The
accuracy, precision, and specificity metrics are used to
evaluate the performance of this technique. To classify the
transaction as fraudulent or authorized, I used three
supervised learning techniques: Logistic Regression,
Random Forest, and Naive Bayes. Using feedback and
delayed supervised training, these classifiers were trained
on a delayed supervised sample dataset of almost 284807
transaction records. Due to the massive imbalance, the
dataset was subjected to an Oversampling technique, which
resulted in the number of fraud and normal transactions
being nearly equal. The training and test data were tested
using the three Models, and the results were obtained. The
accuracy of the Random Forest, Logistic Regression, and
Naive Bayes was 99.27%, 91.20%, and89.40%, respectively.
From the Above project, itcanbeconcludedthattheRandom
Forest model is somewhat trustworthy, and its accuracy
could be improved further with a larger and more balanced
dataset. If some other algorithms can be combined with this
one to form a Hybrid Algorithm, the results will be even
better.
ACKNOWLEDGEMENT
The success and outcome of this project required a lot of
guidance and assistance from many people, and we are
highly privileged to have got this all along with the
completion of our project. All that we have done is only due
to such supervision and assistance, and we will not forget to
thank them.
We are extremely grateful to Dr. Paramjit S. Jaswal, Vice-
Chancellor, SRM University, and Dr.PuneetGoswami,Head
of the Department, Department of Computer Science and
Engineering, for providing all the required resources for the
completion of my seminar.
Our heartfelt gratitude to our guide Dr. Ishwari Singh, for
their valuable suggestion and guidance in preparing the
research paper. Last but not least, we would express our
obligation to all the people who have worked extensively on
the topic and make the content available for free to all the
aspiring people who want to grow in their community. I
would say this report can be helpful to any aspiring student
who wants to gain an overall idea about how high-
performance computing works in practical life.
REFERENCES
[1] T. Mohana Priya, Dr. M. Punithavalli & Dr. R. Rajesh
Kanna, Machine Learning Algorithm forDevelopmentof
Enhanced Support Vector MachineTechniquetoPredict
Stress, Global Journal of Computer Science and
Technology: C Software & Data Engineering, Volume20,
Issue 2, No. 2020, pp 12-20
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1363
[2] Ganesh Kumar and P.Vasanth Sena, “Novel Artificial
Neural Networks and Logistic Approach for Detecting
Credit Card Deceit,” International Journal of Computer
Science and Network Security, Vol. 15, issue 9, Sep.
2015, pp. 222-234
[3] Gyusoo Kim and Seulgi Lee, “2014 Payment Research”,
Bank of Korea, Vol. 2015, No. 1, Jan. 2015.
[4] Chengwei Liu, Yixiang Chan, Syed Hasnain Alam Kazmi,
Hao Fu, “Financial Fraud Detection Model: Based on
Random Forest,” International Journal ofEconomicsand
Finance, Vol. 7, Issue. 7, pp. 178-188, 2015.
[5] Hitesh D. Bambhava, Prof. Jayeshkumar Pitroda, Prof.
Jaydev J. Bhavsar (2013), “A Comparative Study on
Bamboo Scaffolding And Metal Scaffolding in
Construction Industry Using Statistical Methods,"
International Journal of Engineering Trends and
Technology (IJETT) – Volume 4, Issue 6, June 2013,
Pg.2330-2337.
[6] P. Ganesh Prabhu, D. Ambika, "Study on Behaviour of
Workers in Construction Industry to Improve
Production Efficiency," International Journal of Civil,
Structural, Environmental and Infrastructure
Engineering Research and Development (IJCSEIERD),
Vol. 3, Issue 1, Mar 2013, 59-66
[7] Manideep, A. P. S., and Seema Kharb. "A Comparative
Analysis of Machine Learning Prediction Techniquesfor
Crop Yield Prediction in India." Turkish Journal of
Computer andMathematicsEducation(TURCOMAT) 13.2
(2022): 120-133.

More Related Content

Similar to Credit Card Fraud Detection Using Machine Learning & Data Science

A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
IRJET Journal
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
IRJET Journal
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine Learning
IRJET Journal
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
IRJET Journal
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
IRJET Journal
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
IRJET Journal
 
Tax Prediction Using Machine Learning
Tax Prediction Using Machine LearningTax Prediction Using Machine Learning
Tax Prediction Using Machine Learning
IRJET Journal
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine Learning
IRJET Journal
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
IRJET Journal
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
IRJET Journal
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...
IJDKP
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
dataalcott
 
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET Journal
 
B05840510
B05840510B05840510
B05840510
IOSR-JEN
 
B05840510
B05840510B05840510
B05840510
IOSR-JEN
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
IRJET Journal
 
IRJET- Competitive Analysis of Attacks on Social Media
IRJET-  	 Competitive Analysis of Attacks on Social MediaIRJET-  	 Competitive Analysis of Attacks on Social Media
IRJET- Competitive Analysis of Attacks on Social Media
IRJET Journal
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
IRJET Journal
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
IRJET Journal
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
Bastiaan Frerix
 

Similar to Credit Card Fraud Detection Using Machine Learning & Data Science (20)

A Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud DetectionA Comparative Study on Credit Card Fraud Detection
A Comparative Study on Credit Card Fraud Detection
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
Online Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine LearningOnline Transaction Fraud Detection System Based on Machine Learning
Online Transaction Fraud Detection System Based on Machine Learning
 
ATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithmsATM fraud detection system using machine learning algorithms
ATM fraud detection system using machine learning algorithms
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
 
A Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud DetectionA Research Paper on Credit Card Fraud Detection
A Research Paper on Credit Card Fraud Detection
 
Tax Prediction Using Machine Learning
Tax Prediction Using Machine LearningTax Prediction Using Machine Learning
Tax Prediction Using Machine Learning
 
In Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine LearningIn Banking Loan Approval Prediction Using Machine Learning
In Banking Loan Approval Prediction Using Machine Learning
 
IRJET- Credit Card Fraud Detection using Random Forest
IRJET-  	  Credit Card Fraud Detection using Random ForestIRJET-  	  Credit Card Fraud Detection using Random Forest
IRJET- Credit Card Fraud Detection using Random Forest
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
 
A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...A predictive system for detection of bankruptcy using machine learning techni...
A predictive system for detection of bankruptcy using machine learning techni...
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...IRJET-  	  Survey on Credit Card Security System for Bank Transaction using N...
IRJET- Survey on Credit Card Security System for Bank Transaction using N...
 
B05840510
B05840510B05840510
B05840510
 
B05840510
B05840510B05840510
B05840510
 
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISKMACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
MACHINE LEARNING CLASSIFIERS TO ANALYZE CREDIT RISK
 
IRJET- Competitive Analysis of Attacks on Social Media
IRJET-  	 Competitive Analysis of Attacks on Social MediaIRJET-  	 Competitive Analysis of Attacks on Social Media
IRJET- Competitive Analysis of Attacks on Social Media
 
IRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection AnalysisIRJET- Credit Card Fraud Detection Analysis
IRJET- Credit Card Fraud Detection Analysis
 
IRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data MiningIRJET- Financial Analysis using Data Mining
IRJET- Financial Analysis using Data Mining
 
Detection of credit card fraud
Detection of credit card fraudDetection of credit card fraud
Detection of credit card fraud
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 

Recently uploaded (20)

AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 

Credit Card Fraud Detection Using Machine Learning & Data Science

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1359 Credit Card Fraud Detection Using Machine Learning & Data Science Ishika Sharma1 Shivjyoti Dalai2, Venktesh Tiwari3, Ishwari Singh 4, Seema Kharb5 1,2,3 Students, Computer Science Engineering, SRM University, Sonipat 4Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, 5Asst. Professor, Dept. of Computer Science Engineering, SRM University, Haryana, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - A method for 'Credit Card Fraud Detection' is created in this study. As the number of scammers grows every day. Credit cards are used for fraudulent transactions, and there are several sorts of fraud. As a result, various techniques such as Logistic Regression, Random Forest, and Naive Bayes are utilized to tackle this problem. This transaction is evaluated individually, and whateverworksbestiscarried out. The primary purpose is to detect fraud by filtering the aforementioned strategies in order to achieve a better outcome. Key Words: Credit Card, Fraud Detection,RandomForest, Naïve Bayes, Logistic Regression. 1. INTRODUCTION Credit card fraud is a broad word for theft and fraud perpetrated using or utilizing a credit card at the moment of payment. The goal may be to buy something without paying for it or withdraw money from an account without permission. Identity theft is often accompanied by credit card fraud. According to the Federal Trade Commission of the United States, the rate of identity theft remained steady during the mid-2000s, but it jumped by 21% in 2008. Even though credit card fraud, the crime most people connect with ID theft, fell to a fraction of total ID theft complaints in 2000, roughly 10 million transactions, or one out of every 1300, were fraudulent. In addition, 0.05 percent (5 out of 10,000) of all monthly active accounts were fake. Today, fraud detection systems keep track of a twelfth of one percent of all transactions performed, resulting in billions of dollars in losses. Credit card fraud is one of the most serious issues facing businesses today. However, to successfully detect fraud, it is necessary first to comprehend the processes of fraud execution. Fraudsters use a variety of methods to perpetrate credit cardfraud.CreditCardFraudis described as "when an individual uses another person's credit card for personal reasons while the card owner and the card issuer are unaware that the card is being used." Theft of the actual card or the critical data linked with the account, such as the card account number or other information that must be given to a merchant during a valid transaction, is where card fraud begins. Card numbers, usually the Primary Account Number (PAN), are often reproduced on the card, and the data is stored in machine- readable format on a magnetic stripe on the reverse. 2. METHODOLOGY This part should provide the method and analysis used in your research project. Using keywords from your title in the first few phrases is a simple and effective method to follow. A. Data Collection The data-gathering phase is the first step in the project; this dataset comprises a collectionoftransactions,someofwhich are real and others are fraudulent. The data-gatheringphase is the first step in the project; this dataset includes a collection of transactions, some of which are real and others that are fraudulent. The data-gatheringphaseisthefirststep in the project; this dataset comprises a collection of transactions, some of which are real and others are fraudulent. B. Credit Card Dataset A credit card transaction data set was gathered via Kaggle, and it comprises a total of 2,84,808 credit card transactions from a European bank. It divides transactions into "positive class" and "negative class." The data set is highly skewed, with roughly 0.172 percent of transactions being fraudulent and the remainder being legitimate; this indicates that just 492 of the 2,84,808 transactions are fraudulent, and the rest are genuine ones. So, we oversampled to balance the data set, resulting in 60% of fraud transactions and 40% genuine ones. C. Preprocessing of Dataset Selected data is formatted, cleaned, and sampled in this module. The following are some of the data pre-processing steps: a) Formatting: The chosen data might not be in the correct format. We may prefer data in a file format over a relational database or vice versa. b) Cleaning is the process of removing or correcting missing data. The dataset may contain recordsthat areincompleteor have null values. Such records must be deleted. c) Sampling: The class distribution in credit card transactions is uneven because the number of frauds in the dataset is fewer than the total number of transactions. As a
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1360 result, the sampling approach is utilized to tackle this problem. D. Loading of Dataset The dataset is loaded after it has been pre-processed. Various library functions can be used to load the dataset. In this case, we used the read CSV method of Python's Pandas library to load a dataset in CSV or Microsoft Excel format; in terms of python, it is called a DATAFRAME. dataset = pd.read_csv('creditcard.csv') E. Splitting of Dataset To compensate for the dataset's imbalance, we used the ADASYN oversampling technique, which oversamples both the number of fraudulent and genuine transactions to a specific number, resulting in a positive and negative range that is nearly equal. After the dataset is oversampled, the samples are split into Train and Test data. A suitable ratio is to be performed for the model (Usually, 70% for Train data and 30% for Test data are chosen, anyone can choose their ration). The train dataset can be further split into train data and validation data. F. Building Model After the data has been split into train and test data which is 70% and 30%, respectively, the training data is now utilized for the model building. The dataset contains 31 features,out of which 30 features or columns are the independent features, and the last column called the CLASS column, is the dependent feature. So here, the dataset is split into four categories: xtrain, ytrain, xtest, and ytest, representing independent training features, dependent training features, independent test features, and dependent test features. G. Algorithms a) Logistic Regression -Regression is a regression model that analyses therelationshipbetweenmultipleindependent variables and has a categorical dependent variable. There are many different logistic regression models, including binary, multiple, and binomial logistic models. The Binary Logistic Regression model calculates the likelihood of a binary response based on one or more predictors. Fig 1- Logistic Regression expression The above equation represents the logistic regression in mathematical form. b) Random Forest - Random Forest can be used to rank the importance of variables in a regression or classification problem in a natural way. Random forest is a tree-based algorithm that createsseveral treesandcombinesthe results to improve the model's generalization ability. An ensemble method is a technique for combining trees. Ensembling is nothing more than putting togethera groupofweak learners (individual trees) to create a stronglearner.RandomForests can be used to solve problems involving regression and classification. The dependent variable in regression problems is continuous. The dependent variable in classification problems is categorical. Fig 2- Random Forest expression c) Naïve Bayes - A Bayesian classifier is a statistical method that uses Bayes' theorem to calculate the probability that a feature belongs to a specific class. It is referred to as naive because it assumes that the possibilities of individual components are independent of one another, which is extremely unlikely to occur inthereal world.Theprobability of an event occurring is calculated by considering the likelihood of another event occurring. It's possibletowriteit as: Fig 3- Naïve Bayes expression Where the posterior probability of target class c P(c|X) is calculated from P(c), P(X|c), and P(X). H. Training of Model After building the model, the model is trained using Train data and validation data. The model is trained using the library function fit () function. I. Evaluating the Model The model can be evaluated by using various metrics. These are a) Interpreting Loss and Validationloss -Lossistheresult of a bad prediction. A loss is a number indicating how bad the model's prediction was on a single example. Loss can be validation loss and training loss. b) Interpreting Accuracy and Validation Accuracy - Validation accuracy and accuracy need to be converged in a good model. c) Confusion Matrix - A confusion matrix summarizes classification problem prediction results. The correct and
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1361 incorrect predictions are totaled and broken down by class using count values. The confusion matrix's key is this. The confusion matrix depicts the various ways in which your classification model becomes perplexed when making predictions. It reveals the number of errors made by a classifier and the types of errors made. Here, •Class 1: Positive •Class 2: Negative Classification Rate/ Accuracy Classification Rate or Accuracy is given by the relation: Fig 4- Accuracy Expression Recall - Recall can be defined as the ratio of the total number of correctly classified positive examples divided by the total number of positive examples. High Recall indicates the class is correctly recognized (a small number of FN). Fig 5- Recall Expression Precision - To get the precision value, we divide the total number of correctly classified positive examples by thetotal number of predicted positive examples. High precision indicates that an example labeled as positive is positive (a small number of FP). Fig 6- Precision Expression J. Saving the Model After building the model, the model is saved to our device. The model can be saved in .pkl format or .h5 format. To save the model in .pkl format, python provides usa librarynamed Pickle, and to save it in .h5 format, the Tensorflow library is used. I have used the Pickle library and saved the models in .pkl format. 3. MODEL AND ANALYSIS Fig 8- RandomForest Classifier Fig 9: Naïve Bayes model 4. RESULTS AND DISCUSSION The accuracy results we got from the three algorithms are shown in the following table below. Table- Accuracy Results Fig 7- Logistic Regression
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1362 Training vs. Test Data in Dataset Fig 10- Training vs. Test Data Normal vs. Fraud Transaction after Oversampling Fig 11- Fraudulent vs. non-Fraudulent Correlation matrix Fig12- Correlation Matrix 4. CONCLUSIONS Various machine learning algorithms for detecting fraud in credit card transactions were reviewed in this paper. The accuracy, precision, and specificity metrics are used to evaluate the performance of this technique. To classify the transaction as fraudulent or authorized, I used three supervised learning techniques: Logistic Regression, Random Forest, and Naive Bayes. Using feedback and delayed supervised training, these classifiers were trained on a delayed supervised sample dataset of almost 284807 transaction records. Due to the massive imbalance, the dataset was subjected to an Oversampling technique, which resulted in the number of fraud and normal transactions being nearly equal. The training and test data were tested using the three Models, and the results were obtained. The accuracy of the Random Forest, Logistic Regression, and Naive Bayes was 99.27%, 91.20%, and89.40%, respectively. From the Above project, itcanbeconcludedthattheRandom Forest model is somewhat trustworthy, and its accuracy could be improved further with a larger and more balanced dataset. If some other algorithms can be combined with this one to form a Hybrid Algorithm, the results will be even better. ACKNOWLEDGEMENT The success and outcome of this project required a lot of guidance and assistance from many people, and we are highly privileged to have got this all along with the completion of our project. All that we have done is only due to such supervision and assistance, and we will not forget to thank them. We are extremely grateful to Dr. Paramjit S. Jaswal, Vice- Chancellor, SRM University, and Dr.PuneetGoswami,Head of the Department, Department of Computer Science and Engineering, for providing all the required resources for the completion of my seminar. Our heartfelt gratitude to our guide Dr. Ishwari Singh, for their valuable suggestion and guidance in preparing the research paper. Last but not least, we would express our obligation to all the people who have worked extensively on the topic and make the content available for free to all the aspiring people who want to grow in their community. I would say this report can be helpful to any aspiring student who wants to gain an overall idea about how high- performance computing works in practical life. REFERENCES [1] T. Mohana Priya, Dr. M. Punithavalli & Dr. R. Rajesh Kanna, Machine Learning Algorithm forDevelopmentof Enhanced Support Vector MachineTechniquetoPredict Stress, Global Journal of Computer Science and Technology: C Software & Data Engineering, Volume20, Issue 2, No. 2020, pp 12-20
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 06 | Jun 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1363 [2] Ganesh Kumar and P.Vasanth Sena, “Novel Artificial Neural Networks and Logistic Approach for Detecting Credit Card Deceit,” International Journal of Computer Science and Network Security, Vol. 15, issue 9, Sep. 2015, pp. 222-234 [3] Gyusoo Kim and Seulgi Lee, “2014 Payment Research”, Bank of Korea, Vol. 2015, No. 1, Jan. 2015. [4] Chengwei Liu, Yixiang Chan, Syed Hasnain Alam Kazmi, Hao Fu, “Financial Fraud Detection Model: Based on Random Forest,” International Journal ofEconomicsand Finance, Vol. 7, Issue. 7, pp. 178-188, 2015. [5] Hitesh D. Bambhava, Prof. Jayeshkumar Pitroda, Prof. Jaydev J. Bhavsar (2013), “A Comparative Study on Bamboo Scaffolding And Metal Scaffolding in Construction Industry Using Statistical Methods," International Journal of Engineering Trends and Technology (IJETT) – Volume 4, Issue 6, June 2013, Pg.2330-2337. [6] P. Ganesh Prabhu, D. Ambika, "Study on Behaviour of Workers in Construction Industry to Improve Production Efficiency," International Journal of Civil, Structural, Environmental and Infrastructure Engineering Research and Development (IJCSEIERD), Vol. 3, Issue 1, Mar 2013, 59-66 [7] Manideep, A. P. S., and Seema Kharb. "A Comparative Analysis of Machine Learning Prediction Techniquesfor Crop Yield Prediction in India." Turkish Journal of Computer andMathematicsEducation(TURCOMAT) 13.2 (2022): 120-133.