1. INSURANCE FRAUD DETECTION
USING MACHINE LEARNING
Submitted To
Ms. Minakshi Halder
Submitted By
Jahanvi Maheshwari
(0827EC201011)
1
Acropolis Institute of Technology & Research
3. I
N
T
R
O
D
U
C
T
I
O
N
The insurance industries consist of more than
thousand companies in worldwide. And collect more
than one trillions of dollars premiums in each
year. The vehicle insurance fraud is the most
prominent type of insurance fraud, which can be done
by fake accident claim. In this project, focusing on
detecting the vehicle fraud by using, machine learning
techniques.
4. • Machine Learning is said as a subset of Artificial
intelligence that is mainly concerned with the development of
algorithms which allow a computer to learn from the data and
past experiences on their own.
• Machine learning uses data to detect various patterns in a
given dataset.
• It can learn from past data and improve automatically.
Introduction to Machine Learning
5.
6. Insurance Fraud Detection
• Fraud is one of the largest and most well-known problems that insurers face.
Fraudulent claims can be highly expensive for each insurer. Therefore, it is
important to know which claims are correct and which are not.
• Ideally, an insurance agent would have the capacity to investigate each case
and conclude whether it is genuine or not. However, this process is not only
time consuming, but costly. Sourcing and funding the skilled labor required to
review each of the thousands of claims that are filed a day is simply
unfeasible.
• This is where machine learning comes in to save the day. Once the proper
data is fed to the system it'll be very easy to find out if the claim is genuine or
not
9. Exploratory Data Analysis
Exploratory Data Analysis refers to the critical process of performing
initial investigations on data so as to discover patterns, to spot
anomalies, to test hypothesis and to check assumptions with the help
of summary statistics and graphical representations.
It is a good practice to understand the data first and try to gather as
many insights from it.
12. MODEL
EVALUTION
• Model evaluation assesses the quality and
performance of machine learning models.
• Use appropriate metrics such as accuracy, precision,
recall, or regression-specific metrics to quantify a
model's performance.
• Model evaluation should align with the goals and
requirements of the specific application.
•
13. Support
Vector
Machine
● Support vectors are the data points that lie closest to the
decision surface (or hyperplane)
● It is a decision plane or space which is divided between a set
of objects having different classes.
ACCURAY of SVM
14. DECISION
TREE
•Decision tree is a tree-like model for decision-
making.
•It's used in classification and regression tasks.
•It makes sequential decisions based on feature
values to reach an outcome.
ACCURACY OF DECISION TREE
•
15. LOGISTIC
REGRESSION
• Logistic regression is a statistical method used for
classification tasks in machine learning and statistics.
• It's particularly well-suited for binary classification, where the
goal is to predict one of two possible outcomes, such as
yes/no or pass/fail.
• Logistic regression can also be extended to handle multi-class
classification problems, where there are more than two
possible outcomes.
• The core of logistic regression is the sigmoid function, which
maps input values to a probability between 0 and 1.
ACCURACY OF LOGISTIC
REGRESSION
•
16. Model
Selection
● After models are build and undergone evaluation,
then models are compared with each other basis
of efficiency
● Evaluating the models with various evaluating
parameters we select the efficient model for
prediction which gives accurate results.
17. Model
DEPLOYMENT
•Anaconda for ML: Anaconda is a versatile platform for
data science and ML, featuring an integrated environment
and libraries.
•Model Deployment Significance: Model deployment is
the critical step of making ML models operational for real-
world applications.
•Anaconda Deployment Tools: Anaconda offers tools
like Anaconda Enterprise and Conda package manager
for creating and managing deployment environments.
•Deployment Considerations: Key deployment
considerations include scalability, performance,
monitoring, security, compatibility, and dependency
management.
18. Application Development
• This folder contains an HTML file, in this case, index.html. This HTML
file is used to take inputs from the end-user.
• After index.html, a file that loads the previously exported model
(model.pkl file) and, based on the end-user input from the index.html
file, returns the predicted value.
• The file is the requirement.txt file that contains all the libraries used
during model building.