Neurodevelopmental disorders according to the dsm 5 tr
Presentation 5.pptx
1. Guru Nanak Dev Engineering College,
Bidar
(Department of Information Science & Engineering)
HEALTH INSURANCE COST PREDICTION BY
USING REGRESSION MODELS.
Major Project
Arjun Singh (3GN18IS007)
Gourishanker (3GN18IS009)
Prabhakar (3GN18IS015)
Sai Krishna (3GN18IS029)
Under the guidance of,
Prof. Sangameshwar Kawdi
2. AIM
• The main aim of this project is to identify or predict the nearest value of
the health insurances of the citizens based on the collected data.
• This model ensures the predicted amount for the health insurance gives
maximum accuracy to the people by implementing various different
algorithms.
3. Objective
• To implement the efficient algorithms which provide more accuracy in
terms of predicting the right insurance amount.
• Comparing different algorithms to achieve the accurate outcome
through regression models.
4. Problem Statement
• The amount of the premium for a health insurance policy depends from
person to person, as many factors affect the amount of the premium for
a health insurance policy. Let’s say age, a young person is very less likely
to have major health problems compared to an older person. Thus,
treating an older person will be expensive compared to a young one.
That is why an older person is required to pay a high premium
compared to a young person. The right prediction model is a must in
consideration with their daily habits, such that an idea is given to the
people about their health insurance
5. L
Paper title : Predict Health Insurance Cost by using Machine
Learning and DNN Regression Models. (Publisher: Ieee,
source: https://ieeexplore.ieee.org/document/703922)
Major Observations:
• Regression analysis allows us to quantify the
relationship between outcome and
associated variables. Many techniques for
performing statistical predictions have been
developed, but, in this project, three models
- Multiple Linear Regression (MLR), Decision
tree regression and Gradient Boosting
Regression were tested and compared
Paper title : Health Insurance Amount Prediction.
(Publisher: International Journal of Engineering Research &
Technology (IJERT))
Major Observations:
• In this paper, a method was developed, using
large-scale health insurance claims data, to
predict the number of hospitalization days in
a population. They utilized a regression
decision tree algorithm, along with insurance
claim data from 242 075 individuals over
three years, to provide predictions. The
proposed method performs well in the
general population as well as in
subpopulations.
Literature Survey
6. Hardware & Software
Requirements:
Hardware Requirements:
Standard Pentium Series Processor
Minimum 4 GB RAM
256 GB HDD Storage capacity.
Software Requirements:
Windows 7
Chrome or Any Search Engine
Text Editor
Anaconda Software
7. Important Methods &
Approaches:
Below listed are the different regression models which are used:
1. Multiple Linear Regression.
2. Decision Tree Regression.
3. Gradient Boosting Regression.
8. What is regression?
Regression analysis is primarily used for two conceptually distinct purposes. First,
regression analysis is widely used for prediction and forecasting, where its use
has substantial overlap with the field of machine learning. Second, in some
situations regression analysis can be used to infer causal relationships between
the independent and dependent variables. Importantly, regressions by
themselves only reveal relationships between a dependent variable and a
collection of independent variables in a fixed dataset. To use regressions for
prediction or to infer causal relationships, respectively, a researcher must carefully
justify why existing relationships have predictive power for a new context or why
a relationship between two variables has a causal interpretation. The latter is
especially important when researchers hope to estimate causal relationships
using observational data.
9. Multiple Linear Regression?
Multiple linear regression (MLR), also known simply as multiple regression,
is a statistical technique that uses several explanatory variables to predict
the outcome of a response variable. The goal of multiple linear regression
is to model the linear relationship between the explanatory (independent)
variables and response (dependent) variables. In essence, multiple
regression is the extension of ordinary least-squares
(OLS) regression because it involves more than one explanatory variable.
10. Key Takeaways
Multiple linear regression (MLR), also known simply as multiple
regression, is a statistical technique that uses several explanatory
variables to predict the outcome of a response variable.
Multiple regression is an extension of linear (OLS) regression that uses
just one explanatory variable.
MLR is used extensively in econometrics and financial inference.
11. Decision Tree Regression?
Decision tree builds regression or classification models in the form of a
tree structure. It breaks down a dataset into smaller and smaller
subsets while at the same time an associated decision tree is
incrementally developed. The final result is a tree with decision
nodes and leaf nodes. A decision node (e.g., Outlook) has two or
more branches (e.g., Sunny, Overcast and Rainy), each representing
values for the attribute tested. Leaf node (e.g., Hours Played)
represents a decision on the numerical target. The topmost decision
node in a tree which corresponds to the best predictor called root
node. Decision trees can handle both categorical and numerical data.
12. Gradient Boosting
Regression?
Gradient boosting is a machine learning technique used in regression
and classification tasks, among others. It gives a prediction model in
the form of an ensemble of weak prediction models, which are
typically decision trees. When a decision tree is the weak learner, the
resulting algorithm is called gradient-boosted trees; it usually
outperforms random forest. A gradient-boosted trees model is built in
a stage-wise fashion as in other boosting methods, but it generalizes
the other methods by allowing optimization of an arbitrary
differentiable loss function.