rapport-pi_bi-final (1).docx

Sinistres routiers
Projet PI_BI
PROJET BIBLIOGRAPHIQUE
Realized by :
FRIDHI AHMED BEN FRAJ GHAILEN
BEDHIAFI RADHWEN OKKEZ HADIL
GHABI YASSINE CHAABENE YASSMINE
Supervised by : Mme Dorsaf Ben Hassen
Academic year 2020-2021

Projet PI_BI
2
Summary :
I. Introduction : ..................................................................................................................7
II. Problem statement :........................................................................................................7
III. Market study :.................................................................................................................8
1. Advantages :.................................................................................................................10
2. Drawbacks :..................................................................................................................10
IV. Objective : ....................................................................................................................11
V. Functionality :...............................................................................................................11
1. Functional requirements :.................................................................................11
2. Non-Functional requirements :.........................................................................12
VI. Methodology : ..............................................................................................................12
a) Identification : .............................................................................................13
b) Design : .......................................................................................................13
c) Implementation : .........................................................................................13
d) Permanent improvement : ...........................................................................14
VII. Data Model :................................................................................................................15
3. Star schema offer the following benefits :........................................................15
4. Star schema vs Snowflake schema :...........................................................................16
5. Constellation Model : .................................................................................................16
6. DW sinistres routiers :................................................................................................17
VIII. Data integration with Talend:.......................................................................................18
1. Dimension table « Assure » :........................................................................................18
2. Dimension table « contrat » :........................................................................................19
3. Dimension table « lieu » :.............................................................................................21
4. Dimension table « vehicule » :.....................................................................................21

Projet PI_BI
3
5. Table Fact :...................................................................................................................23
IX. Business Understanding ...............................................................................................24
a) Business objectives....................................................................................24
b) Assessing the situation...............................................................................25
c) Data Mining objectives..............................................................................25
d) Project Plan................................................................................................25
X. Data Understanding......................................................................................................26
a) Collecting initial data .................................................................................................26
b) Describing data...........................................................................................................26
c) Data Exploration.........................................................................................................27
d) Verifying data quality : ..............................................................................................28
XI. Data Preparation...........................................................................................................29
a) Selecting Data.......................................................................................29
b) Cleaning data : ......................................................................................30
c) Construct required data :.......................................................................31
XII. Modeling.......................................................................................................................32
XIII. Evaluation.....................................................................................................................53
a) Evaluate the results :.....................................................................................................53
b) Review Process :...........................................................................................................53
c) Determine next steps : ..................................................................................................53
XIV. Deployment ..................................................................................................................54
XV. Dashboards :................................................................................................................57
XVI. Web application............................................................................................................69
XVII. Conclusion :..................................................................................................................70

Projet PI_BI
4
List of Figures
Figure 1.......................................................................................................................................7
Figure 2.......................................................................................................................................8
Figure 3.......................................................................................................................................9
Figure 4.......................................................................................................................................9
Figure 5.....................................................................................................................................10
Figure 6.....................................................................................................................................11
Figure 7 :...................................................................................................................................15
Figure 8.....................................................................................................................................17
Figure 9.....................................................................................................................................18
Figure 10...................................................................................................................................18
Figure 11...................................................................................................................................19
Figure 12...................................................................................................................................20
Figure 13...................................................................................................................................20
Figure 14...................................................................................................................................21
Figure 15...................................................................................................................................21
Figure 16...................................................................................................................................22
Figure 17...................................................................................................................................23
Figure 18...................................................................................................................................23
Figure 19...................................................................................................................................24
Figure 20...................................................................................................................................26
Figure 21...................................................................................................................................26
Figure 22...................................................................................................................................27
Figure 23...................................................................................................................................27
Figure 24...................................................................................................................................27
Figure 25...................................................................................................................................28
Figure 26...................................................................................................................................28
Figure 27...................................................................................................................................28
Figure 28...................................................................................................................................29
Figure 29...................................................................................................................................29
Figure 30...................................................................................................................................30

Projet PI_BI
5
Figure 31...................................................................................................................................30
Figure 32...................................................................................................................................31
Figure 33...................................................................................................................................31
Figure 34...................................................................................................................................32
Figure 35...................................................................................................................................32
Figure 36...................................................................................................................................33
Figure 37...................................................................................................................................34
Figure 38...................................................................................................................................34
Figure 39...................................................................................................................................35
Figure 40...................................................................................................................................35
Figure 41...................................................................................................................................36
Figure 42...................................................................................................................................36
Figure 43...................................................................................................................................37
Figure 44...................................................................................................................................37
Figure 45...................................................................................................................................38
Figure 46...................................................................................................................................38
Figure 47...................................................................................................................................39
Figure 48...................................................................................................................................40
Figure 49...................................................................................................................................41
Figure 50...................................................................................................................................41
Figure 51...................................................................................................................................42
Figure 52...................................................................................................................................42
Figure 53...................................................................................................................................42
Figure 54...................................................................................................................................43
Figure 55...................................................................................................................................44
Figure 56...................................................................................................................................44
Figure 57...................................................................................................................................45
Figure 58...................................................................................................................................45
Figure 59...................................................................................................................................46
Figure 60...................................................................................................................................46
Figure 61...................................................................................................................................47
Figure 62...................................................................................................................................47
Figure 63...................................................................................................................................48

Projet PI_BI
6
Figure 64...................................................................................................................................49
Figure 65...................................................................................................................................49
Figure 66...................................................................................................................................50
Figure 67...................................................................................................................................51
Figure 68...................................................................................................................................51
Figure 69...................................................................................................................................52
Figure 70...................................................................................................................................52
Figure 71...................................................................................................................................54
Figure 72...................................................................................................................................55
Figure 73...................................................................................................................................55
Figure 74...................................................................................................................................56
Figure 75...................................................................................................................................57
Figure 76...................................................................................................................................58
Figure 77...................................................................................................................................58
Figure 78...................................................................................................................................59
Figure 79...................................................................................................................................60
Figure 80...................................................................................................................................61
Figure 81...................................................................................................................................61
Figure 82...................................................................................................................................62
Figure 83...................................................................................................................................62
Figure 84...................................................................................................................................63
Figure 85...................................................................................................................................64
Figure 86...................................................................................................................................64
Figure 87...................................................................................................................................65
Figure 88...................................................................................................................................66
Figure 89...................................................................................................................................66
Figure 90...................................................................................................................................67
Figure 91...................................................................................................................................67
Figure 92...................................................................................................................................68
Figure 93...................................................................................................................................68

Projet PI_BI
7
I. Introduction :
Tunisia had the second worst traffic death rate per capita in North Africa. According to figures
reported by the National Observatory for Road Safety (ONSR), 5877 accidents took place in
2018, claiming the lives of 1205 people and injuring 8869 others which causes a great financial
loss for Tunisian insurance companies.
Figure 1
II. Problem statement :
Our society is witnessing very high percentages of tragedies and loss that had affected their
daily actions. Citing one of the causes is road accidents which is producing shocking numbers
year in year out which is obvious through this picture.
We came here to question the effectiveness of the measures taken to reduce the ratio of this
phenomenon and to propose an effective solution.
Our intervention is based on an analytical view of data stored within the assurance companies’
databases in order to give insightful actions to be taken.

Projet PI_BI
8
III. Market study :
In a matter of fact, we can acknowledge the effort given by the assurance companies and their
continualness tentative to minimalize the ratio of road accidents via the National Institution of
Statistics (NIS) which is a public establishment dependent on the Development and
International cooperation Minster.
This establishment is taking part of the production and the analysis of official statistics in
Tunisia.
Figure 2
In addition, it provides the possibility of visualizing all wanted data and analysis such as the
evolution of accident numbers through location, the evolution of accident numbers through
causes and the evolution of accident numbers through injuries and death.

Projet PI_BI
9
Figure 3
Figure 4

Projet PI_BI
10
Figure 5
1. Advantages :
Through these figures, we can acknowledge the feasibility of any wanted analysis..Each year
the general insurance committee has a very detailed written report which presents the
evolution of road accidents in relation to several factors.
2. Drawbacks :
It is true that the report is well detailed and provides us with several information, but it is
complicated to extract the clues necessary to make the right decisions.
In addition, since each report is specific to a well-determined year which causes a scattered
history.

Projet PI_BI
11
IV. Objective :
Figure 6
V. Functionality :
1. Functional requirements :
The functional specifications define the different requirements that the analytical system must
satisfy.
Applying that on our case, we are ought to describe the requirement’s details in order to build
a global use case diagram.
Our analytical system must satisfy these requirements:

Projet PI_BI
12
• Allow the user to visualize the different classification of the clients and vehicles.
• Provide information to the assurance companies to classify the behavior of clients.
• Provide information to the assurance companies reflecting on the details of high threat
vehicles.
• Identify the weak policies in the contract in order to improve these points.
• Identify the locations that provide a high ratio of road accidents.
• Provide an easy visualization of analyzed data throughout a dashboard.
2. Non-Functional requirements :
Non-functional requirements describe how efficiently a system should function. They refer to
the general qualities that provide a good user experience.
Our analytical system must satisfy these requirements:
Security: Ensure that the system is protected from unauthorized access.
Reliability: Ensure that the system will work without failure.
Performance: Ensure that the system will provide the qualities expected such as the
responsiveness of the system.
VI. Methodology :
GIMSI Definition
Generalization-Information-Method and Measurement-System and
Systemic-Individuality and Initiative
Defines a cooperative methodological design framework in order to better formalize the
conditions for the success of the BI project centered on the issue of the dashboard.

Projet PI_BI
13
a) Identification :
What is the context?
1-Company environment
Analysis of the economic environment and the company's strategy in order to define the
perimeter and scope of the project.
2-Identification of the company
Analysis of the structures of the company to identify the processes, activities and actors
involved.
b) Design :
What should be done?
3-Definition of objectives
Selection of tactical objectives for each team
4-Construction of the dashboard
Definition of the dashboard of each team
5-Choice of indicators
Choice of indicators according to the objectives chosen.
6-Collection of information
Identification of the information needed to construct the indicators.
7-The dashboard system
Construction of the dashboard system, control of overall consistency
c) Implementation :
How to do it?

Projet PI_BI
14
8-The choice of software packages
Development of the selection grid for the choice of suitable software packages
9-Integration and deployment
Implementation of software packages, deployment to the company
Continuous improvement
d) Permanent improvement :
10-Audit
Continuous monitoring of the system Does the system still meet expectations?
 Methodology choice :
Boost the creation of values in a transversal orientation.
Position the needs of the actor in a decision-making situation at the heart of the process in order
to fully consider the risk-taking inherent in new ways of operating companies.
Contribute to the destruction of the wall still existing between operational technological
solutions and user expectations.

Projet PI_BI
15
VII. Data Model :
Figure 7 :
Star schemas offer the simplest structure for organizing data into a data warehouse. The center
of a star schema consists of one or fact table "fact sinistres"that index a series of dimension
tables(dim Assure ,dim vehicule , dim lieu ,dim date de sinistre,dim contrat)
3. Star schema offer the following benefits :
 Queries are simpler: Because all of the data connects through the fact table the
multiple dimension tables are treated as one large table of information, and that
makes queries simpler and easier to perform.
 Easier business insights reporting: Star schemas simplify the process of pulling
business reports like as-of-as and period-over-period reports.

Projet PI_BI
16
 Better-performing queries: By removing the bottlenecks of a highly normalized
schema, query speed increases, and the performance of read-only commands
improves.
4. Star schema vs Snowflake schema :
1-Star schema dimension tables are not normalized, snowflake schemas dimension tables are
normalized.
2-Snowflake schemas will use less space to store dimension tables but are more complex.
3-Star schemas will only join the fact table with the dimension tables, leading to simpler,
faster SQL queries.
4-Snowflake schemas have no redundant data, so they're easier to maintain.
5-Snowflake schemas are good for data warehouses, star schemas are better for datamarts
with simple relationships.
5. Constellation Model :
Fact Constellation is a schema for representing multidimensional model. It is a collection of
multiple fact tables having some common dimension tables. It can be viewed as a collection
of several star schemas and hence, also known as Galaxy schema. It is one of the widely used
schema for Data warehouse designing and it is much more complex than star and snowflake
schema. For complex systems, we require fact constellations.

Projet PI_BI
17
6. DW sinistres routiers :
Figure 8

Projet PI_BI
18
VIII. Data integration with Talend:
1. Dimension table « Assure » :
Figure 9
Figure 10

Projet PI_BI
19
tMSSqlInput'assure':reads data and extracts fields based on a query from a Microsoft
SQL Server database or a Microsoft Azure SQL database.
tmap:TMap transforms and directs data from one or more sources and to one or more
destinations.
tUniqRow:The tUniqRow component compares the entries and removes duplicates from
the input stream.
tMSSqlOutput 'dimassure':Inserting data into a database table dimassure and extracting
useful information from it
getUsCity: randomly returns a price city from a list of known cities
2. Dimension table « contrat » :
Figure 11

Projet PI_BI
20
Figure 12
Figure 13
Numeric.sequence :Returns an incremented numeric identifier.
formatDate:Returns a date expression formatted according to the specified date pattern

Projet PI_BI
21
3. Dimension table « lieu » :
Figure 14
4. Dimension table « vehicule » :
Figure 15

Projet PI_BI
22
Figure 16
this job uses tmap component to do an inner join with the data form the 'vehicule' table
based on the code marque column and the code column from the table marque.

Projet PI_BI
23
5. Table Fact :
Figure 17
Figure 18

Projet PI_BI
24
Figure 19
Dimension Table connected to the fact table.
Primary Keyin fact table is mapped as foreign keys to Dimensions
(codePolice,codevehicule,codeassure,id_lieu,id_date,id_natureSinistre), also it holds the
necessary key metrics for setting up main and relevant key performance indexes (KPI)
exemple 'pourcentage de responsabilité','calculsinistre'...
IX. Business Understanding
a) Business objectives
We aim throughout this project to satisfy our client, which is the General Council of Insurance.
The Council have set its primary objectives, which are the following:
-Identify the seasons of road accidents.
-Identify the responsible of the road accident.
-Identify the vehicles that take part in a road accident.

Projet PI_BI
25
b) Assessing the situation
In order to achieve our objectives, we need to list all resources available to our project.
The resources are:
-Personnel: We will need 6 engineers who are familiar with data mining.
-Data: The General Council of Insurance will provide us with the necessary data from all
insurance companies.
-Hardware: We will need six computers and access to the internet.
-Software: We will work with Google Collaboratory application.
c) Data Mining objectives
The data mining objectives will be divided into 2 classes.
We will perform an additional analysis where we aim to satisfy the following objectives:
-Determine in which season road accidents occurred the most.
-Determine which attribute that have the highest correlation with a road accident.
Also, we will perform a predictive analysis where we aim to satisfy the following objectives:
-Predict the nature of the road accident.
-Predict the rate of responsibility of a road accident.
-Predict the brand of the vehicles that frequently cause a road accident.
d) Project Plan
In order to achieve the business objectives and data mining objectives we will be need the
following tools:
-Talend Integration Tool: This tool permits us to integrate the data and produce the
dimension tables and the fact tables.

Projet PI_BI
26
-Google Collaboratory: This tool permits us to analyze the datasets extracted from the
integration phase throughout algorithms using Python language.
-Power BI: This tool allows us to visualize the datasets extracted and analyzed and
assures the reporting phase.
-PyCharm: This tool allows us to deploy our dashboard on a web application.
X. Data Understanding
a) Collecting initial data
The CGA or the General Council of Insurance has provided us with the database containing
all data required to achieve our objectives
b) Describing data
The database was a SQL Server database and it had 10 tables.
Most of the tables had more than a thousand row and multiple columns.
The attributes of the tables were in different types.
Figure 20
Figure 21

Projet PI_BI
27
c) Data Exploration
After exploring the data acquired, we proceeded with doing simple queries and
visualizations in order to discover a relationship among the data. And yet we have found an
aggregation between tables that helped us achieve our objectives.
Figure 22
Figure 23
Figure 24

Projet PI_BI
28
Figure 26
d) Verifying data quality :
1) The ratio of data to error :
The number of data errors were minimal.
2) Number of empty values :
The number of empty values were significant, it was approximately over ten thousand
missing value.
3) Data storage cost :
The data storage was acceptable, the volume of the datasets was not significant.
Figure 25
Figure 27

Projet PI_BI
29
XI. Data Preparation
a) Selecting Data
In order to achieve our objectives, we will need four tables from the database.
The “Assure” table.
The “Sinistres” table.
The “MarqueVehicules” table.
The “Compagnie” table.
The selection of these four tables is justified with the requirements needed to achieve our
objectives.
Figure 28
Figure 29

Projet PI_BI
30
b) Cleaning data :
In order to obtain the optimal dataset for analysis, we have performed some cleaning operations.
These operations allow us to remove the unnecessary and unused data to the extent of raising
the data quality.
Additionally, we have used the amputation technique to remove unwanted data as well as
transformations which will allow us to use the required algorithms.
Example of amputation:
Figure 30
Figure 31

Projet PI_BI
31
Example of transformations:
c) Construct required data :
In order to give our analysis more depth, we have generated a new dataset.
This dataset revolves around the date where we have extracted the date from each road accident
and gave it new attributes and eventually generated new columns such as “Saison” column.
Figure 32
Figure 33

Projet PI_BI
32
XII. Modeling
For the additional analysis, we have used two algorithms.
K-Means Algorithm.
AHC Algorithm.
ACP Algorithm.
1. K-Means Algorithm :
1. Selecting model technique and building model :
We start off by importing the dataset and eliminating the unused column.
We proceed by applying the K-Means algorithm, we initially chose the value of K equals to 3.
Figure 34
Figure 35

Projet PI_BI
33
2. Asses Model :
We can say throughout the results that the model is a good model for our objective as we can
see it resulted in 3 clusters which satisfy the requirements.
The cluster (1) is the cluster for the season “hiver”.
The other clusters are for the other seasons.
We can eventually say that “hiver” is the most common season for road accidents.
2. AHC Algorithm :
a) Selecting model technique and building model :
Figure 36

Projet PI_BI
34
From this dendrogram, we can determine the number of classes to work with which is equal to
two.
b) Asses model :
We can affirm that AHC did not perform as we wanted so this model cannot be trusted to work
with, we proceed with K-Means algorithm results.
3. ACP Algorithm :
a) Selecting model technique and building model :
We start off by importing the dataset.
Figure 37
Figure 38

Projet PI_BI
35
Figure 39
We proceed by preparing the data through multiple operations. We must explicitly center and
reduce variables to realize ACP standardized with PCA.
We use for this operation StandardScaler class.
Figure 40

Projet PI_BI
36
In fact, the centering permits us to have the same scale of values to we obtain eventually a
variance/covariance matrix.
Figure 41
We proceed by verifying the properties of the new set of data through the algorithm awareness.
We can notice that the mean and the standard deviation are now null and unitary.
Instantiation and launching calculus.
Figure 42

Projet PI_BI
37
Displaying the number of generated components which is equal to 10.
Figure 43
Proceeding with proper values and Screen Plot, the propriety “explained_variance” helps us get
the variance associated with the factorial axes.
Figure 44

Projet PI_BI
38
b) Asses model :
We move to the quality of the representation, with the square cosine method.
In order to know the quality of the representation of the attributes on the axes, we need firstly
to calculate the square value of the distance between the origin of the attributes which is
corresponds with their contribution with the total inertia.
We can notice that “dateOuvertureSinistre” and “annee” are the most pertinent attributes.
Figure 45
Figure 46

Projet PI_BI
39
We finish this task with the representation of variables, initially, we need eigen vectors to
analyze variables. The vectors are produced by operation “champ.components”.
Figure 47
We can see throughout this visualization that “dateOuvertureSinistre” is effectively correlated
with “annee”. This global relation between variables is determined by “sinistre_id”.
For the predictive analysis, we have used two algorithms.
Random Forrest Algorithm.
KNN Algorithm.

Projet PI_BI
40
1. Random Forrest Algorithm
 First Model (Predict the rate of responsibility of a road accident)
a) Selecting model technique :
Figure 48
In order to proceed with the Random Forrest algorithm, we need to transform the categorical
data into numerical data.

Projet PI_BI
41
Figure 49
Transforming the dataset
Figure 50

Projet PI_BI
42
b) Generating test design :
We will be splitting the data into 35% test set and 65% training set.
Figure 51
c) Build model :
Applying Random Forrest.
Figure 52
Improving the performance of the algorithm Random Forrest
Figure 53

Projet PI_BI
43
d) Assess model :
Interpreting the results of the classification report and the confusion matrix.
Figure 54
The precision has a high value ( equal to 0.80), which means that from all the positive classes
that our model has predicted, the number of classes that were true was important.
The recall value is also high (equal to 0.80), which means our model has predicted an important
number of positive classes.
The accuracy value is also high, which means our model has correctly predicted the actual class
from all the class types.
The F1 score (equal to 0.80) helps to measure Recall and Precision at the same time. It uses
Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more.
The support (equal to 13643) is high in all classes which is the number of samples of the true
values in each class.
 Second Model (Predict the nature of the road accident)

Projet PI_BI
44
In order to proceed with the Random Forrest algorithm, we need to transform the categorical
data into numerical data.
Figure 56
Figure 55

Projet PI_BI
45
Then, we continue with Features’ visualization.
Figure 57
We can notice that the attribute “Nature de sinistre corporelle” is insignificant comparing with
“Nature de sinistre materiel” classifying with “Sinistre_Id”
The values of the attribute “Sinistre corporelle” and “Sinistre materiel” are proportional
classifying by “Code_Assure” with a significant difference in values.
Figure 58

Projet PI_BI
46
Figure 59
Figure 60

Projet PI_BI
47
c) Build model :
Applying Random Forrest.
Figure 61
Figure 62

Projet PI_BI
48
Our model has a precision score of 88.66% of the test set and a 100% on the training set.
Figure 63
d) Assess model :
Interpreting the results of the classification report and the confusion matrix.

Projet PI_BI
49
Figure 64
Figure 65

Projet PI_BI
50
The precision has a high value in all classes, which means that from all the positive classes that
our model has predicted, the number of classes that were true was important.
The recall value is also high in all classes, which means our model has predicted an important
number of positive classes.
The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in
place of Arithmetic Mean by punishing the extreme values more.
The support is high in all classes which is the number of samples of the true values in each class.
2. KNN Algorithm :
In order to achieve our predictive goal, we have used the KNN Algorithm.
We start off by importing the dataset to work with
Figure 66

Projet PI_BI
51
In order to work with KNN algorithm, we must encode the categorical data into numerical data.
Figure 67
Figure 68
c) Build model :
We verify the algorithm accuracy by creating a loop for the number K, starting from 1 to 30
and therefore we can determine the optimal value for K in order to proceed.
The optimal value for k is 20. We proceed with the algorithm.

Projet PI_BI
52
Figure 69
d) Assess model :
The assessing part is where we interpret the results throughout our classification report.
Figure 70

Projet PI_BI
53
The precision has a high value which equals to 0.98, which means that from all the positive
classes that our model has predicted, the number of classes that were true was important.
The recall value is also high which equals to 0.99, which means our model has predicted an
important number of positive classes.
The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in
place of Arithmetic Mean by punishing the extreme values more, which is equal to 0.98.
The support is high which equals to 6083 which is the number of samples of the true values in
each class.
XIII. Evaluation
a) Evaluate the results :
Looking into our predictive goals and the models created to achieve these goals and its great
accuracy, we can say that the results does satisfy our business criteria.
b) Review Process :
In the process of creating these models we have tried to create the optimal dataset.
The models created by different algorithms have produced different accuracy and precision
values, so the choice was on the model that had the highest values and that respects our business
goals criteria.
Looking back to the phases of creating the optimal models, we must note that you must try
different algorithms and adjust the parameters of the algorithm.
c) Determine next steps :
We aim to improve our models by adjusting the algorithms and cleaning the datasets by
removing unnecessary data in order to enhance the precision and the effectiveness of our model.

Projet PI_BI
54
XIV. Deployment
A. Deployment plan :
In order to deploy our solution, we have used “PyCharm” application.
We start off by importing our dataset and inserting our algorithms.
We then proceed by generating our prediction functions throughout our models and algorithms.
We then proceed by generating the HTML file to obtain the layout and form which contains the
fields to insert input values and finally we display the results of the prediction.
B. Monitoring and maintenance plan :
In order to keep monitoring our deployed solution and keeping it update, the problems
encountered where the type of data and how the algorithm is responding with the latter.
Figure 71

Projet PI_BI
55
Figure 72
To avoid this kind of technical problems, we have encoded our data to the requirements of the
algorithm whether its categorical of numerical.
Figure 73

Projet PI_BI
56
We can acknowledge the influence of this operation by observing the effectiveness of our
model.
Figure 74
To summarize our project and the work have been made, we kick off by stating business
understanding; we aimed to achieve our clients’ requirements and goals which is reducing road
accidents ratio. This goal was divided into multiple objectives related to our initial goal.
The division of the initial goal resulted in three business objectives which are identifying the
factors causing a road accident.
The set of business objectives where additionally transformed into data mining goals which
eventually produced two classes: An additional analysis and a predictive analysis.
Going through the data mining process, we decided to work with multiple algorithms in order
to obtain the optimal results for our goals.

Projet PI_BI
57
The selection of the algorithms was not random, we have tried multiple algorithms and we
eventually decided to choose the most convenient algorithms which where K means and ACP
for the additional analysis and Random Forrest for the predictive analysis.
The results were satisfying, we have found that the accuracy of the models met the requirements
set.
Moving to evaluating our data mining results, we have found that the precedent results were
meeting the success criteria of our business and we were able to determine the dominant factors
of a road accident.
To conclude, the client has set a primary goal which was reducing the ratio of road accident and
through out our work we have followed a strict methodology and have eventually produced
satisfying results from an analytical view.
XV. Dashboards :
Figure 75

Projet PI_BI
58
Figure 76
 Interpretation
From this graph we see that the year with the highest number of claims is 2018 0.22M and
decreases to 0.01M in 2016
 Impact for the company
Identify what are the main causes for this reduction
 Actions to take
CGA take into consideration the causes of this reduction
Figure 77

Projet PI_BI
59
 Interpretation
From this graph we can see that Volkswagen is the most damaged car brand contract then starts
to decrease
 Impact on the business
These results allow insurance companies to identify the most damaged car brands
 Actions to take
An additional addition for the 4 most affected brands
Figure 78
 Interpretation
From this graph we see that the number of material claims is greater than that of bodily claims
This graph allows to know the nature of the disaster
 Actions to take
Insurance companies ask these customers to take the necessary precautions

Projet PI_BI
60
Figure 79
 Interpretation
From this graph we see that Ami Assurance Tunisian insurance company which has the highest
number of claims About 52K claims after mutual insurance Takafel around 260 and ASTREE
180 sinister
 Impact for the company
This graph makes it possible to distinguish the insurance companies which have the number the
highest of claims in order to make the right decisions to minimize these claims
 Actions to take
We must identify the main causes of this high number of claims and then take the necessary
decisions to stop this phenomenon

Projet PI_BI
61
Figure 80
Figure 81
 Interpretation
From this graph we see that the first month of March and the most damaged month 56K claims
and February in 2 place and after the curve decreases until month 10 which is the disaster month
identify the most disaster-stricken months
 Actions to take
insurance companies advise their clients to drive safely

Projet PI_BI
62
Figure 82
 Interpretation
This graph shows that spring is the most damaged season and winter is in 2nd place
Identify the most disaster-stricken seasons
 Actions to take
Look for the climatic and infrastructure factors that make spring the most disaster-stricken
season and take the necessary precautions
Figure 83

Projet PI_BI
63
 Interpretation
From this graph we see that TUNIS is the most damaged place with 338 claims and in second
place Nabeul around 174 claims
This graph makes it possible to identify the most affected places
 Actions to take
The insurance companies must advise these customers who circulate in the most disaster
stricken places to drive safely
Figure 84

Projet PI_BI
64
Figure 85
 Interpretation
From this graph we see the 5 most affected customers between 8.2K and 3.6k claims
Know the most affected customers
 Actions to take
Penalize these customers or change the contract formula with them
Figure 86

Projet PI_BI
65
 Interpretation
From this graph we see the percentage of customer responsibility which is 100 is greater
282.47K
This graph allows to know if the customers are responsible for the accident or not
This graph allows to know if the customers are responsible for the accident or not
Figure 87
 Interpretation
This graph shows us that policyholders who are age 44 are customers who make more claims
around 6.5K and secondarily policyholders who are 60 years old
Identify the age category Policyholders who make more claims
 Actions to take
Insurance companies ask these insureds to drive Cautiously

Projet PI_BI
66
Figure 88
 Interpretation
From this graph we see that men make more claims than women up to 205.46K
this information helps the CGA to know the sex most affected
 Actions to take
agencies should advise male policyholders to drive safely
Figure 89

Projet PI_BI
67
Figure 90
 Interpretation
From this graph we can see that Volkswagen is the most damaged car brand
These results allow insurance companies to identify the most damaged car brands
 Action to take
An additional addition for the 5 most affected brands
Figure 91
 Interpretation
From this graph we see that cars with a fiscal power equal to 5 are the most damaged about
8.89k claims
This graph makes it possible to identify the fiscal power of the most damaged cars
 Action to take
Reformulate the contract by adding an additional for cars of fiscal power 5,6,4

Projet PI_BI
68
Figure 92
 Interpretation
From this graph we can see that the most damaged vehicles are from Gasoil energy
This graph makes it possible to identify the energy of the most damaged vehicles
 Action to take
Insurance companies must advise those customers who have diesel vehicles to take the right
precautions
Figure 93
 Interpretation
From this graph we see that the most damaged vehicles are Tunisian registrations

Projet PI_BI
69
This graph makes it possible to identify the types of registrations of the most damaged vehicles
 Action to take
The insurances must advise those customers who have vehicles of the imma Tu type to take the
right precautions
XVI. Web application

Projet PI_BI
70
XVII. Conclusion :
Every year more than 1,500 Tunisian lives are perished due to car accidents and traffic
incidents, alongside the material damages. for that reason, Tunisian insurance companies are
trying to reduce these eventualities which will result in an increase in profit; By establishing a
decisive system that will help them reach their goal.

rapport-pi_bi-final (1).docx

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

rapport-pi_bi-final (1).docx