SlideShare a Scribd company logo
Sinistres routiers
Projet PI_BI
PROJET BIBLIOGRAPHIQUE
Realized by :
FRIDHI AHMED BEN FRAJ GHAILEN
BEDHIAFI RADHWEN OKKEZ HADIL
GHABI YASSINE CHAABENE YASSMINE
Supervised by : Mme Dorsaf Ben Hassen
Academic year 2020-2021
Projet PI_BI
2
Summary :
I. Introduction : ..................................................................................................................7
II. Problem statement :........................................................................................................7
III. Market study :.................................................................................................................8
1. Advantages :.................................................................................................................10
2. Drawbacks :..................................................................................................................10
IV. Objective : ....................................................................................................................11
V. Functionality :...............................................................................................................11
1. Functional requirements :.................................................................................11
2. Non-Functional requirements :.........................................................................12
VI. Methodology : ..............................................................................................................12
a) Identification : .............................................................................................13
b) Design : .......................................................................................................13
c) Implementation : .........................................................................................13
d) Permanent improvement : ...........................................................................14
VII. Data Model :................................................................................................................15
3. Star schema offer the following benefits :........................................................15
4. Star schema vs Snowflake schema :...........................................................................16
5. Constellation Model : .................................................................................................16
6. DW sinistres routiers :................................................................................................17
VIII. Data integration with Talend:.......................................................................................18
1. Dimension table « Assure » :........................................................................................18
2. Dimension table « contrat » :........................................................................................19
3. Dimension table « lieu » :.............................................................................................21
4. Dimension table « vehicule » :.....................................................................................21
Projet PI_BI
3
5. Table Fact :...................................................................................................................23
IX. Business Understanding ...............................................................................................24
a) Business objectives....................................................................................24
b) Assessing the situation...............................................................................25
c) Data Mining objectives..............................................................................25
d) Project Plan................................................................................................25
X. Data Understanding......................................................................................................26
a) Collecting initial data .................................................................................................26
b) Describing data...........................................................................................................26
c) Data Exploration.........................................................................................................27
d) Verifying data quality : ..............................................................................................28
XI. Data Preparation...........................................................................................................29
a) Selecting Data.......................................................................................29
b) Cleaning data : ......................................................................................30
c) Construct required data :.......................................................................31
XII. Modeling.......................................................................................................................32
XIII. Evaluation.....................................................................................................................53
a) Evaluate the results :.....................................................................................................53
b) Review Process :...........................................................................................................53
c) Determine next steps : ..................................................................................................53
XIV. Deployment ..................................................................................................................54
XV. Dashboards :................................................................................................................57
XVI. Web application............................................................................................................69
XVII. Conclusion :..................................................................................................................70
Projet PI_BI
4
List of Figures
Figure 1.......................................................................................................................................7
Figure 2.......................................................................................................................................8
Figure 3.......................................................................................................................................9
Figure 4.......................................................................................................................................9
Figure 5.....................................................................................................................................10
Figure 6.....................................................................................................................................11
Figure 7 :...................................................................................................................................15
Figure 8.....................................................................................................................................17
Figure 9.....................................................................................................................................18
Figure 10...................................................................................................................................18
Figure 11...................................................................................................................................19
Figure 12...................................................................................................................................20
Figure 13...................................................................................................................................20
Figure 14...................................................................................................................................21
Figure 15...................................................................................................................................21
Figure 16...................................................................................................................................22
Figure 17...................................................................................................................................23
Figure 18...................................................................................................................................23
Figure 19...................................................................................................................................24
Figure 20...................................................................................................................................26
Figure 21...................................................................................................................................26
Figure 22...................................................................................................................................27
Figure 23...................................................................................................................................27
Figure 24...................................................................................................................................27
Figure 25...................................................................................................................................28
Figure 26...................................................................................................................................28
Figure 27...................................................................................................................................28
Figure 28...................................................................................................................................29
Figure 29...................................................................................................................................29
Figure 30...................................................................................................................................30
Projet PI_BI
5
Figure 31...................................................................................................................................30
Figure 32...................................................................................................................................31
Figure 33...................................................................................................................................31
Figure 34...................................................................................................................................32
Figure 35...................................................................................................................................32
Figure 36...................................................................................................................................33
Figure 37...................................................................................................................................34
Figure 38...................................................................................................................................34
Figure 39...................................................................................................................................35
Figure 40...................................................................................................................................35
Figure 41...................................................................................................................................36
Figure 42...................................................................................................................................36
Figure 43...................................................................................................................................37
Figure 44...................................................................................................................................37
Figure 45...................................................................................................................................38
Figure 46...................................................................................................................................38
Figure 47...................................................................................................................................39
Figure 48...................................................................................................................................40
Figure 49...................................................................................................................................41
Figure 50...................................................................................................................................41
Figure 51...................................................................................................................................42
Figure 52...................................................................................................................................42
Figure 53...................................................................................................................................42
Figure 54...................................................................................................................................43
Figure 55...................................................................................................................................44
Figure 56...................................................................................................................................44
Figure 57...................................................................................................................................45
Figure 58...................................................................................................................................45
Figure 59...................................................................................................................................46
Figure 60...................................................................................................................................46
Figure 61...................................................................................................................................47
Figure 62...................................................................................................................................47
Figure 63...................................................................................................................................48
Projet PI_BI
6
Figure 64...................................................................................................................................49
Figure 65...................................................................................................................................49
Figure 66...................................................................................................................................50
Figure 67...................................................................................................................................51
Figure 68...................................................................................................................................51
Figure 69...................................................................................................................................52
Figure 70...................................................................................................................................52
Figure 71...................................................................................................................................54
Figure 72...................................................................................................................................55
Figure 73...................................................................................................................................55
Figure 74...................................................................................................................................56
Figure 75...................................................................................................................................57
Figure 76...................................................................................................................................58
Figure 77...................................................................................................................................58
Figure 78...................................................................................................................................59
Figure 79...................................................................................................................................60
Figure 80...................................................................................................................................61
Figure 81...................................................................................................................................61
Figure 82...................................................................................................................................62
Figure 83...................................................................................................................................62
Figure 84...................................................................................................................................63
Figure 85...................................................................................................................................64
Figure 86...................................................................................................................................64
Figure 87...................................................................................................................................65
Figure 88...................................................................................................................................66
Figure 89...................................................................................................................................66
Figure 90...................................................................................................................................67
Figure 91...................................................................................................................................67
Figure 92...................................................................................................................................68
Figure 93...................................................................................................................................68
Projet PI_BI
7
I. Introduction :
Tunisia had the second worst traffic death rate per capita in North Africa. According to figures
reported by the National Observatory for Road Safety (ONSR), 5877 accidents took place in
2018, claiming the lives of 1205 people and injuring 8869 others which causes a great financial
loss for Tunisian insurance companies.
Figure 1
II. Problem statement :
Our society is witnessing very high percentages of tragedies and loss that had affected their
daily actions. Citing one of the causes is road accidents which is producing shocking numbers
year in year out which is obvious through this picture.
We came here to question the effectiveness of the measures taken to reduce the ratio of this
phenomenon and to propose an effective solution.
Our intervention is based on an analytical view of data stored within the assurance companies’
databases in order to give insightful actions to be taken.
Projet PI_BI
8
III. Market study :
In a matter of fact, we can acknowledge the effort given by the assurance companies and their
continualness tentative to minimalize the ratio of road accidents via the National Institution of
Statistics (NIS) which is a public establishment dependent on the Development and
International cooperation Minster.
This establishment is taking part of the production and the analysis of official statistics in
Tunisia.
Figure 2
In addition, it provides the possibility of visualizing all wanted data and analysis such as the
evolution of accident numbers through location, the evolution of accident numbers through
causes and the evolution of accident numbers through injuries and death.
Projet PI_BI
9
Figure 3
Figure 4
Projet PI_BI
10
Figure 5
1. Advantages :
Through these figures, we can acknowledge the feasibility of any wanted analysis..Each year
the general insurance committee has a very detailed written report which presents the
evolution of road accidents in relation to several factors.
2. Drawbacks :
It is true that the report is well detailed and provides us with several information, but it is
complicated to extract the clues necessary to make the right decisions.
In addition, since each report is specific to a well-determined year which causes a scattered
history.
Projet PI_BI
11
IV. Objective :
Figure 6
V. Functionality :
1. Functional requirements :
The functional specifications define the different requirements that the analytical system must
satisfy.
Applying that on our case, we are ought to describe the requirement’s details in order to build
a global use case diagram.
Our analytical system must satisfy these requirements:
Projet PI_BI
12
• Allow the user to visualize the different classification of the clients and vehicles.
• Provide information to the assurance companies to classify the behavior of clients.
• Provide information to the assurance companies reflecting on the details of high threat
vehicles.
• Identify the weak policies in the contract in order to improve these points.
• Identify the locations that provide a high ratio of road accidents.
• Provide an easy visualization of analyzed data throughout a dashboard.
2. Non-Functional requirements :
Non-functional requirements describe how efficiently a system should function. They refer to
the general qualities that provide a good user experience.
Our analytical system must satisfy these requirements:
Security: Ensure that the system is protected from unauthorized access.
Reliability: Ensure that the system will work without failure.
Performance: Ensure that the system will provide the qualities expected such as the
responsiveness of the system.
VI. Methodology :
GIMSI Definition
Generalization-Information-Method and Measurement-System and
Systemic-Individuality and Initiative
Defines a cooperative methodological design framework in order to better formalize the
conditions for the success of the BI project centered on the issue of the dashboard.
Projet PI_BI
13
a) Identification :
What is the context?
1-Company environment
Analysis of the economic environment and the company's strategy in order to define the
perimeter and scope of the project.
2-Identification of the company
Analysis of the structures of the company to identify the processes, activities and actors
involved.
b) Design :
What should be done?
3-Definition of objectives
Selection of tactical objectives for each team
4-Construction of the dashboard
Definition of the dashboard of each team
5-Choice of indicators
Choice of indicators according to the objectives chosen.
6-Collection of information
Identification of the information needed to construct the indicators.
7-The dashboard system
Construction of the dashboard system, control of overall consistency
c) Implementation :
How to do it?
Projet PI_BI
14
8-The choice of software packages
Development of the selection grid for the choice of suitable software packages
9-Integration and deployment
Implementation of software packages, deployment to the company
Continuous improvement
d) Permanent improvement :
10-Audit
Continuous monitoring of the system Does the system still meet expectations?
 Methodology choice :
Boost the creation of values in a transversal orientation.
Position the needs of the actor in a decision-making situation at the heart of the process in order
to fully consider the risk-taking inherent in new ways of operating companies.
Contribute to the destruction of the wall still existing between operational technological
solutions and user expectations.
Projet PI_BI
15
VII. Data Model :
Figure 7 :
Star schemas offer the simplest structure for organizing data into a data warehouse. The center
of a star schema consists of one or fact table "fact sinistres"that index a series of dimension
tables(dim Assure ,dim vehicule , dim lieu ,dim date de sinistre,dim contrat)
3. Star schema offer the following benefits :
 Queries are simpler: Because all of the data connects through the fact table the
multiple dimension tables are treated as one large table of information, and that
makes queries simpler and easier to perform.
 Easier business insights reporting: Star schemas simplify the process of pulling
business reports like as-of-as and period-over-period reports.
Projet PI_BI
16
 Better-performing queries: By removing the bottlenecks of a highly normalized
schema, query speed increases, and the performance of read-only commands
improves.
4. Star schema vs Snowflake schema :
1-Star schema dimension tables are not normalized, snowflake schemas dimension tables are
normalized.
2-Snowflake schemas will use less space to store dimension tables but are more complex.
3-Star schemas will only join the fact table with the dimension tables, leading to simpler,
faster SQL queries.
4-Snowflake schemas have no redundant data, so they're easier to maintain.
5-Snowflake schemas are good for data warehouses, star schemas are better for datamarts
with simple relationships.
5. Constellation Model :
Fact Constellation is a schema for representing multidimensional model. It is a collection of
multiple fact tables having some common dimension tables. It can be viewed as a collection
of several star schemas and hence, also known as Galaxy schema. It is one of the widely used
schema for Data warehouse designing and it is much more complex than star and snowflake
schema. For complex systems, we require fact constellations.
Projet PI_BI
17
6. DW sinistres routiers :
Figure 8
Projet PI_BI
18
VIII. Data integration with Talend:
1. Dimension table « Assure » :
Figure 9
Figure 10
Projet PI_BI
19
tMSSqlInput'assure':reads data and extracts fields based on a query from a Microsoft
SQL Server database or a Microsoft Azure SQL database.
tmap:TMap transforms and directs data from one or more sources and to one or more
destinations.
tUniqRow:The tUniqRow component compares the entries and removes duplicates from
the input stream.
tMSSqlOutput 'dimassure':Inserting data into a database table dimassure and extracting
useful information from it
getUsCity: randomly returns a price city from a list of known cities
2. Dimension table « contrat » :
Figure 11
Projet PI_BI
20
Figure 12
Figure 13
Numeric.sequence :Returns an incremented numeric identifier.
formatDate:Returns a date expression formatted according to the specified date pattern
Projet PI_BI
21
3. Dimension table « lieu » :
Figure 14
4. Dimension table « vehicule » :
Figure 15
Projet PI_BI
22
Figure 16
this job uses tmap component to do an inner join with the data form the 'vehicule' table
based on the code marque column and the code column from the table marque.
Projet PI_BI
23
5. Table Fact :
Figure 17
Figure 18
Projet PI_BI
24
Figure 19
Dimension Table connected to the fact table.
Primary Keyin fact table is mapped as foreign keys to Dimensions
(codePolice,codevehicule,codeassure,id_lieu,id_date,id_natureSinistre), also it holds the
necessary key metrics for setting up main and relevant key performance indexes (KPI)
exemple 'pourcentage de responsabilité','calculsinistre'...
IX. Business Understanding
a) Business objectives
We aim throughout this project to satisfy our client, which is the General Council of Insurance.
The Council have set its primary objectives, which are the following:
-Identify the seasons of road accidents.
-Identify the responsible of the road accident.
-Identify the vehicles that take part in a road accident.
Projet PI_BI
25
b) Assessing the situation
In order to achieve our objectives, we need to list all resources available to our project.
The resources are:
-Personnel: We will need 6 engineers who are familiar with data mining.
-Data: The General Council of Insurance will provide us with the necessary data from all
insurance companies.
-Hardware: We will need six computers and access to the internet.
-Software: We will work with Google Collaboratory application.
c) Data Mining objectives
The data mining objectives will be divided into 2 classes.
We will perform an additional analysis where we aim to satisfy the following objectives:
-Determine in which season road accidents occurred the most.
-Determine which attribute that have the highest correlation with a road accident.
Also, we will perform a predictive analysis where we aim to satisfy the following objectives:
-Predict the nature of the road accident.
-Predict the rate of responsibility of a road accident.
-Predict the brand of the vehicles that frequently cause a road accident.
d) Project Plan
In order to achieve the business objectives and data mining objectives we will be need the
following tools:
-Talend Integration Tool: This tool permits us to integrate the data and produce the
dimension tables and the fact tables.
Projet PI_BI
26
-Google Collaboratory: This tool permits us to analyze the datasets extracted from the
integration phase throughout algorithms using Python language.
-Power BI: This tool allows us to visualize the datasets extracted and analyzed and
assures the reporting phase.
-PyCharm: This tool allows us to deploy our dashboard on a web application.
X. Data Understanding
a) Collecting initial data
The CGA or the General Council of Insurance has provided us with the database containing
all data required to achieve our objectives
b) Describing data
The database was a SQL Server database and it had 10 tables.
Most of the tables had more than a thousand row and multiple columns.
The attributes of the tables were in different types.
Figure 20
Figure 21
Projet PI_BI
27
c) Data Exploration
After exploring the data acquired, we proceeded with doing simple queries and
visualizations in order to discover a relationship among the data. And yet we have found an
aggregation between tables that helped us achieve our objectives.
Figure 22
Figure 23
Figure 24
Projet PI_BI
28
Figure 26
d) Verifying data quality :
1) The ratio of data to error :
The number of data errors were minimal.
2) Number of empty values :
The number of empty values were significant, it was approximately over ten thousand
missing value.
3) Data storage cost :
The data storage was acceptable, the volume of the datasets was not significant.
Figure 25
Figure 27
Projet PI_BI
29
XI. Data Preparation
a) Selecting Data
In order to achieve our objectives, we will need four tables from the database.
The “Assure” table.
The “Sinistres” table.
The “MarqueVehicules” table.
The “Compagnie” table.
The selection of these four tables is justified with the requirements needed to achieve our
objectives.
Figure 28
Figure 29
Projet PI_BI
30
b) Cleaning data :
In order to obtain the optimal dataset for analysis, we have performed some cleaning operations.
These operations allow us to remove the unnecessary and unused data to the extent of raising
the data quality.
Additionally, we have used the amputation technique to remove unwanted data as well as
transformations which will allow us to use the required algorithms.
Example of amputation:
Figure 30
Figure 31
Projet PI_BI
31
Example of transformations:
c) Construct required data :
In order to give our analysis more depth, we have generated a new dataset.
This dataset revolves around the date where we have extracted the date from each road accident
and gave it new attributes and eventually generated new columns such as “Saison” column.
Figure 32
Figure 33
Projet PI_BI
32
XII. Modeling
For the additional analysis, we have used two algorithms.
K-Means Algorithm.
AHC Algorithm.
ACP Algorithm.
1. K-Means Algorithm :
1. Selecting model technique and building model :
We start off by importing the dataset and eliminating the unused column.
We proceed by applying the K-Means algorithm, we initially chose the value of K equals to 3.
Figure 34
Figure 35
Projet PI_BI
33
2. Asses Model :
We can say throughout the results that the model is a good model for our objective as we can
see it resulted in 3 clusters which satisfy the requirements.
The cluster (1) is the cluster for the season “hiver”.
The other clusters are for the other seasons.
We can eventually say that “hiver” is the most common season for road accidents.
2. AHC Algorithm :
a) Selecting model technique and building model :
Figure 36
Projet PI_BI
34
From this dendrogram, we can determine the number of classes to work with which is equal to
two.
b) Asses model :
We can affirm that AHC did not perform as we wanted so this model cannot be trusted to work
with, we proceed with K-Means algorithm results.
3. ACP Algorithm :
a) Selecting model technique and building model :
We start off by importing the dataset.
Figure 37
Figure 38
Projet PI_BI
35
Figure 39
We proceed by preparing the data through multiple operations. We must explicitly center and
reduce variables to realize ACP standardized with PCA.
We use for this operation StandardScaler class.
Figure 40
Projet PI_BI
36
In fact, the centering permits us to have the same scale of values to we obtain eventually a
variance/covariance matrix.
Figure 41
We proceed by verifying the properties of the new set of data through the algorithm awareness.
We can notice that the mean and the standard deviation are now null and unitary.
Instantiation and launching calculus.
Figure 42
Projet PI_BI
37
Displaying the number of generated components which is equal to 10.
Figure 43
Proceeding with proper values and Screen Plot, the propriety “explained_variance” helps us get
the variance associated with the factorial axes.
Figure 44
Projet PI_BI
38
b) Asses model :
We move to the quality of the representation, with the square cosine method.
In order to know the quality of the representation of the attributes on the axes, we need firstly
to calculate the square value of the distance between the origin of the attributes which is
corresponds with their contribution with the total inertia.
We can notice that “dateOuvertureSinistre” and “annee” are the most pertinent attributes.
Figure 45
Figure 46
Projet PI_BI
39
We finish this task with the representation of variables, initially, we need eigen vectors to
analyze variables. The vectors are produced by operation “champ.components”.
Figure 47
We can see throughout this visualization that “dateOuvertureSinistre” is effectively correlated
with “annee”. This global relation between variables is determined by “sinistre_id”.
For the predictive analysis, we have used two algorithms.
Random Forrest Algorithm.
KNN Algorithm.
Projet PI_BI
40
1. Random Forrest Algorithm
 First Model (Predict the rate of responsibility of a road accident)
a) Selecting model technique :
We start off by importing the dataset.
Figure 48
In order to proceed with the Random Forrest algorithm, we need to transform the categorical
data into numerical data.
Projet PI_BI
41
Figure 49
Transforming the dataset
Figure 50
Projet PI_BI
42
b) Generating test design :
We will be splitting the data into 35% test set and 65% training set.
Figure 51
c) Build model :
Applying Random Forrest.
Figure 52
Improving the performance of the algorithm Random Forrest
Figure 53
Projet PI_BI
43
d) Assess model :
Interpreting the results of the classification report and the confusion matrix.
Figure 54
The precision has a high value ( equal to 0.80), which means that from all the positive classes
that our model has predicted, the number of classes that were true was important.
The recall value is also high (equal to 0.80), which means our model has predicted an important
number of positive classes.
The accuracy value is also high, which means our model has correctly predicted the actual class
from all the class types.
The F1 score (equal to 0.80) helps to measure Recall and Precision at the same time. It uses
Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more.
The support (equal to 13643) is high in all classes which is the number of samples of the true
values in each class.
 Second Model (Predict the nature of the road accident)
a) Selecting model technique :
We start off by importing the dataset.
Projet PI_BI
44
In order to proceed with the Random Forrest algorithm, we need to transform the categorical
data into numerical data.
Figure 56
Figure 55
Projet PI_BI
45
Then, we continue with Features’ visualization.
Figure 57
We can notice that the attribute “Nature de sinistre corporelle” is insignificant comparing with
“Nature de sinistre materiel” classifying with “Sinistre_Id”
The values of the attribute “Sinistre corporelle” and “Sinistre materiel” are proportional
classifying by “Code_Assure” with a significant difference in values.
Figure 58
Projet PI_BI
46
Figure 59
b) Generating test design :
We will be splitting the data into 40% test set and 60% training set.
Figure 60
Projet PI_BI
47
c) Build model :
Applying Random Forrest.
Figure 61
Figure 62
Projet PI_BI
48
Our model has a precision score of 88.66% of the test set and a 100% on the training set.
Figure 63
d) Assess model :
Interpreting the results of the classification report and the confusion matrix.
Projet PI_BI
49
Figure 64
Figure 65
Projet PI_BI
50
The precision has a high value in all classes, which means that from all the positive classes that
our model has predicted, the number of classes that were true was important.
The recall value is also high in all classes, which means our model has predicted an important
number of positive classes.
The accuracy value is also high, which means our model has correctly predicted the actual class
from all the class types.
The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in
place of Arithmetic Mean by punishing the extreme values more.
The support is high in all classes which is the number of samples of the true values in each class.
2. KNN Algorithm :
a) Selecting model technique :
In order to achieve our predictive goal, we have used the KNN Algorithm.
We start off by importing the dataset to work with
Figure 66
Projet PI_BI
51
In order to work with KNN algorithm, we must encode the categorical data into numerical data.
Figure 67
b) Generating test design :
We will be splitting the data into 35% test set and 65% training set.
Figure 68
c) Build model :
We verify the algorithm accuracy by creating a loop for the number K, starting from 1 to 30
and therefore we can determine the optimal value for K in order to proceed.
The optimal value for k is 20. We proceed with the algorithm.
Projet PI_BI
52
Figure 69
d) Assess model :
The assessing part is where we interpret the results throughout our classification report.
Figure 70
Projet PI_BI
53
The precision has a high value which equals to 0.98, which means that from all the positive
classes that our model has predicted, the number of classes that were true was important.
The recall value is also high which equals to 0.99, which means our model has predicted an
important number of positive classes.
The accuracy value is also high, which means our model has correctly predicted the actual class
from all the class types.
The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in
place of Arithmetic Mean by punishing the extreme values more, which is equal to 0.98.
The support is high which equals to 6083 which is the number of samples of the true values in
each class.
XIII. Evaluation
a) Evaluate the results :
Looking into our predictive goals and the models created to achieve these goals and its great
accuracy, we can say that the results does satisfy our business criteria.
b) Review Process :
In the process of creating these models we have tried to create the optimal dataset.
The models created by different algorithms have produced different accuracy and precision
values, so the choice was on the model that had the highest values and that respects our business
goals criteria.
Looking back to the phases of creating the optimal models, we must note that you must try
different algorithms and adjust the parameters of the algorithm.
c) Determine next steps :
We aim to improve our models by adjusting the algorithms and cleaning the datasets by
removing unnecessary data in order to enhance the precision and the effectiveness of our model.
Projet PI_BI
54
XIV. Deployment
A. Deployment plan :
In order to deploy our solution, we have used “PyCharm” application.
We start off by importing our dataset and inserting our algorithms.
We then proceed by generating our prediction functions throughout our models and algorithms.
We then proceed by generating the HTML file to obtain the layout and form which contains the
fields to insert input values and finally we display the results of the prediction.
B. Monitoring and maintenance plan :
In order to keep monitoring our deployed solution and keeping it update, the problems
encountered where the type of data and how the algorithm is responding with the latter.
Figure 71
Projet PI_BI
55
Figure 72
To avoid this kind of technical problems, we have encoded our data to the requirements of the
algorithm whether its categorical of numerical.
Figure 73
Projet PI_BI
56
We can acknowledge the influence of this operation by observing the effectiveness of our
model.
Figure 74
To summarize our project and the work have been made, we kick off by stating business
understanding; we aimed to achieve our clients’ requirements and goals which is reducing road
accidents ratio. This goal was divided into multiple objectives related to our initial goal.
The division of the initial goal resulted in three business objectives which are identifying the
factors causing a road accident.
The set of business objectives where additionally transformed into data mining goals which
eventually produced two classes: An additional analysis and a predictive analysis.
Going through the data mining process, we decided to work with multiple algorithms in order
to obtain the optimal results for our goals.
Projet PI_BI
57
The selection of the algorithms was not random, we have tried multiple algorithms and we
eventually decided to choose the most convenient algorithms which where K means and ACP
for the additional analysis and Random Forrest for the predictive analysis.
The results were satisfying, we have found that the accuracy of the models met the requirements
set.
Moving to evaluating our data mining results, we have found that the precedent results were
meeting the success criteria of our business and we were able to determine the dominant factors
of a road accident.
To conclude, the client has set a primary goal which was reducing the ratio of road accident and
through out our work we have followed a strict methodology and have eventually produced
satisfying results from an analytical view.
XV. Dashboards :
Figure 75
Projet PI_BI
58
Figure 76
 Interpretation
From this graph we see that the year with the highest number of claims is 2018 0.22M and
decreases to 0.01M in 2016
 Impact for the company
Identify what are the main causes for this reduction
 Actions to take
CGA take into consideration the causes of this reduction
Figure 77
Projet PI_BI
59
 Interpretation
From this graph we can see that Volkswagen is the most damaged car brand contract then starts
to decrease
 Impact on the business
These results allow insurance companies to identify the most damaged car brands
 Actions to take
An additional addition for the 4 most affected brands
Figure 78
 Interpretation
From this graph we see that the number of material claims is greater than that of bodily claims
 Impact on the business
This graph allows to know the nature of the disaster
 Actions to take
Insurance companies ask these customers to take the necessary precautions
Projet PI_BI
60
Figure 79
 Interpretation
From this graph we see that Ami Assurance Tunisian insurance company which has the highest
number of claims About 52K claims after mutual insurance Takafel around 260 and ASTREE
180 sinister
 Impact for the company
This graph makes it possible to distinguish the insurance companies which have the number the
highest of claims in order to make the right decisions to minimize these claims
 Actions to take
We must identify the main causes of this high number of claims and then take the necessary
decisions to stop this phenomenon
Projet PI_BI
61
Figure 80
Figure 81
 Interpretation
From this graph we see that the first month of March and the most damaged month 56K claims
and February in 2 place and after the curve decreases until month 10 which is the disaster month
 Impact on the business
identify the most disaster-stricken months
 Actions to take
insurance companies advise their clients to drive safely
Projet PI_BI
62
Figure 82
 Interpretation
This graph shows that spring is the most damaged season and winter is in 2nd place
 Impact on the business
Identify the most disaster-stricken seasons
 Actions to take
Look for the climatic and infrastructure factors that make spring the most disaster-stricken
season and take the necessary precautions
Figure 83
Projet PI_BI
63
 Interpretation
From this graph we see that TUNIS is the most damaged place with 338 claims and in second
place Nabeul around 174 claims
 Impact on the business
This graph makes it possible to identify the most affected places
 Actions to take
The insurance companies must advise these customers who circulate in the most disaster
stricken places to drive safely
Figure 84
Projet PI_BI
64
Figure 85
 Interpretation
From this graph we see the 5 most affected customers between 8.2K and 3.6k claims
 Impact on the business
Know the most affected customers
 Actions to take
Penalize these customers or change the contract formula with them
Figure 86
Projet PI_BI
65
 Interpretation
From this graph we see the percentage of customer responsibility which is 100 is greater
282.47K
 Impact on the business
This graph allows to know if the customers are responsible for the accident or not
 Impact on the business
This graph allows to know if the customers are responsible for the accident or not
Figure 87
 Interpretation
This graph shows us that policyholders who are age 44 are customers who make more claims
around 6.5K and secondarily policyholders who are 60 years old
 Impact on the business
Identify the age category Policyholders who make more claims
 Actions to take
Insurance companies ask these insureds to drive Cautiously
Projet PI_BI
66
Figure 88
 Interpretation
From this graph we see that men make more claims than women up to 205.46K
 Impact on the business
this information helps the CGA to know the sex most affected
 Actions to take
agencies should advise male policyholders to drive safely
Figure 89
Projet PI_BI
67
Figure 90
 Interpretation
From this graph we can see that Volkswagen is the most damaged car brand
 Impact on the business
These results allow insurance companies to identify the most damaged car brands
 Action to take
An additional addition for the 5 most affected brands
Figure 91
 Interpretation
From this graph we see that cars with a fiscal power equal to 5 are the most damaged about
8.89k claims
 Impact on the business
This graph makes it possible to identify the fiscal power of the most damaged cars
 Action to take
Reformulate the contract by adding an additional for cars of fiscal power 5,6,4
Projet PI_BI
68
Figure 92
 Interpretation
From this graph we can see that the most damaged vehicles are from Gasoil energy
 Impact on the business
This graph makes it possible to identify the energy of the most damaged vehicles
 Action to take
Insurance companies must advise those customers who have diesel vehicles to take the right
precautions
Figure 93
 Interpretation
From this graph we see that the most damaged vehicles are Tunisian registrations
Projet PI_BI
69
 Impact on the business
This graph makes it possible to identify the types of registrations of the most damaged vehicles
 Action to take
The insurances must advise those customers who have vehicles of the imma Tu type to take the
right precautions
XVI. Web application
Projet PI_BI
70
XVII. Conclusion :
Every year more than 1,500 Tunisian lives are perished due to car accidents and traffic
incidents, alongside the material damages. for that reason, Tunisian insurance companies are
trying to reduce these eventualities which will result in an increase in profit; By establishing a
decisive system that will help them reach their goal.

More Related Content

Recently uploaded

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxbenishzehra469
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like BitcoinDOT TECH
 

Recently uploaded (20)

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

rapport-pi_bi-final (1).docx

  • 1. Sinistres routiers Projet PI_BI PROJET BIBLIOGRAPHIQUE Realized by : FRIDHI AHMED BEN FRAJ GHAILEN BEDHIAFI RADHWEN OKKEZ HADIL GHABI YASSINE CHAABENE YASSMINE Supervised by : Mme Dorsaf Ben Hassen Academic year 2020-2021
  • 2. Projet PI_BI 2 Summary : I. Introduction : ..................................................................................................................7 II. Problem statement :........................................................................................................7 III. Market study :.................................................................................................................8 1. Advantages :.................................................................................................................10 2. Drawbacks :..................................................................................................................10 IV. Objective : ....................................................................................................................11 V. Functionality :...............................................................................................................11 1. Functional requirements :.................................................................................11 2. Non-Functional requirements :.........................................................................12 VI. Methodology : ..............................................................................................................12 a) Identification : .............................................................................................13 b) Design : .......................................................................................................13 c) Implementation : .........................................................................................13 d) Permanent improvement : ...........................................................................14 VII. Data Model :................................................................................................................15 3. Star schema offer the following benefits :........................................................15 4. Star schema vs Snowflake schema :...........................................................................16 5. Constellation Model : .................................................................................................16 6. DW sinistres routiers :................................................................................................17 VIII. Data integration with Talend:.......................................................................................18 1. Dimension table « Assure » :........................................................................................18 2. Dimension table « contrat » :........................................................................................19 3. Dimension table « lieu » :.............................................................................................21 4. Dimension table « vehicule » :.....................................................................................21
  • 3. Projet PI_BI 3 5. Table Fact :...................................................................................................................23 IX. Business Understanding ...............................................................................................24 a) Business objectives....................................................................................24 b) Assessing the situation...............................................................................25 c) Data Mining objectives..............................................................................25 d) Project Plan................................................................................................25 X. Data Understanding......................................................................................................26 a) Collecting initial data .................................................................................................26 b) Describing data...........................................................................................................26 c) Data Exploration.........................................................................................................27 d) Verifying data quality : ..............................................................................................28 XI. Data Preparation...........................................................................................................29 a) Selecting Data.......................................................................................29 b) Cleaning data : ......................................................................................30 c) Construct required data :.......................................................................31 XII. Modeling.......................................................................................................................32 XIII. Evaluation.....................................................................................................................53 a) Evaluate the results :.....................................................................................................53 b) Review Process :...........................................................................................................53 c) Determine next steps : ..................................................................................................53 XIV. Deployment ..................................................................................................................54 XV. Dashboards :................................................................................................................57 XVI. Web application............................................................................................................69 XVII. Conclusion :..................................................................................................................70
  • 4. Projet PI_BI 4 List of Figures Figure 1.......................................................................................................................................7 Figure 2.......................................................................................................................................8 Figure 3.......................................................................................................................................9 Figure 4.......................................................................................................................................9 Figure 5.....................................................................................................................................10 Figure 6.....................................................................................................................................11 Figure 7 :...................................................................................................................................15 Figure 8.....................................................................................................................................17 Figure 9.....................................................................................................................................18 Figure 10...................................................................................................................................18 Figure 11...................................................................................................................................19 Figure 12...................................................................................................................................20 Figure 13...................................................................................................................................20 Figure 14...................................................................................................................................21 Figure 15...................................................................................................................................21 Figure 16...................................................................................................................................22 Figure 17...................................................................................................................................23 Figure 18...................................................................................................................................23 Figure 19...................................................................................................................................24 Figure 20...................................................................................................................................26 Figure 21...................................................................................................................................26 Figure 22...................................................................................................................................27 Figure 23...................................................................................................................................27 Figure 24...................................................................................................................................27 Figure 25...................................................................................................................................28 Figure 26...................................................................................................................................28 Figure 27...................................................................................................................................28 Figure 28...................................................................................................................................29 Figure 29...................................................................................................................................29 Figure 30...................................................................................................................................30
  • 5. Projet PI_BI 5 Figure 31...................................................................................................................................30 Figure 32...................................................................................................................................31 Figure 33...................................................................................................................................31 Figure 34...................................................................................................................................32 Figure 35...................................................................................................................................32 Figure 36...................................................................................................................................33 Figure 37...................................................................................................................................34 Figure 38...................................................................................................................................34 Figure 39...................................................................................................................................35 Figure 40...................................................................................................................................35 Figure 41...................................................................................................................................36 Figure 42...................................................................................................................................36 Figure 43...................................................................................................................................37 Figure 44...................................................................................................................................37 Figure 45...................................................................................................................................38 Figure 46...................................................................................................................................38 Figure 47...................................................................................................................................39 Figure 48...................................................................................................................................40 Figure 49...................................................................................................................................41 Figure 50...................................................................................................................................41 Figure 51...................................................................................................................................42 Figure 52...................................................................................................................................42 Figure 53...................................................................................................................................42 Figure 54...................................................................................................................................43 Figure 55...................................................................................................................................44 Figure 56...................................................................................................................................44 Figure 57...................................................................................................................................45 Figure 58...................................................................................................................................45 Figure 59...................................................................................................................................46 Figure 60...................................................................................................................................46 Figure 61...................................................................................................................................47 Figure 62...................................................................................................................................47 Figure 63...................................................................................................................................48
  • 6. Projet PI_BI 6 Figure 64...................................................................................................................................49 Figure 65...................................................................................................................................49 Figure 66...................................................................................................................................50 Figure 67...................................................................................................................................51 Figure 68...................................................................................................................................51 Figure 69...................................................................................................................................52 Figure 70...................................................................................................................................52 Figure 71...................................................................................................................................54 Figure 72...................................................................................................................................55 Figure 73...................................................................................................................................55 Figure 74...................................................................................................................................56 Figure 75...................................................................................................................................57 Figure 76...................................................................................................................................58 Figure 77...................................................................................................................................58 Figure 78...................................................................................................................................59 Figure 79...................................................................................................................................60 Figure 80...................................................................................................................................61 Figure 81...................................................................................................................................61 Figure 82...................................................................................................................................62 Figure 83...................................................................................................................................62 Figure 84...................................................................................................................................63 Figure 85...................................................................................................................................64 Figure 86...................................................................................................................................64 Figure 87...................................................................................................................................65 Figure 88...................................................................................................................................66 Figure 89...................................................................................................................................66 Figure 90...................................................................................................................................67 Figure 91...................................................................................................................................67 Figure 92...................................................................................................................................68 Figure 93...................................................................................................................................68
  • 7. Projet PI_BI 7 I. Introduction : Tunisia had the second worst traffic death rate per capita in North Africa. According to figures reported by the National Observatory for Road Safety (ONSR), 5877 accidents took place in 2018, claiming the lives of 1205 people and injuring 8869 others which causes a great financial loss for Tunisian insurance companies. Figure 1 II. Problem statement : Our society is witnessing very high percentages of tragedies and loss that had affected their daily actions. Citing one of the causes is road accidents which is producing shocking numbers year in year out which is obvious through this picture. We came here to question the effectiveness of the measures taken to reduce the ratio of this phenomenon and to propose an effective solution. Our intervention is based on an analytical view of data stored within the assurance companies’ databases in order to give insightful actions to be taken.
  • 8. Projet PI_BI 8 III. Market study : In a matter of fact, we can acknowledge the effort given by the assurance companies and their continualness tentative to minimalize the ratio of road accidents via the National Institution of Statistics (NIS) which is a public establishment dependent on the Development and International cooperation Minster. This establishment is taking part of the production and the analysis of official statistics in Tunisia. Figure 2 In addition, it provides the possibility of visualizing all wanted data and analysis such as the evolution of accident numbers through location, the evolution of accident numbers through causes and the evolution of accident numbers through injuries and death.
  • 10. Projet PI_BI 10 Figure 5 1. Advantages : Through these figures, we can acknowledge the feasibility of any wanted analysis..Each year the general insurance committee has a very detailed written report which presents the evolution of road accidents in relation to several factors. 2. Drawbacks : It is true that the report is well detailed and provides us with several information, but it is complicated to extract the clues necessary to make the right decisions. In addition, since each report is specific to a well-determined year which causes a scattered history.
  • 11. Projet PI_BI 11 IV. Objective : Figure 6 V. Functionality : 1. Functional requirements : The functional specifications define the different requirements that the analytical system must satisfy. Applying that on our case, we are ought to describe the requirement’s details in order to build a global use case diagram. Our analytical system must satisfy these requirements:
  • 12. Projet PI_BI 12 • Allow the user to visualize the different classification of the clients and vehicles. • Provide information to the assurance companies to classify the behavior of clients. • Provide information to the assurance companies reflecting on the details of high threat vehicles. • Identify the weak policies in the contract in order to improve these points. • Identify the locations that provide a high ratio of road accidents. • Provide an easy visualization of analyzed data throughout a dashboard. 2. Non-Functional requirements : Non-functional requirements describe how efficiently a system should function. They refer to the general qualities that provide a good user experience. Our analytical system must satisfy these requirements: Security: Ensure that the system is protected from unauthorized access. Reliability: Ensure that the system will work without failure. Performance: Ensure that the system will provide the qualities expected such as the responsiveness of the system. VI. Methodology : GIMSI Definition Generalization-Information-Method and Measurement-System and Systemic-Individuality and Initiative Defines a cooperative methodological design framework in order to better formalize the conditions for the success of the BI project centered on the issue of the dashboard.
  • 13. Projet PI_BI 13 a) Identification : What is the context? 1-Company environment Analysis of the economic environment and the company's strategy in order to define the perimeter and scope of the project. 2-Identification of the company Analysis of the structures of the company to identify the processes, activities and actors involved. b) Design : What should be done? 3-Definition of objectives Selection of tactical objectives for each team 4-Construction of the dashboard Definition of the dashboard of each team 5-Choice of indicators Choice of indicators according to the objectives chosen. 6-Collection of information Identification of the information needed to construct the indicators. 7-The dashboard system Construction of the dashboard system, control of overall consistency c) Implementation : How to do it?
  • 14. Projet PI_BI 14 8-The choice of software packages Development of the selection grid for the choice of suitable software packages 9-Integration and deployment Implementation of software packages, deployment to the company Continuous improvement d) Permanent improvement : 10-Audit Continuous monitoring of the system Does the system still meet expectations?  Methodology choice : Boost the creation of values in a transversal orientation. Position the needs of the actor in a decision-making situation at the heart of the process in order to fully consider the risk-taking inherent in new ways of operating companies. Contribute to the destruction of the wall still existing between operational technological solutions and user expectations.
  • 15. Projet PI_BI 15 VII. Data Model : Figure 7 : Star schemas offer the simplest structure for organizing data into a data warehouse. The center of a star schema consists of one or fact table "fact sinistres"that index a series of dimension tables(dim Assure ,dim vehicule , dim lieu ,dim date de sinistre,dim contrat) 3. Star schema offer the following benefits :  Queries are simpler: Because all of the data connects through the fact table the multiple dimension tables are treated as one large table of information, and that makes queries simpler and easier to perform.  Easier business insights reporting: Star schemas simplify the process of pulling business reports like as-of-as and period-over-period reports.
  • 16. Projet PI_BI 16  Better-performing queries: By removing the bottlenecks of a highly normalized schema, query speed increases, and the performance of read-only commands improves. 4. Star schema vs Snowflake schema : 1-Star schema dimension tables are not normalized, snowflake schemas dimension tables are normalized. 2-Snowflake schemas will use less space to store dimension tables but are more complex. 3-Star schemas will only join the fact table with the dimension tables, leading to simpler, faster SQL queries. 4-Snowflake schemas have no redundant data, so they're easier to maintain. 5-Snowflake schemas are good for data warehouses, star schemas are better for datamarts with simple relationships. 5. Constellation Model : Fact Constellation is a schema for representing multidimensional model. It is a collection of multiple fact tables having some common dimension tables. It can be viewed as a collection of several star schemas and hence, also known as Galaxy schema. It is one of the widely used schema for Data warehouse designing and it is much more complex than star and snowflake schema. For complex systems, we require fact constellations.
  • 17. Projet PI_BI 17 6. DW sinistres routiers : Figure 8
  • 18. Projet PI_BI 18 VIII. Data integration with Talend: 1. Dimension table « Assure » : Figure 9 Figure 10
  • 19. Projet PI_BI 19 tMSSqlInput'assure':reads data and extracts fields based on a query from a Microsoft SQL Server database or a Microsoft Azure SQL database. tmap:TMap transforms and directs data from one or more sources and to one or more destinations. tUniqRow:The tUniqRow component compares the entries and removes duplicates from the input stream. tMSSqlOutput 'dimassure':Inserting data into a database table dimassure and extracting useful information from it getUsCity: randomly returns a price city from a list of known cities 2. Dimension table « contrat » : Figure 11
  • 20. Projet PI_BI 20 Figure 12 Figure 13 Numeric.sequence :Returns an incremented numeric identifier. formatDate:Returns a date expression formatted according to the specified date pattern
  • 21. Projet PI_BI 21 3. Dimension table « lieu » : Figure 14 4. Dimension table « vehicule » : Figure 15
  • 22. Projet PI_BI 22 Figure 16 this job uses tmap component to do an inner join with the data form the 'vehicule' table based on the code marque column and the code column from the table marque.
  • 23. Projet PI_BI 23 5. Table Fact : Figure 17 Figure 18
  • 24. Projet PI_BI 24 Figure 19 Dimension Table connected to the fact table. Primary Keyin fact table is mapped as foreign keys to Dimensions (codePolice,codevehicule,codeassure,id_lieu,id_date,id_natureSinistre), also it holds the necessary key metrics for setting up main and relevant key performance indexes (KPI) exemple 'pourcentage de responsabilité','calculsinistre'... IX. Business Understanding a) Business objectives We aim throughout this project to satisfy our client, which is the General Council of Insurance. The Council have set its primary objectives, which are the following: -Identify the seasons of road accidents. -Identify the responsible of the road accident. -Identify the vehicles that take part in a road accident.
  • 25. Projet PI_BI 25 b) Assessing the situation In order to achieve our objectives, we need to list all resources available to our project. The resources are: -Personnel: We will need 6 engineers who are familiar with data mining. -Data: The General Council of Insurance will provide us with the necessary data from all insurance companies. -Hardware: We will need six computers and access to the internet. -Software: We will work with Google Collaboratory application. c) Data Mining objectives The data mining objectives will be divided into 2 classes. We will perform an additional analysis where we aim to satisfy the following objectives: -Determine in which season road accidents occurred the most. -Determine which attribute that have the highest correlation with a road accident. Also, we will perform a predictive analysis where we aim to satisfy the following objectives: -Predict the nature of the road accident. -Predict the rate of responsibility of a road accident. -Predict the brand of the vehicles that frequently cause a road accident. d) Project Plan In order to achieve the business objectives and data mining objectives we will be need the following tools: -Talend Integration Tool: This tool permits us to integrate the data and produce the dimension tables and the fact tables.
  • 26. Projet PI_BI 26 -Google Collaboratory: This tool permits us to analyze the datasets extracted from the integration phase throughout algorithms using Python language. -Power BI: This tool allows us to visualize the datasets extracted and analyzed and assures the reporting phase. -PyCharm: This tool allows us to deploy our dashboard on a web application. X. Data Understanding a) Collecting initial data The CGA or the General Council of Insurance has provided us with the database containing all data required to achieve our objectives b) Describing data The database was a SQL Server database and it had 10 tables. Most of the tables had more than a thousand row and multiple columns. The attributes of the tables were in different types. Figure 20 Figure 21
  • 27. Projet PI_BI 27 c) Data Exploration After exploring the data acquired, we proceeded with doing simple queries and visualizations in order to discover a relationship among the data. And yet we have found an aggregation between tables that helped us achieve our objectives. Figure 22 Figure 23 Figure 24
  • 28. Projet PI_BI 28 Figure 26 d) Verifying data quality : 1) The ratio of data to error : The number of data errors were minimal. 2) Number of empty values : The number of empty values were significant, it was approximately over ten thousand missing value. 3) Data storage cost : The data storage was acceptable, the volume of the datasets was not significant. Figure 25 Figure 27
  • 29. Projet PI_BI 29 XI. Data Preparation a) Selecting Data In order to achieve our objectives, we will need four tables from the database. The “Assure” table. The “Sinistres” table. The “MarqueVehicules” table. The “Compagnie” table. The selection of these four tables is justified with the requirements needed to achieve our objectives. Figure 28 Figure 29
  • 30. Projet PI_BI 30 b) Cleaning data : In order to obtain the optimal dataset for analysis, we have performed some cleaning operations. These operations allow us to remove the unnecessary and unused data to the extent of raising the data quality. Additionally, we have used the amputation technique to remove unwanted data as well as transformations which will allow us to use the required algorithms. Example of amputation: Figure 30 Figure 31
  • 31. Projet PI_BI 31 Example of transformations: c) Construct required data : In order to give our analysis more depth, we have generated a new dataset. This dataset revolves around the date where we have extracted the date from each road accident and gave it new attributes and eventually generated new columns such as “Saison” column. Figure 32 Figure 33
  • 32. Projet PI_BI 32 XII. Modeling For the additional analysis, we have used two algorithms. K-Means Algorithm. AHC Algorithm. ACP Algorithm. 1. K-Means Algorithm : 1. Selecting model technique and building model : We start off by importing the dataset and eliminating the unused column. We proceed by applying the K-Means algorithm, we initially chose the value of K equals to 3. Figure 34 Figure 35
  • 33. Projet PI_BI 33 2. Asses Model : We can say throughout the results that the model is a good model for our objective as we can see it resulted in 3 clusters which satisfy the requirements. The cluster (1) is the cluster for the season “hiver”. The other clusters are for the other seasons. We can eventually say that “hiver” is the most common season for road accidents. 2. AHC Algorithm : a) Selecting model technique and building model : Figure 36
  • 34. Projet PI_BI 34 From this dendrogram, we can determine the number of classes to work with which is equal to two. b) Asses model : We can affirm that AHC did not perform as we wanted so this model cannot be trusted to work with, we proceed with K-Means algorithm results. 3. ACP Algorithm : a) Selecting model technique and building model : We start off by importing the dataset. Figure 37 Figure 38
  • 35. Projet PI_BI 35 Figure 39 We proceed by preparing the data through multiple operations. We must explicitly center and reduce variables to realize ACP standardized with PCA. We use for this operation StandardScaler class. Figure 40
  • 36. Projet PI_BI 36 In fact, the centering permits us to have the same scale of values to we obtain eventually a variance/covariance matrix. Figure 41 We proceed by verifying the properties of the new set of data through the algorithm awareness. We can notice that the mean and the standard deviation are now null and unitary. Instantiation and launching calculus. Figure 42
  • 37. Projet PI_BI 37 Displaying the number of generated components which is equal to 10. Figure 43 Proceeding with proper values and Screen Plot, the propriety “explained_variance” helps us get the variance associated with the factorial axes. Figure 44
  • 38. Projet PI_BI 38 b) Asses model : We move to the quality of the representation, with the square cosine method. In order to know the quality of the representation of the attributes on the axes, we need firstly to calculate the square value of the distance between the origin of the attributes which is corresponds with their contribution with the total inertia. We can notice that “dateOuvertureSinistre” and “annee” are the most pertinent attributes. Figure 45 Figure 46
  • 39. Projet PI_BI 39 We finish this task with the representation of variables, initially, we need eigen vectors to analyze variables. The vectors are produced by operation “champ.components”. Figure 47 We can see throughout this visualization that “dateOuvertureSinistre” is effectively correlated with “annee”. This global relation between variables is determined by “sinistre_id”. For the predictive analysis, we have used two algorithms. Random Forrest Algorithm. KNN Algorithm.
  • 40. Projet PI_BI 40 1. Random Forrest Algorithm  First Model (Predict the rate of responsibility of a road accident) a) Selecting model technique : We start off by importing the dataset. Figure 48 In order to proceed with the Random Forrest algorithm, we need to transform the categorical data into numerical data.
  • 41. Projet PI_BI 41 Figure 49 Transforming the dataset Figure 50
  • 42. Projet PI_BI 42 b) Generating test design : We will be splitting the data into 35% test set and 65% training set. Figure 51 c) Build model : Applying Random Forrest. Figure 52 Improving the performance of the algorithm Random Forrest Figure 53
  • 43. Projet PI_BI 43 d) Assess model : Interpreting the results of the classification report and the confusion matrix. Figure 54 The precision has a high value ( equal to 0.80), which means that from all the positive classes that our model has predicted, the number of classes that were true was important. The recall value is also high (equal to 0.80), which means our model has predicted an important number of positive classes. The accuracy value is also high, which means our model has correctly predicted the actual class from all the class types. The F1 score (equal to 0.80) helps to measure Recall and Precision at the same time. It uses Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more. The support (equal to 13643) is high in all classes which is the number of samples of the true values in each class.  Second Model (Predict the nature of the road accident) a) Selecting model technique : We start off by importing the dataset.
  • 44. Projet PI_BI 44 In order to proceed with the Random Forrest algorithm, we need to transform the categorical data into numerical data. Figure 56 Figure 55
  • 45. Projet PI_BI 45 Then, we continue with Features’ visualization. Figure 57 We can notice that the attribute “Nature de sinistre corporelle” is insignificant comparing with “Nature de sinistre materiel” classifying with “Sinistre_Id” The values of the attribute “Sinistre corporelle” and “Sinistre materiel” are proportional classifying by “Code_Assure” with a significant difference in values. Figure 58
  • 46. Projet PI_BI 46 Figure 59 b) Generating test design : We will be splitting the data into 40% test set and 60% training set. Figure 60
  • 47. Projet PI_BI 47 c) Build model : Applying Random Forrest. Figure 61 Figure 62
  • 48. Projet PI_BI 48 Our model has a precision score of 88.66% of the test set and a 100% on the training set. Figure 63 d) Assess model : Interpreting the results of the classification report and the confusion matrix.
  • 50. Projet PI_BI 50 The precision has a high value in all classes, which means that from all the positive classes that our model has predicted, the number of classes that were true was important. The recall value is also high in all classes, which means our model has predicted an important number of positive classes. The accuracy value is also high, which means our model has correctly predicted the actual class from all the class types. The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more. The support is high in all classes which is the number of samples of the true values in each class. 2. KNN Algorithm : a) Selecting model technique : In order to achieve our predictive goal, we have used the KNN Algorithm. We start off by importing the dataset to work with Figure 66
  • 51. Projet PI_BI 51 In order to work with KNN algorithm, we must encode the categorical data into numerical data. Figure 67 b) Generating test design : We will be splitting the data into 35% test set and 65% training set. Figure 68 c) Build model : We verify the algorithm accuracy by creating a loop for the number K, starting from 1 to 30 and therefore we can determine the optimal value for K in order to proceed. The optimal value for k is 20. We proceed with the algorithm.
  • 52. Projet PI_BI 52 Figure 69 d) Assess model : The assessing part is where we interpret the results throughout our classification report. Figure 70
  • 53. Projet PI_BI 53 The precision has a high value which equals to 0.98, which means that from all the positive classes that our model has predicted, the number of classes that were true was important. The recall value is also high which equals to 0.99, which means our model has predicted an important number of positive classes. The accuracy value is also high, which means our model has correctly predicted the actual class from all the class types. The F1 score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more, which is equal to 0.98. The support is high which equals to 6083 which is the number of samples of the true values in each class. XIII. Evaluation a) Evaluate the results : Looking into our predictive goals and the models created to achieve these goals and its great accuracy, we can say that the results does satisfy our business criteria. b) Review Process : In the process of creating these models we have tried to create the optimal dataset. The models created by different algorithms have produced different accuracy and precision values, so the choice was on the model that had the highest values and that respects our business goals criteria. Looking back to the phases of creating the optimal models, we must note that you must try different algorithms and adjust the parameters of the algorithm. c) Determine next steps : We aim to improve our models by adjusting the algorithms and cleaning the datasets by removing unnecessary data in order to enhance the precision and the effectiveness of our model.
  • 54. Projet PI_BI 54 XIV. Deployment A. Deployment plan : In order to deploy our solution, we have used “PyCharm” application. We start off by importing our dataset and inserting our algorithms. We then proceed by generating our prediction functions throughout our models and algorithms. We then proceed by generating the HTML file to obtain the layout and form which contains the fields to insert input values and finally we display the results of the prediction. B. Monitoring and maintenance plan : In order to keep monitoring our deployed solution and keeping it update, the problems encountered where the type of data and how the algorithm is responding with the latter. Figure 71
  • 55. Projet PI_BI 55 Figure 72 To avoid this kind of technical problems, we have encoded our data to the requirements of the algorithm whether its categorical of numerical. Figure 73
  • 56. Projet PI_BI 56 We can acknowledge the influence of this operation by observing the effectiveness of our model. Figure 74 To summarize our project and the work have been made, we kick off by stating business understanding; we aimed to achieve our clients’ requirements and goals which is reducing road accidents ratio. This goal was divided into multiple objectives related to our initial goal. The division of the initial goal resulted in three business objectives which are identifying the factors causing a road accident. The set of business objectives where additionally transformed into data mining goals which eventually produced two classes: An additional analysis and a predictive analysis. Going through the data mining process, we decided to work with multiple algorithms in order to obtain the optimal results for our goals.
  • 57. Projet PI_BI 57 The selection of the algorithms was not random, we have tried multiple algorithms and we eventually decided to choose the most convenient algorithms which where K means and ACP for the additional analysis and Random Forrest for the predictive analysis. The results were satisfying, we have found that the accuracy of the models met the requirements set. Moving to evaluating our data mining results, we have found that the precedent results were meeting the success criteria of our business and we were able to determine the dominant factors of a road accident. To conclude, the client has set a primary goal which was reducing the ratio of road accident and through out our work we have followed a strict methodology and have eventually produced satisfying results from an analytical view. XV. Dashboards : Figure 75
  • 58. Projet PI_BI 58 Figure 76  Interpretation From this graph we see that the year with the highest number of claims is 2018 0.22M and decreases to 0.01M in 2016  Impact for the company Identify what are the main causes for this reduction  Actions to take CGA take into consideration the causes of this reduction Figure 77
  • 59. Projet PI_BI 59  Interpretation From this graph we can see that Volkswagen is the most damaged car brand contract then starts to decrease  Impact on the business These results allow insurance companies to identify the most damaged car brands  Actions to take An additional addition for the 4 most affected brands Figure 78  Interpretation From this graph we see that the number of material claims is greater than that of bodily claims  Impact on the business This graph allows to know the nature of the disaster  Actions to take Insurance companies ask these customers to take the necessary precautions
  • 60. Projet PI_BI 60 Figure 79  Interpretation From this graph we see that Ami Assurance Tunisian insurance company which has the highest number of claims About 52K claims after mutual insurance Takafel around 260 and ASTREE 180 sinister  Impact for the company This graph makes it possible to distinguish the insurance companies which have the number the highest of claims in order to make the right decisions to minimize these claims  Actions to take We must identify the main causes of this high number of claims and then take the necessary decisions to stop this phenomenon
  • 61. Projet PI_BI 61 Figure 80 Figure 81  Interpretation From this graph we see that the first month of March and the most damaged month 56K claims and February in 2 place and after the curve decreases until month 10 which is the disaster month  Impact on the business identify the most disaster-stricken months  Actions to take insurance companies advise their clients to drive safely
  • 62. Projet PI_BI 62 Figure 82  Interpretation This graph shows that spring is the most damaged season and winter is in 2nd place  Impact on the business Identify the most disaster-stricken seasons  Actions to take Look for the climatic and infrastructure factors that make spring the most disaster-stricken season and take the necessary precautions Figure 83
  • 63. Projet PI_BI 63  Interpretation From this graph we see that TUNIS is the most damaged place with 338 claims and in second place Nabeul around 174 claims  Impact on the business This graph makes it possible to identify the most affected places  Actions to take The insurance companies must advise these customers who circulate in the most disaster stricken places to drive safely Figure 84
  • 64. Projet PI_BI 64 Figure 85  Interpretation From this graph we see the 5 most affected customers between 8.2K and 3.6k claims  Impact on the business Know the most affected customers  Actions to take Penalize these customers or change the contract formula with them Figure 86
  • 65. Projet PI_BI 65  Interpretation From this graph we see the percentage of customer responsibility which is 100 is greater 282.47K  Impact on the business This graph allows to know if the customers are responsible for the accident or not  Impact on the business This graph allows to know if the customers are responsible for the accident or not Figure 87  Interpretation This graph shows us that policyholders who are age 44 are customers who make more claims around 6.5K and secondarily policyholders who are 60 years old  Impact on the business Identify the age category Policyholders who make more claims  Actions to take Insurance companies ask these insureds to drive Cautiously
  • 66. Projet PI_BI 66 Figure 88  Interpretation From this graph we see that men make more claims than women up to 205.46K  Impact on the business this information helps the CGA to know the sex most affected  Actions to take agencies should advise male policyholders to drive safely Figure 89
  • 67. Projet PI_BI 67 Figure 90  Interpretation From this graph we can see that Volkswagen is the most damaged car brand  Impact on the business These results allow insurance companies to identify the most damaged car brands  Action to take An additional addition for the 5 most affected brands Figure 91  Interpretation From this graph we see that cars with a fiscal power equal to 5 are the most damaged about 8.89k claims  Impact on the business This graph makes it possible to identify the fiscal power of the most damaged cars  Action to take Reformulate the contract by adding an additional for cars of fiscal power 5,6,4
  • 68. Projet PI_BI 68 Figure 92  Interpretation From this graph we can see that the most damaged vehicles are from Gasoil energy  Impact on the business This graph makes it possible to identify the energy of the most damaged vehicles  Action to take Insurance companies must advise those customers who have diesel vehicles to take the right precautions Figure 93  Interpretation From this graph we see that the most damaged vehicles are Tunisian registrations
  • 69. Projet PI_BI 69  Impact on the business This graph makes it possible to identify the types of registrations of the most damaged vehicles  Action to take The insurances must advise those customers who have vehicles of the imma Tu type to take the right precautions XVI. Web application
  • 70. Projet PI_BI 70 XVII. Conclusion : Every year more than 1,500 Tunisian lives are perished due to car accidents and traffic incidents, alongside the material damages. for that reason, Tunisian insurance companies are trying to reduce these eventualities which will result in an increase in profit; By establishing a decisive system that will help them reach their goal.