SlideShare a Scribd company logo
1 of 30
Download to read offline
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Actuariat de l’Assurance Non-Vie # 9
A. Charpentier (UQAM & Université de Rennes 1)
ENSAE ParisTech, Octobre 2015 - Janvier 2016.
http://freakonometrics.hypotheses.org
@freakonometrics 1
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Fourre-Tout sur la Tarification
• modèle collectif vs. modèle individuel
• cas de la grande dimension
• choix de variables
• choix de modèles
@freakonometrics 2
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La loi Tweedie
Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ
and scale parameter φ, then it is a compound Poisson model,
• N ∼ P(λ) with λ =
φµ2−p
2 − p
• Yi ∼ G(α, β) with α = −
p − 2
p − 1
and β =
φµ1−p
p − 1
Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β),
then
• variance function power is p =
α + 2
α + 1
• mean is µ =
λα
β
• scale parameter is φ =
[λα]
α+2
α+1 −1
β2− α+2
α+1
α + 1
@freakonometrics 3
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
In the context of regression
Ni ∼ P(λi) with λi = exp[XT
i βλ]
Yj,i ∼ G(µi, φ) with µi = exp[XT
i βµ]
Then Si = Y1,i + · · · + YN,i has a Tweedie distribution
• variance function power is p =
φ + 2
φ + 1
• mean is λiµi
• scale parameter is
λ
1
φ+1 −1
i
µ
φ
φ+1
i
φ
1 + φ
There are 1 + 2dim(X) degrees of freedom.
@freakonometrics 4
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
Remark Note that the scale parameter should not depend on i.
A Tweedie regression is
• variance function power is p ∈ (1, 2)
• mean is µi = exp[XT
i βTweedie]
• scale parameter is φ
There are 2 + dim(X) degrees of freedom.
@freakonometrics 5
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
Considérons les bases suivantes, en RC, pour la fréquence
1 > freq = merge(contrat ,nombre_RC)
pour les coûts individuels
1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0)
,]
2 > sinistre_RC = merge(sinistre_RC ,contrat)
et pour les co ûts agrégés par police
1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat)
, FUN=’sum’)
2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’)
3 > global_RC = merge(contrat , agg_RC , all.x=TRUE)
4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0
@freakonometrics 6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
1 > library(splines)
2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=freq ,family=poisson)
3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC
,family=Gamma(link="log"))
Simple Modèle Coût par Police
1 > library(tweedie)
2 > library(statmod)
3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_RC ,family=tweedie(var.power =1.5 , link.
power =0))
@freakonometrics 7
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparaison des primes
1 > freq2 = freq
2 > freq2$exposition = 1
3 > P_f = predict(reg_f,newdata=freq2 ,type="response")
4 > P_c = predict(reg_c,newdata=freq2 ,type="response")
5 prime1 = P_f*P_c
1 > k = 1.5
2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_DO ,family=tweedie(var.power=k, link.power
=0))
3 > prime2 = predict(reg_a,newdata=freq2 ,type="response")
1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1)
@freakonometrics 8
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
@freakonometrics 9
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Tweedie 1
0200400600800
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
Tweedie 1−0.20.00.20.40.6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
Comparaison des primes pures, assurés no1, no2 et no 3 (DO)
@freakonometrics 10
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
‘Optimisation’ du Paramètre Tweedie
1 > dev = function(k){
2 + reg = glm(cout_RC~zone+bs( ageconducteur )+
carburant , data=global_RC , family=
tweedie(var.power=k, link.power =0) ,
offset=log(exposition))
3 + reg$deviance
4 + }
@freakonometrics 11
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Tarification et données massives (Big Data)
Problèmes classiques avec des données massives
• beaucoup de variables explicatives, k grand, XT
X peut-être non inversible
• gros volumes de données, e.g. données télématiques
• données non quantitatives, e.g. texte, localisation, etc.
@freakonometrics 12
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
La fascination pour les estimateurs sans biais
En statistique mathématique, on aime les estimateurs sans biais car ils ont
plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs
biaisés, potentiellement meilleurs ?
Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2
). Define
θ = αY . What is the optimal α to get the best estimator of µ ?
• bias: bias θ = E θ − µ = (α − 1)µ
• variance: Var θ =
α2
σ2
n
• mse: mse θ = (α − 1)2
µ2
+
α2
σ2
n
The optimal value is α =
µ2
µ2 +
σ2
n
< 1.
@freakonometrics 13
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Linear Model
Consider some linear model yi = xT
i β + εi for all i = 1, · · · , n.
Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write





y1
...
yn





y,n×1
=





1 x1,1 · · · x1,k
...
...
...
...
1 xn,1 · · · xn,k





X,n×(k+1)








β0
β1
...
βk








β,(k+1)×1
+





ε1
...
εn





ε,n×1
.
Assuming ε ∼ N(0, σ2
I), the maximum likelihood estimator of β is
β = argmin{ y − XT
β 2
} = (XT
X)−1
XT
y
... under the assumtption that XT
X is a full-rank matrix.
What if XT
i X cannot be inverted? Then β = [XT
X]−1
XT
y does not exist, but
βλ = [XT
X + λI]−1
XT
y always exist if λ > 0.
@freakonometrics 14
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Ridge Regression
The estimator β = [XT
X + λI]−1
XT
y is the Ridge estimate obtained as solution
of
β = argmin
β



n
i=1
[yi − β0 − xT
i β]2
+ λ β 2
1Tβ2



for some tuning parameter λ. One can also write
β = argmin
β; β 2 ≤s
{ Y − XT
β 2
}
Remark Note that we solve β = argmin
β
{objective(β)} where
objective(β) = L(β)
training loss
+ R(β)
regularization
@freakonometrics 15
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
In severall applications, k can be (very) large, but a lot of features are just noise:
βj = 0 for many j’s. Let s denote the number of relevent features, with s << k,
cf Hastie, Tibshirani & Wainwright (2015),
s = card{S} where S = {j; βj = 0}
The model is now y = XT
SβS + ε, where XT
SXS is a full rank matrix.
@freakonometrics 16
q
q
= . +
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
Define a 0 = 1(|ai| > 0). Ici dim(β) = s.
We wish we could solve
β = argmin
β; β 0 ≤s
{ Y − XT
β 2
}
Problem: it is usually not possible to describe all possible constraints, since
s
k
coefficients should be chosen here (with k (very) large).
Idea: solve the dual problem
β = argmin
β; Y −XTβ 2 ≤h
{ β 0
}
where we might convexify the 0 norm, · 0
.
@freakonometrics 17
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Regularization 0, 1 et 2
@freakonometrics 18
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
On [−1, +1]k
, the convex hull of β 0
is β 1
On [−a, +a]k
, the convex hull of β 0
is a−1
β 1
Hence,
β = argmin
β; β 1 ≤˜s
{ Y − XT
β 2
}
is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem
β = argmin{ Y − XT
β 2 +λ β 1 }
@freakonometrics 19
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO Least Absolute Shrinkage and Selection Operator
β ∈ argmin{ Y − XT
β 2
+λ β 1
}
is a convex problem (several algorithms ), but not strictly convex (no unicity of
the minimum). Nevertheless, predictions y = xT
β are unique
MM, minimize majorization, coordinate descent Hunter (2003).
@freakonometrics 20
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Optimal LASSO Penalty
Use cross validation, e.g. K-fold,
β(−k)(λ) = argmin



i∈Ik
[yi − xT
i β]2
+ λ β



then compute the sum of the squared errors,
Qk(λ) =
i∈Ik
[yi − xT
i β(−k)(λ)]2
and finally solve
λ = argmin Q(λ) =
1
K
k
Qk(λ)
Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the
largest λ such that
Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2
=
1
K2
K
k=1
[Qk(λ) − Q(λ)]2
@freakonometrics 21
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,11],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence RC
−10 −9 −8 −7 −6 −5
−0.20−0.15−0.10−0.050.000.050.10
Log Lambda
Coefficients
4 4 4 4 3 1
1
3
4
5
@freakonometrics 22
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence RC
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,11], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0002845703
6 > log(cvfit$lambda.min)
7 [1] -8.16453
• Cross validation curve + error bars
0.0 0.1 0.2 0.3 0.4
−0.20−0.15−0.10−0.050.000.050.10
L1 Norm
Coefficients
0 2 3 4 4
1
3
4
5
−10 −9 −8 −7 −6 −5
0.2460.2480.2500.2520.2540.2560.258
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1
@freakonometrics 23
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,12],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence DO
−9 −8 −7 −6 −5 −4
−0.8−0.6−0.4−0.20.0
Log Lambda
Coefficients
4 4 3 2 1 1
1
2
4
5
@freakonometrics 24
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence DO
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,12], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0004744917
6 > log(cvfit$lambda.min)
7 [1] -7.653266
• Cross validation curve + error bars
0.0 0.2 0.4 0.6 0.8 1.0
−0.8−0.6−0.4−0.20.0
L1 Norm
Coefficients
0 1 1 1 2 4
1
2
4
5
−9 −8 −7 −6 −5 −4
0.2150.2200.2250.2300.235
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1
@freakonometrics 25
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
The theoretical curve, given a distribution F, is
u → L(u) =
F −1
(u)
−∞
tdF(t)
+∞
−∞
tdF(t)
see Gastwirth (1972, econpapers.repec.org)
@freakonometrics 26
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
1 > library(ineq)
2 > set.seed (1)
3 > (x<-sort(rlnorm (5,0,1)))
4 [1] 0.4336018 0.5344838 1.2015872 1.3902836
4.9297132
5 > Lc.sim <- Lc(x)
6 > plot(Lc.sim)
7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch
=19, col="blue")
8 > lines(Lc.lognorm , parameter =1,lty =2)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
q
q
q
q
@freakonometrics 27
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Gini index is the ratio of the areas
A
A + B
. Thus,
G =
2
n(n − 1)x
n
i=1
i · xi:n −
n + 1
n − 1
=
1
E(Y )
∞
0
F(y)(1 − F(y))dy
1 > Gini(x)
2 [1] 0.4640003
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
p
L(p)
q
q
q
q
A
B
@freakonometrics 28
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparing Models
Consider an ordered sample {y1, · · · , yn} of incomes,
with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
We have observed losses yi and premiums π(xi). Con-
sider an ordered sample by the model, see Frees, Meyers
& Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then
plot
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
0 20 40 60 80 100
020406080100
Proportion (%)
Income(%)
poorest ← → richest
0 20 40 60 80 100
020406080100
Proportion (%)
Losses(%)
more risky ← → less risky
@freakonometrics 29
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Choix et comparaison de modèle en tarification
See Frees et al. (2010) or Tevet (2013).
@freakonometrics 30

More Related Content

What's hot (20)

HdR
HdRHdR
HdR
 
Slides ineq-3b
Slides ineq-3bSlides ineq-3b
Slides ineq-3b
 
Slides erasmus
Slides erasmusSlides erasmus
Slides erasmus
 
Slides toulouse
Slides toulouseSlides toulouse
Slides toulouse
 
Slides erm
Slides ermSlides erm
Slides erm
 
Quantile and Expectile Regression
Quantile and Expectile RegressionQuantile and Expectile Regression
Quantile and Expectile Regression
 
Proba stats-r1-2017
Proba stats-r1-2017Proba stats-r1-2017
Proba stats-r1-2017
 
Slides ensae 11bis
Slides ensae 11bisSlides ensae 11bis
Slides ensae 11bis
 
Lundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentierLundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentier
 
Slides ensae 10
Slides ensae 10Slides ensae 10
Slides ensae 10
 
Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Classification
ClassificationClassification
Classification
 
Pricing Game, 100% Data Sciences
Pricing Game, 100% Data SciencesPricing Game, 100% Data Sciences
Pricing Game, 100% Data Sciences
 
Slides guanauato
Slides guanauatoSlides guanauato
Slides guanauato
 
Berlin
BerlinBerlin
Berlin
 
Slides amsterdam-2013
Slides amsterdam-2013Slides amsterdam-2013
Slides amsterdam-2013
 
Slides univ-van-amsterdam
Slides univ-van-amsterdamSlides univ-van-amsterdam
Slides univ-van-amsterdam
 
Inequality #4
Inequality #4Inequality #4
Inequality #4
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
 

Viewers also liked (16)

Slides ensae-2016-6
Slides ensae-2016-6Slides ensae-2016-6
Slides ensae-2016-6
 
Slides ensae-2016-5
Slides ensae-2016-5Slides ensae-2016-5
Slides ensae-2016-5
 
Slides ensae-2016-7
Slides ensae-2016-7Slides ensae-2016-7
Slides ensae-2016-7
 
Slides ensae-2016-4
Slides ensae-2016-4Slides ensae-2016-4
Slides ensae-2016-4
 
Slides networks-2017-2
Slides networks-2017-2Slides networks-2017-2
Slides networks-2017-2
 
Slides simplexe
Slides simplexeSlides simplexe
Slides simplexe
 
Slides ads ia
Slides ads iaSlides ads ia
Slides ads ia
 
IA-advanced-R
IA-advanced-RIA-advanced-R
IA-advanced-R
 
15 03 16_data sciences pour l'actuariat_f. soulie fogelman
15 03 16_data sciences pour l'actuariat_f. soulie fogelman15 03 16_data sciences pour l'actuariat_f. soulie fogelman
15 03 16_data sciences pour l'actuariat_f. soulie fogelman
 
Slides 2040-6
Slides 2040-6Slides 2040-6
Slides 2040-6
 
Slides inequality 2017
Slides inequality 2017Slides inequality 2017
Slides inequality 2017
 
Julien slides - séminaire quantact
Julien slides - séminaire quantactJulien slides - séminaire quantact
Julien slides - séminaire quantact
 
Slides arbres-ubuntu
Slides arbres-ubuntuSlides arbres-ubuntu
Slides arbres-ubuntu
 
So a webinar-2013-2
So a webinar-2013-2So a webinar-2013-2
So a webinar-2013-2
 
Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1
 
Slides 2040-4
Slides 2040-4Slides 2040-4
Slides 2040-4
 

Similar to Slides ensae-2016-9

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingArthur Charpentier
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsColin Gillespie
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceArthur Charpentier
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedKaiju Capital Management
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-webArthur Charpentier
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Asma Ben Slimene
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Asma Ben Slimene
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920Karl Rudeen
 

Similar to Slides ensae-2016-9 (20)

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive Modeling
 
Slides ACTINFO 2016
Slides ACTINFO 2016Slides ACTINFO 2016
Slides ACTINFO 2016
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic Models
 
Slides ub-2
Slides ub-2Slides ub-2
Slides ub-2
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & Insurance
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem Resolved
 
Side 2019 #3
Side 2019 #3Side 2019 #3
Side 2019 #3
 
Slides ineq-4
Slides ineq-4Slides ineq-4
Slides ineq-4
 
Slides lyon-anr
Slides lyon-anrSlides lyon-anr
Slides lyon-anr
 
Slides ub-3
Slides ub-3Slides ub-3
Slides ub-3
 
Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
kactl.pdf
kactl.pdfkactl.pdf
kactl.pdf
 
Slides mevt
Slides mevtSlides mevt
Slides mevt
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-web
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
 
Project Paper
Project PaperProject Paper
Project Paper
 
PhDretreat2011
PhDretreat2011PhDretreat2011
PhDretreat2011
 

More from Arthur Charpentier (20)

Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
ACT6100 introduction
ACT6100 introductionACT6100 introduction
ACT6100 introduction
 
Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)
 
Control epidemics
Control epidemics Control epidemics
Control epidemics
 
STT5100 Automne 2020, introduction
STT5100 Automne 2020, introductionSTT5100 Automne 2020, introduction
STT5100 Automne 2020, introduction
 
Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
Reinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and FinanceReinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and Finance
 
Optimal Control and COVID-19
Optimal Control and COVID-19Optimal Control and COVID-19
Optimal Control and COVID-19
 
Slides OICA 2020
Slides OICA 2020Slides OICA 2020
Slides OICA 2020
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Side 2019 #10
Side 2019 #10Side 2019 #10
Side 2019 #10
 
Side 2019 #11
Side 2019 #11Side 2019 #11
Side 2019 #11
 
Side 2019 #12
Side 2019 #12Side 2019 #12
Side 2019 #12
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Side 2019 #8
Side 2019 #8Side 2019 #8
Side 2019 #8
 
Side 2019 #7
Side 2019 #7Side 2019 #7
Side 2019 #7
 
Side 2019 #6
Side 2019 #6Side 2019 #6
Side 2019 #6
 

Recently uploaded

The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfGale Pooley
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfGale Pooley
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex
 
VIP Call Girls Thane Sia 8617697112 Independent Escort Service Thane
VIP Call Girls Thane Sia 8617697112 Independent Escort Service ThaneVIP Call Girls Thane Sia 8617697112 Independent Escort Service Thane
VIP Call Girls Thane Sia 8617697112 Independent Escort Service ThaneCall girls in Ahmedabad High profile
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdfFinTech Belgium
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfGale Pooley
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingAggregage
 
20240417-Calibre-April-2024-Investor-Presentation.pdf
20240417-Calibre-April-2024-Investor-Presentation.pdf20240417-Calibre-April-2024-Investor-Presentation.pdf
20240417-Calibre-April-2024-Investor-Presentation.pdfAdnet Communications
 
Stock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfStock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfMichael Silva
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure servicePooja Nehwal
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Pooja Nehwal
 
Log your LOA pain with Pension Lab's brilliant campaign
Log your LOA pain with Pension Lab's brilliant campaignLog your LOA pain with Pension Lab's brilliant campaign
Log your LOA pain with Pension Lab's brilliant campaignHenry Tapper
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designsegoetzinger
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130Suhani Kapoor
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyTyöeläkeyhtiö Elo
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptxFinTech Belgium
 

Recently uploaded (20)

The Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdfThe Economic History of the U.S. Lecture 17.pdf
The Economic History of the U.S. Lecture 17.pdf
 
The Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdfThe Economic History of the U.S. Lecture 18.pdf
The Economic History of the U.S. Lecture 18.pdf
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024
 
VIP Call Girls Thane Sia 8617697112 Independent Escort Service Thane
VIP Call Girls Thane Sia 8617697112 Independent Escort Service ThaneVIP Call Girls Thane Sia 8617697112 Independent Escort Service Thane
VIP Call Girls Thane Sia 8617697112 Independent Escort Service Thane
 
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
06_Joeri Van Speybroek_Dell_MeetupDora&Cybersecurity.pdf
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdf
 
How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of Reporting
 
20240417-Calibre-April-2024-Investor-Presentation.pdf
20240417-Calibre-April-2024-Investor-Presentation.pdf20240417-Calibre-April-2024-Investor-Presentation.pdf
20240417-Calibre-April-2024-Investor-Presentation.pdf
 
Stock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfStock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdf
 
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure serviceCall US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
Call US 📞 9892124323 ✅ Kurla Call Girls In Kurla ( Mumbai ) secure service
 
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
Independent Call Girl Number in Kurla Mumbai📲 Pooja Nehwal 9892124323 💞 Full ...
 
Log your LOA pain with Pension Lab's brilliant campaign
Log your LOA pain with Pension Lab's brilliant campaignLog your LOA pain with Pension Lab's brilliant campaign
Log your LOA pain with Pension Lab's brilliant campaign
 
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(DIYA) Bhumkar Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Instant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School DesignsInstant Issue Debit Cards - School Designs
Instant Issue Debit Cards - School Designs
 
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
 
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANIKA) Budhwar Peth Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Maya Call 7001035870 Meet With Nagpur Escorts
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx
 

Slides ensae-2016-9

  • 1. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Actuariat de l’Assurance Non-Vie # 9 A. Charpentier (UQAM & Université de Rennes 1) ENSAE ParisTech, Octobre 2015 - Janvier 2016. http://freakonometrics.hypotheses.org @freakonometrics 1
  • 2. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Fourre-Tout sur la Tarification • modèle collectif vs. modèle individuel • cas de la grande dimension • choix de variables • choix de modèles @freakonometrics 2
  • 3. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La loi Tweedie Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ and scale parameter φ, then it is a compound Poisson model, • N ∼ P(λ) with λ = φµ2−p 2 − p • Yi ∼ G(α, β) with α = − p − 2 p − 1 and β = φµ1−p p − 1 Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β), then • variance function power is p = α + 2 α + 1 • mean is µ = λα β • scale parameter is φ = [λα] α+2 α+1 −1 β2− α+2 α+1 α + 1 @freakonometrics 3
  • 4. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie In the context of regression Ni ∼ P(λi) with λi = exp[XT i βλ] Yj,i ∼ G(µi, φ) with µi = exp[XT i βµ] Then Si = Y1,i + · · · + YN,i has a Tweedie distribution • variance function power is p = φ + 2 φ + 1 • mean is λiµi • scale parameter is λ 1 φ+1 −1 i µ φ φ+1 i φ 1 + φ There are 1 + 2dim(X) degrees of freedom. @freakonometrics 4
  • 5. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie Remark Note that the scale parameter should not depend on i. A Tweedie regression is • variance function power is p ∈ (1, 2) • mean is µi = exp[XT i βTweedie] • scale parameter is φ There are 2 + dim(X) degrees of freedom. @freakonometrics 5
  • 6. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel Considérons les bases suivantes, en RC, pour la fréquence 1 > freq = merge(contrat ,nombre_RC) pour les coûts individuels 1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0) ,] 2 > sinistre_RC = merge(sinistre_RC ,contrat) et pour les co ûts agrégés par police 1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat) , FUN=’sum’) 2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’) 3 > global_RC = merge(contrat , agg_RC , all.x=TRUE) 4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0 @freakonometrics 6
  • 7. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel 1 > library(splines) 2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=freq ,family=poisson) 3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC ,family=Gamma(link="log")) Simple Modèle Coût par Police 1 > library(tweedie) 2 > library(statmod) 3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_RC ,family=tweedie(var.power =1.5 , link. power =0)) @freakonometrics 7
  • 8. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparaison des primes 1 > freq2 = freq 2 > freq2$exposition = 1 3 > P_f = predict(reg_f,newdata=freq2 ,type="response") 4 > P_c = predict(reg_c,newdata=freq2 ,type="response") 5 prime1 = P_f*P_c 1 > k = 1.5 2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_DO ,family=tweedie(var.power=k, link.power =0)) 3 > prime2 = predict(reg_a,newdata=freq2 ,type="response") 1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1) @freakonometrics 8
  • 9. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures @freakonometrics 9 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq Tweedie 1 0200400600800 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq Tweedie 1−0.20.00.20.40.6
  • 10. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures Comparaison des primes pures, assurés no1, no2 et no 3 (DO) @freakonometrics 10
  • 11. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 ‘Optimisation’ du Paramètre Tweedie 1 > dev = function(k){ 2 + reg = glm(cout_RC~zone+bs( ageconducteur )+ carburant , data=global_RC , family= tweedie(var.power=k, link.power =0) , offset=log(exposition)) 3 + reg$deviance 4 + } @freakonometrics 11
  • 12. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Tarification et données massives (Big Data) Problèmes classiques avec des données massives • beaucoup de variables explicatives, k grand, XT X peut-être non inversible • gros volumes de données, e.g. données télématiques • données non quantitatives, e.g. texte, localisation, etc. @freakonometrics 12
  • 13. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 La fascination pour les estimateurs sans biais En statistique mathématique, on aime les estimateurs sans biais car ils ont plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs biaisés, potentiellement meilleurs ? Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2 ). Define θ = αY . What is the optimal α to get the best estimator of µ ? • bias: bias θ = E θ − µ = (α − 1)µ • variance: Var θ = α2 σ2 n • mse: mse θ = (α − 1)2 µ2 + α2 σ2 n The optimal value is α = µ2 µ2 + σ2 n < 1. @freakonometrics 13
  • 14. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Linear Model Consider some linear model yi = xT i β + εi for all i = 1, · · · , n. Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write      y1 ... yn      y,n×1 =      1 x1,1 · · · x1,k ... ... ... ... 1 xn,1 · · · xn,k      X,n×(k+1)         β0 β1 ... βk         β,(k+1)×1 +      ε1 ... εn      ε,n×1 . Assuming ε ∼ N(0, σ2 I), the maximum likelihood estimator of β is β = argmin{ y − XT β 2 } = (XT X)−1 XT y ... under the assumtption that XT X is a full-rank matrix. What if XT i X cannot be inverted? Then β = [XT X]−1 XT y does not exist, but βλ = [XT X + λI]−1 XT y always exist if λ > 0. @freakonometrics 14
  • 15. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Ridge Regression The estimator β = [XT X + λI]−1 XT y is the Ridge estimate obtained as solution of β = argmin β    n i=1 [yi − β0 − xT i β]2 + λ β 2 1Tβ2    for some tuning parameter λ. One can also write β = argmin β; β 2 ≤s { Y − XT β 2 } Remark Note that we solve β = argmin β {objective(β)} where objective(β) = L(β) training loss + R(β) regularization @freakonometrics 15
  • 16. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues In severall applications, k can be (very) large, but a lot of features are just noise: βj = 0 for many j’s. Let s denote the number of relevent features, with s << k, cf Hastie, Tibshirani & Wainwright (2015), s = card{S} where S = {j; βj = 0} The model is now y = XT SβS + ε, where XT SXS is a full rank matrix. @freakonometrics 16 q q = . +
  • 17. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues Define a 0 = 1(|ai| > 0). Ici dim(β) = s. We wish we could solve β = argmin β; β 0 ≤s { Y − XT β 2 } Problem: it is usually not possible to describe all possible constraints, since s k coefficients should be chosen here (with k (very) large). Idea: solve the dual problem β = argmin β; Y −XTβ 2 ≤h { β 0 } where we might convexify the 0 norm, · 0 . @freakonometrics 17
  • 18. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Regularization 0, 1 et 2 @freakonometrics 18
  • 19. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues On [−1, +1]k , the convex hull of β 0 is β 1 On [−a, +a]k , the convex hull of β 0 is a−1 β 1 Hence, β = argmin β; β 1 ≤˜s { Y − XT β 2 } is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem β = argmin{ Y − XT β 2 +λ β 1 } @freakonometrics 19
  • 20. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO Least Absolute Shrinkage and Selection Operator β ∈ argmin{ Y − XT β 2 +λ β 1 } is a convex problem (several algorithms ), but not strictly convex (no unicity of the minimum). Nevertheless, predictions y = xT β are unique MM, minimize majorization, coordinate descent Hunter (2003). @freakonometrics 20
  • 21. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Optimal LASSO Penalty Use cross validation, e.g. K-fold, β(−k)(λ) = argmin    i∈Ik [yi − xT i β]2 + λ β    then compute the sum of the squared errors, Qk(λ) = i∈Ik [yi − xT i β(−k)(λ)]2 and finally solve λ = argmin Q(λ) = 1 K k Qk(λ) Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the largest λ such that Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2 = 1 K2 K k=1 [Qk(λ) − Q(λ)]2 @freakonometrics 21
  • 22. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence RC −10 −9 −8 −7 −6 −5 −0.20−0.15−0.10−0.050.000.050.10 Log Lambda Coefficients 4 4 4 4 3 1 1 3 4 5 @freakonometrics 22
  • 23. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence RC 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0002845703 6 > log(cvfit$lambda.min) 7 [1] -8.16453 • Cross validation curve + error bars 0.0 0.1 0.2 0.3 0.4 −0.20−0.15−0.10−0.050.000.050.10 L1 Norm Coefficients 0 2 3 4 4 1 3 4 5 −10 −9 −8 −7 −6 −5 0.2460.2480.2500.2520.2540.2560.258 log(Lambda) PoissonDeviance q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1 @freakonometrics 23
  • 24. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence DO −9 −8 −7 −6 −5 −4 −0.8−0.6−0.4−0.20.0 Log Lambda Coefficients 4 4 3 2 1 1 1 2 4 5 @freakonometrics 24
  • 25. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence DO 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0004744917 6 > log(cvfit$lambda.min) 7 [1] -7.653266 • Cross validation curve + error bars 0.0 0.2 0.4 0.6 0.8 1.0 −0.8−0.6−0.4−0.20.0 L1 Norm Coefficients 0 1 1 1 2 4 1 2 4 5 −9 −8 −7 −6 −5 −4 0.2150.2200.2250.2300.235 log(Lambda) PoissonDeviance q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1 @freakonometrics 25
  • 26. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj The theoretical curve, given a distribution F, is u → L(u) = F −1 (u) −∞ tdF(t) +∞ −∞ tdF(t) see Gastwirth (1972, econpapers.repec.org) @freakonometrics 26
  • 27. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) 1 > library(ineq) 2 > set.seed (1) 3 > (x<-sort(rlnorm (5,0,1))) 4 [1] 0.4336018 0.5344838 1.2015872 1.3902836 4.9297132 5 > Lc.sim <- Lc(x) 6 > plot(Lc.sim) 7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch =19, col="blue") 8 > lines(Lc.lognorm , parameter =1,lty =2) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Lorenz curve p L(p) q q q q @freakonometrics 27
  • 28. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Gini index is the ratio of the areas A A + B . Thus, G = 2 n(n − 1)x n i=1 i · xi:n − n + 1 n − 1 = 1 E(Y ) ∞ 0 F(y)(1 − F(y))dy 1 > Gini(x) 2 [1] 0.4640003 q q 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 p L(p) q q q q A B @freakonometrics 28
  • 29. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparing Models Consider an ordered sample {y1, · · · , yn} of incomes, with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj We have observed losses yi and premiums π(xi). Con- sider an ordered sample by the model, see Frees, Meyers & Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then plot {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj 0 20 40 60 80 100 020406080100 Proportion (%) Income(%) poorest ← → richest 0 20 40 60 80 100 020406080100 Proportion (%) Losses(%) more risky ← → less risky @freakonometrics 29
  • 30. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Choix et comparaison de modèle en tarification See Frees et al. (2010) or Tevet (2013). @freakonometrics 30