0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Slides laval stat avril 2011

2,544

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
2,544
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
18
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Arthur CHARPENTIER, transformed kernels and beta kernels Beta kernels and transformed kernels applications to copulas and quantiles Arthur Charpentier Université Rennes 1 arthur.charpentier@univ-rennes1.fr http ://freakonometrics.blog.free.fr/ Statistical seminar, Université Laval, Québec, April 2011 1
• 2. Arthur CHARPENTIER, transformed kernels and beta kernels Agenda• General introduction and kernel based estimationCopula density estimation• Kernel based estimation and bias• Beta kernel estimation• Transforming observationsQuantile estimation• Parametric estimation• Semiparametric estimation, extreme value theory• Nonparametric estimation• Transforming observations 2
• 3. Arthur CHARPENTIER, transformed kernels and beta kernels Moving histogram to estimate a densityA natural way to estimate a density at x from an i.i.d. sample {X1 , · · · , Xn } is tocount (and then normalized) how many observations are in a neighborhood of x,e.g. |x − Xi | ≤ h, 3
• 4. Arthur CHARPENTIER, transformed kernels and beta kernels Kernel based estimationInstead of a step function 1(|x − Xi | ≤ h) consider so smoother functions, n n 1 1 x − xi fh (x) = Kh (x − xi ) = K , n i=1 nh i=1 hwhere K(·) is a kernel i.e. a non-negative real-valued integrable function such +∞that K(u) du = 1 so that fh (·) is a density , and K(·) is symmetric, i.e. −∞K(−u) = K(u). 4
• 5. Arthur CHARPENTIER, transformed kernels and beta kernels 5
• 6. Arthur CHARPENTIER, transformed kernels and beta kernelsStandard kernels are 1• uniform (rectangular) K(u) = 1{|u|≤1} 2• triangular K(u) = (1 − |u|) 1{|u|≤1} 3• Epanechnikov K(u) = (1 − u2 ) 1{|u|≤1} 4 1 1 2• Gaussian K(u) = √ exp − u 2π 2 6
• 7. Arthur CHARPENTIER, transformed kernels and beta kernelsStandard kernels are 1• uniform (rectangular) K(u) = 1{|u|≤1} 2• triangular K(u) = (1 − |u|) 1{|u|≤1} 3• Epanechnikov K(u) = (1 − u2 ) 1{|u|≤1} 4 1 1• Gaussian K(u) = √ exp − u2 2π 2 7
• 8. Arthur CHARPENTIER, transformed kernels and beta kernels word about copulasA 2-dimensional copula is a 2-dimensional cumulative distribution functionrestricted to [0, 1]2 with standard uniform margins. Copula (cumulative distribution function) Level curves of the copula Copula density Level curves of the copulaIf C is twice diﬀerentiable, let c denote the density of the copula. 8
• 9. Arthur CHARPENTIER, transformed kernels and beta kernelsSklar theorem : Let C be a copula, and FX and FY two marginal distributions,then F (x, y) = C(FX (x), FY (y)) is a bivariate distribution function, withF ∈ F(FX , FY ).Conversely, if F ∈ F(FX , FY ), there exists C such thatF (x, y) = C(FX (x), FY (y). Further, if FX and FY are continuous, then C isunique, and given by −1 −1 C(u, v) = F (FX (u), FY (v)) for all (u, v) ∈ [0, 1] × [0, 1]We will then deﬁne the copula of F , or the copula of (X, Y ). 9
• 10. Arthur CHARPENTIER, transformed kernels and beta kernels MotivationExample Loss-ALAE : consider the following dataset, were the Xi ’s are lossamount (paid to the insured) and the Yi ’s are allocated expenses. Denote by Ri ˆand Si the respective ranks of Xi and Yi . Set Ui = Ri /n = FX (Xi ) and ˆVi = Si /n = FY (Yi ).Figure 1 shows the log-log scatterplot (log Xi , log Yi ), and the associate copulabased scatterplot (Ui , Vi ).Figure 2 is simply an histogram of the (Ui , Vi ), which is a nonparametricestimation of the copula density.Note that the histogram suggests strong dependence in upper tails (theinteresting part in an insurance/reinsurance context). 10
• 11. Arthur CHARPENTIER, transformed kernels and beta kernels Log!log scatterplot, Loss!ALAE Copula type scatterplot, Loss!ALAE 1.0 5 0.8 Probability level ALAE 4 0.6 log(ALAE) 0.4 3 0.2 2 0.0 1 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 log(LOSS) Probability level LOSS Figure 1 – Loss-ALAE, scatterplots (log-log and copula type). 11
• 12. Arthur CHARPENTIER, transformed kernels and beta kernels Figure 2 – Loss-ALAE, histogram of copula type transformation. 12
• 13. Arthur CHARPENTIER, transformed kernels and beta kernels Why nonparametrics, instead of parametrics ?In parametric estimation, assume the the copula density cθ belongs to some givenfamily C = {cθ , θ ∈ Θ}. The tail behavior will crucially depend on the tailbehavior of the copulas in CExample : Table below show the probability that both X and Y exceed high −1 −1thresholds (X > FX (p) and Y > FY (p)), for usual copula families, whereparameter θ is obtained using maximum likelihood techniques. p Clayton Frank Gaussian Gumbel Clayton∗ max/min 90% 1.93500% 2.73715% 4.73767% 4.82614% 5.66878% 2.93 95% 0.51020% 0.78464% 1.99195% 2.30085% 2.78677% 5.46 99% 0.02134% 0.03566% 0.27337% 0.44246% 0.55102% 25.82 99.9% 0.00021% 0.00037% 0.01653% 0.04385% 0.05499% 261.85Probability of exceedances, for given parametric copulas, τ = 0.5. 13
• 14. Arthur CHARPENTIER, transformed kernels and beta kernels −1 −1Figure 3 shows the graphical evolution of p → P X > FX (p), Y > FY (p) . If theoriginal model is an multivariate student vector (X, Y ), the associated probability is theupper line. If either marginal distributions are misﬁtted (e.g. Gaussian assumption), orthe dependence structure is mispeciﬁed (e.g. Gaussian assumption), probabilities arealways underestimated. 14
• 15. Arthur CHARPENTIER, transformed kernels and beta kernels 0.30 Joint probability of exceeding high quantiles Ratios of exceeding probability 1.0 Student dependence structure, Student margins Gaussian dependence structure, Student margins Student dependence structure, Gaussian margins 0.25 Gaussian dependence structure, Gaussian margins 0.8 0.20 0.6 0.15 0.4 0.10 Misfitting dependence structure 0.2 Misfitting margins 0.05 Misfit margins and dependence 0.0 0.00 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Quantile levels −1 −1Figure 3 – p → P X > FX (p), Y > FY (p) when (X, Y ) is a Student t randomvector, and when either margins or the dependence structure is mispectiﬁed. 15
• 16. Arthur CHARPENTIER, transformed kernels and beta kernels Kernel estimation for bounded support densityConsider a kernel based estimation of density f , n 1 x − Xi f (x) = K , nh h i=1where K is a kernel function, given a n sample X1 , ..., Xn of positive random variable(Xi ∈ [0, ∞[). Let K denote a symmetric kernel, then 1 E(f (0)) = f (0) + O(h) 2 16
• 17. Arthur CHARPENTIER, transformed kernels and beta kernels Exponential random variables Exponential random variables 1.2 1.2 1.0 1.0 0.8 0.8 Density Density 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 !1 0 1 2 3 4 5 6 !1 0 1 2 3 4 5 6 Figure 4 – Density estimation of an exponential density, 100 points. 17
• 18. Arthur CHARPENTIER, transformed kernels and beta kernels 1.2 1.0 1.0 0.8 0.8 0.6Density Density 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0 2 4 6 !0.2 0.0 0.1 0.2 0.3 0.4 0.5 Figure 5 – Density estimation of an exponential density, 10, 000 points. 18
• 19. Arthur CHARPENTIER, transformed kernels and beta kernels 1.2 Kernel based estimation of the uniform density on [0,1] Kernel based estimation of the uniform density on [0,1] 1.2 1.0 1.0 0.8 0.8 Density Density 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 6 – Density estimation of an uniform density on [0, 1]. 19
• 20. Arthur CHARPENTIER, transformed kernels and beta kernels How to get a proper estimation on the borderSeveral techniques have been introduce to get a better estimation on the border,– boundary kernel (Müller (1991))– mirror image modiﬁcation (Deheuvels & Hominal (1989), Schuster (1985))– transformed kernel (Devroye & Györfi (1981), Wand, Marron & Ruppert (1991))In the particular case of densities on [0, 1],– Beta kernel (Brown & Chen (1999), Chen (1999, 2000)),– average of histograms inspired by Dersko (1998). 20
• 21. Arthur CHARPENTIER, transformed kernels and beta kernels Local kernel density estimatorsThe idea is that the bandwidth h(x) can be diﬀerent for each point x at which f (x) isestimated. Hence, n 1 x − Xi f (x, h(x)) = K , nh(x) h(x) i=1(see e.g. Loftsgaarden & Quesenberry (1965)). Variable kernel density estimatorsThe idea is that the bandwidth h can be replaced by n values α(Xi ). Hence, n 1 1 x − Xi f (x, α) = K , n α(Xi ) α(Xi ) i=1(see e.g. Abramson (1982)). 21
• 22. Arthur CHARPENTIER, transformed kernels and beta kernels The transformed Kernel estimateThe idea was developed in Devroye & Györfi (1981) for univariate densities.Consider a transformation T : R → [0, 1] strictly increasing, continuously diﬀerentiable,one-to-one, with a continuously diﬀerentiable inverse.Set Y = T (X). Then Y has density fY (y) = fX (T −1 (y)) · (T −1 ) (y).If fY is estimated by fY , then fX is estimated by fX (x) = fY (T (x)) · T (x). 22
• 23. Arthur CHARPENTIER, transformed kernels and beta kernels 0.5 Density estimation transformed kernel Density estimation transformed kernel 2.0 0.4 1.5 0.3 1.0 0.2 0.5 0.1 0.0 0.0 −3 −2 −1 0 1 2 3 0.0 0.2 0.4 0.6 0.8 1.0 Figure 7 – The transform kernel principle (with a Φ−1 -transformation). 23
• 24. Arthur CHARPENTIER, transformed kernels and beta kernels The Beta Kernel estimateThe Beta-kernel based estimator of a density with support [0, 1] at point x, is obtainedusing beta kernels, which yields n 1 u 1−u f (x) = K Xi , + 1, +1 n b b i=1where K(·, α, β) denotes the density of the Beta distribution with parameters α and β, Γ(α + β) α−1 K(x, α, β) = x (1 − x)β−1 1{x∈[0,1]} . Γ(α)Γ(β) 24
• 25. Arthur CHARPENTIER, transformed kernels and beta kernels Gaussian Kernel approach 10 Density estimation using beta kernels 8 2.0 6 4 1.5 2 0 0.0 0.2 0.4 0.6 0.8 1.0 1.0 Beta Kernel approach 10 0.5 8 6 0.0 4 0.0 0.2 0.4 0.6 0.8 1.0 2 0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 8 – The beta-kernel estimate. 25
• 26. Arthur CHARPENTIER, transformed kernels and beta kernels Copula density estimation : the boundary problemLet (U1 , V1 ), ..., (Un , Vn ) denote a sample with support [0, 1]2 , and with density c(u, v),which is assumed to be twice continuously diﬀerentiable on (0, 1)2 .If K denotes a symmetric kernel, with support [−1, +1], then for all(u, v) ∈ [0, 1] × [0, 1], in any corners (e.g. (0, 0)) 1 1 1 E(c(0, 0, h)) = · c(u, v) − [c1 (0, 0) + c2 (0, 0)] ωK(ω)dω · h + o(h). 4 2 0on the interior of the borders (e.g. u = 0 and v ∈ (0, 1)), 1 1 E(c(0, v, h)) = · c(u, v) − [c1 (0, v)] ωK(ω)dω · h + o(h). 2 0and in the interior ((u, v) ∈ (0, 1) × (0, 1)), 1 1 E(c(u, v, h)) = c(u, v) + [c1,1 (u, v) + c2,2 (u, v)] ω 2 K(ω)dω · h2 + o(h2 ). 2 −1On borders, there is a multiplicative bias and the order of the bias is O(h) (while it isO(h2 ) in the interior). 26
• 27. Arthur CHARPENTIER, transformed kernels and beta kernelsIf K denotes a symmetric kernel, with support [−1, +1], then for all(u, v) ∈ [0, 1] × [0, 1], in any corners (e.g. (0, 0)) 1 2 1 1 V ar(c(0, 0, h)) = c(0, 0) K(ω)2 dω · +o . 0 nh2 nh2on the interior of the borders (e.g. u = 0 and v ∈ (0, 1)), 1 1 1 1 V ar(c(0, v, h)) = c(0, v) K(ω)2 dω K(ω)2 dω · +o . −1 0 nh2 nh2and in the interior ((u, v) ∈ (0, 1) × (0, 1)), 1 2 1 1 V ar(c(u, v, h)) = c(u, v) K(ω)2 dω d· +o . −1 nh2 nh2 27
• 28. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of Frank copula 1.0 5 0.8 4 0.6 3 0.4 2 0.8 0.6 0.2 1 0.4 0.2 0.0 0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 Figure 9 – Theoretical density of Frank copula. 28
• 29. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of Frank copula 1.0 5 0.8 4 0.6 3 0.4 2 0.8 0.6 0.2 1 0.4 0.2 0.0 0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0Figure 10 – Estimated density of Frank copula, using standard Gaussian (inde-pendent) kernels, h = h∗ . 29
• 30. Arthur CHARPENTIER, transformed kernels and beta kernels Transformed kernel techniqueConsider the kernel estimator of the density of the (Xi , Yi ) = (G−1 (Ui ), G−1 (Vi ))’s,where G is a strictly increasing distribution function, with a diﬀerentiable density.Since density f is continuous, twice diﬀerentiable, and bounded above, for all(x, y) ∈ R2 , n 1 x − Xi y − Yi f (x, y) = K K , nh2 h h i=1satisﬁes E(f (x, y)) = f (x, y) + O(h2 ),as long as ωK(ω) = 0. And the variance is 2 f (x, y) 1 V ar(f (x, y)) = K(ω)2 dω +o , nh2 nh2and asymptotic normality can be obtained, 2 √ L nh2 f (x, y) − E(f (x, y)) → N (0, f (x, y) K(ω)2 dω ). 30
• 31. Arthur CHARPENTIER, transformed kernels and beta kernelsSince f (x, y) = g(x)g(y)c[G(x), G(y)]. (1)can be inverted in f (G−1 (u), G−1 (v)) c(u, v) = , (u, v) ∈ [0, 1] × [0, 1], (2) g(G−1 (u))g(G−1 (v))one gets, substituting f in (2) n 1 G−1 (u) − G−1 (Ui ) G−1 (v) − G−1 (Vi )c(u, v) = K , , nh · g(G−1 (u)) · g(G−1 (v)) h h i=1 (3)Therefore, o(h) E(c(u, v, h)) = c(u, v) + −1 (u))g(G−1 (v)) . g(G 31
• 32. Arthur CHARPENTIER, transformed kernels and beta kernelsSimilarly, 2 1 c(u, v) V ar(c(u, v, h)) = K(ω)2 dω g(G−1 (u))g(G−1 (v)) nh2 1 1 + o . g(G−1 (u))2 g(G−1 (v))2 nh2 32
• 33. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of Frank copula 5 0.8 4 0.6 3 0.4 2 0.8 0.6 0.2 1 0.4 0.2 0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8Figure 11 – Estimated density of Frank copula, using a Gaussian kernel, after aGaussian normalization. 33
• 34. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of Frank copula 5 0.8 4 0.6 3 0.4 2 0.8 0.6 0.2 1 0.4 0.2 0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8Figure 12 – Estimated density of Frank copula, using a Gaussian kernel, after aStudent normalization, with 5 degrees of freedom. 34
• 35. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of Frank copula 5 0.8 4 0.6 3 0.4 2 0.8 0.6 0.2 1 0.4 0.2 0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8Figure 13 – Estimated density of Frank copula, using a Gaussian kernel, after aStudent normalization, with 3 degrees of freedom. 35
• 36. Arthur CHARPENTIER, transformed kernels and beta kernels Bivariate Beta kernelsThe Beta-kernel based estimator of the copula density at point (u, v), is obtained usingproduct beta kernels, which yields n 1 u 1−u v 1−v c(u, v) = K Xi , + 1, + 1 · K Yi , + 1, +1 , n b b b b i=1where K(·, α, β) denotes the density of the Beta distribution with parameters α and β. 36
• 37. Arthur CHARPENTIER, transformed kernels and beta kernels Beta (independent) bivariate kernel , x=0.0, y=0.0 Beta (independent) bivariate kernel , x=0.2, y=0.0 Beta (independent) bivariate kernel , x=0.5, y=0.0 Beta (independent) bivariate kernel , x=0.0, y=0.2 Beta (independent) bivariate kernel , x=0.2, y=0.2 Beta (independent) bivariate kernel , x=0.5, y=0.2 Beta (independent) bivariate kernel , x=0.0, y=0.5 Beta (independent) bivariate kernel , x=0.2, y=0.5 Beta (independent) bivariate kernel , x=0.5, y=0.5Figure 14 – Shape of bivariate Beta kernels K(·, x/b+1, (1−x)/b+1)×K(·, y/b+1, (1 − y)/b + 1) for b = 0.2. 37
• 38. Arthur CHARPENTIER, transformed kernels and beta kernelsAssume that the copula density c is twice diﬀerentiable on [0, 1] × [0, 1]. Let(u, v) ∈ [0, 1] × [0, 1]. The bias of c(u, v) is of order b, i.e. E(c(u, v)) = c(u, v) + Q(u, v) · b + o(b),where the bias Q(u, v) is 1Q(u, v) = (1 − 2u)c1 (u, v) + (1 − 2v)c2 (u, v) + [u(1 − u)c1,1 (u, v) + v(1 − v)c2,2 (u, v)] . 2The bias here is O(b) (everywhere) while it is O(h2 ) using standard kernels.Assume that the copula density c is twice diﬀerentiable on [0, 1] × [0, 1]. Let(u, v) ∈ [0, 1] × [0, 1]. The variance of c(u, v) is in corners, e.g. (0, 0), 1 V ar(c(0, 0)) = [c(0, 0) + o(n−1 )], nb2in the interior of borders, e.g. u = 0 and v ∈ (0, 1) 1 V ar(c(0, v)) = [c(0, v) + o(n−1 )], 2nb3/2 πv(1 − v) 38
• 39. Arthur CHARPENTIER, transformed kernels and beta kernelsand in the interior,(u, v) ∈ (0, 1) × (0, 1) 1 V ar(c(u, v)) = [c(u, v) + o(n−1 )]. 4nbπ v(1 − v)u(1 − u)Remark From those properties, an (asymptotically) optimal bandwidth b can bededuced, maximizing asymptotic mean squared error, 1/3 1 1 b∗ ≡ · . 16πnQ(u, v)2 v(1 − v)u(1 − u)Note (see Charpentier, Fermanian & Scaillet (2005)) that all those results can beobtained in dimension d ≥ 2.Example For n = 100 simulated data, from Frank copula, the optimal bandwidth isb ∼ 0.05. 39
• 40. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of the copula density (Beta kernel, b=0.1) Estimation of the copula density (Beta kernel, b=0.1) 1.0 3.0 0.8 2.5 0.6 2.0 1.5 0.4 0.8 1.0 0.6 0.2 0.4 0.5 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 Figure 15 – Estimated density of Frank copula, Beta kernels, b = 0.1 40
• 41. Arthur CHARPENTIER, transformed kernels and beta kernels Estimation of the copula density (Beta kernel, b=0.05) Estimation of the copula density (Beta kernel, b=0.05) 1.0 3.0 0.8 2.5 0.6 2.0 1.5 0.4 0.8 1.0 0.6 0.2 0.4 0.5 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.0 0.2 0.4 0.6 0.8 1.0 Figure 16 – Estimated density of Frank copula, Beta kernels, b = 0.05 41
• 42. Arthur CHARPENTIER, transformed kernels and beta kernels Standard Gaussian kernel estimator, n=100 Standard Gaussian kernel estimator, n=1000 Standard Gaussian kernel estimator, n=10000 4 4 4 3 3 3 Density of the estimator Density of the estimator Density of the estimator 2 2 2 1 1 1 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Estimation of the density on the diagonal Estimation of the density on the diagonal Estimation of the density on the diagonal Figure 17 – Density estimation on the diagonal, standard kernel. 42
• 43. Arthur CHARPENTIER, transformed kernels and beta kernels Transformed kernel estimator (Gaussian), n=100 Transformed kernel estimator (Gaussian), n=1000 Transformed kernel estimator (Gaussian), n=10000 4 4 4 3 3 3 Density of the estimator Density of the estimator Density of the estimator 2 2 2 1 1 1 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Estimation of the density on the diagonal Estimation of the density on the diagonal Estimation of the density on the diagonal Figure 18 – Density estimation on the diagonal, transformed kernel. 43
• 44. Arthur CHARPENTIER, transformed kernels and beta kernels Beta kernel estimator, b=0.05, n=100 Beta kernel estimator, b=0.02, n=1000 Beta kernel estimator, b=0.005, n=10000 4 4 4 3 3 3 Density of the estimator Density of the estimator Density of the estimator 2 2 2 1 1 1 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Estimation of the density on the diagonal Estimation of the density on the diagonal Estimation of the density on the diagonal Figure 19 – Density estimation on the diagonal, Beta kernel. 44
• 45. Arthur CHARPENTIER, transformed kernels and beta kernels Copula density estimationGijbels & Mielniczuk (1990) : given an i.i.d. sample, a natural estimate for thenormed density is obtained using the transformed sample(FX (X1 ), FY (Y1 )), ..., (FX (Xn ), FY (Yn )), where FX and FY are the empiricaldistribution function of the marginal distribution. The copula density can beconstructed as some density estimate based on this sample (Behnen, Husková &Neuhaus (1985) investigated the kernel method).The natural kernel type estimator c of c is n 1 u − FX (Xi ) v − FY (Yi ) c(u, v) = K , , (u, v) ∈ [0, 1]. nh2 h h i=1“this estimator is not consistent in the points on the boundary of the unit square.” 45
• 46. Arthur CHARPENTIER, transformed kernels and beta kernels Copula density estimation and pseudo-observationsExample : in linear regression, residuals are pseudo observations. εi = H(Xi , Yi ) = Yi − α − βXi εi = Hn (Xi , Yi ) = Yi − αn − βn XiExample : when dealing with copulas, ranks Ui , Vi yield pseudo-observations. (Ui , Vi ) = H(Xi , Yi ) = (FX (Xi ), FY (Yi )) (Ui , Vi ) = Hn (Xi , Yi ) = (FX (Xi ), FY (Yi ))(see Genest & Rivest (1993)).More formally, let X 1 , ..., X n denote a series of observations of X (∈ X), stationaryand ergodic.Let H : X → Rd and set εi = H(X i ) (non-observable).If H is estimated by Hn then εi = Hn (X i ) are called pseudo-observations. 46
• 47. Arthur CHARPENTIER, transformed kernels and beta kernelsLet Kn denote the empirical distribution function of those pseudo-observations, n 1 Kn (t) = I(εi ≤ t) where t ∈ Rd . n i=1Further, if K denotes the distribution function of ε = H(X), then deﬁne the empiricalprocess based on pseudo-observations, √ Kn (t) = n Kn (t) − K(t)As proved in Ghoudi & Rémillard (1998, 2004), this empirical process convergesweakly.Figure ?? shows scatterplots when margins are known (i.e. (FX (Xi ), FY (Yi ))’s), and ˆ ˆwhen margins are estimated (i.e. (FX (Xi ), FY (Yi )’s). Note that the pseudo sample ismore “uniform”, in the sense of a lower discrepancy (as in Quasi Monte Carlotechniques, see e.g. Niederreiter (1992)). 47
• 48. Arthur CHARPENTIER, transformed kernels and beta kernels Scatterplot of observations (Xi,Yi) Scatterplot of pseudo−observations (Ui,Vi) 1.0 1.0 0.8 0.8 Value of the Vis (ranks) Value of the Yis 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Value of the Xis Value of the Uis (ranks)Figure 20 – Observations and pseudo-observation, 500 simulated observations ˆ ˆfrom Frank copula (Xi , Yi ) and the associate pseudo-sample (FX (Xi ), FY (Yi )). 48
• 49. Arthur CHARPENTIER, transformed kernels and beta kernelsBecause samples are more “uniform” using ranks and pseudo-observations, the varianceof the estimator of the density, at some given point (u, v) ∈ (0, 1) × (0, 1) is usuallysmaller. For instance, Figure 21 shows the impact of considering pseudo observations, ˆ ˆi.e. substituting FX and FY to unknown marginal distributions FX and FY . The dottedline shows the density of c(u, v) from a n = 100 sample (Ui , Vi ) (from Frank copula), ˆ ˆ ˆand the straight line shows the density of c(u, v) from the sample (FU (Ui ), FV (Vi )) (i.e. ˆranks of observations). 49
• 50. Arthur CHARPENTIER, transformed kernels and beta kernels Impact of pseudo−observations (n=100) 2.0 1.5 Density of the estimator 1.0 0.5 0.0 0 1 2 3 4 Distribution of the estimation of the density c(u,v) Figure 21 – The impact of estimating from pseudo-observations. 50
• 51. Arthur CHARPENTIER, transformed kernels and beta kernels Roots of ‘transformed kernel’ 51
• 52. Arthur CHARPENTIER, transformed kernels and beta kernels 52
• 53. Arthur CHARPENTIER, transformed kernels and beta kernels 53
• 54. Arthur CHARPENTIER, transformed kernels and beta kernels 54
• 55. Arthur CHARPENTIER, transformed kernels and beta kernels 55
• 56. Arthur CHARPENTIER, transformed kernels and beta kernels Using a parametric approach −1If FX ∈ F = {Fθ , θ ∈ Θ} (assumed to be continuous), qX (α) = Fθ (α), and thus, anatural estimator is qX (α) = F −1 (α), (4) θwhere θ is an estimator of θ (maximum likelihood, moments estimator...). 56
• 57. Arthur CHARPENTIER, transformed kernels and beta kernels Using the Gaussian distributionA natural idea (that can be found in classical ﬁnancial models) is to assume Gaussiandistributions : if X ∼ N (µ, σ), then the α-quantile is simply q(α) = µ + Φ−1 (α)σ,where Φ−1 (α) is obtained in statistical tables (or any statistical software), e.g.u = −1.64 if α = 90%, or u = −1.96 if α = 95%.Deﬁnition1Given a n sample {X1 , · · · , Xn }, the (Gaussian) parametric estimation of the α-quantileis qn (α) = µ + Φ−1 (α)σ, 57
• 58. Arthur CHARPENTIER, transformed kernels and beta kernels Using a parametric modelsActually, is the Gaussian model does not ﬁt very well, it is still possible to use GaussianapproximationIf the variance is ﬁnite, (X − E(X))/σ might be closer to the Gaussian distribution, andthus, consider the so-called Cornish-Fisher approximation, i.e. Q(X, α) ∼ E(X) + zα V (X), (5)where 2 −1 ζ1 −1 ζ2 −1 −1 ζ1zα = Φ 2 (α) + [Φ (α) − 1] + 3 [Φ (α) − 3Φ (α)] − [2Φ−1 (α)3 − 5Φ−1 (α)], 6 24 36where ζ1 is the skewness of X, and ζ2 is the excess kurtosis, i.e. i.e. E([X − E(X)]3 ) E([X − E(X)]4 ) ζ1 = and ζ2 = − 3. (6) E([X − E(X)]2 )3/2 E([X − E(X)]2 )2 58
• 59. Arthur CHARPENTIER, transformed kernels and beta kernels Using a parametric modelsDeﬁnition2Given a n sample {X1 , · · · , Xn }, the Cornish-Fisher estimation of the α-quantile is n n 1 1 2 qn (α) = µ + zα σ, where µ = Xi and σ = (Xi − µ) , n n−1 i=1 i=1and 2 −1 ζ1 −1 ζ2 −1 ζ1 zα = Φ (α)+ [Φ (α) −1]+ [Φ (α) −3Φ (α)]− [2Φ−1 (α)3 −5Φ−1 (α)], (7) 2 3 −1 6 24 36where ζ1 is the natural estimator for the skewness of X , and ζ2 is the natural estimator of √ n n(n − 1) n i=1 (Xi − µ)3the excess kurtosis, i.e. ζ1 = and n−2 n (Xi − µ)2 3/2 i=1 n n−1 n i=1 (Xi − µ)4 ζ2 = (n + 1)ζ2 + 6 where ζ2 = 2 − 3. (n − 2)(n − 3) n (Xi − µ)2 i=1 59
• 60. Arthur CHARPENTIER, transformed kernels and beta kernels Parametrics estimator and error model Density, theoritical versus empirical Density, theoritical versus empirical 0.8 0.3 0.6 Theoritical Student Theoritical lognormal Fitted lStudent Fitted lognormal 0.2 Fitted Gaussian Fitted gamma 0.4 0.1 0.2 0.0 0.0 0 1 2 3 4 5 −4 −2 0 2 4 Figure 22 – Estimation of Value-at-Risk, model error. 60
• 61. Arthur CHARPENTIER, transformed kernels and beta kernels Using a semiparametric modelsGiven a n-sample {Y1 , . . . , Yn }, let Y1:n ≤ Y2:n ≤ . . .≤ Yn:n denotes the associated orderstatistics.If u large enough, Y − u given Y > u has a Generalized Pareto distribution withparameters ξ and β ( Pickands-Balkema-de Haan theorem).If u = Yn−k:n for k large enough, and if ξ> 0, denote by βk and ξk maximum likelihoodestimators of the Genralized Pareto distribution of sample{Yn−k+1:n − Yn−k:n , ..., Yn:n − Yn−k:n }, βk n −ξk Q(Y, α) = Yn−k:n + (1 − α) −1 , (8) ξk kAn alternative is to use Hill’s estimator if ξ > 0, k n −ξk 1 Q(Y, α) = Yn−k:n (1 − α) , ξk = log Yn+1−i:n − log Yn−k:n . (9) k k i=1 61
• 62. Arthur CHARPENTIER, transformed kernels and beta kernels On nonparametric estimation for quantiles −1For continuous distribution q(α) = FX (α), thus, a natural idea would be to consider −1q(α) = FX (α), for some nonparametric estimation of FX .Deﬁnition3The empirical cumulative distribution function Fn , based on sample {X1 , . . . , Xn } is n 1Fn (x) = 1(Xi ≤ x). n i=1Deﬁnition4The kernel based cumulative distribution function, based on sample {X1 , . . . , Xn } is n x n 1 Xi − t 1 Xi − x Fn (x) = k dt = K nh −∞ h n h i=1 i=1 xwhere K(x) = k(t)dt, k being a kernel and h the bandwidth. −∞ 62
• 63. Arthur CHARPENTIER, transformed kernels and beta kernels Smoothing nonparametric estimatorsTwo techniques have been considered to smooth estimation of quantiles, either implicit,or explicit.• consider a linear combinaison of order statistics,The classical empirical quantile estimate is simply −1 i Qn (p) = Fn = Xi:n = X[np]:n where [·] denotes the integer part. (10) nThe estimator is simple to obtain, but depends only on one observation. A naturalextention will be to use - at least - two observations, if np is not an integer. Theweighted empirical quantile estimate is then deﬁned as Qn (p) = (1 − γ) X[np]:n + γX[np]+1:n where γ = np − [np]. 63
• 64. Arthur CHARPENTIER, transformed kernels and beta kernels The quantile function in R The quantile function in R 7 q q qq qq qq type=1 q q 8 type=1 qq type=3 qq qqq 6 type=3 type=5 qq type=5 qqqq qq type=7 qq type=7 q qqqqq qqqq 6 5 qqqq qq quantile level quantile level qqq q qqqq q qq q q qqq q qqq q qqqq qq 4 qqq q qq 4 qq qqqq qq qqq q 3 qq qqq qq q q 2 qq 2 qq q q qq 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 probability level probability level Figure 23 – Several quantile estimators in R. 64
• 65. Arthur CHARPENTIER, transformed kernels and beta kernels Smoothing nonparametric estimatorsIn order to increase eﬃciency, L-statistics can be considered i.e. n n 1 −1 i −1 Qn (p) = Wi,n,p Xi:n = Wi,n,p Fn = Fn (t) k (p, h, t) dt (11) n 0 i=1 i=1where Fn is the empirical distribution function of FX , where k is a kernel and h abandwidth. This expression can be written equivalently n i n i i−1 n t−p n −p n −p Qn (p) = k dt X(i) = IK − IK X(i) (i−1) h h h i=1 n i=1 (12) xwhere again IK (x) = k (t) dt. The idea is to give more weight to order statistics −∞X(i) such that i is closed to pn. 65
• 66. Arthur CHARPENTIER, transformed kernels and beta kernels 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 quantile (probability) level Figure 24 – Quantile estimator as wieghted sum of order statistics. 66
• 67. Arthur CHARPENTIER, transformed kernels and beta kernels 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 quantile (probability) level Figure 25 – Quantile estimator as wieghted sum of order statistics. 67
• 68. Arthur CHARPENTIER, transformed kernels and beta kernels 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 quantile (probability) level Figure 26 – Quantile estimator as wieghted sum of order statistics. 68
• 69. Arthur CHARPENTIER, transformed kernels and beta kernels 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 quantile (probability) level Figure 27 – Quantile estimator as wieghted sum of order statistics. 69
• 70. Arthur CHARPENTIER, transformed kernels and beta kernels 3 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 quantile (probability) level Figure 28 – Quantile estimator as wieghted sum of order statistics. 70
• 71. Arthur CHARPENTIER, transformed kernels and beta kernels Smoothing nonparametric estimatorsE.g. the so-called Harrell-Davis estimator is deﬁned as n i n Γ(n + 1) Qn (p) = y (n+1)p−1 (1 − y)(n+1)q−1 Xi:n , (i−1) Γ((n + 1)p)Γ((n + 1)q) i=1 n• ﬁnd a smooth estimator for FX , and then ﬁnd (numerically) the inverse,The α-quantile is deﬁned as the solution of FX ◦ qX (α) = α.If Fn denotes a continuous estimate of F , then a natural estimate for qX (α) is qn (α)such that Fn ◦ qn (α) = α, obtained using e.g. Gauss-Newton algorithm. 71
• 72. Arthur CHARPENTIER, transformed kernels and beta kernels Improving Beta kernel estimatorsProblem : the convergence is not uniform, and there is large second order bias onborders, i.e. 0 and 1.Chen (1999) proposed a modiﬁed Beta 2 kernel estimator, based on   k t , 1−t (u)  b b , if t ∈ [2b, 1 − 2b]  k2 (u; b; t) = k 1−t (u) , if t ∈ [0, 2b)  ρb (t), b  , if t ∈ (1 − 2b, 1]  kt ,ρ (1−t) (u) b b twhere ρb (t) = 2b2 + 2.5 − 4b4 + 6b2 + 2.25 − t2 − . b 72
• 73. Arthur CHARPENTIER, transformed kernels and beta kernels Non-consistency of Beta kernel estimatorsProblem : k(0, α, β) = k(1, α, β) = 0. So if there are point mass at 0 or 1, the estimatorbecomes inconsistent, i.e. 1 x 1−x fb (x) = k Xi , 1 + , 1 + , x ∈ [0, 1] n b b 1 x 1−x = k Xi , 1 + , 1 + , x ∈ [0, 1] n b b Xi =0,1 n − n0 − n1 1 x 1−x = k Xi , 1 + ,1 + , x ∈ [0, 1] n n − n0 − n1 b b Xi =0,1 ≈ (1 − P(X = 0) − P(X = 1)) · f0 (x), x ∈ [0, 1]and therefore Fb (x) ≈ (1 − P(X = 0) − P(X = 1)) · F0 (x), and we may have problemﬁnding a 95% or 99% quantile since the total mass will be lower. 73
• 74. Arthur CHARPENTIER, transformed kernels and beta kernels Non-consistency of Beta kernel estimatorsGouriéroux & Monfort (2007) proposed (1) fb (x) fb (x) = 1 , for all x ∈ [0, 1]. 0 fb (t)dtIt is called macro-β since the correction is performed globally.Gouriéroux & Monfort (2007) proposed n (2) 1 kβ (Xi ; b; x) fb (x) = 1 , for all x ∈ [0, 1]. n kβ (Xi ; b; t)dt i=1 0It is called micro-β since the correction is performed locally. 74
• 75. Arthur CHARPENTIER, transformed kernels and beta kernels Transforming observations ?In the context of density estimation, Devroye and Györfi (1985) suggested to use aso-called transformed kernel estimateGiven a random variable Y , if H is a strictly increasing function, then the p-quantile ofH(Y ) is equal to H(q(Y ; p)).An idea is to transform initial observations {X1 , · · · , Xn } into a sample {Y1 , · · · , Yn }where Yi = H(Xi ), and then to use a beta-kernel based estimator, if H : R → [0, 1].Then qn (X; p) = H −1 (qn (Y ; p)).In the context of density estimation fX (x) = fY (H(x))H (x). As mentioned inDevroye and Györfi (1985) (p 245), “for a transformed histogram histogramestimate, the optimal H gives a uniform [0, 1] density and should therefore be equal toH(x) = F (x), for all x”. 75
• 76. Arthur CHARPENTIER, transformed kernels and beta kernels Transforming observations ? a monte carlo studyAssume that sample {X1 , · · · , Xn } have been generated from Fθ0 (from a famillyF = (Fθ , θ ∈ Θ). 4 transformations will be considered– H = Fθ (based on a maximum likelihood procedure)– H = Fθ0 (theoritical optimal transformation)– H = Fθ with θ < θ0 (heavier tails)– H = Fθ with θ > θ0 (lower tails) 76
• 77. Arthur CHARPENTIER, transformed kernels and beta kernels 1.0 1.0 qq qq q q qq q q qqqq qq qq qq q q q qq q q qq q qq q q qq qq qq q q q qq q q q qq q qq q q q qq qq qqq qq q q q q q qq qq qqq qq qq q qq qq 0.8 0.8 qq qq qq qq qqq q qq qq qq qq qq q qq qq q qq q q q qq qq q q qq q qTransformed observations Transformed observations q qq qq qq q q q qqq qq q q q q q q q q qq qq qq qq q qqqq qq qq qq q q q qqq qq qq q qq qq q q q q q 0.6 0.6 qq q q q q q qq qq qq q qqq qq qq q qq qq qqq qqq qqq q qq q qq qqq q qq qq qq qq q qq qq qqq qq q qq qq qq q qq qq qq q q q q q q qq qq qqq q q q q q q q qq qq q q qq qq q 0.4 0.4 q qq q q qq q qq q q q qqq qq q qq qq q qq qq qq qq q q q q q q q q qq qq q qq qq qq qq qq qq q q q q q q q qq qq qq qq qq q q q q q q q q qqq qq q q qq q qq 0.2 0.2 q qq qq q q q q q q qq q q q qqq qq q q q q q q qq q q q qq q q q q qq q q qq qq qq q qq qq qq qq q q q q q qq q q q qq qq qq qq qq qq qq qq qq qq qq qq qq qq q q qq q 0.0 0.0 q q qq qq qq 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 29 – F (Xi ) versus Fθ (Xi ), i.e. P P plot. ˆ 77
• 78. Arthur CHARPENTIER, transformed kernels and beta kernels 1.4 1.4 1.2 1.2Estimated density Estimated density 1.0 1.0 q 0.8 0.8 0.6 0.6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 30 – Nonparametric estimation of the density of the Fθ (Xi )’s. ˆ 78
• 79. Arthur CHARPENTIER, transformed kernels and beta kernels Estimated optimal transformation Estimated optimal transformation 4.0 q q 5 3.5 q q 3.0 4 q q qQuantile Quantile q q 2.5 q q 3 q q q q q 2.0 q q q q q q q q q q q q q q 2 q q 1.5 qq qq qq qq qq q qq q qq q qq qq qq qq qq q qq qq qq 1.0 q qq qqq qq qqq qq qqq 1 0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 Probability level Probability level −1 Figure 31 – Nonparametric estimation of the quantile function, Fθ (q). ˆ 79
• 80. Arthur CHARPENTIER, transformed kernels and beta kernels 1.0 1.0 q q q qq q q q qq q qq qq qq qq qq q q qq qq q q qq q q q qqq qq qq qq q q q qq qq qq q q q qq q qq qq q q q q q qqq qq q q qq qq q q q q qq qq 0.8 0.8 qqq q qq qq qq qq qq qq q qq q qq q qq qq q q q q q qq q q qq qTransformed observations Transformed observations q qq qq qq q qq qqq qq q q q qq q qq q qq q qq q qq qq qq q qq qq qq qqq qq qq qq q q q q qq qq qq q q q q qq 0.6 0.6 q qq q q q qq qq q qq qq q q qq qq qq q q q q q qqq qqq qqq qqq q qq q qq q qq qq qq qq qqq q q qq qq q qq qq qq qq q qq qq q q qq q qqq qq q qq qq q qq qq qq qq q 0.4 0.4 q qqqq qq q q q q q q qq q qq qq q qq qq qq qq qq q qq qqq qq q q q q q q q qq q q qq q qq qq qq q qq qq q q q q qq q qq qq qq q q qq q q qq qq q qq qq 0.2 0.2 q qq q q qqq qq qq q qq qq q q q qq q q qq q qq q qq q q qqq q q q qq qq qq qq qq qq qq qq qq qq qq q qq qq q q qq qq q q qq qq qq qq qq qq qq q qq qq qq qq q qq q q q q 0.0 0.0 q qqq q qq q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 32 – F (Xi ) versus Fθ0 (Xi ), i.e. P P plot. 80
• 81. Arthur CHARPENTIER, transformed kernels and beta kernels 1.4 1.4 1.2 1.2Estimated density Estimated density 1.0 1.0 q 0.8 0.8 0.6 0.6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 33 – Nonparametric estimation of the density of the Fθ0 (Xi )’s. 81
• 82. Arthur CHARPENTIER, transformed kernels and beta kernels Estimated optimal transformation Estimated optimal transformation q q 4 4 q q q qQuantile Quantile q q 3 3 q q q q q q q q q q q q q q q 2 q q q 2 q q q q q q q qq q q q q qq qq qq q qq qq q qq q qq q qq qq qq qq q qq qq qqq qqq 1 qqq qqq 1 0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 Probability level Probability level −1 Figure 34 – Nonparametric estimation of the quantile function, Fθ0 (q). 82
• 83. Arthur CHARPENTIER, transformed kernels and beta kernels 1.0 1.0 q qq q qq qq q q q qq qq qq q qq q qq q q qq q q qq qq q qq qq qq qq q q q q q qq qq qq qq qq q q q q qq qq qqq qq qq q q q qq q q qq qq 0.8 0.8 q qq q qq qq qqq qq qq qq qq q qq q q qq qq q q q qq qq qq qq q qq qTransformed observations Transformed observations q qq qq qq qq qq q q qqq q qq q qq qq qq q qq q qq q q q q qqq qq qq q q q q q q q qq qq qq qq q q q q qq 0.6 0.6 q q qq qq qq q q qq qqq q qq qq q q q qqq qqq qq qq q q qqq qqq qqq qqq qq q q qq qq qq qq qq qq qqq q q q qq qq qq qq qq q q q qq qq qq q q q qq q q qq qq q qqq qqq qqq qq qq q qq qq q 0.4 0.4 q qq qq q qq qq q qq q qq qq q qq qq q q q qq qq qq q q q q qq q qq qq qq q q qq q qq qq q q q q qq qq qq qqq qq q q q q q qqq qq qq q q qq 0.2 0.2 q q qq q q qq qq qq qq qq qq qq qq q qq q q q q q q qqq qq q q q q qq qq qq qq qq qq qq qq qq q qq q qq qq q q q qq qq q q qq q q q q qq qq qq qq q q q qq q q q 0.0 0.0 q qqq q qq q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 35 – F (Xi ) versus Fθ (Xi ), i.e. P P plot, θ < θ0 (heavier tails). 83
• 84. Arthur CHARPENTIER, transformed kernels and beta kernels 1.4 1.4 1.2 1.2Estimated density Estimated density 1.0 1.0 q 0.8 0.8 0.6 0.6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 36 – Estimation of the density of the Fθ (Xi )’s, θ < θ0 (heavier tails). 84
• 85. Arthur CHARPENTIER, transformed kernels and beta kernels Estimated optimal transformation Estimated optimal transformation 12 q q 12 10 10 q q 8 8Quantile Quantile q q 6 q q 6 q q q q q q 4 q q 4 q q q q q q q q qq qq q q qq q qq q qq qq qq 2 qqq qqq 2 qqq qqq qqq qqq qq qqqq qq qqqq qqqqq qqqqq 0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 Probability level Probability level −1 Figure 37 – Estimation of quantile function, Fθ (q), θ < θ0 (heavier tails). 85
• 86. Arthur CHARPENTIER, transformed kernels and beta kernels 1.0 1.0 qqqq qqq qq q qqq q q q q q qqq q qqq qqq q q q q q q q qq qqq qq qq qq q q q q qq qq q q q q q q q qq q qq q q q q qq q qqq qq q qq qq q q q qq qq 0.8 0.8 q qq qq qq q q q q qqq qq qq qq q qq qq q qq q qq qq qq qTransformed observations Transformed observations q q q q qq qqq qq q q q qq qq q q q qq q qq q qq qq q q q qq qq qq qq q q qq q q q qq q q q qq qq q 0.6 0.6 q qqq qqq q q q q q qq q q q q q qq qq q q q qq q qq q qqq qqq qqq qqq qq q qq q qq q q q qq qq qq qqq q q q q q q qq qq qq qq q q q q q q q q qqq qq q q qq qq qq qq q 0.4 0.4 q qq qq qq qq qqq qq qq q q q q q q qqq q qq q q q q q q q q q qq q qq qq qq q q q q q q qq qq qq q qq qq qq qq q qq qq qq qq qq q qq qq q q qqq qq qq q qq qq qq qq qq 0.2 0.2 qq q q q qqq qq q q q q qq q q q q q q q qq q qq qq q qq q qq qq qq q qq qq qq qq qq qq qq qq qq q qq q qq qqq qq qq qq q qq q q q qq qq q qqq qq qqq q qqq q q q 0.0 0.0 q q qqq q q q qq q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 38 – F (Xi ) versus Fθ (Xi ), i.e. P P plot, θ > θ0 (lighter tails). 86
• 87. Arthur CHARPENTIER, transformed kernels and beta kernels 1.4 1.4 1.2 1.2Estimated density Estimated density 1.0 1.0 q 0.8 0.8 0.6 0.6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 39 – Estimation of density of Fθ (Xi )’s, θ > θ0 (lighter tails). 87
• 88. Arthur CHARPENTIER, transformed kernels and beta kernels Estimated optimal transformation Estimated optimal transformation q q 3.5 3.5 q q 3.0 3.0 q q q q 2.5 2.5Quantile Quantile q q q q q q q q 2.0 q q 2.0 q q q q q q q q q q q q q q q q 1.5 q q 1.5 qq qq q qq q qq qq qq qq qq qq qq q qq q qq q qq q qq 1.0 qq qq 1.0 qq qq qq qq 0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 Probability level Probability level −1 Figure 40 – Estimation of quantile function, Fθ (q), θ > θ0 (lighter tails). 88
• 89. Arthur CHARPENTIER, transformed kernels and beta kernels A universal distribution for lossesBuch-Larsen,Nielsen, Guillen, & Bolancé (2005) considered the Champernownegeneralized distribution to model insurance claims, i.e. positive variables, (y + c)α − cα Fα,M,c (y) = where α > 0, c ≥ 0 and M > 0. (y + c)α + (M + c)α − 2cαThe associated density is then α (y + c)α−1 ((M + c)α − cα ) fα,M,c (y) = . α α ((y + c) + (M + c) − 2c α )2 89
• 90. Arthur CHARPENTIER, transformed kernels and beta kernels A Monte Carlo study to compare those nonparametric estimatorsAs in Buch-Larsen,Nielsen, Guillen, & Bolancé (2005), 4 distributions wereconsidered– normal distribution,– Weibull distribution,– log-normal distribution,– mixture of Pareto and log-normal distributions, 90
• 91. Arthur CHARPENTIER, transformed kernels and beta kernels Box−plot for the 11 quantile estimators Density of quantile estimators (mixture longnormal/pareto) R benchmark qqqqqq qq qqq qqqqq q qqqq q qq q qq qq qq qq qq q qqq qqqqq qq qqq q q qq q qq qq qqq q q qq qqq q qqq q q q q q q qqq q qq q q q qq q q qq q qq qq q q q q q 0.30 E Epanechnikov qq qqqq qqqqqq qqqqq q qqq q qqq qqq qq qq qq qq qq qq qqqqqqqq q qq qqq q qqq q qq q qq qqq q qqq qq q qq q q q qq q qq q q q q q qq q qq qq q q q q q q q 0.25 HD Harrell Davis q qqqqq qqqqq q qqq qq qq qqqq qqq q qqq q qq qq qqqqq qqq q q q q qq q q q qqqqqq qq q qq q q qq qqq qq q q qq qq q q q q q qq q q q qq q qq q q q q q q q q q q qq q q Benchmark (R estimator) HD (Harrell−Davis) PDG Padgett 0.20 q q q q qqqq q q qqqqq q qqq qq q qq qqq q qq qqq q qq qq q q qq q q density of estimators PRK (Park) B1 (Beta 1) PRK Park qqqqqqq q q qqqq qqqq qqqq qqqq qqq qqq qq q qq q q q qq qq qq qq q qq q qq q q qqqqq q q q q q q q q q q q q qqq q q q q qq q q q qq q q B2 (Beta 2) 0.15 Beta1 qqqqq q q qq q q qq qqq qqq qqq qqq qqq q qq qqqqq q q qqqq q q qqqqqq qqq q q q qq qq q q q q q q q qq q qq q q qq q q q qq q q q q qq q q q q qq q qq q q q 0.10 MACRO Beta1 q q qqq q q qqq q qqq q qqq qqq qq qq qq qq q qqqqqq q q q qqqqq q qqq qq qqq q q qq qq q qq qqq qq q q qq q q qq q MICRO Beta1 q qqqqq q qq qqq qqqq qq qq qqqq qqq q q qq qq q q qqqq qqqqq q qq q qqqq qqqqq q q qq q qq qqqq qqqqq q qq q qqq qq qqq qqqq q qq q q q q q q qq q q 0.05 Beta2 qq qqqqq q qqqqq qqqqq qqqq qqqq qq qq qq qq q qqqq q qq q q q qq q qq q qqq q q qqq qq q q q qqq qq q qqq qq q q q q q q qq q 0.00 MACRO Beta2 q qqqq q qqqq q qqqq q qqqq qqq qqq qq qq q qqqqqqqqqq qqqqqqqq q qqq qq q q q qqq qqq q q qqq q q qqq q q qqq q q q q q q MICRO Beta2 qqqqq q qqqq q qqqq qqqq qqq qqq qq q qqq qq q qqq qqqq qqq q q qqqq qqq qq qq qqqq q qq qq qqq qq qqqq q q q qq q q q q qq q 5 10 15 20 q Estimated value−at−risk 5 10 15 20Figure 41 – Distribution of the 95% quantile of the mixture distribution, n = 200,and associated box-plots. 91
• 92. Arthur CHARPENTIER, transformed kernels and beta kernels MSE ratio, normal distribution, HD (Harrell−Davis) MSE ratio, normal distribution, B1 (Beta1) MSE ratio, normal distribution, MACB1 (MACRO−Beta1) 2.0 2.0 2.0 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=200 n=200 n=200 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q q q q MSE ratio MSE ratio MSE ratio q 1.0 1.0 1.0 q q q q qq q q q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.0 0.0 0.0 q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) MSE ratio, normal distribution, PRK (Park) MSE ratio, normal distribution, B1 (Beta1) MSE ratio, normal distribution, MACB1 (MACRO−Beta1) q q 2.0 2.0 2.0 q q q q n= 50 n= 50 n= 50 n=100 n=100 n=100 n=200 n=200 n=200 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q q q q q q MSE ratio MSE ratio MSE ratio q 1.0 1.0 1.0 q q q qq q q q q q q q q q q q q q 0.5 0.5 0.5 0.0 0.0 0.0 q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Figure 42 – Comparing MSE for 6 estimators, the normal distribution case. 92
• 93. Arthur CHARPENTIER, transformed kernels and beta kernels MSE ratio, Weibull distribution, HD (Harrell−Davis) MSE ratio, Weibull distribution, MACB1 (MACRO−Beta1) MSE ratio, Weibull distribution, MICB1 (MICRO−Beta1) 2.0 2.0 2.0 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=200 n=200 n=200 n=500 n=500 n=500 1.5 q 1.5 q 1.5 MSE ratio q MSE ratio MSE ratio q q q q q 1.0 1.0 1.0 qq q q q q q q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.0 0.0 0.0 q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) MSE ratio, Weibull distribution, PRK (Park) MSE ratio, Weibull distribution, MACB1 (MACRO−Beta1) MSE ratio, Weibull distribution, MICB1 (MICRO−Beta1) q 2.0 2.0 2.0 n= 50 q n= 50 n= 50 q n=100 q n=100 n=100 n=200 n=200 n=200 q n=500 n=500 n=500 1.5 q 1.5 q 1.5 q MSE ratio MSE ratio MSE ratio q q q q q 1.0 1.0 1.0 q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.0 0.0 0.0 q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p)Figure 43 – Comparing MSE for 6 estimators, the Weibull distribution case. 93
• 94. Arthur CHARPENTIER, transformed kernels and beta kernels MSE ratio, lognormal distribution, HD (Harrell−Davis) MSE ratio, lognormal distribution, MACB1 (MACRO−Beta1) MSE ratio, lognormal distribution, MICB1 (MICRO−Beta1) MSE ratio, lognormal distribution, B1 (Beta1) 2.0 2.0 2.0 2.0 n= 50 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=100 n=200 n=200 n=200 n=200 n=500 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q 1.5 q q q q qMSE ratio MSE ratio MSE ratio MSE ratio q q q q 1.0 1.0 1.0 1.0 q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) MSE ratio, lognormal distribution, PRK (Park) MSE ratio, lognormal distribution, MACB2 (MACRO−Beta2) MSE ratio, lognormal distribution, MICB2 (MICRO−Beta2) MSE ratio, lognormal distribution, B2 (Beta2) 2.0 2.0 2.0 2.0 q n= 50 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=100 n=200 n=200 n=200 n=200 n=500 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q 1.5 q q q q q qMSE ratio MSE ratio MSE ratio MSE ratio 1.0 1.0 1.0 1.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p)Figure 44 – Comparing MSE for 9 estimators, the lognormal distribution case. 94
• 95. Arthur CHARPENTIER, transformed kernels and beta kernels q MSE ratio, mixture distribution, HD (Harrell−Davis) MSE ratio, mixture distribution, MACB1 (MACRO−Beta1) MSE ratio, mixture distribution, MICB1 (MICRO−Beta1) MSE ratio, mixture distribution, B1 (Beta1) q 2.0 2.0 2.0 2.0 n= 50 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=100 n=200 n=200 n=200 n=200 n=500 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q 1.5 q q q qMSE ratio MSE ratio MSE ratio MSE ratio q q 1.0 1.0 1.0 1.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) MSE ratio, mixture distribution, PRK (Park) MSE ratio, mixture distribution, MACB2 (MACRO−Beta2) MSE ratio, mixture distribution, MICB2 (MICRO−Beta2) MSE ratio, mixture distribution, B2 (Beta2) 2.0 2.0 2.0 2.0 n= 50 n= 50 n= 50 n= 50 n=100 n=100 n=100 n=100 n=200 n=200 n=200 n=200 n=500 n=500 n=500 n=500 1.5 q 1.5 q 1.5 q 1.5 q q q q q q q qMSE ratio MSE ratio MSE ratio MSE ratio q q q q q q q q q q 1.0 1.0 1.0 1.0 q q q q q q q q q q q q q q q q q q 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Probability, confidence levels (p) Figure 45 – Comparing MSE for 9 estimators, the mixture distribution case. 95
• 96. Arthur CHARPENTIER, transformed kernels and beta kernels Some referencesCharpentier, A. , Fermanian, J-D. & Scaillet, O. (2006). The estimation ofcopulas : theory and practice. In : Copula Methods in Derivatives and RiskManagement : From Credit Risk to Market Risk. Risk BookCharpentier, A. & Oulidi, A. (2008). Beta Kernel estimation for Value-At-Risk ofheavy-tailed distributions. in revision Journal of Computational Statistics and DataAnalysis.Charpentier, A. & Oulidi, A. (2007). Estimating allocations for Value-at-Riskportfolio optimzation. to appear in Mathematical Methods in Operations Research.Chen, S. X. (1999). A Beta Kernel Estimator for Density Functions. ComputationalStatistics & Data Analysis, 31, 131-145.Devroye, L. & Györfi, L. (1981). Nonparametric Density Estimation : The L1 View.Wiley.Gouriéroux, C., & Montfort, A. 2006. (Non) Consistency of the Beta KernelEstimator for Recovery Rate Distribution. CREST-DP 2006-32. 96