Successfully reported this slideshow.
Upcoming SlideShare
×

# Estimating Tail Parameters

281 views

Published on

When fitting loss data (insurance) to a distribution, often the parameters that provide a good overall fit will understate the density in the tail.
This method allows one to split the distribution into 2 portions, and use a Pareto distribution to fit the tail.
Presented at the CAS Spring Meeting in Seattle, May 2016.

Published in: Economy & Finance
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Estimating Tail Parameters

1. 1. Parameters for a Loss Distribution Estimating the Tail CAS Spring Meeting Seattle, Washington May 17, 2016 Alejandro Ortega, FCAS, CFA
2. 2. 2 The Problem • Estimate the Distribution of Losses for 2016 • We can use this to measure economic capital or regulatory capital (Solvency II) • Test Different Reinsurance Plans • Could be a book of Business – Auto, Marine, Property, or an entire company
3. 3. 3 Data • Analysis Date – 13 Nov 2015 • Premium 2010 – 2014 • Loss Data 2010 – 2014 • Data as of Sep 30, 2015 • Why don’t we use 2015 data
4. 4. 4 Summary of Loss Data Year Number of Claims Amount Paid 2010 330 3,057,507 2011 312 3,177,057 2012 256 2,849,844 2013 272 3,571,991 2014 367 4,680,122
5. 5. 5 Summary of Data Assumptions • Size of the book is unchanged • If it has, we would calculate Frequency of Claims, instead of number • Loss Trend is zero – 0% • If it isn’t, we need to adjust the historical data • If we have a sufficiently large book, or market data we can use a specific loss trend • Otherwise, a general inflation estimate can be used
6. 6. 6 Summary of Data Amount Paid 1,000 2,000 3,000 4,000 5,000 2010 2011 2012 2013 2014 Amount Paid
7. 7. 7 Summary of Data Number of Claims 0 100 200 300 400 Number of Claims
8. 8. 8 Year Number of Claims Amount Paid Severity 2010 330 3,057,507 9,265 2011 312 3,177,057 10,183 2012 256 2,849,844 11,132 2013 272 3,571,991 13,132 2014 367 4,680,122 12,752 • Severity appears to be increasing • Speak to Underwriting, Product and Claims Summary of Data
9. 9. 9 Summary of Data Number of Claims (Quarterly) Mean 77 Median 77 Min 49 Max 114 Std Dev 15 • There does not appear to be a trend • We assumed exposures is constant over this period 0 20 40 60 80 100 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Number of Claims Claims Mean
10. 10. 10 Summary of Data Number of Claims (Quarterly) Mean 77 Median 77 Min 49 Max 114 Std Dev 15 • Rolling Averages are helpful for looking for trends • Use annual periods to minimize seasonality effects 0 20 40 60 80 100 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Number of Claims Claims Mean Rolling 8
11. 11. 11 Steps • Estimate Frequency Distribution • Generally, estimate frequency and apply to projected exposures • Estimate Severity Distribution • First the Meat (belly) of the distribution • Then the Tail
12. 12. 12 Frequency Distributions • Negative Binomial • Poisson • Overdispersed Poison • Binomial
13. 13. Frequency Parameters 13 Selected Parameters 37 0.68 Actual Data Mean 76.9 Standard Deviation 15.4 Negative Binomial
14. 14. Frequency Parameters 14 Selected Parameters 37 0.68 Estimated Mean 77.3 Standard Deviation 15.5 Actual Data Mean 76.9 Standard Deviation 15.4
15. 15. 15 Frequency Distribution Simulation 1 0 40 80 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Actual Claims Simulated Negative Binomial
16. 16. 16 Frequency Distribution Simulation 2 0 40 80 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Actual Claims Simulated Negative Binomial
17. 17. 17 Frequency Distribution Simulation 3 0 40 80 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Actual Claims Simulated Negative Binomial
18. 18. 18 Frequency Distribution Simulation 4 0 40 80 120 2010Q1 2011Q1 2012Q1 2013Q1 2014Q1 Actual Claims Simulated Negative Binomial
19. 19. 19 Steps • Estimate Frequency Distribution • Generally, estimate frequency and apply to projected exposures • Estimate Severity Distribution • First the Meat (belly) of the distribution • Then the Tail
20. 20. 20 Severity Distributions • Weibull • Gamma • Normal • LogNormal • Exponential • Pareto
21. 21. 21 Severity - Parameters • Parameter Estimation • Maximum Likelihood Estimators • Graph Distribution of Losses with Estimated Distribution • Graphical representation confirms if the modeled distribution is a good fit
22. 22. 22 Steps • Estimate Frequency Distribution • Generally, estimate frequency and apply to projected exposures • Estimate Severity Distribution • First the Meat (belly) of the distribution • Then the Tail
23. 23. Excess Mean For all Losses greater than 𝑢; the average amount by which they exceed 𝑢 𝐸 𝑋 − 𝑢 𝑋 > 𝑢 • For any 𝑢 When this increases with respect to 𝑢 you have a fat tailed distribution 23
24. 24. Excess Mean - Normal 24 93 - 250 500 - 500 1,000 1,500 2,000 u Excess Mean Normal
25. 25. Excess Mean - Exponential 25 200 - 250 500 - 500 1,000 1,500 2,000 u Excess Mean Exponential
26. 26. Excess Mean - Weibull 26 454 - 500 1,000 1,500 2,000 - 500 1,000 1,500 2,000 u Excess Mean Weibull
27. 27. Excess Mean - LogNormal 27 622 - 500 1,000 1,500 2,000 - 500 1,000 1,500 2,000 u Excess Mean Lognormal
28. 28. Excess Mean - Pareto 28 Linear 832 - 500 1,000 1,500 2,000 0 500 1000 1500 2000 u Excess Mean Pareto
29. 29. Fat Tailed Embrechts: Any Distribution with a Fat Tail In the Limit, has a Pareto distribution 29 The girth (size, fatness) of the tail is controlled by the parameter 𝜉 (Xi) 0.5 < 𝜉 Variance does Not Exist 1 < 𝜉 Mean Does not Exist
30. 30. Fat Tailed Embrechts: Any Distribution with a Fat Tail In the Limit, has a Pareto distribution 30 1 < 𝜉 < 2 Insurance 2 < 𝜉 < 5 Finance (eg. stocks)
31. 31. Pareto Distribution 𝐹𝑢 𝑥 = 1 − 1 + 𝜉𝑥 𝛽 − 1 𝜉 𝑆 𝑢 𝑥 = 1 + 𝜉𝑥 𝛽 − 1 𝜉 31
32. 32. Data - Severity 32 Claim Quarter Amount Amount with Trend 1 2010Q1 14,101 14,101 2 2010Q1 1,824 1,824 3 2010Q1 688 688 … … ... … 1534 2014Q4 25 25 1535 2014Q4 32,574 32,574 1536 2014Q4 15,380 15,380 1537 2014Q4 1,016 1,016
33. 33. Data - Severity Largest Claims 33 Claim Quarter Amount 1305 2014Q2 126,434 415 2011Q1 135,387 1392 2014Q3 149,925 1055 2013Q3 153,900 1423 2014Q3 166,335 225 2010Q3 181,881 1381 2014Q3 254,864 1310 2014Q2 510,060
34. 34. Severity 34 0% 20% 40% 60% 80% 100% - 200,000 400,000 600,000 Empirical
35. 35. Survival Function – Log Log Scale 35 𝑆 𝑥 = 1 − 𝐹(𝑥) 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 12.50% 25.00% 50.00% 100.00% 1 8 128 2,048 32,768 524,288 Empirical Log Log Scale
36. 36. Survival Function – Log Log Scale 36 𝑆 𝑥 = 1 − 𝐹(𝑥) Look for the Turn The Tail of the Pareto Distribution is a Line on a log log graph 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 12.50% 10,000 80,000 Empirical Log Log Scale
37. 37. Survival Function – Log Log Scale 37 𝑆 𝑥 = 1 − 𝐹(𝑥) 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical Log Log Scale The Tail of the Pareto Distribution is a Line on a log log graph
38. 38. Survival Function – Log Log Scale 38 𝑆 𝑥 = 1 − 𝐹(𝑥) 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical Log Log Scale The Tail of the Pareto Distribution is a Line on a log log graph
39. 39. Survival Function – Log Log Scale 39 Don’t pick a line based on just the end points 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical Log Log Scale
40. 40. Survival Function – Log Log Scale 40 Don’t pick a line based on just the end points 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical Log Log Scale
41. 41. Select Parameters - Severity 𝑢 = 50,000 𝐹 𝑢 = 95.26% 𝑆 𝑢 = 4.74% You can try different 𝑢’s to see if the results are similar (or different) 𝑢 should be deep enough in the tail, that the graph is a line Want enough points past 𝑢, so we have confidence in 𝐹(𝑢) 41
42. 42. Select 𝑢, determine 𝐹(𝑢) 42 Claim Quarter Amount Empirical 𝐹(𝑥) 1348 2014Q3 49,302 95.12% 592 2011Q4 49,479 95.19% 105 2010Q2 49,885 95.25% 50,000 95.26% 686 2012Q1 50,862 95.32% 682 2012Q1 51,085 95.38% 639 2011Q4 51,160 95.45% 𝐹 𝑥 = 𝑟𝑎𝑛𝑘𝑖𝑛𝑔 𝒏 + 𝟏 𝑛 =Number of Claims
43. 43. First Parameter Selection Select 𝜉 first, then 𝛽 The selected 𝜉 is too high 43 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 1.00 = 1,500 Empirical Log Log Scale
44. 44. Second Parameter Selection Better. The slope is steeper, but it’s still too shallow 44 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.50 = 6,000 Empirical Log Log Scale
45. 45. Third Parameter Selection Now, it drops too quickly 𝜉 ∈ [0.2,0.5] 45 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.20 = 15,000 Empirical Log Log Scale
46. 46. Fourth Selection of Parameters This is much better Let’s look a little lower and higher 46 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.35 = 9,000 Empirical Log Log Scale
47. 47. Fifth Selection of Parameters This is also quite good 47 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.30 = 10,500 Empirical Log Log Scale
48. 48. Sixth Selection of Parameters Excellent Fit from 40k to about 120k, not as good for the last 7 points or so 48 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.40 = 8,200 Empirical Log Log Scale 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25% 40,000 320,000 Empirical F(x) Pareto S(x) 1.00 = 2,000 Empirical Log Log Scale
49. 49. 0.01% 0.02% 0.04% 0.08% 0.16% 0.31% 0.63% 1.25% 2.50% 5.00% 40,000 320,000 0.35 = 9,000 Empirical Log Log Scale Final Selection of Parameters 300k was the size of the expected largest claim 49 𝑢 = 50,000 𝑆 𝑢 = 4.74% 0.01% 0.02% 0.05% 0.10% 0.20% 0.39% 0.78% 1.56% 3.13% 6.25%40,000 320,000 S(x) emprico - logarithmo S(x) empirico S(x) Pareto
50. 50. Largest Expected Loss 𝝃 𝜷 Largest Loss 99.93%ile 0.30 10,500 277,000 0.35 9,000 304,000 0.40 8,200 358,000 50
51. 51. Resumen Severidad We estimate a distribution for the bely • Up to 95.26% For the Tail, we use Pareto with the following parameters: • 𝑢 = 50,000 • 𝐹 𝑢 = 95.26% • 𝜉 = 0.35 • 𝛽 = 9,000 51
52. 52. 52 Steps • Estimate Frequency Distribution • Generally, estimate frequency and apply to projected exposures • Estimate Severity Distribution • First the Meat (belly) of the distribution • Then the Tail
53. 53. Assumptions Book is Homogenous • Homes, Apartments Useful Exposure Base • Cars, Homes, Revenue Make Loss Trend Adjustments • Market Inflation, or more detailed, if available Model Risk • Exposures, Inflation 53
54. 54. Thank You