Your SlideShare is downloading. ×
Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Analytics, Big Data and The Cloud II Conference - Kiribatu Labs

529
views

Published on

Learn how insurers predict risk and how you can apply it to your predictive analytics project.

Learn how insurers predict risk and how you can apply it to your predictive analytics project.

Published in: Technology, Economy & Finance

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
529
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Learn how insurers predict risk andhow you can apply it to yourpredictive analytics projectPawel Brzeminski, Founder & CEOpawel@kirbatulabs.comMay 15, 2013Analytics, Big Data, and The Cloud IIEdmonton
  • 2. The  Company  KIRIBATULABSDiscovering Knowledge AssetsKiribatu is a predictive analytics company, founded in2009 / 6 employeesWe serve the Canadian financial sector, predominantlyProperty & Casualty insurance
  • 3. Predic1ve  analy1cs,  huh?  KIRIBATULABSDiscovering Knowledge AssetsGoal-driven ANALYSIS of a large data set toPREDICT human behavior
  • 4. If  speed  was  important  to  you…  KIRIBATULABSDiscovering Knowledge AssetsYOUR insurance premium is calculated by methodsdesigned 40-50 years agoVS.
  • 5. Risk  assessment  in  Insurance  KIRIBATULABSDiscovering Knowledge AssetsA vast majority of Canadian insurers (May 2013) still useoutdated premium rating formulas created in 1960-1970sOnly a handful of Canadian insurance companies aresophisticated predictive analytics usersLeaders are decimating their competition
  • 6. Where  to  start?  KIRIBATULABSDiscovering Knowledge AssetsSource: By Phil McElhinney from London (Jeremy Wariner) (http://creativecommons.org/licenses/by-sa/2.0)How to identify an opportunity for a predictiveanalytics project?
  • 7. Ques1ons  to  ask  while  star1ng  KIRIBATULABSDiscovering Knowledge AssetsData is already collected (or can be easily acquired)Transactional data, customer data, sensor-generated data, usage data, etc.There is a clear objective to predict somethingFuture price, failure rate, customer risk, customer profitability, customer retention, etc.Well-defined functional settings are a great place to startWe focused on a Risk Sharing Pool (RSP) problem optimizationTypically the SMEs (Subject Matter Experts) are makingdecisions based on their experience and “gut feeling”Senior underwriters in our caseSignificant ROI is expectedInvestment in analytics can be small but usually it is not trivial
  • 8. Example  KIRIBATULABSDiscovering Knowledge AssetsRisk Sharing Pool is a construct used by Canadianinsurers to optimize their risk assessmentInsurers put their highest risks (primary driver and avehicle) in the pool to avoid paying for the claimsBut they forfeit the premiumInsurers retain the risks they deem profitable on theirbook of businessThey can collect the premium and make a profit
  • 9. Challenge  KIRIBATULABSDiscovering Knowledge AssetsCan we effectively predict future claims on policies?The model would need to predict claims that will occur up to 12 months in advance
  • 10. Introducing  Underwri1ng  Score  KIRIBATULABSDiscovering Knowledge AssetsThe predictive model generates an Underwriting (UW)ScoreThe UW Score is a number between 1 to 1000High UW Score = high profitability = low riskLow UW Score = low profitability = high riskHighly accurate predictor of future claims on a policyUW Score will be used to assess which risks are placedin the pool and which risks are not placed in the pool
  • 11. Data  Prepara1on  Ra1ng  Factor  Analysis  Model  Development  Gain  Assessment  KIRIBATULABSDiscovering Knowledge Assets4  Key  Modeling  Steps  
  • 12. Data  Prepara1on   •  Policy  &  claims  data  profiling,  understanding  and  verifica1on  •  Data  cleansing  (filling  missing  values,  outliers  removal)  •  Data  transforma1on  •  Data  normaliza1on  (infla1on  &  claim  development  factors)  •  Data  enrichment  with  3rd  party  data  (demographic,  econometric  –  Census  Canada,  VICC,  CLEAR,  etc.)  Data  Prepara1on  KIRIBATULABSDiscovering Knowledge Assets
  • 13. Ra1ng  Factor  Analysis  KIRIBATULABSDiscovering Knowledge Assets•  Sta1s1cal  analysis  of  each  data  element  for  its  propensity  to  claim    •  Ra1ng  factors  with  high  correla1ons  are  included  in  the  final  predic1ve  model(s)  •  OYen,  new  powerful  ra1ng  factors  are  discovered  in  this  step  (very  useful  for  Underwri1ng)  Ra1ng  Factor  Analysis  Data  Prepara1on  
  • 14. Model  Development  KIRIBATULABSDiscovering Knowledge Assets•  Algorithm  selec1on  (gene1c  algorithms,  neural  networks,  logis1c  regression,  SVM)    •  Time-­‐wise  training  and  tes1ng  data  set  split    •  Model  parameteriza1on,  genera1on  and  evalua1on  Data  Prepara1on  Ra1ng  Factor  Analysis  Model  Development  
  • 15.  •  Calcula1on  of  UW  Scores  on  test  data  set  •  Retrospec1ve  underwri1ng  gain  assessment  on  historical  data  sets        Data  Prepara1on  Ra1ng  Factor  Analysis  Model  Development  Gain  Assessment  KIRIBATULABSDiscovering Knowledge AssetsRSP  Gain  Assessment  
  • 16. Results  KIRIBATULABSDiscovering Knowledge AssetsSource: “Improving P&C Insurance Risk Management and Policy Pricing with Predictive Analytics”, Pawel Brzeminski,September 2011, http://www.kiribatulabs.com/resources.php.UW Score = 1000 – Risk Score
  • 17. 4  Key  Challenges  KIRIBATULABSDiscovering Knowledge AssetsExtremely low correlations / Data set imbalance98% of policy transactions do not have any claims, 2% have claimsBad, bad dataDrivers driving 200,000 km per year (thats driving over 500 km per day for 365 days a year)Over-fittingCertain features do not generalize very well in a time-wise data splitData sparcityMotor Vehicle Abstract (MVA) data that contains convictions, suspensions and reinstatementis not always available
  • 18. 5  Key  Breakthroughs  KIRIBATULABSDiscovering Knowledge AssetsPolicy transactions collapsed into single vectorsIndividual risk assessment for each vehicle on policyInstance sampling and weightingDealing with dataset imbalance and bad dataCustom model quality metricAggregation of the highest claims in the top 5% of all transactions really moved the needleRisk Assessment per insurance coverageDifferent data elements are important for each coverage, for instance liability coverage andcomprehensive coverage are completely different products behave very differentlyPrediction of ProfitabilityInclude written premiums in 2nd level model
  • 19. Homework  KIRIBATULABSDiscovering Knowledge AssetsWhere can I apply predictive analytics in mybusiness?Questions? Always happy to have a coffeePawel Brzeminski, Founder & CEOpawel@kirbatulabs.com780-232-2634http://ca.linkedin.com/pub/pawel-brzeminski/0/523/555@pawelwb