SlideShare a Scribd company logo
1 of 29
EMB America + Salford Systems
Getting the best of Two Worlds
   Who is EMB?

   Insurance industry predictive modeling
    applications

   EMBLEM- our GLM tool

   How we have used CART with EMBLEM

   Case studies

   Other areas of expected synergies
   Global network of p&c insurance consultants
    servicing clients throughout the world

   (insert globe)
   Predictive Modeling
   Ratemaking & Profitability Analysis
   Underwriting & Credit Scoring
   Enterprise Risk Management, Pro Forma, Business
    Planning
   Retention & Conversion Modeling
   New Program Development
   Competitive Analysis
   Reinsurance Program Analysis
   Reserve Analysis & Opinion Letters
   Software Development & Software Support
   Expert Witness Testimony
   Regulatory Support & Law Analysis
   EMB’s suite of software products cover all
    aspects of personal and commercial lines of
    insurance
    ◦   EMBLEM
    ◦   Rate Assessor
    ◦   Classifier
    ◦   Igloo Professional
    ◦   ExtrEMB
    ◦   ResQ Professional
    ◦   PrisEMB
    ◦   RePro
   We use EMBLEM, a GLM tool, for our
    predictive modeling needs

   Why?
   Primary application:
    ◦ Estimating the cost of the product they sell (insurance) two steps:

       Reserving= estimating the cost of outstanding insurance claims
       Pricing= estimating the cost of future insurance coverage

   Secondary applications
    ◦ Retention Modeling= probability that a policyholder will renew

    ◦ Conversion Modeling= probability that a prospective policyholder
      will purchase a policy

    ◦ Price Optimization

    ◦ Claim fraud detection

    ◦ Marketing
   Goal is to develop a unique rate for every risk
    ◦ Don’t think in terms of good/bad risks

    ◦ State Farm/Allstate vs GEICO/Progressive

    ◦ Quickly exhausts the data
       Credibility/ variability/ stability

   Risks are described by the predictor variables, not the
    target.
    ◦ Need to have a mapping of the predictor variable levels to a target
      value- not the other way around

       Other way around makes it difficult to derive impact of individual
        predictor variables

       Important because actual data often does not describe all possible
        combinations of potential customers
   Highly regulated marketplace
    ◦ Restrictions
      Predictors can and cannot use
         Credit scores

      Rules on values for the predictors
         Ages 65+ relativities cannot be >110% of ages 40-60
         Maximum rate change between adjacent territories

      Rules on predictor order and magnitude of importance
         CA Sequential Analysis (driving record>annual mileage>years held
          license)

    ◦ Regulatory Approval
      Rates need to be supported

      Black box methodologies will not be accepted
   Response variable is continuous/discrete function

   (insert graph)
    ◦ Gamma consistent with severity modeling, or even Inverse
      Gaussian

   (insert graph)
    ◦ Poisson consistent with frequency modeling

   No single trial/outcome
    ◦ Trial is measured in terms of time

    ◦ Actual policy length varies tremendously because of changes
       Marital status
       New car
       moved
   In 1996, EMB designed EMBLEM to provide access to GLM for
    statisticians and non-statisticians pricing personal and
    commercial insurance

   EMBLEM revolutionized the use of GLM’s, enabling analysis that
    was previously either impossible or too time-consuming to be
    worth attempting

   EMBLEM is now used by over 100 insurance companies globally:
    ◦ 18 of the top 20 personal auto writers in the UK
    ◦ 50 companies in the US including 8 of the top 10 personal auto writers

   Fastest GLM tool with the capability to model millions of
    observations in seconds with a host of diagnostic tools:
    ◦ Graphical, practical, statistical, automated.

    ◦ Stand-alone software package that can be integrated with a variety of
      external software including SAS®

    ◦ Microsoft® Visual Basic® for Applications provides ultimate flexibility
   GLM characteristics work to our advantage
    ◦ Exponential family does an excellent job of describing
      the underlying components of insurance losses

    ◦ Output of the model is in the form of Beta parameters
      which can easily be converted to rate relativities

    ◦ EMBLEM is not automated
      User has complete control over the model structure

      Complete diagnostic tools to assist the modeler with
       decisions
   In terms of estimating the cost of insurance:
    ◦ UK has embraced predictive modeling
       Experienced with its techniques

       Knowledgeable with the factors that tend to be predictive

    ◦ US is learning about predictive modeling
       Saturation with big players in personal lines marketplace

         Companies not using predictive modeling techniques are being adversely
          selected against

         Now expanding dimensionality of databases

       Still fairly new concept in commercial lines marketplace

         Big players are using techniques but historical rating structures are
          hindering the rapid expansion
   Result?
    ◦ UK is expanding into secondary applications
       Retention modeling

       Conversion modeling

       Price optimization

       Claim fraud detection

    ◦ Because Predictive Modeling has been around for some time in the
      UK, the datasets are getting larger in terms of the number of
      predictors to evaluate

    ◦ Experienced US companies are beginning to evaluate the
      secondary applications

    ◦ Marketing is used in a manner similar to other industries
   How does CART fit into this?
    ◦ As we transition into the secondary applications we move
      from modeling a continuous function to a binary function

       Tree-based techniques can add value to the analysis

   Retention and Conversion modeling
    ◦ Accept/ Reject target variable

    ◦ Desirable smooth surface

    ◦ Price optimization integrates these with premium models

   Marketing and Fraud detection
    ◦ Classic tree applications
   Using CART and EMBLEM
    ◦ Goal is to play off of the strengths of each tool

   CART strengths
    ◦ Automatic separation of relevant from irrelevant predictors

    ◦ Easily rank-orders variable importance

    ◦ Automatic interaction detection (requires additional work)

    ◦ Captures multiple structures within a dataset rather than a
      single dominant structure

    ◦ Can handle missing values and is impervious to outliers
   EMBLEM Strengths

    ◦ User has control over the model structure

    ◦ Ease of communication/conceptualization- effects
      of each explanatory variable is transparent

    ◦ Provides predicted response values for new data
      points
   CART
    ◦ Factor selection

    ◦ Interaction detection

    ◦ Model validation

   EMBLEM
    ◦ Model structure

    ◦ Incorporating time/seasonality trend effects

    ◦ Implementation of results
   Both CART and EMBLEM are excellent tools both
    of which produce consistent results in similar
    situations

    ◦ This is not an exercise of seeing which is better

   The purpose of this discussion is to show how
    efficiencies can be gained in the modeling
    process

    ◦ As datasets get larger in terms of the number of
      predictors time becomes a crucial element
   Retention modeling assignment

    ◦ 97,227 observations

      Each observation represents one trial/outcome

      Split 50/50 between training/test datasets

    ◦ 11 predictors

      Grand total number of levels:147
   Modeling Process
    ◦ Started with Forward Entry Regression

       Automated process
       Used Chi-Squared statistic for testing significance
       Took about 30 minutes to run

    ◦ Significant factors (8)

         Rating Area
         Vehicle Category
         Age
         NCD
         Driver Restriction
         Vehicle Age
         Change Over Last Year’s Premium
         Market Competitiveness
   Build a model with no factors and add based
    on prespecified criteria regarding
    improvement in model fit:

   (insert table)

   Add the factor that performed the best on the
    Chi Square test. (Policyholder Age)

   Iterate process with the new base model until
    no further factors indicated removal
   Compared results with CART/ TreeNet

    ◦ Significant factors were essentially the same

    ◦ Model predictiveness was the same (ROC=0.7)

   Interactions

    ◦ No significant interactions were found by EMBLEM or
      CART

   Test Dataset

    ◦ ROC=0.7
   Retention modeling assignment

    ◦ 198,386 observations

      Each observation represented one trial/outcome

      Split 50/50 between training/test datasets

    ◦ 135 predictors

      Grand total number of levels: approx 3,752
   Forward Entry Regression
    ◦ Found 57 predictors to be significant

    ◦ Took a weekend to run

   Comparison to CART/ TreeNet
    ◦ Found 24 significant predictors

    ◦ Top 15 based on variable importance were also found by
      EMBLEM

    ◦ Correlations with the rest of the predictors

   Through the modeling process we reduced the
    number of predictors to 26
   Interactions

    ◦ We relied on indications from CART/ TreeNet

    ◦ 6 interactions were identified and included in the
      model

   EMBLEM Results

    ◦ Training ROC= .862

    ◦ Test ROC= .85
   Variable importance

   Segmentation

   Super-Profiling
   CART excels at identifying different segments in data

   CART may also help determine where to segment data

   Segmentation is a useful alternative to fitting many
    interactions

    ◦ Example: In a automobile insurance renewal problem, a CART
      analysis showed several occurrences of a split between those
      policyholders with just one years duration and those with a
      greater duration

   This suggests segmenting the data into two parts:
    ◦ Policies renewing with one year duration

    ◦ Policies renewing with more than one year
   After a GLM model is constructed use CART
    to model the residuals to see if any patterns
    exists

    ◦ If a pattern is discovered, go back to the model
      structure and incorporate the findings

    ◦ Test to see if model structure was inadvertently
      over-simplified

More Related Content

What's hot

Best Billing Practices - MedicalBillersandCoders
Best Billing Practices - MedicalBillersandCodersBest Billing Practices - MedicalBillersandCoders
Best Billing Practices - MedicalBillersandCodersMedicalBillersandCoders
 
OpenSpan Claims Automation Service
OpenSpan Claims Automation ServiceOpenSpan Claims Automation Service
OpenSpan Claims Automation ServiceFrank Wagman
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his macRising Media, Inc.
 
Insurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeInsurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeThe Digital Insurer
 
Insurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeInsurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeThe Digital Insurer
 
Bracing for disruption: Building agility in life sciences and healthcare
Bracing for disruption: Building agility in life sciences and healthcareBracing for disruption: Building agility in life sciences and healthcare
Bracing for disruption: Building agility in life sciences and healthcareSlalom
 
Inside Counsel White Paper
Inside Counsel White PaperInside Counsel White Paper
Inside Counsel White Paperlarrylieb
 
Solutions for Behavioral Health
Solutions for Behavioral HealthSolutions for Behavioral Health
Solutions for Behavioral HealthNextGen Healthcare
 
Bid to Win workshop for Seedbed - October 2015
Bid to Win workshop for Seedbed - October 2015Bid to Win workshop for Seedbed - October 2015
Bid to Win workshop for Seedbed - October 2015Matt Spry
 
About ItegraL
About ItegraLAbout ItegraL
About ItegraLbevering
 
How to Justify a Change in Your ALLL
How to Justify a Change in Your ALLLHow to Justify a Change in Your ALLL
How to Justify a Change in Your ALLLLibby Bierman
 

What's hot (18)

More intelligent processes - choices and results
More intelligent processes - choices and resultsMore intelligent processes - choices and results
More intelligent processes - choices and results
 
Best Billing Practices - MedicalBillersandCoders
Best Billing Practices - MedicalBillersandCodersBest Billing Practices - MedicalBillersandCoders
Best Billing Practices - MedicalBillersandCoders
 
OpenSpan Claims Automation Service
OpenSpan Claims Automation ServiceOpenSpan Claims Automation Service
OpenSpan Claims Automation Service
 
General Insurance
General InsuranceGeneral Insurance
General Insurance
 
Centricity EDI Product Overview
Centricity EDI Product OverviewCentricity EDI Product Overview
Centricity EDI Product Overview
 
HealthCo: Centricity Clearinghouse Overview
HealthCo: Centricity Clearinghouse OverviewHealthCo: Centricity Clearinghouse Overview
HealthCo: Centricity Clearinghouse Overview
 
1555 track 1 huang_using his mac
1555 track 1 huang_using his mac1555 track 1 huang_using his mac
1555 track 1 huang_using his mac
 
Insurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeInsurance Innovation Award - Investec Life
Insurance Innovation Award - Investec Life
 
Insurance Innovation Award - Investec Life
Insurance Innovation Award - Investec LifeInsurance Innovation Award - Investec Life
Insurance Innovation Award - Investec Life
 
Bracing for disruption: Building agility in life sciences and healthcare
Bracing for disruption: Building agility in life sciences and healthcareBracing for disruption: Building agility in life sciences and healthcare
Bracing for disruption: Building agility in life sciences and healthcare
 
1645 track 3 porter
1645 track 3 porter1645 track 3 porter
1645 track 3 porter
 
Inside Counsel White Paper
Inside Counsel White PaperInside Counsel White Paper
Inside Counsel White Paper
 
Solutions for Behavioral Health
Solutions for Behavioral HealthSolutions for Behavioral Health
Solutions for Behavioral Health
 
Bid to Win workshop for Seedbed - October 2015
Bid to Win workshop for Seedbed - October 2015Bid to Win workshop for Seedbed - October 2015
Bid to Win workshop for Seedbed - October 2015
 
About ItegraL
About ItegraLAbout ItegraL
About ItegraL
 
005
005005
005
 
CDO_public
CDO_publicCDO_public
CDO_public
 
How to Justify a Change in Your ALLL
How to Justify a Change in Your ALLLHow to Justify a Change in Your ALLL
How to Justify a Change in Your ALLL
 

Viewers also liked

Olrac SPS Predictive Insurance Solutions
Olrac SPS Predictive Insurance SolutionsOlrac SPS Predictive Insurance Solutions
Olrac SPS Predictive Insurance SolutionsJustin Shanks
 
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...David Losada
 
The Robos Are Coming - How AI will revolutionize Insurance 0117
The Robos Are Coming - How AI will revolutionize Insurance 0117The Robos Are Coming - How AI will revolutionize Insurance 0117
The Robos Are Coming - How AI will revolutionize Insurance 0117Graham Clark
 
How to use R in different professions: R for Car Insurance Product (Speaker: ...
How to use R in different professions: R for Car Insurance Product (Speaker: ...How to use R in different professions: R for Car Insurance Product (Speaker: ...
How to use R in different professions: R for Car Insurance Product (Speaker: ...Zurich_R_User_Group
 
Conductrics bandit basicsemetrics1016
Conductrics bandit basicsemetrics1016Conductrics bandit basicsemetrics1016
Conductrics bandit basicsemetrics1016Matt Gershoff
 
Multi-Armed Bandits:
 Intro, examples and tricks
Multi-Armed Bandits:
 Intro, examples and tricksMulti-Armed Bandits:
 Intro, examples and tricks
Multi-Armed Bandits:
 Intro, examples and tricksIlias Flaounas
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshopodsc
 
Bajaj Allianz Business Plan
Bajaj Allianz Business PlanBajaj Allianz Business Plan
Bajaj Allianz Business Planyourskarthikeyan
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General InsuranceSyed Danish Ali
 
Insurance pricing
Insurance pricingInsurance pricing
Insurance pricingLincy PT
 

Viewers also liked (13)

Commercial Insurance for Municipalities in NH
Commercial Insurance for Municipalities in NHCommercial Insurance for Municipalities in NH
Commercial Insurance for Municipalities in NH
 
Olrac SPS Predictive Insurance Solutions
Olrac SPS Predictive Insurance SolutionsOlrac SPS Predictive Insurance Solutions
Olrac SPS Predictive Insurance Solutions
 
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...
Feeling Lucky? Multi-armed Bandits for Ordering Judgements in Pooling-based E...
 
The Robos Are Coming - How AI will revolutionize Insurance 0117
The Robos Are Coming - How AI will revolutionize Insurance 0117The Robos Are Coming - How AI will revolutionize Insurance 0117
The Robos Are Coming - How AI will revolutionize Insurance 0117
 
How to use R in different professions: R for Car Insurance Product (Speaker: ...
How to use R in different professions: R for Car Insurance Product (Speaker: ...How to use R in different professions: R for Car Insurance Product (Speaker: ...
How to use R in different professions: R for Car Insurance Product (Speaker: ...
 
Conductrics bandit basicsemetrics1016
Conductrics bandit basicsemetrics1016Conductrics bandit basicsemetrics1016
Conductrics bandit basicsemetrics1016
 
Multi-Armed Bandits:
 Intro, examples and tricks
Multi-Armed Bandits:
 Intro, examples and tricksMulti-Armed Bandits:
 Intro, examples and tricks
Multi-Armed Bandits:
 Intro, examples and tricks
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshop
 
Bajaj Allianz Business Plan
Bajaj Allianz Business PlanBajaj Allianz Business Plan
Bajaj Allianz Business Plan
 
Advanced Pricing in General Insurance
Advanced Pricing in General InsuranceAdvanced Pricing in General Insurance
Advanced Pricing in General Insurance
 
Actuarial Analytics in R
Actuarial Analytics in RActuarial Analytics in R
Actuarial Analytics in R
 
Princing insurance contracts with R
Princing insurance contracts with RPrincing insurance contracts with R
Princing insurance contracts with R
 
Insurance pricing
Insurance pricingInsurance pricing
Insurance pricing
 

Similar to Combining Linear and Non Linear Modeling Techniques

Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingXavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingKai Xin Thia
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptxDhanuDhanu49
 
Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Egor Kraev
 
Quant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability DefaultsQuant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability DefaultsDavidkerrkelly
 
Loan Approval Prediction
Loan Approval PredictionLoan Approval Prediction
Loan Approval PredictionIRJET Journal
 
Webinar: Three levels of Business Models, Rita McGrath Slides
Webinar: Three levels of Business Models, Rita McGrath SlidesWebinar: Three levels of Business Models, Rita McGrath Slides
Webinar: Three levels of Business Models, Rita McGrath SlidesEngage // Innovate
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingNoelle Ibrahim
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingNoelle Ibrahim
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesSindhujanDhayalan
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationKaushik Rajan
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxAsadAli104515
 
Dynamic Gradient Scoring Model (DGSM)
Dynamic Gradient Scoring Model (DGSM)Dynamic Gradient Scoring Model (DGSM)
Dynamic Gradient Scoring Model (DGSM)Aditya Yadav
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoInstitute of Contemporary Sciences
 
PEGA Decision strategy manager (DSM)
PEGA Decision strategy manager (DSM)PEGA Decision strategy manager (DSM)
PEGA Decision strategy manager (DSM)bhaskarvittal
 
Documentation on bigmarket copy
Documentation on bigmarket   copyDocumentation on bigmarket   copy
Documentation on bigmarket copyswamypotharaveni
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detectionjagan477830
 

Similar to Combining Linear and Non Linear Modeling Techniques (20)

Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricingXavier Conort, DataScience SG Meetup - Challenges in insurance pricing
Xavier Conort, DataScience SG Meetup - Challenges in insurance pricing
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptx
 
Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...Supercharge your AB testing with automated causal inference - Community Works...
Supercharge your AB testing with automated causal inference - Community Works...
 
Quant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability DefaultsQuant Foundry Labs - Low Probability Defaults
Quant Foundry Labs - Low Probability Defaults
 
Loan Approval Prediction
Loan Approval PredictionLoan Approval Prediction
Loan Approval Prediction
 
Webinar: Three levels of Business Models, Rita McGrath Slides
Webinar: Three levels of Business Models, Rita McGrath SlidesWebinar: Three levels of Business Models, Rita McGrath Slides
Webinar: Three levels of Business Models, Rita McGrath Slides
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
Customer churn classification using machine learning techniques
Customer churn classification using machine learning techniquesCustomer churn classification using machine learning techniques
Customer churn classification using machine learning techniques
 
Data Mining on Customer Churn Classification
Data Mining on Customer Churn ClassificationData Mining on Customer Churn Classification
Data Mining on Customer Churn Classification
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Models ABC
Models ABCModels ABC
Models ABC
 
Dynamic Gradient Scoring Model (DGSM)
Dynamic Gradient Scoring Model (DGSM)Dynamic Gradient Scoring Model (DGSM)
Dynamic Gradient Scoring Model (DGSM)
 
machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1machineLearningTypingTool_Rev1
machineLearningTypingTool_Rev1
 
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur SuchwalkoFraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
Fraud Detection in Insurance with Machine Learning for WARTA - Artur Suchwalko
 
PEGA Decision strategy manager (DSM)
PEGA Decision strategy manager (DSM)PEGA Decision strategy manager (DSM)
PEGA Decision strategy manager (DSM)
 
Documentation on bigmarket copy
Documentation on bigmarket   copyDocumentation on bigmarket   copy
Documentation on bigmarket copy
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 

More from Salford Systems

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Salford Systems
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Salford Systems
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerSalford Systems
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To RememberSalford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to marsSalford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingSalford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hivSalford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning CombinationSalford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSalford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPMSalford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7Salford Systems
 

More from Salford Systems (20)

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 

Recently uploaded

Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Combining Linear and Non Linear Modeling Techniques

  • 1. EMB America + Salford Systems Getting the best of Two Worlds
  • 2. Who is EMB?  Insurance industry predictive modeling applications  EMBLEM- our GLM tool  How we have used CART with EMBLEM  Case studies  Other areas of expected synergies
  • 3. Global network of p&c insurance consultants servicing clients throughout the world  (insert globe)
  • 4. Predictive Modeling  Ratemaking & Profitability Analysis  Underwriting & Credit Scoring  Enterprise Risk Management, Pro Forma, Business Planning  Retention & Conversion Modeling  New Program Development  Competitive Analysis  Reinsurance Program Analysis  Reserve Analysis & Opinion Letters  Software Development & Software Support  Expert Witness Testimony  Regulatory Support & Law Analysis
  • 5. EMB’s suite of software products cover all aspects of personal and commercial lines of insurance ◦ EMBLEM ◦ Rate Assessor ◦ Classifier ◦ Igloo Professional ◦ ExtrEMB ◦ ResQ Professional ◦ PrisEMB ◦ RePro
  • 6. We use EMBLEM, a GLM tool, for our predictive modeling needs  Why?
  • 7. Primary application: ◦ Estimating the cost of the product they sell (insurance) two steps:  Reserving= estimating the cost of outstanding insurance claims  Pricing= estimating the cost of future insurance coverage  Secondary applications ◦ Retention Modeling= probability that a policyholder will renew ◦ Conversion Modeling= probability that a prospective policyholder will purchase a policy ◦ Price Optimization ◦ Claim fraud detection ◦ Marketing
  • 8. Goal is to develop a unique rate for every risk ◦ Don’t think in terms of good/bad risks ◦ State Farm/Allstate vs GEICO/Progressive ◦ Quickly exhausts the data  Credibility/ variability/ stability  Risks are described by the predictor variables, not the target. ◦ Need to have a mapping of the predictor variable levels to a target value- not the other way around  Other way around makes it difficult to derive impact of individual predictor variables  Important because actual data often does not describe all possible combinations of potential customers
  • 9. Highly regulated marketplace ◦ Restrictions  Predictors can and cannot use  Credit scores  Rules on values for the predictors  Ages 65+ relativities cannot be >110% of ages 40-60  Maximum rate change between adjacent territories  Rules on predictor order and magnitude of importance  CA Sequential Analysis (driving record>annual mileage>years held license) ◦ Regulatory Approval  Rates need to be supported  Black box methodologies will not be accepted
  • 10. Response variable is continuous/discrete function  (insert graph) ◦ Gamma consistent with severity modeling, or even Inverse Gaussian  (insert graph) ◦ Poisson consistent with frequency modeling  No single trial/outcome ◦ Trial is measured in terms of time ◦ Actual policy length varies tremendously because of changes  Marital status  New car  moved
  • 11. In 1996, EMB designed EMBLEM to provide access to GLM for statisticians and non-statisticians pricing personal and commercial insurance  EMBLEM revolutionized the use of GLM’s, enabling analysis that was previously either impossible or too time-consuming to be worth attempting  EMBLEM is now used by over 100 insurance companies globally: ◦ 18 of the top 20 personal auto writers in the UK ◦ 50 companies in the US including 8 of the top 10 personal auto writers  Fastest GLM tool with the capability to model millions of observations in seconds with a host of diagnostic tools: ◦ Graphical, practical, statistical, automated. ◦ Stand-alone software package that can be integrated with a variety of external software including SAS® ◦ Microsoft® Visual Basic® for Applications provides ultimate flexibility
  • 12. GLM characteristics work to our advantage ◦ Exponential family does an excellent job of describing the underlying components of insurance losses ◦ Output of the model is in the form of Beta parameters which can easily be converted to rate relativities ◦ EMBLEM is not automated  User has complete control over the model structure  Complete diagnostic tools to assist the modeler with decisions
  • 13. In terms of estimating the cost of insurance: ◦ UK has embraced predictive modeling  Experienced with its techniques  Knowledgeable with the factors that tend to be predictive ◦ US is learning about predictive modeling  Saturation with big players in personal lines marketplace  Companies not using predictive modeling techniques are being adversely selected against  Now expanding dimensionality of databases  Still fairly new concept in commercial lines marketplace  Big players are using techniques but historical rating structures are hindering the rapid expansion
  • 14. Result? ◦ UK is expanding into secondary applications  Retention modeling  Conversion modeling  Price optimization  Claim fraud detection ◦ Because Predictive Modeling has been around for some time in the UK, the datasets are getting larger in terms of the number of predictors to evaluate ◦ Experienced US companies are beginning to evaluate the secondary applications ◦ Marketing is used in a manner similar to other industries
  • 15. How does CART fit into this? ◦ As we transition into the secondary applications we move from modeling a continuous function to a binary function  Tree-based techniques can add value to the analysis  Retention and Conversion modeling ◦ Accept/ Reject target variable ◦ Desirable smooth surface ◦ Price optimization integrates these with premium models  Marketing and Fraud detection ◦ Classic tree applications
  • 16. Using CART and EMBLEM ◦ Goal is to play off of the strengths of each tool  CART strengths ◦ Automatic separation of relevant from irrelevant predictors ◦ Easily rank-orders variable importance ◦ Automatic interaction detection (requires additional work) ◦ Captures multiple structures within a dataset rather than a single dominant structure ◦ Can handle missing values and is impervious to outliers
  • 17. EMBLEM Strengths ◦ User has control over the model structure ◦ Ease of communication/conceptualization- effects of each explanatory variable is transparent ◦ Provides predicted response values for new data points
  • 18. CART ◦ Factor selection ◦ Interaction detection ◦ Model validation  EMBLEM ◦ Model structure ◦ Incorporating time/seasonality trend effects ◦ Implementation of results
  • 19. Both CART and EMBLEM are excellent tools both of which produce consistent results in similar situations ◦ This is not an exercise of seeing which is better  The purpose of this discussion is to show how efficiencies can be gained in the modeling process ◦ As datasets get larger in terms of the number of predictors time becomes a crucial element
  • 20. Retention modeling assignment ◦ 97,227 observations  Each observation represents one trial/outcome  Split 50/50 between training/test datasets ◦ 11 predictors  Grand total number of levels:147
  • 21. Modeling Process ◦ Started with Forward Entry Regression  Automated process  Used Chi-Squared statistic for testing significance  Took about 30 minutes to run ◦ Significant factors (8)  Rating Area  Vehicle Category  Age  NCD  Driver Restriction  Vehicle Age  Change Over Last Year’s Premium  Market Competitiveness
  • 22. Build a model with no factors and add based on prespecified criteria regarding improvement in model fit:  (insert table)  Add the factor that performed the best on the Chi Square test. (Policyholder Age)  Iterate process with the new base model until no further factors indicated removal
  • 23. Compared results with CART/ TreeNet ◦ Significant factors were essentially the same ◦ Model predictiveness was the same (ROC=0.7)  Interactions ◦ No significant interactions were found by EMBLEM or CART  Test Dataset ◦ ROC=0.7
  • 24. Retention modeling assignment ◦ 198,386 observations  Each observation represented one trial/outcome  Split 50/50 between training/test datasets ◦ 135 predictors  Grand total number of levels: approx 3,752
  • 25. Forward Entry Regression ◦ Found 57 predictors to be significant ◦ Took a weekend to run  Comparison to CART/ TreeNet ◦ Found 24 significant predictors ◦ Top 15 based on variable importance were also found by EMBLEM ◦ Correlations with the rest of the predictors  Through the modeling process we reduced the number of predictors to 26
  • 26. Interactions ◦ We relied on indications from CART/ TreeNet ◦ 6 interactions were identified and included in the model  EMBLEM Results ◦ Training ROC= .862 ◦ Test ROC= .85
  • 27. Variable importance  Segmentation  Super-Profiling
  • 28. CART excels at identifying different segments in data  CART may also help determine where to segment data  Segmentation is a useful alternative to fitting many interactions ◦ Example: In a automobile insurance renewal problem, a CART analysis showed several occurrences of a split between those policyholders with just one years duration and those with a greater duration  This suggests segmenting the data into two parts: ◦ Policies renewing with one year duration ◦ Policies renewing with more than one year
  • 29. After a GLM model is constructed use CART to model the residuals to see if any patterns exists ◦ If a pattern is discovered, go back to the model structure and incorporate the findings ◦ Test to see if model structure was inadvertently over-simplified