SlideShare a Scribd company logo
1 of 13
Agency Performance Prediction
Miraj Vashi
11-Dec-2016
Agency Performance Prediction | 2
Contents
Contents
 Business Case
 Insights Required & Business Benefit
 A Bit About Domain…
 Data Pre-Processing
 Modelling
 Approach
 Evaluation Metric
 Outcome
 Best Model Comparison
 Model Interpretation & Key Challenges
Agency Performance Prediction | 3
Business Case
Azure Insurance Group is operating property and casualty (P&C) insurance, life
insurance and insurance brokerage companies. Azure sells the policies through
direct & indirect sales channel. For indirect selling, Azure has tie up with 1600+
agencies across 6 states. Azure is interested in classifying existing agencies into
predefined performance categories in a supervised predictive framework &
based on agencies past performance. Specifically Azure expects to better
understand which agencies are likely to bring more growth in Personal Line (PL)
of Business
Agency Performance Prediction | 4
Insights Required & Business Benefits
What Insights Are Required?
Classify each agency into one of the following categories
– GROW: Business from the agency is likely to grow > 5% in 2014
– STABLE: Business from the agency is likely to stay flat with growth in the range [-5%,5%] in 2014
– LOSS: Business form the agency is likely to shrink > 5% (< -5% growth) in 2014
Note: Business growth is measured in terms of %growth in Average Monthly Written Premium Amount achieved by the Agency for a given year
Potential Business Benefits
• Improved understanding of Agency Performance - at a micro level & macro level
– How is an individual agency is likely to perform?
– How are all agencies in a state are likely to perform?
• Optimized utilization of Agency Development Funds
Agency Performance Prediction | 5
A Bit About Domain…
What is Insurance?
• Risk Management Tool for the
customer (individual/business)
allowing him to transfer the risk
of financial loss to the insurance
company
• In exchange for a constant
stream of premiums, insurance
companies offer to pay
consumers a sum of money
upon the occurrence of a
predetermined event, such as a
natural catastrophe, a car crash
or death etc..
• Broadly, from a business
perspective - insurance is
classified as: Life OR Non-Life
(General)
Insurance
Life
Insurance
General
Insurance
Property & Casualty
Insurance
Medical
Insurance
Motor Vehicle
Insurance
Marine
Insurance
Fire
Insurance
Homeowner’s
Insurance
Insurance Type
Agency Performance Prediction | 6
Data Preprocessing
What Data Was Provided By Azure?
• 213K+ observations with 49 dimensions
• Each observation representing yearly aggregated data for an Agency >> for a Year >> for a state >> for a
product
• Key attribute summary:
– 1624 agencies
– 11 years of time duration (2005-2015)
– 6 states
– 29 products
– 2 product lines
• No target class in the data !
Attribute Analysis
Each input attribute was assessed from 3 different angles:
• Business meaning: What does it mean?
• Domain Expertise Based Predictive Importance: Can it help in predicting agency performance?
• Sparsity: Does it have enough values?
Agency Performance Prediction | 7
Data Preprocessing (Cont…)
Key Preprocessing Challenges:
SR# Challenge Category Challenge Resolution
1 Missing Values 1. Identified and dropped highly sparse attributes
2. Missing values encoded as "99999", "Unknown" were converted to NA during file read in R
2 Unwanted Data 1. Agencies, appointed as late as 2014 & for which 2014 growth rate can not be calculated - were
removed
2. Scope of analysis is "Personal Line (PL)" data, hence, Commercial Line (CL) data was filtered out
3 Unavailable Data New attributes created for all Quantity and Revenue attributes to average them over the # of months
data is available
4 Incomplete Data 2005 and 2015 data removed as data were available only for 8 & 5 months respectively
5 Repeating Data Agency specific attributes were detached from raw data, processed separately and later merge with
main data
6 Format Of Data For
Modelling
1. All Quantity attributes & revenue attributes were aggregated based on AGENCY_ID and YEAR
2. Each important attribute was expanded with AGENCY_ID in row and Year identifier in column
E.g. WrittenPremAmount column was converted to 2006_WrittenPremAmount,
2007_WrittenPremAmount....
7 No Target Class
Present
1. Lag variable was created for Written Premium Amount
2. Growth Rate for each agency for all years (2006-20014) was calculated
3. Each agency was assigned a class label based on 2014 growth rate:
• GROW class := 2014 growth rate > 5%
• STABLE class := 2014 growth rate in the range [-5%, 5%]
• LOSS class := 2014 growth rate < -5%
Agency Performance Prediction | 8
Modelling - Approach
• Important features were identified using Boruta package (11 attributes dropped)
• As this is a classification problem, following algorithms were used:
– CART
– C5.0
– Random Forest
– K Nearest Neighbours
– Artificial Neural Network
– Support Vector Machine
– GBM
– Ensemble-Stacking
• Many algorithms were tried on three flavours of data:
– ASIS Data
– ASIS Data + Range transformation
– ASIS Data + Range transformation + Important Features
• 10-fold cross-validation (3x-10x repeated) was performed to get an initial best-estimate of
hyper parameters ("caret" package)
• One or more round of grid search was used to fine tune the hyper-parameter values ("caret"
package)
• Cost-sensitive learning was used in CART and SVM
Agency Performance Prediction | 9
Modelling – Evaluation Metric
• Interesting Insight:
– Only ~40% of the agencies achieved >0% growth in 2014
– Of the 40%, Only ~50% of the agencies grew > 5%. Same is reflected in the
2014 growth class distribution:
• Azure is interested in identifying agencies in GROW class as accurately as
possible
GROW STABLE LOSS
21% 37% 42%
• Model Evaluation Metric:
– Higher Recall For GROW class AND
– Optimal F1 to balance Recall-Precision tradeoff
Agency Performance Prediction | 10
Modelling - Outcome
Agency Performance Prediction | 11
Modelling – Best Model Comparison
Best Model Vs. Baseline Model:
• In the absence of a model OR as a baseline model, the best estimate of 2014 Performance
Class is MODE of 2014 Performance Class attribute.
• Baseline model would predict "LOSS" class for all agencies as, with 42% observations, "LOSS"
is the highest occurring class
Model Metric Baseline Predictive Model Best Predictive Model
GROWTH Class – Recall 0 0.80
GROWTH Class – Precision 0 0.35
GROWTH Class – F1 NA 0.49
Overall Accuracy 41.78% 49.20%
Agency Performance Prediction | 12
Modelling – Model Interpretation & Key Challenges
Model Interpretation
• If an agency is likely to grow > 5% in 2014:
– Best Predictive Model is able to accurately label it as "GROW" in 4/5 cases
• If the Best Predictive Model has labeled an agency as "GROW":
– In 1/3 cases the agency will actually grow > 5% in 2014
– In 2/3 cases the agency will stay STABLE or LOSS in 2014
Key Challenges:
• GROW Class is a minority class in the data. The class distribution is imbalanced & is skewed toward
"LOSS" class
• For majority of algorithms, the learning is skewed toward learning LOSS class correctly - something that
Azure is not interested in
• The data has lot of variance. Difficult to get Test Data truly representative of Train data !
• There is "not enough" data to overcome class-imbalance and variance in the data
AIG Performance Classification

More Related Content

Similar to AIG Performance Classification

WorldAtWorkConfernce_USBank_OS FINAL (no notes)
WorldAtWorkConfernce_USBank_OS FINAL (no notes)WorldAtWorkConfernce_USBank_OS FINAL (no notes)
WorldAtWorkConfernce_USBank_OS FINAL (no notes)Laura Roach
 
Transform Data into Action
Transform Data into ActionTransform Data into Action
Transform Data into ActionWorkday, Inc.
 
Strategic value assessment (sva) web
Strategic value assessment (sva)   webStrategic value assessment (sva)   web
Strategic value assessment (sva) webCharles Novak
 
Product Development Plan
Product Development PlanProduct Development Plan
Product Development PlanOsama Shaath
 
Operations analytics assignment
Operations analytics assignmentOperations analytics assignment
Operations analytics assignmentssuser58cd6d
 
Role Of HR In Organizational Design PowerPoint Presentation Slides
Role Of HR In Organizational Design PowerPoint Presentation SlidesRole Of HR In Organizational Design PowerPoint Presentation Slides
Role Of HR In Organizational Design PowerPoint Presentation SlidesSlideTeam
 
Customer insight presentation s houston - boston march 2014
Customer insight presentation   s houston - boston march 2014Customer insight presentation   s houston - boston march 2014
Customer insight presentation s houston - boston march 2014Stuart Houston
 
How to seize B2B market opportunities thanks to Big Data
How to seize B2B market opportunities thanks to Big DataHow to seize B2B market opportunities thanks to Big Data
How to seize B2B market opportunities thanks to Big DataMark Beekman
 
Leveraging Data Analysis for Sales
Leveraging Data Analysis for SalesLeveraging Data Analysis for Sales
Leveraging Data Analysis for SalesAditya Ratnaparkhi
 
CX Analytics for Cloud Services
CX Analytics for Cloud ServicesCX Analytics for Cloud Services
CX Analytics for Cloud ServicesVivian Jones
 
The Science of Incentive Compensation Programs: The DNA of What Works
The Science of Incentive Compensation Programs: The DNA of What WorksThe Science of Incentive Compensation Programs: The DNA of What Works
The Science of Incentive Compensation Programs: The DNA of What WorksProformative, Inc.
 
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnRohitKumar639388
 
I F F06 Sal Caruso 091807
I F F06 Sal Caruso 091807I F F06 Sal Caruso 091807
I F F06 Sal Caruso 091807Dreamforce07
 
Qontak.com introduction
Qontak.com introductionQontak.com introduction
Qontak.com introductionDiani NM
 
FastTrack Analytics for Insurance
FastTrack Analytics for InsuranceFastTrack Analytics for Insurance
FastTrack Analytics for InsuranceEdgewater
 
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and...
Total Customer Experience Management Overview #TCE #CEM  -- The Why, What and...Total Customer Experience Management Overview #TCE #CEM  -- The Why, What and...
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and...Stephen King
 
Ultimate guide to performance measurement
Ultimate guide to performance measurementUltimate guide to performance measurement
Ultimate guide to performance measurementRebecca Manjra
 
Connected Insight
Connected InsightConnected Insight
Connected InsightKyuCho8
 

Similar to AIG Performance Classification (20)

WorldAtWorkConfernce_USBank_OS FINAL (no notes)
WorldAtWorkConfernce_USBank_OS FINAL (no notes)WorldAtWorkConfernce_USBank_OS FINAL (no notes)
WorldAtWorkConfernce_USBank_OS FINAL (no notes)
 
Transform Data into Action
Transform Data into ActionTransform Data into Action
Transform Data into Action
 
Strategic value assessment (sva) web
Strategic value assessment (sva)   webStrategic value assessment (sva)   web
Strategic value assessment (sva) web
 
Product Development Plan
Product Development PlanProduct Development Plan
Product Development Plan
 
BoSUSA23 | Lauren Kelley | Meaning in Metrics
BoSUSA23 | Lauren Kelley | Meaning in MetricsBoSUSA23 | Lauren Kelley | Meaning in Metrics
BoSUSA23 | Lauren Kelley | Meaning in Metrics
 
Operations analytics assignment
Operations analytics assignmentOperations analytics assignment
Operations analytics assignment
 
Role Of HR In Organizational Design PowerPoint Presentation Slides
Role Of HR In Organizational Design PowerPoint Presentation SlidesRole Of HR In Organizational Design PowerPoint Presentation Slides
Role Of HR In Organizational Design PowerPoint Presentation Slides
 
Customer insight presentation s houston - boston march 2014
Customer insight presentation   s houston - boston march 2014Customer insight presentation   s houston - boston march 2014
Customer insight presentation s houston - boston march 2014
 
How to seize B2B market opportunities thanks to Big Data
How to seize B2B market opportunities thanks to Big DataHow to seize B2B market opportunities thanks to Big Data
How to seize B2B market opportunities thanks to Big Data
 
AFS Empowers SunnyD - Webinar Sept 2016
AFS Empowers SunnyD - Webinar Sept 2016AFS Empowers SunnyD - Webinar Sept 2016
AFS Empowers SunnyD - Webinar Sept 2016
 
Leveraging Data Analysis for Sales
Leveraging Data Analysis for SalesLeveraging Data Analysis for Sales
Leveraging Data Analysis for Sales
 
CX Analytics for Cloud Services
CX Analytics for Cloud ServicesCX Analytics for Cloud Services
CX Analytics for Cloud Services
 
The Science of Incentive Compensation Programs: The DNA of What Works
The Science of Incentive Compensation Programs: The DNA of What WorksThe Science of Incentive Compensation Programs: The DNA of What Works
The Science of Incentive Compensation Programs: The DNA of What Works
 
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
 
I F F06 Sal Caruso 091807
I F F06 Sal Caruso 091807I F F06 Sal Caruso 091807
I F F06 Sal Caruso 091807
 
Qontak.com introduction
Qontak.com introductionQontak.com introduction
Qontak.com introduction
 
FastTrack Analytics for Insurance
FastTrack Analytics for InsuranceFastTrack Analytics for Insurance
FastTrack Analytics for Insurance
 
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and...
Total Customer Experience Management Overview #TCE #CEM  -- The Why, What and...Total Customer Experience Management Overview #TCE #CEM  -- The Why, What and...
Total Customer Experience Management Overview #TCE #CEM -- The Why, What and...
 
Ultimate guide to performance measurement
Ultimate guide to performance measurementUltimate guide to performance measurement
Ultimate guide to performance measurement
 
Connected Insight
Connected InsightConnected Insight
Connected Insight
 

Recently uploaded

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 

Recently uploaded (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 

AIG Performance Classification

  • 2. Agency Performance Prediction | 2 Contents Contents  Business Case  Insights Required & Business Benefit  A Bit About Domain…  Data Pre-Processing  Modelling  Approach  Evaluation Metric  Outcome  Best Model Comparison  Model Interpretation & Key Challenges
  • 3. Agency Performance Prediction | 3 Business Case Azure Insurance Group is operating property and casualty (P&C) insurance, life insurance and insurance brokerage companies. Azure sells the policies through direct & indirect sales channel. For indirect selling, Azure has tie up with 1600+ agencies across 6 states. Azure is interested in classifying existing agencies into predefined performance categories in a supervised predictive framework & based on agencies past performance. Specifically Azure expects to better understand which agencies are likely to bring more growth in Personal Line (PL) of Business
  • 4. Agency Performance Prediction | 4 Insights Required & Business Benefits What Insights Are Required? Classify each agency into one of the following categories – GROW: Business from the agency is likely to grow > 5% in 2014 – STABLE: Business from the agency is likely to stay flat with growth in the range [-5%,5%] in 2014 – LOSS: Business form the agency is likely to shrink > 5% (< -5% growth) in 2014 Note: Business growth is measured in terms of %growth in Average Monthly Written Premium Amount achieved by the Agency for a given year Potential Business Benefits • Improved understanding of Agency Performance - at a micro level & macro level – How is an individual agency is likely to perform? – How are all agencies in a state are likely to perform? • Optimized utilization of Agency Development Funds
  • 5. Agency Performance Prediction | 5 A Bit About Domain… What is Insurance? • Risk Management Tool for the customer (individual/business) allowing him to transfer the risk of financial loss to the insurance company • In exchange for a constant stream of premiums, insurance companies offer to pay consumers a sum of money upon the occurrence of a predetermined event, such as a natural catastrophe, a car crash or death etc.. • Broadly, from a business perspective - insurance is classified as: Life OR Non-Life (General) Insurance Life Insurance General Insurance Property & Casualty Insurance Medical Insurance Motor Vehicle Insurance Marine Insurance Fire Insurance Homeowner’s Insurance Insurance Type
  • 6. Agency Performance Prediction | 6 Data Preprocessing What Data Was Provided By Azure? • 213K+ observations with 49 dimensions • Each observation representing yearly aggregated data for an Agency >> for a Year >> for a state >> for a product • Key attribute summary: – 1624 agencies – 11 years of time duration (2005-2015) – 6 states – 29 products – 2 product lines • No target class in the data ! Attribute Analysis Each input attribute was assessed from 3 different angles: • Business meaning: What does it mean? • Domain Expertise Based Predictive Importance: Can it help in predicting agency performance? • Sparsity: Does it have enough values?
  • 7. Agency Performance Prediction | 7 Data Preprocessing (Cont…) Key Preprocessing Challenges: SR# Challenge Category Challenge Resolution 1 Missing Values 1. Identified and dropped highly sparse attributes 2. Missing values encoded as "99999", "Unknown" were converted to NA during file read in R 2 Unwanted Data 1. Agencies, appointed as late as 2014 & for which 2014 growth rate can not be calculated - were removed 2. Scope of analysis is "Personal Line (PL)" data, hence, Commercial Line (CL) data was filtered out 3 Unavailable Data New attributes created for all Quantity and Revenue attributes to average them over the # of months data is available 4 Incomplete Data 2005 and 2015 data removed as data were available only for 8 & 5 months respectively 5 Repeating Data Agency specific attributes were detached from raw data, processed separately and later merge with main data 6 Format Of Data For Modelling 1. All Quantity attributes & revenue attributes were aggregated based on AGENCY_ID and YEAR 2. Each important attribute was expanded with AGENCY_ID in row and Year identifier in column E.g. WrittenPremAmount column was converted to 2006_WrittenPremAmount, 2007_WrittenPremAmount.... 7 No Target Class Present 1. Lag variable was created for Written Premium Amount 2. Growth Rate for each agency for all years (2006-20014) was calculated 3. Each agency was assigned a class label based on 2014 growth rate: • GROW class := 2014 growth rate > 5% • STABLE class := 2014 growth rate in the range [-5%, 5%] • LOSS class := 2014 growth rate < -5%
  • 8. Agency Performance Prediction | 8 Modelling - Approach • Important features were identified using Boruta package (11 attributes dropped) • As this is a classification problem, following algorithms were used: – CART – C5.0 – Random Forest – K Nearest Neighbours – Artificial Neural Network – Support Vector Machine – GBM – Ensemble-Stacking • Many algorithms were tried on three flavours of data: – ASIS Data – ASIS Data + Range transformation – ASIS Data + Range transformation + Important Features • 10-fold cross-validation (3x-10x repeated) was performed to get an initial best-estimate of hyper parameters ("caret" package) • One or more round of grid search was used to fine tune the hyper-parameter values ("caret" package) • Cost-sensitive learning was used in CART and SVM
  • 9. Agency Performance Prediction | 9 Modelling – Evaluation Metric • Interesting Insight: – Only ~40% of the agencies achieved >0% growth in 2014 – Of the 40%, Only ~50% of the agencies grew > 5%. Same is reflected in the 2014 growth class distribution: • Azure is interested in identifying agencies in GROW class as accurately as possible GROW STABLE LOSS 21% 37% 42% • Model Evaluation Metric: – Higher Recall For GROW class AND – Optimal F1 to balance Recall-Precision tradeoff
  • 10. Agency Performance Prediction | 10 Modelling - Outcome
  • 11. Agency Performance Prediction | 11 Modelling – Best Model Comparison Best Model Vs. Baseline Model: • In the absence of a model OR as a baseline model, the best estimate of 2014 Performance Class is MODE of 2014 Performance Class attribute. • Baseline model would predict "LOSS" class for all agencies as, with 42% observations, "LOSS" is the highest occurring class Model Metric Baseline Predictive Model Best Predictive Model GROWTH Class – Recall 0 0.80 GROWTH Class – Precision 0 0.35 GROWTH Class – F1 NA 0.49 Overall Accuracy 41.78% 49.20%
  • 12. Agency Performance Prediction | 12 Modelling – Model Interpretation & Key Challenges Model Interpretation • If an agency is likely to grow > 5% in 2014: – Best Predictive Model is able to accurately label it as "GROW" in 4/5 cases • If the Best Predictive Model has labeled an agency as "GROW": – In 1/3 cases the agency will actually grow > 5% in 2014 – In 2/3 cases the agency will stay STABLE or LOSS in 2014 Key Challenges: • GROW Class is a minority class in the data. The class distribution is imbalanced & is skewed toward "LOSS" class • For majority of algorithms, the learning is skewed toward learning LOSS class correctly - something that Azure is not interested in • The data has lot of variance. Difficult to get Test Data truly representative of Train data ! • There is "not enough" data to overcome class-imbalance and variance in the data