SlideShare a Scribd company logo
1 of 31
LA CITY WORKER
COMPENSATION ANALYSIS
ZOE HUANG, JIAYING GU, DANNIEL WINARTO, BAILI LU ,
RINI MUKHERJEE
DANNIEL
WINARTO
JIAYING
GU
ZOE
HUANG
BAILI
LU
RINI
MUKHERJEE
LA CITY WORKER
COMPENSATION ANALYSIS
Agenda
Project Overview & Objective
Overview:
1. Dataset contains 3 years record of claimant of LA workers.
2. Entire Dataset contains 39 different Excel files.
3. Response Variable is the claim amount.
Objective:
1. Figure out high risk factors and their patterns.
2. Build a predictive model.
Exploratory Analysis
Histogram of Log Amount
Exploratory Analysis
Average Amount for each Fiscal Year
Exploratory Analysis
Total Amount for each Fiscal Year
Exploratory Analysis
Average Payment Amount Group by Nature of Injury
Average Payment Amount Group by Claimant Cause
Exploratory Analysis
Exploratory Analysis
Average Payment Amount Group by Claimant Type
Exploratory Analysis
Average Payment Amount Group by Age
Joining Datasets
JoiningDatasets
Datasets Joining —— Aggregate datasets to avoid one - many relationship
Payment data is important to our analysis.
We combined the past 3 years payment data
together to form a large data set.
For each claim, there might be several
payment. But all other datasets we chose can
all be uniquely identified by claim_id. So we
aggregate payment information by claim_id
so it can easily join with all other datasets.
This also reduce the number of the
observations from 300K to 30K.
Incident Date Incident Month
Incident Hour
Birth Date Age
Incident Date – Hire Date Hire Year
Creating New Variables
Data Cleaning
More than 100 variables:
- Delete missing percent >
90%
More than 2400 levels:
- Reduce level
Data Cleaning
Categorical Variables Level Reduction methodology:
1. Zip Code
Splitted based on the quantile of claim amount, into 5
groups
2. Incident Hour
Splitted every 4 hours into 5 groups
“Midnight - 5AM”, “6AM - 11AM”, “Noon - 5PM”, 7PM -
Midnight”
3. Occupation group
Splitted based on the quantile claim amount, into 5 groups
4. Organization Code group
Splitted based on the quantile claim amount, into 5 groups
Data Cleaning
Building Model
We split the dataset:
80 % training
20% testing
We integrate the levels of categorical variables
(litigated, nature of injury, fiscal year, zip code and
claimant type)
The predictive model is significant
The Adjusted R-Squared is 0.9054
Building Model
- All the levels of litigated, nature of
injury, fiscal year and zip code group
are significant
- Only part of the levels of claimant
type are significant
Building Model
Since the trend of many variables is
seriously right-skewed, we do the log
transformation for the dependent variable
and fit the linear model
Building Model
We integrate the levels of categorical variables
(litigated, nature of injury, fiscal year, zip code and
claimant type)
The predictive model is significant
The Adjusted R-Squared is 0.9054
Building Model
Log.Amount =
10.089 +
C1 x Litigated[0/1] +
C2 x Nature of Injury Group +
C3 x Fiscal Year Group +
C4 x Zip Group+
C5 x Claimant Type
Where:
C1 = [-.120, 0]
C2 = [.035,.012, 0]
C3 = [-.034, .165, .212, .202, .178, 0]
C4 = [-5.218, -3.580, -2.314, -.923, 0]
C5 = [-.189, .120, -.120]
Model Prediction > THE MODEL WELL PREDICT THE TESTING DATA
> Most of our predicted results are off within 50% of the actual data,
meaning that our model can capture the true values.
Business Insight
List of Importance (stepwise
selection) :
Zip Group
Fiscal Year Group
Litigated
Claimant Type
Nature of Injury Group
Business Insight - Zip Code
Zip Group is the most significant variable, we found out by applying stepwise regression
methodology.
This finding is intuitively make sense, since some areas may have condition(eg. Road quality,
rainfall intensity, average income) that increase chance of the worker getting injured (Based on
the assumption that generally people tend to live close where they work).
Example of most freq in each zip group:
Z1 = 91350 -> Santa Clarita
Z2 = 91342 -> Sylmar
Z3 = 93065 -> Simi Valley
Z4 = 93551 -> Palmdale
Z5 = 91709 -> Chino hills
We should pay attention more on this area to reduce the claimant frequency in the future
Business Insight - Fiscal Year
Claims that happened in fiscal year
2012/2013 have the highest amount.
All those that happened before 2011 are
lower than after 2011.
After 2013, the average amount per
claim decreased. 2013-2014 fiscal year
Is higher than 2014-2015 fiscal year.
Business Insight - Litigated
Claims that are litigated have higher amount than
those that are not litigated.
Meaning that in order to cut cost, reduce the
number of litigated claims would help.
Business Insight - Claimant Type
Error claims have the lowest amount, and
future medical claims have the highest.
Incident only claims are lower than medical
only.
In order to cut cost, special attention
should be applied to future medical
claims.
Business Insight - Nature of Injury
Multiple injuries claims have the
highest average amount,
occ disease or cum injury comes next
and the lowest is specific injury.
Cost could be reduced by paying
attention to claims of multiple injuries.
1. Since our model suggests that Zip Group, Fiscal Year Group ,Litigated ,Claimant Type,
Nature of Injury Group are risky factors for claim amount. LA City could cut cost by
focusing on these.
2. There are other variables that we considered to put into our model, but because of
existence of missing values or wrong values, we excluded them from our model. If
these variables are maintained better in the future, they can be contributed to the
model. For example, employment type.
3. If LA City would like to obtain more accurate predictive results, we will recommend to
gather more numerical data in the future.
Suggestions & Takeaways
LA CITY WORKER
COMPENSATION ANALYSIS
ZOE HUANG, JIAYING GU, DANNIEL WINARTO, BAILI LU ,
RINI MUKHERJEE
DANNIEL
WINARTO
JIAYING
GU
ZOE
HUANG
BAILI
LU
RINI
MUKHERJEE
THANK YOU
&
QUESTIONS ARE WELCOME

More Related Content

Viewers also liked

The arab deaf week on the road to inclusion by dr. ghassan shahrour
The arab deaf week on the road to inclusion by dr. ghassan shahrourThe arab deaf week on the road to inclusion by dr. ghassan shahrour
The arab deaf week on the road to inclusion by dr. ghassan shahrourGhassan Shahrour
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis iwan_rg
 
Γλώσσα - κείμενα
Γλώσσα -  κείμεναΓλώσσα -  κείμενα
Γλώσσα - κείμεναdrallis
 
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsInnovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsAreej Al-Wabil
 
المشروع القومى لتمكين الصم بإستخدام التكنولوجيا
المشروع القومى لتمكين الصم بإستخدام التكنولوجياالمشروع القومى لتمكين الصم بإستخدام التكنولوجيا
المشروع القومى لتمكين الصم بإستخدام التكنولوجياAdel Khalifa, PhD
 
Towards Robust and Safe Autonomous Drones
Towards Robust and Safe Autonomous DronesTowards Robust and Safe Autonomous Drones
Towards Robust and Safe Autonomous DronesSERENEWorkshop
 
Sound engineer cv pfd 3
Sound engineer cv pfd 3Sound engineer cv pfd 3
Sound engineer cv pfd 3Paul Chivers
 
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO PAOLA_M_NARANJO
 

Viewers also liked (10)

The arab deaf week on the road to inclusion by dr. ghassan shahrour
The arab deaf week on the road to inclusion by dr. ghassan shahrourThe arab deaf week on the road to inclusion by dr. ghassan shahrour
The arab deaf week on the road to inclusion by dr. ghassan shahrour
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
 
Sociedad de la información
Sociedad de la informaciónSociedad de la información
Sociedad de la información
 
Γλώσσα - κείμενα
Γλώσσα -  κείμεναΓλώσσα -  κείμενα
Γλώσσα - κείμενα
 
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in SchoolsInnovation in the Classroom: Keeping Digital Innovation Alive in Schools
Innovation in the Classroom: Keeping Digital Innovation Alive in Schools
 
المشروع القومى لتمكين الصم بإستخدام التكنولوجيا
المشروع القومى لتمكين الصم بإستخدام التكنولوجياالمشروع القومى لتمكين الصم بإستخدام التكنولوجيا
المشروع القومى لتمكين الصم بإستخدام التكنولوجيا
 
paragangliomas
paragangliomas paragangliomas
paragangliomas
 
Towards Robust and Safe Autonomous Drones
Towards Robust and Safe Autonomous DronesTowards Robust and Safe Autonomous Drones
Towards Robust and Safe Autonomous Drones
 
Sound engineer cv pfd 3
Sound engineer cv pfd 3Sound engineer cv pfd 3
Sound engineer cv pfd 3
 
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO
TRATADOS COMERCIALES DE MÉXICO CON EL MUNDO
 

Similar to La city presentation

Workers Compensation Program Overview
Workers Compensation Program OverviewWorkers Compensation Program Overview
Workers Compensation Program OverviewLuke Slabaugh
 
Peer Risk Model for Cyber Security Risk
Peer Risk Model for Cyber Security RiskPeer Risk Model for Cyber Security Risk
Peer Risk Model for Cyber Security RiskThomas Lee
 
Loan Analysis Predicting Defaulters
Loan Analysis Predicting DefaultersLoan Analysis Predicting Defaulters
Loan Analysis Predicting DefaultersIRJET Journal
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimizationMike Nguyen
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industryskewdlogix
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paperShubhashish Biswas
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxAniket Patil
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptxpatilaniket2418
 
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...55296
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bankPankaj Jha
 
201206 Tech Decisions: Finding Profits
201206 Tech Decisions: Finding Profits201206 Tech Decisions: Finding Profits
201206 Tech Decisions: Finding ProfitsSteven Callahan
 
9 Key Concepts
9 Key Concepts9 Key Concepts
9 Key Conceptsmikewells
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal
 
Pinpoint Predictive- InsurTech Innovation Award 2022
Pinpoint Predictive- InsurTech Innovation Award 2022Pinpoint Predictive- InsurTech Innovation Award 2022
Pinpoint Predictive- InsurTech Innovation Award 2022The Digital Insurer
 
PredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics, Inc.
 
Analytics to the Rescue: Better Loss Prevention through Modeling
Analytics to the Rescue: Better Loss Prevention through ModelingAnalytics to the Rescue: Better Loss Prevention through Modeling
Analytics to the Rescue: Better Loss Prevention through ModelingCognizant
 
Cyber loss model for all industries
Cyber loss model for all industriesCyber loss model for all industries
Cyber loss model for all industriesThomas Lee
 

Similar to La city presentation (20)

Workers Compensation Program Overview
Workers Compensation Program OverviewWorkers Compensation Program Overview
Workers Compensation Program Overview
 
Peer Risk Model for Cyber Security Risk
Peer Risk Model for Cyber Security RiskPeer Risk Model for Cyber Security Risk
Peer Risk Model for Cyber Security Risk
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Loan Analysis Predicting Defaulters
Loan Analysis Predicting DefaultersLoan Analysis Predicting Defaulters
Loan Analysis Predicting Defaulters
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industry
 
Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paper
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
Customer_Churn_prediction.pptx
Customer_Churn_prediction.pptxCustomer_Churn_prediction.pptx
Customer_Churn_prediction.pptx
 
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
7. Plan, perform, and evaluate samples for substantive procedures IPPTChap009...
 
Gmid associates services portfolio bank
Gmid associates  services portfolio bankGmid associates  services portfolio bank
Gmid associates services portfolio bank
 
201206 Tech Decisions: Finding Profits
201206 Tech Decisions: Finding Profits201206 Tech Decisions: Finding Profits
201206 Tech Decisions: Finding Profits
 
9 Key Concepts
9 Key Concepts9 Key Concepts
9 Key Concepts
 
Bank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim PredictionBank Customer Segmentation & Insurance Claim Prediction
Bank Customer Segmentation & Insurance Claim Prediction
 
Pinpoint Predictive- InsurTech Innovation Award 2022
Pinpoint Predictive- InsurTech Innovation Award 2022Pinpoint Predictive- InsurTech Innovation Award 2022
Pinpoint Predictive- InsurTech Innovation Award 2022
 
PredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections CapabilitiesPredictiveMetrics' Predictive Scoring for Collections Capabilities
PredictiveMetrics' Predictive Scoring for Collections Capabilities
 
Analytics to the Rescue: Better Loss Prevention through Modeling
Analytics to the Rescue: Better Loss Prevention through ModelingAnalytics to the Rescue: Better Loss Prevention through Modeling
Analytics to the Rescue: Better Loss Prevention through Modeling
 
Cyber loss model for all industries
Cyber loss model for all industriesCyber loss model for all industries
Cyber loss model for all industries
 

Recently uploaded

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 

Recently uploaded (20)

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

La city presentation

  • 1. LA CITY WORKER COMPENSATION ANALYSIS ZOE HUANG, JIAYING GU, DANNIEL WINARTO, BAILI LU , RINI MUKHERJEE DANNIEL WINARTO JIAYING GU ZOE HUANG BAILI LU RINI MUKHERJEE LA CITY WORKER COMPENSATION ANALYSIS
  • 3. Project Overview & Objective Overview: 1. Dataset contains 3 years record of claimant of LA workers. 2. Entire Dataset contains 39 different Excel files. 3. Response Variable is the claim amount. Objective: 1. Figure out high risk factors and their patterns. 2. Build a predictive model.
  • 5. Exploratory Analysis Average Amount for each Fiscal Year
  • 6. Exploratory Analysis Total Amount for each Fiscal Year
  • 7. Exploratory Analysis Average Payment Amount Group by Nature of Injury
  • 8. Average Payment Amount Group by Claimant Cause Exploratory Analysis
  • 9. Exploratory Analysis Average Payment Amount Group by Claimant Type
  • 13. Datasets Joining —— Aggregate datasets to avoid one - many relationship Payment data is important to our analysis. We combined the past 3 years payment data together to form a large data set. For each claim, there might be several payment. But all other datasets we chose can all be uniquely identified by claim_id. So we aggregate payment information by claim_id so it can easily join with all other datasets. This also reduce the number of the observations from 300K to 30K.
  • 14. Incident Date Incident Month Incident Hour Birth Date Age Incident Date – Hire Date Hire Year Creating New Variables
  • 15. Data Cleaning More than 100 variables: - Delete missing percent > 90% More than 2400 levels: - Reduce level
  • 16. Data Cleaning Categorical Variables Level Reduction methodology: 1. Zip Code Splitted based on the quantile of claim amount, into 5 groups 2. Incident Hour Splitted every 4 hours into 5 groups “Midnight - 5AM”, “6AM - 11AM”, “Noon - 5PM”, 7PM - Midnight” 3. Occupation group Splitted based on the quantile claim amount, into 5 groups 4. Organization Code group Splitted based on the quantile claim amount, into 5 groups
  • 18. Building Model We split the dataset: 80 % training 20% testing We integrate the levels of categorical variables (litigated, nature of injury, fiscal year, zip code and claimant type) The predictive model is significant The Adjusted R-Squared is 0.9054
  • 19. Building Model - All the levels of litigated, nature of injury, fiscal year and zip code group are significant - Only part of the levels of claimant type are significant
  • 20. Building Model Since the trend of many variables is seriously right-skewed, we do the log transformation for the dependent variable and fit the linear model
  • 21. Building Model We integrate the levels of categorical variables (litigated, nature of injury, fiscal year, zip code and claimant type) The predictive model is significant The Adjusted R-Squared is 0.9054
  • 22. Building Model Log.Amount = 10.089 + C1 x Litigated[0/1] + C2 x Nature of Injury Group + C3 x Fiscal Year Group + C4 x Zip Group+ C5 x Claimant Type Where: C1 = [-.120, 0] C2 = [.035,.012, 0] C3 = [-.034, .165, .212, .202, .178, 0] C4 = [-5.218, -3.580, -2.314, -.923, 0] C5 = [-.189, .120, -.120]
  • 23. Model Prediction > THE MODEL WELL PREDICT THE TESTING DATA > Most of our predicted results are off within 50% of the actual data, meaning that our model can capture the true values.
  • 24. Business Insight List of Importance (stepwise selection) : Zip Group Fiscal Year Group Litigated Claimant Type Nature of Injury Group
  • 25. Business Insight - Zip Code Zip Group is the most significant variable, we found out by applying stepwise regression methodology. This finding is intuitively make sense, since some areas may have condition(eg. Road quality, rainfall intensity, average income) that increase chance of the worker getting injured (Based on the assumption that generally people tend to live close where they work). Example of most freq in each zip group: Z1 = 91350 -> Santa Clarita Z2 = 91342 -> Sylmar Z3 = 93065 -> Simi Valley Z4 = 93551 -> Palmdale Z5 = 91709 -> Chino hills We should pay attention more on this area to reduce the claimant frequency in the future
  • 26. Business Insight - Fiscal Year Claims that happened in fiscal year 2012/2013 have the highest amount. All those that happened before 2011 are lower than after 2011. After 2013, the average amount per claim decreased. 2013-2014 fiscal year Is higher than 2014-2015 fiscal year.
  • 27. Business Insight - Litigated Claims that are litigated have higher amount than those that are not litigated. Meaning that in order to cut cost, reduce the number of litigated claims would help.
  • 28. Business Insight - Claimant Type Error claims have the lowest amount, and future medical claims have the highest. Incident only claims are lower than medical only. In order to cut cost, special attention should be applied to future medical claims.
  • 29. Business Insight - Nature of Injury Multiple injuries claims have the highest average amount, occ disease or cum injury comes next and the lowest is specific injury. Cost could be reduced by paying attention to claims of multiple injuries.
  • 30. 1. Since our model suggests that Zip Group, Fiscal Year Group ,Litigated ,Claimant Type, Nature of Injury Group are risky factors for claim amount. LA City could cut cost by focusing on these. 2. There are other variables that we considered to put into our model, but because of existence of missing values or wrong values, we excluded them from our model. If these variables are maintained better in the future, they can be contributed to the model. For example, employment type. 3. If LA City would like to obtain more accurate predictive results, we will recommend to gather more numerical data in the future. Suggestions & Takeaways
  • 31. LA CITY WORKER COMPENSATION ANALYSIS ZOE HUANG, JIAYING GU, DANNIEL WINARTO, BAILI LU , RINI MUKHERJEE DANNIEL WINARTO JIAYING GU ZOE HUANG BAILI LU RINI MUKHERJEE THANK YOU & QUESTIONS ARE WELCOME