Attrition Model and Remaining Lifetime

4,536 views

Published on

What is Survival Analysis?
Survival analysis is a set of statistical methods for time-dependent outcomes. It is typically used when a research question centers around whether or when an event occurs, such as when does a customer attrite?
Life Table and Empirical Hazard
The life table looks at a lifetime implication of customers whose risk of attriting is depended on tenure and the hazard provides a graphical representation.
Flexible Hazard Modeling Methodology
The hazard model based on logistic regression was built using discrete time survival data and the model possess a simple parametric form.
How the Model was Used
The model was scored which provides predictive scores of a customer attriting. The mean residual life was calculated which provides an estimate of the mean remaining lifetime of a customer.
Future Considerations
Competing Risk such as mutually exclusive reasons for attriting, time-independent variables such as attrition risk, and incorporating seasonality into the model should be considered for the future. Vist http://www.saraconsultingllc.com to learn more about the presenter.

Published in: Business, Technology
1 Comment
4 Likes
Statistics
Notes
  • eCapture
    BULKeMAILAddresses.net
    Now 300% More Effective
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
4,536
On SlideShare
0
From Embeds
0
Number of Embeds
51
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

Attrition Model and Remaining Lifetime

  1. 1. Author: Melinda Richmond Date: May 6, 2009 Attrition Model and Mean Remaining Lifetime Disclaimer: Dummy Data
  2. 2. <ul><li>What is Survival Analysis? </li></ul><ul><li>Survival analysis is a set of statistical methods for time-dependent outcomes. It is typically used when a research question centers around whether or when an event occurs, such as when does a customer attrite? </li></ul><ul><li>Life Table and Empirical Hazard </li></ul><ul><li>The life table looks at a lifetime implication of customers whose risk of attriting is depended on tenure and the hazard provides a graphical representation. </li></ul><ul><li>Flexible Hazard Modeling Methodology </li></ul><ul><li>The hazard model based on logistic regression was built using discrete time survival data and the model possess a simple parametric form. </li></ul><ul><li>How the Model was Used </li></ul><ul><li>The model was scored which provides predictive scores of a customer attriting. The mean residual life was calculated which provides an estimate of the mean remaining lifetime of a customer. </li></ul><ul><li>Future Considerations </li></ul><ul><li>Competing Risk such as mutually exclusive reasons for attriting, time-independent variables such as attrition risk, and incorporating seasonality into the model should be considered for the future. </li></ul>Executive Summary
  3. 3. <ul><li>Research question: Whether a customer with COMPANY ABC has attrited, and if so, when did the customer attrite? </li></ul><ul><li>Let’s consider two COMPANY ABC customers, Customers A and B. Customer A, attrited during the study period having thirty-nine years of tenure. Customer B, did not attrite during the study period, so the event time of this customer attriting is unknown. </li></ul><ul><li>Customer A, represents a customer with a time-dependent outcome (attrited) which depends on a discrete event time, T=thirty-nine years of tenure. Customer B, represents a customer with a time-dependent outcome (not attrited) which depends on an unknown discrete event time, i.e., tenure of when the customer attrites is unknown. This is known as right-censoring. We have the answer to the research question. We know whether or not the customer attrited and when the customer attrited during observable time (the study period, see page 4). </li></ul><ul><li>In addition to knowing the outcome and event time, whether time is known or censored, we need to know what other variables help to explain why a customer attrites. These are known as time-dependent covariates. For example: What was the age of the customer who attrited after thirty-nine years of tenure? Was this customer who attrited after thirty-nine years of tenure in a paid up status? </li></ul><ul><li>When the values of the covariates change in time, then the data for each customer consists of many individual time series. </li></ul>What is Survival Analysis?
  4. 4. Window of Opportunity for Survival Analysis Data Jan 2006 Dec 2006 STUDY PERIOD PRIOR AFTER Account Opened Attrited Right-Censored Attrited
  5. 5. Life Table All customers have had a tenure (t) = 0 years, 3.1 million. 500 (0.0161%) new customers attrited and 160,000 new customers did not attrite by end of study period, i.e., right-censored. 2.9 million customers have had a tenure (t) = 1 years. 650 (0.0221%) first year customers attrited and 161,000 first year customers did not attrite by end of study period, i.e., right-censored.
  6. 6. Empirical Hazard The increasing part of the curve indicates that customers are more likely to leave COMPANY ABC , and the decreasing part of the curve indicates that customers are less likely to leave COMPANY ABC .
  7. 7. <ul><li>A Hazard model based on logistic regression was built using discrete time survival data and the model possess a simple parametric form. </li></ul><ul><li>This model handles the nonlinear and irregular shape of the hazard and time-dependent covariates. </li></ul><ul><li>The hazard function is the joint probability distribution of the event time, conditional on the covariates, and controls the occurrence of the outcome. </li></ul><ul><li>h(t|x) = Pr(T= t |T ≥t, x) = #Attritors / #Atrisk </li></ul><ul><li>h(0|x) = Pr(T=0|x) </li></ul><ul><li>h(1|x) = Pr(T= 0|x ) Pr(T=1|x) </li></ul><ul><li>h(2|x) = Pr(T= 0 | x) Pr(T= 1 |x ) Pr(T= 2 | x) </li></ul><ul><li>h(n|x) = Pr(T= 0 | x) Pr(T= 1| x) Pr(T= 2 | x) … Pr(T= n | x) </li></ul><ul><li>It follows that the Binary logistic regression model is </li></ul>Flexible Hazard Modeling Methodology
  8. 8. Logistic Regression Model
  9. 9. <ul><li>Scored the hazard estimates from the model on 2007 data. The score measures the likelihood of a customer attriting. A graph is presented on page 10, which compares the actual number of attritors to the predicted number of attritors. </li></ul><ul><li>The average remaining lifetime of a customer is computed by scoring future time intervals with the hazard model. A graph is presented on page 11, displaying the average remaining lifetime. </li></ul><ul><li>conditional probability of the customer attriting at time t given that the customer has not attrited yet </li></ul>How the Model was Used
  10. 10. Fitted Model for Number of Attritors
  11. 11. Fitted Model for Restricted Mean Residual Lifetime
  12. 12. <ul><li>This write-up does not consider the reason for a customer attriting. However, if one knows mutually exclusive reasons for a person attriting one could model on competing risk (mutually exclusive reasons of attriting) using a multinomial logistic approach as opposed to a logistic approach. </li></ul><ul><li>Identify attrition risk by large, medium, or small. These are variables independent of time. One could build three separate models obtaining an equation for each risk. </li></ul><ul><li>This write-up does not consider seasonality. However, we know that with certain observations there maybe a seasonality effect. </li></ul>Future Considerations

×