Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

"How to Data Model Churn (real life examples)"

2,921 views

Published on

Dataiku attended the New York Data Modeling Meetup, on Wednesday, August the 19th of 2015. Pierre Gutierrez (Data Scientist at Dataiku) talked about "How To Data Model Churn (real life examples)

You can now download the presentation in a PDF file.

Get started with the free community edition of Data Science studio right now: http://www.dataiku.com/dss/trynow/

Published in: Data & Analytics
  • Be the first to comment

"How to Data Model Churn (real life examples)"

  1. 1. How to data model Churn Real life examples
  2. 2. Quick quizz •  How many of you are familiar with Churn issue? •  with Machine Learning? Logistic Regression, Random Forest, Gradient Boosting trees? (Not the subject here) •  With SQL? (we may see some code later) •  What database tech do you use? What about EMC Greenplum or Vertica?
  3. 3. Who I am •  Senior Data Scientist at Dataiku (worked on churn prediction, fraud detection, bot detection, recommender systems, graph analytics, smart cities, … ) •  Occasional Kaggle competitor •  Mostly code with python and SQL •  Twitter @prrgutierrez
  4. 4. Churn definition •  Wikipedia: “Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the number of individuals or items moving out of a collective group over a specific period of time” = Customer leaving
  5. 5. Two types of Churn •  Subscription models: •  Telco •  E-gamming (Wow) •  Ex : Coyote -> 1 year subscription -> you know when someone leave •  Non subscription models: •  E-Business (Amazon, Price Minister, Vente Privée) •  E-gamming (Candy Crush, free MMORPG) -> you approximate someone leaving Candy Crush: days / weeks MMORPG: 2 months (holidays) Price Minister: months
  6. 6. Two types of Churn •  Blurred Separation: •  Ex: T-mobile: 1 month subscription -> paying each call •  Ex: Wow: 1 month to 6 month subscription •  Banking? •  Focus : no subscription: •  Can be seen as a generalization where you have to approximate the target •  Bonus : Seller churn •  Market places •  Clients that participate product life •  Forums (Reddit) •  E-gamming (Korean competitions, guilds etc.)
  7. 7. Dealing with churn •  Motivations : •  Saturated market -> cost get new client >>> cost keep client •  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx •  Wireline company : 2% to 2.5 % churn rate per month. •  If 5 M customers -> 1.32 M churn per year •  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month
  8. 8. Dealing with churn •  Predict churn : •  One model for performance <- our focus, short term, more ML •  One model for understanding <- long term, more Analytics •  Act on it (short term) : •  Special offer (telco call, free in game money, discount coupon … ) •  Does it work? Feedback loop needed! •  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit? •  Significant LTV for activation? •  Act on it (long term) : •  Is there a problem in my purchasing funnel? •  Is the game too hard at some point?
  9. 9. Dealing with churn •  Candy Crush Rumor : •  Change the distribution of probabilities of candies / bombs •  Change the difficulty of the game •  Loosing a lot makes the game easier
  10. 10. Modelling Churn •  Machine learning model (classification) -> target: •  Known in subscription •  Unknown in general •  Step 1 : Maintain customer status •  Do you care only about your best? •  Anyway churn action won’t be the same •  Has a client churned? -> target = churner = don’t buy / visit since time X -> best = buy / visit more than y since time Y •  Can be refined (“new customer”, several class of best or inactive, reactivated…) •  Storage : maintain only the difference!
  11. 11. Modelling Churn •  Machine learning model -> features: •  Explicative factors to use as input for the model •  Step 2 : Maintain customer features •  Social (woman, age, etc.) •  Behavioral! •  Utilization / buying rate •  Trend in utilization / buying rate •  Ad hoc features : •  WoW / Social game churn: take into account friend network churn •  Telco: call to call centers •  Beware of time dependence!
  12. 12. Data Model
  13. 13. Computation Dependency diagram
  14. 14. Ex : Train and predict scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for  feature   genera,on.   Use  model  to  predict   future  churn   Train  model  using  features  and  target  
  15. 15. Ex : Train Evaluation and Predict Scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for   feature  genera,on   Valida&on  set   Use  model  to   predict  future   churn   Training   Evaluate  on  the  target   of  the  valida,on  set   T  –  8  month   Data  is  used  for  features   genera,on.   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months  
  16. 16. Thank you for your attention !

×