Advertisement

Mar. 21, 2019•0 likes## 0 likes

•60 views## views

Be the first to like this

Show More

Total views

0

On Slideshare

0

From embeds

0

Number of embeds

0

Download to read offline

Report

Data & Analytics

Talk given at PyData London 2018 on using using Survival Analysis to forecast Customer Retention

lornamanFollow

Advertisement

Advertisement

Advertisement

By the Power of Metrics - Lean Kanban North America 2015Wolfgang Wiedenroth

Marketing Experiment - Part II: Analysis Minha Hwang

Survival_AnalysisRushil Goyal

Future of AI-powered automation in businessLouis Dorard

Dr. Stephen Koontz - Thinning Cash Fed Cattle Trade: How Thin is Too Thin & W...John Blue

Python Project PosterXuan(Sharon) Han

- Survival Analysis a practical application
- structure of the talk • Me • Tails.com • Lifetime value and retention • Survival analysis - motivation and theory • A survival regression model • Outputs and accuracy measures 2
- University of York: Maths MMath University of Bristol: PhD Random Matrix Theory Department for Education: Operations Research Analyst, Post-16 Education and Funding ASI Data Science: Data Science Internship Tails.com: Data Scientist 3 about me
- 4 Our proposition is based on a one-to-one relationship with each owner and their dog Changing the world of pet food for good Our proposition is based on a one-to-one relationship with each owner and their dog customer visits tails.com and enters dog’s details perfect product blended to meet pet’s individual requirements as a one-off Packaging personalised with dog’s name & their unique blend details Delivered to customer left in a safe place if necessary feeding plan automatically updates as dog ages or after optional owner feedback auto-replenishment so the owner never runs out, or has too much free adjustable, personalised feeding scoop making it easy to feed the right amount every day
- over 85,000 dogs UK wide deliver 4 million meals every month average monthly order costs £24 3 treat varieties, 15 wet food varieties over 1 million blends searched in 0.1s to find the optimal blend for your dog expect sales of well over £20m this year around 100 employees... 5 tails.com in numbers
- 6 … and around 25 office dogs
- customer retention and lifetime value
- Lifetime Value helps us make smart decisions on… ...product giveaways ...customer refunds ...marketing spend ...project prioritisation 8 why do we care about lifetime value?
- 9 retention and lifetime value Retention (how long you will be a customer) Frequency (how often you will order) Order Value (how much we make from your orders) Lifetime Value (total profit attributed to you)
- 10 retention and lifetime value Retention (how long you will be a customer) Frequency (how often you will order) Order Value (how much we make from your orders) Lifetime Value (total profit attributed to you)
- survival analysis
- 12 motivation What is the average subscription length? Censored data - not observed end event yet
- 13 survival analysis in action
- The time to a subscription end for a randomly chosen customer (time the customer churned) 14 survival analysis definitions Hazard function: probability that the customer will churn at time t And are related by: T ≥ 0 Survival Function: probability that the customer hasn’t churned by time t
- package options
- Lifelines Lightweight, good visualisations. Limited model selection Cameron Davidson Pilon github.com/CamDavidsonPilon scikit-survival Bigger selection of linear and nonlinear model options Sebastian Pölsterl - PyCon UK 2017 github.com/sebp/scikit-survival 16 model choices Survival, KMSurv, OISurv: Decent model selection, lots of tutorials and lectures on the subject use these packages. Generally slower to train, and less intuitive to use than Python options
- modelling
- 18 Kaplan-Meier estimate of the survival function Using lifelines package:
- 19 Time since subscription start Probabilitystillacustomer Probability customer still active = 50% Expected time active Kaplan-Meier estimate of the survival function
- Key Assumption: Impact of a factor on survival is multiplicative, and impact is constant over time 20 survival regression Input Features x: Everything we know about... - Your dog - You - Your actions in trial Cox Proportional Hazards model
- Dealing with categorical data 21 feature engineering pet_id breed 1 Jack Russell 2 Labrador 3 Dalmatian pet_id jack_russell labrador dalmatian 1 1 0 0 2 0 1 0 3 0 0 1 One hot encoding Category with n options converted to n - 1 binary features Assign sensible ordering pet_id breed 1 Jack Russell 2 Labrador 3 Dalmatian pet_id breed_median_days_active 1 xxx 2 yyy 3 zzz Logical ordering given instead of category - estimate median time active based on population Better clarity on impact of each individual category Generally more accurate prediction
- 22 training the model …..
- evaluation
- Measure accuracy by successful ordering of pairs of customers 24 concordance index Mowgli Predicted time a customer will be active 0 Mr. Patch
- Measure accuracy by successful ordering of pairs of customers 25 concordance index Mowgli: Active for 12 months Mr. Patch: Active for 18 months Predict Mowgli active for longer wrong! Predict Mr Patch active for longer correct! Concordance Index between 0 and 1 1: Perfect ordering of pets 0.5: As good as random ordering 0: Perfect anti-ordering
- 26 qA pet level survival predictions Time since subscription start Probabilitystillacustomer Lines don’t intersect - due to underlying proportional hazard assumption Happy customer, active for a long time Uninterested customer, churns quickly
- python, data and dogs your kind of thing?
- lorna@tails.com https://tails.com/careers 28

Advertisement