Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Survival analysis by Sanjaya Sahoo 2285 views
- Libro socorro by Daniela Videla 8549 views
- A gentle introduction to survival a... by Angelo Tinazzi 2431 views
- Survival Analysis of Web Users by Data Science London 3131 views
- Survival analysis by Har Jindal 2788 views
- Survival analysis by College of Fisher... 995 views

3,076 views

Published on

http://www.meetup.com/Data-Mining/events/68283282/

No Downloads

Total views

3,076

On SlideShare

0

From Embeds

0

Number of Embeds

6

Shares

0

Downloads

97

Comments

0

Likes

5

No embeds

No notes for slide

- 1. Subscription Survival Analysis SF Data Mining Meetup San Francisco June 2012 Jim Porzak – Senior Director, Business Intelligence
- 2. What We’ll Cover…• Quick introduction to survival analysis.• Rational for marketing & business.• Subscription businesses past & present.• The example data set. How real is it?• Calculating survival, average tenure & LTV.• Applications of Survival & Hazard curves.• Discussion.• Appendix (links & details). 2
- 3. Traditional Survival Analysis• Modeling of “time to event” data – Death in biological organisms (medicine) – Failure of machines (reliability engineering) – Political or sociological change (duration analysis)• These kinds of data share common gotcha: – If an “individual” has not yet died, failed, or changed, what can we say about the expected time to the event? 3
- 4. Trick: Use what we know – everything!• All have started, some have stopped, & some continue on. – “Right censored”• From this kind of data we can calculate probability of stopping. – “Hazard Ratio” = f(tenure, … )• Probability of survival at tenure, T, is just ( 1 – cumulative hazard(T)) 4
- 5. Classical “Bathtub” Hazard CurveSource: http://en.wikipedia.org/wiki/Bathtub_curve 5
- 6. The Subscription Business Model1st Visit Acquisition Conversion Lapse Re-activation Today’s #’s or Win-back Re- activate Engage Win-Back Acquire Retain Retain Convert 6
- 7. Aside: Subscription Stints – Rational & Method• For each member we need to know subscription start &, if terminated, stop dates for each member: Subscriber A• But data often has the accounting view with gaps in the subscription: CC Issue Prod Chg Whatever Subscriber A• From the customer’s perspective, he had one continuous subscription. It’s important to take that view since, in general, the longer a subscriber is active the lower the risk of churn.• A subscription stint algorithm typically ignores any gaps up to, say, 30 days – which is also taken as the boundary between a lapsed subscriber in the “re-activate” vrs “win-back” pools. 7
- 8. Why Subscription Survival?• Expected churn probability of new subscribers – Project future subscriber count • Especially after a successful campaign – Define tenure segment boundaries• Expected tenure of a new subscriber & LTV – As function of duration, product, channel, offer – Evaluate marketing & site tests by value not rates• Projected value of retention efforts – How improved renewal rates increase value• A new way of thinking about subscribers – Retention metrics, as used in marketing, are deceptive (see Appendix slide) 8
- 9. Typical Subscription Survival CurvesLinoff, Gordon (2004) Link in appendix. 9
- 10. Assumptions• Homogeneity of strata – Each includes only one segment of customers• Stability over time – Hardest to satisfy? – Propensity to renew a function of: • General economic conditions • Offer(s) at start of subscription • Engagement efforts by marketing group • Perceived long term value 10
- 11. Example Data Set• 100k records: – CustomerID – SubStartDate – SubEndDate (NULL if subscription active) – Duration = [A, Q, M] for Annual, Quarterly, or Monthly – Product = [B, P] for Basic or Professional – Channel = [R, A, S, B, O] for Referral, Affiliate, Search, Banner, or Other• Randomly generated with day-of-week and week-in-year seasonality; 15% year over year growth for starts.• Randomly generated number of renewals – With different churn rates for A, Q, & M• Mimics general properties of subscription data from Playboy.com, LA Times, 24 Hour Fitness, Chicago Sun Times, Ancestry.com, & Viadeo.com – Additional parameters could be Offer, Promotion Cohort, …• Assumes “no-return” policy, i.e. no early refunds. 11
- 12. Algorithm – Part 1: Hazard Ratio (# terminated during day i) ______________________________________HRi = (# active at beginning of day i) = 1/4 = 0.25Remember N is typically huge in asubscription business so we tend to getsmooth curves. 12
- 13. Algorithm – Part 2 Survival Curve Hazard Ratio Rate USDeluxe Domestic Acom Hazard 0.14 Duration A (N = 204,538) M (N = 196,345) 0.12 0.10Hazard Rate 0.08 0.06 0.04 0.02 0.00 0 200 400 600 800 1000 Tenure (days) Subscribers since 2006-01-01 (N = 400,883 subscribers) Average 2-year Tenure Survival Probability USDeluxe Domestic Acom Subscriber Survival 1.0 Duration A (N = 204,538) M (N = 196,345) 0.8 Probability of Survival 0.6 120 404 0.4 0.2 0.0 0 200 400 600 800 1000 Tenure (days) Subscribers since 2006-01-01 (N = 400,883 subscribers) LTR = (Daily Sales Price) * (Average Tenure) 13
- 14. Application: Average Tenure of Renewal Durations Subscription Stint Survival by Duration 1-Year 1-year 2-Year 2-year 3-Year 3-year Duration Half Life Average Probability Average Probability Average Probability (Months) (days) Tenure of Survival Tenure of Survival Tenure of Survival 1 149 180.9 0.217 NA NA NA NA 3 273 260.9 0.380 354.4 0.131 387.7 0.058 12 456 364.7 0.880 557.8 0.452 629.9 0.159 24 822 363.8 0.991 723.1 0.978 840.3 0.125 14
- 15. Application: Evaluating a Conversion TestWe test a great new design for the web site conversionpage flow & get 1050 conversions for new flow vs. 1000 forold flow - a 5% lift! A: Control Page Flow B: New Page Flow 2-Yr Daily 2-Yr Projected 2- Projected 2- Tenure ASP Value # Conv. Yr Value # Conv. Yr Value M 220 0.15 $33.00 500 $16,500 650 $21,450 Q 350 0.12 $42.00 300 $12,600 275 $11,550 A 560 0.09 $50.40 200 $10,080 125 $6,300Total 1000 $39,180 1050 $39,300Revenue Per Unit (2-Year) $39.18 $37.43 15
- 16. Application: Projected Value of BaseSo far we have looked at survival from start of subscription. Now weneed remaining survival for current subscribers. For example a 1-yearsubscriber: Just reset Ps to 1.0Projected n-year value of your current subscriber base is:For all subscribers, Sum their (daily sales price) * (Area under appropriate survival curve from their current tenure over the next n years) / (Ps (current tenure)) 16
- 17. Application: Thinking in Hazard SpacePicking Tenure Segment BoundariesSimple minded customer segmentation. But very useful! • Newbie: 0 - 100 days ___ • Builder: 101 - ___ days ___ 380 • Established: 381 - 735 days ___ ___ • Core: 736 days & up ___ Hazard Ratio USDeluxe Domestic Acom Hazard Rate 0.14 Duration A (N = 204,538) M (N = 196,345) 0.12 0.10 Hazard Rate 0.08 0.06 0.04 0.02 0.00 0 200 400 600 800 1000 Tenure (days) Subscribers since 2006-01-01 (N = 400,883 subscribers) 17
- 18. Application: Modeling in Hazard SpaceWhat-if Newbie Churn Could be Reduced by 10%? Red: Newbie Hazard down by Red: Newbie 10% 2-year value up by $10 18
- 19. Wrap-up. What we saw.• Introduction to classical survival• Subscribers & calculating their hazard & survival• Applying survival & hazard to real-world questions• See Appendix for links & some more details. 19
- 20. Questions?And, ping me later with questions orcomments: JPorzak@gmail.com 20
- 21. Appendix• Links to learn more• Retention vs Survival Curves
- 22. Subscription Survival Background• On subscription survival for customer intelligence: – Gordon Linoff’s 2004 Article from Intelligent Enterprise (now InformationWeek) http://www.data-miners.com/resources/Customer- Insight-Article.pdf – Will Potts’ technical white paper http://www.data- miners.com/resources/Will%20Survival.pdf – Berry & Linoff’s white paper for SAS, emphasizes forecasting application http://www.data- miners.com/companion/sas/forecastingWP_001.pdf – Chapter 10 in http://amzn.to/mRAVpp• Background on “classical” survival analysis: – http://en.wikipedia.org/wiki/Survival_analysis• R packages used – survival by Terry Therneau – ggplot2, plyr, lubridate by Hadley Wickham, et al – zoo, xts by Achim Zeileis, Jeffrey Ryan, et al 22
- 23. Retention vs Survival Curves• Retention Rate = 1 – Churn Rate• Churn Rate (typically) defined as: (# Subscribers leaving) (Average # subscribers) over some period.• Which means: – Ignores information out of period – Not monotonically decreasing 23
- 24. Traditional vs. Subscription SurvivalPropertity Traditional SubscriptionN Smallish (10^1 – 10^3) Large to huge (10^5 & up)Hazard Assumed continuous Usually spikySurvival curves Often Modeled EmpiricalSurvival Curve(s) Assumed continuous Usually stair step 24

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment