Upcoming SlideShare
×

# My Entry to the DMEF CLV Contest

1,750 views
1,634 views

Published on

As part of my master thesis "Stochastic Models of Noncontractual Consumer Relationships" I participated in a contest organized by the DMEF to forecast Consumer Lifetime Value. My submitted model finished second (out of 25 entries). These slides concisely summarize my approach and also the final model.

2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,750
On SlideShare
0
From Embeds
0
Number of Embeds
32
Actions
Shares
0
43
0
Likes
2
Embeds 0
No embeds

No notes for slide

### My Entry to the DMEF CLV Contest

1. 1. THE DMEF CLV COMPETITION AND HOW I ENDED UP ON 2ND PLACE
2. 2. THE CHALLENGE \$ \$ \$ ?\$?\$? 1.1.2002 31.8.2006 31.8.2008 non-contractual Setting non-observable Status
3. 3. THE CHALLENGE \$ \$ \$ ?\$?\$? 1.1.2002 31.8.2006 31.8.2008 21,000 DONORS acquired in ﬁrst half of 2002 54,000 DONATIONS until mid of 2006
4. 4. THE GAME PLAN • Understand the Data Set ➙ EDA • Split Estimation for # Transactions and \$ Value • Implement Parametric Stochastic Models NBD, Pareto/NBD, BG/NBD, CBG/NBD,.. • Benchmark Data Fit and Predictive Power • Try to Improve Predictive Power
5. 5. THE DATA SET SAMPLED TIMING PATTERNS Various Timing Patterns 11382546 | | | | | 11371770 | | | || | | | | | | | | | | | 11359536 | | | 11343894 | | 11329984 | Donor ID 11317401 | 11303989 | 11292547 | | 11281342 | | | | | | | 11270451 | 11259736 | 10870988 |||||||||||||||||||||||||||||||||||||||||||| 2002 2003 2004 2005 2006 Time Scale
6. 6. THE DATA SET TRENDS AT AGGREGATE LEVEL Nr of Donations Avg Donation Amount 50 8000 40 30 4000 13% 15% 14% 20 10 +24% 10% +12% 0 0 2002 2004 2006 2002 2004 2006 Time Time
7. 7. THE DATA SET TRENDS AT AGGREGATE LEVEL Percentage of Donors Average Nr of Donations who Have Donated Within that Year per Active Donor 0.5 2.0 1.55 0.4 1.46 1.51 1.5 1.42 27.8% 29.5% 0.3 23.5% 1.0 18.8% 0.2 0.5 0.1 0.0 0.0 2002 2003 2004 2005 2002 2003 2004 2005 Time Time
8. 8. THE DATA SET INTERTRANSACTION TIMES Overall Distribution of Intertransaction Times 4000 1 12 3000 Count 2000 1000 24 0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 Nr of Months in between Donations
9. 9. THE MODELS NBD ASSUMPTIONS (1959) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α „while there is not enough information to reliably estimate the purchase rate for each person, there will generally be enough to estimate the distribution of it over customers“
10. 10. THE MODELS NBD - ESTIMATION r = 0,475 avg IPT: 2,9 years α = 489.5 med IPT: 6,6 years
11. 11. THE MODELS PARETO/NBD ASSUMPTIONS (1987) A) The number of transactions follows a Poisson NBD { process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Customer Lifetime is exponentially distributed Pareto { with death rate μ D) Heterogeneity in μ follows a Gamma distribution with shape parameter s and rate parameter β E) λ and μ are distributed independently
12. 12. THE MODELS BG/NBD ASSUMPTIONS (2005) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
13. 13. THE MODELS CBG/NBD ASSUMPTIONS (2007) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) At time zero and directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
14. 14. THE BENCHMARK DATA FIT Actual vs Fitted Frequency of Repeat Transactions 10000 Observed NBD Pareto/NBD BG/NBD 8000 2 = 366.1 CBG/NBD NBD 2 Pareto/NBD = 391.5 2 BG/NBD = 487.2 6000 Frequency 2 CBG/NBD = 363.7 4000 2000 0 0 1 2 3 4 5 6 7+
15. 15. THE BENCHMARK PREDICTIVE POWER Time Split Calibration Validation Period Period 2002 2003 2004 2005 2006
16. 16. THE BENCHMARK PREDICTIVE POWER MSLE = Mean Squared Logarithmic Error RMSE = Root Mean Squared Error MAE = Mean Absolute Error Corr = Correlation
17. 17. THE PROBLEM A SIMPLE LINEAR MODEL
18. 18. THE APPROACH INVESTIGATE IN ERRORS Timing Patterns for the Timing Patterns for the 10 Worst Underestimated Donors 10 Worst Overestimated Donors | | | | | | ||||| | | ||| | |||| || | | | | | |||||||||||||||||||||||||||||| || | | ||||||||||||| | |||| || |||| | || ||| |||| | | | | | || | | | |||||||||| |||||||||||||||| |||||||||||| | | | || | ||||||||||||||| || | | || | | | | ||| | | | | || | | | | | | | ||||||||||||||| || | || || | | |||||| ||| | | | | | | | ||||||||||||||| | | | || | ||| | ||||||||||| | | | | | | | |||||||||||||| |||||||||||| ||||||||||||||||||||||| | | | ||||| | | ||||||||||||||||||||||||||||||||| || | | | || ||||||| | ||| |||| | | | | | || | | | | || | | || | | | | | || | ||| | | | | Calibration Period Validation Period Calibration Period Validation Period
19. 19. REGULARITY IT‘S NOT JUST ABOUT RECENCY AND FREQUENCY Two Users with same Recency and Frequency But one of them is more likely to be active after T.
20. 20. THE POISSON PROCESS PROBLEMATIC IMPLICATIONS Poisson implies Exponentially Distributed IPT •Mode Zero: The most likely time of purchase is immediately after a purchase. No dead period. •Memoryless Property: No regularity within timing patterns. Succeeding interpurchase times are assumed to be uncorrelated.
21. 21. THE SOLUTION CBG/CNBD-K ASSUMPTIONS (2008) A) While active, transactions occur with Erlang-k (rate parameter λ) distributed waiting times B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
22. 22. THE SOLUTION ERLANG-K Erlang 1 | | | | 0.0 0.4 0.8 | | | | | | | | | | | || | | | ||| | | || || | | | | || | | | | | | || | | | | 0 1 2 3 4 5 Erlang 2 | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5 Erlang 3 | | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5 Erlang 100 | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5
23. 23. THE SOLUTION CBG/CNBD-K - 2008
24. 24. REGULARITY MEASURES ESTIMATING ,K‘ Distribution of Estimated Gamma Shape Parameters r=1 Exponential IPTs r=2 Erlang 2 IPTs 0 2 4 6 8 10 Regularity Measure M Shape Parameter r 2.5 Actual Distribution of M Distribution of M for r=2 Distribution of M for r=1 2.0 1.5 Density 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0
25. 25. THE BENCHMARK MSLE RMSE MAE Corr SUM LM 0,0863 0,642 0,262 0,644 -31 % Pareto/NBD 0,0977 0,653 0,359 0,628 +22% BG/NBD 0,0963 0,651 0,362 0,640 +19% CBG/NBD 0,0959 0,650 0,360 0,639 +19% CBD/CNBD-2 0,0831 0,632 0,293 0,660 -11 % CBD/CNBD-3 0,0816 0,637 0,275 0,663 -24 %
26. 26. THE CONTEST PARTICIPANTS Companies US Universities Internation Universities DataLab U Pennsylvania U Frankfurt Targetbase U Connecticut Tech Uni Munich Hewlett-Packard UT Dallas Leuven U Washington PUC Chile SAS OK State U Duisburg-Essen Alliance Data Commenius U Old Dominion U Thinkanalytics, LLC BU Vienna Georgia State DK Shiffet & Assoc Ltd. SUNY New Platz U Wisconsin W
27. 27. THE CONTEST MODELS • Ad Hoc • Linear Regression • Hierarchical Bayesian • BG/NBD, MBG-NBD, CBG-NBD, Pareto/NBD • Bayesian Seemingly Unrelated Regressions • Probit / logistic regression • Tobit • ARIMA • ArtXP Time Series • Support Vector Machines • Trees • Kohonen Networks • Feedforward Neural Networks • Stochastic Microanalytical Simulations No Markov chain models though
28. 28. THE CONTEST OUTCOME TASK 1: CUSTOMER EQUITY