Uplift models
Paulius Ĺ arka
Image from http://www.supplychaincowboy.com/epic-warehouse-warning-signs/
tl;dr
Uplift model: directly models the impact of a treatment
Contents
1. Context for uplift models
2. What is an uplift model
3. Building an uplift model
Image from https://www.facebook.com/thechurn/
1.
The X and the y
X y
Apply:
Train:
X p
m
m
The treatment
p
Image from https://mashable.com/2011/10/03/anti-spam-guide/
The effect
Image from http://scientificmarketer.com/2007/02/fundamental-campaign-segmentation.html/
Classical targets p
Uplift targets
Image from http://unitedforkliftllc.com/parts.html
2.
Uplift model
X y m
Apply:
Train:
X δ
t
m
Uplift model
X y
Apply:
Train:
X δ
tt
δ
= 1 if targeted else 0
= P(churn | t = 1)
- P(churn | t = 0)
m
m
The effect (again)
Minimal Viable Pipeline
Xy
δ
6. Apply uplift model
1. Some action
8. More results
y
Pls stay!
t
4. Train uplift model
Pls stay!
7. More action
2. Collect results
9. Profit!
3. Create features 5. Validate model
X
m
m
Validation
● True value of δ is unobservable, hence usual metrics do not have
equivalents:
✘ MSE
✘ Accuracy & Precision
✘ Log loss
✘ ROC curve
● But some have!
✓ Lift curve
Lift curve of a classifier
1. Compute test set predictions
2. Sort users by predicted score
3. Plot a cumulative sum of
positive responses
Data needed: p, y
Uplift curve of an uplift model
1. Compute test set predictions
2. Sort users by predicted score
3. Plot a cumulative sum of
positive target responses minus
positive control responses
Data needed: δ, y, t
One more time!
Uplift model Binary classifier
P(y=1|X, t=1) - P(y=1|X, t=0) P(y=1|X)
ulf.fit(X, y, t) clf.fit(X, y)
ulf.predict_uplift(X) clf.predict_proba(X)
uplift_curve(δ, y, t) lift_curve(p, y)
Image from https://www.pinterest.co.uk/pin/144537469272890024/
3.
Poor man’s uplift model
Split the dataset in two:
X_t = X[t=1], X_c = X[t=0]
y_t = y[t=1], y_c = y[t=0]
Train two models:
clf_t.fit(X_t, y_t), clf_c.fit(X_c, y_c)
Subtract:
δ = clf_t.predict_proba(X) - clf_c.predict_proba(X)
Better: hack a decision tree
● Find a way to feed t
● Modify the splitting rule
Image from https://www.linkedin.com/pulse/how-does-id3-algorithm-works-decision-trees-sagarnil-das/
Better: hack a decision tree
➕ Positive uplift to the left
➖ Negative uplift to the right*
*conditions apply
Âť More ideas in the papers!
Image from https://www.linkedin.com/pulse/how-does-id3-algorithm-works-decision-trees-sagarnil-das/
Also
✓ Bagging
✓ Random forest
✓ Feature importances*
✓ Per-user feature importances*
* probably
Resources
1999, N.Radcliffe, P.Surry, Differential Response Analysis: Modeling True Responses by Isolating the Effect of a Single Action
2002, B.Hansotia, B.Rukstales, Incremental Value Modeling
2007, N.Radcliffe, Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models
2010, P.Rzepakowski, S.Jaroszewicz, Decision trees for uplift modeling
2011, N.Radcliffe, P.Surry, Real-World Uplift Modelling with Significance-Based Uplift Trees
2012, P.Rzepakowski, S.Jaroszewicz, Decision trees for uplift modeling with single and multiple treatments
2015, L.Guelman, M.Guillen, M.Perez-Marin, Uplift Random Forests
2015, M.Soltys, S.Jaroszewicz, P.Rzepakowski, Ensemble methods for uplift modeling
2017, W.Verbeke, C.Bravo, B.Baesens, Profit drive business analytics: A practitioner's guide to transforming big data into added value. (Chapter 4)
Wiki page: https://en.wikipedia.org/wiki/Uplift_modelling
R package: http://cran.r-project.org/web/packages/uplift/index.html
some python code: https://github.com/psarka/uplift

Uplift models

  • 1.
  • 2.
  • 3.
    tl;dr Uplift model: directlymodels the impact of a treatment
  • 4.
    Contents 1. Context foruplift models 2. What is an uplift model 3. Building an uplift model
  • 5.
  • 6.
    The X andthe y X y Apply: Train: X p m m
  • 7.
    The treatment p Image fromhttps://mashable.com/2011/10/03/anti-spam-guide/
  • 8.
    The effect Image fromhttp://scientificmarketer.com/2007/02/fundamental-campaign-segmentation.html/
  • 9.
  • 10.
  • 11.
  • 12.
    Uplift model X ym Apply: Train: X δ t m
  • 13.
    Uplift model X y Apply: Train: Xδ tt δ = 1 if targeted else 0 = P(churn | t = 1) - P(churn | t = 0) m m
  • 14.
  • 15.
    Minimal Viable Pipeline Xy δ 6.Apply uplift model 1. Some action 8. More results y Pls stay! t 4. Train uplift model Pls stay! 7. More action 2. Collect results 9. Profit! 3. Create features 5. Validate model X m m
  • 16.
    Validation ● True valueof δ is unobservable, hence usual metrics do not have equivalents: ✘ MSE ✘ Accuracy & Precision ✘ Log loss ✘ ROC curve ● But some have! ✓ Lift curve
  • 17.
    Lift curve ofa classifier 1. Compute test set predictions 2. Sort users by predicted score 3. Plot a cumulative sum of positive responses Data needed: p, y
  • 18.
    Uplift curve ofan uplift model 1. Compute test set predictions 2. Sort users by predicted score 3. Plot a cumulative sum of positive target responses minus positive control responses Data needed: δ, y, t
  • 19.
    One more time! Upliftmodel Binary classifier P(y=1|X, t=1) - P(y=1|X, t=0) P(y=1|X) ulf.fit(X, y, t) clf.fit(X, y) ulf.predict_uplift(X) clf.predict_proba(X) uplift_curve(δ, y, t) lift_curve(p, y)
  • 20.
  • 21.
    Poor man’s upliftmodel Split the dataset in two: X_t = X[t=1], X_c = X[t=0] y_t = y[t=1], y_c = y[t=0] Train two models: clf_t.fit(X_t, y_t), clf_c.fit(X_c, y_c) Subtract: δ = clf_t.predict_proba(X) - clf_c.predict_proba(X)
  • 22.
    Better: hack adecision tree ● Find a way to feed t ● Modify the splitting rule Image from https://www.linkedin.com/pulse/how-does-id3-algorithm-works-decision-trees-sagarnil-das/
  • 23.
    Better: hack adecision tree ➕ Positive uplift to the left ➖ Negative uplift to the right* *conditions apply » More ideas in the papers! Image from https://www.linkedin.com/pulse/how-does-id3-algorithm-works-decision-trees-sagarnil-das/
  • 24.
    Also ✓ Bagging ✓ Randomforest ✓ Feature importances* ✓ Per-user feature importances* * probably
  • 25.
    Resources 1999, N.Radcliffe, P.Surry,Differential Response Analysis: Modeling True Responses by Isolating the Effect of a Single Action 2002, B.Hansotia, B.Rukstales, Incremental Value Modeling 2007, N.Radcliffe, Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models 2010, P.Rzepakowski, S.Jaroszewicz, Decision trees for uplift modeling 2011, N.Radcliffe, P.Surry, Real-World Uplift Modelling with Significance-Based Uplift Trees 2012, P.Rzepakowski, S.Jaroszewicz, Decision trees for uplift modeling with single and multiple treatments 2015, L.Guelman, M.Guillen, M.Perez-Marin, Uplift Random Forests 2015, M.Soltys, S.Jaroszewicz, P.Rzepakowski, Ensemble methods for uplift modeling 2017, W.Verbeke, C.Bravo, B.Baesens, Profit drive business analytics: A practitioner's guide to transforming big data into added value. (Chapter 4) Wiki page: https://en.wikipedia.org/wiki/Uplift_modelling R package: http://cran.r-project.org/web/packages/uplift/index.html some python code: https://github.com/psarka/uplift