Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
The document presents a talk by Mojan Hamed on uplift modeling, which focuses on identifying individuals whose behavior can be positively influenced through targeted marketing efforts. It details methodologies for estimating the causal effect of treatments on individuals and discusses various approaches for modeling, including the use of decision trees and binary outcomes. The aim is to improve marketing effectiveness by tailoring strategies to those most likely to respond incrementally to campaigns.
5
most targeted marketingactivity today, even when
measured on the basis of incremental impact, is
targeted on the basis of non-incremental models.
Radcliffe & Surry1
“
6.
: “who willrespond positively?”
: “who will respond positively, that would not have if not treated?”
Thus uplift modelling focuses on which individuals’ behaviour you can .
6
7.
target on thebasis of incrementality
7
we may be tempted to use a response (propensity) model
Training
Population
Pilot
Campaign
Model
Test
Population
ex. logistic regression
11
● run arandomized experiment with the treatment
○ can infer from data if unconfoundedness condition is met (out of scope)
● train a model that can estimate the causal effect (E.g. CATE)
● evaluate model on incremental gains
● extend model to score individuals on an ongoing basis
onditional verage reatmentffect (CATE)
14
vector of features
individualoutcome
This is the
“What would a person have done in alternative universe where we didn’t treat them?”
Three
methods
26
Train one modelfor E[Yi
(treatment)|Xi
] and another for E[Yi
(control)|Xi
]
Well known implementation: CausalLift
Treatment
Control
Model B
E[Yi
(treatment)|Xi
]
Model A
E[Yi
(treatment)|Xi
]
Uplift:
Model A - Model B
* to score individuals going forward, take Δ of models
Experiment
27.
Three
methods
27
● Very simple,intuitive
● Can work well for lower complexity problems
● Disadvantages:
○ Behaviour of the uplift may differ from individual classifiers
○ Fitting to the main effect may miss “weaker” uplift signal
○ Variable selection and weighting differs between models
28.
Three
methods
28
● Requires balancedbinary outcome variable
● Derive transformed outcome and train model to optimize for it
○ Well known implementation: Pylift
proof
* W is treatment(0,1) and p = P(W=1)
* to score individuals going forward, simply fit to single model
29.
Three
methods
29
● in general,goal of decision tree is to:
○ minimize Δ in size of splits
○ maximize Δ in value of splits (homogeneity)
● for uplift we add one additional criterion:
○ maximize Δ of control & treatment between splits
● aka “difference of differences”
we are attemptingto validate the counterfactual, an event
that never happened, therefore there is no “ground truth”.
32
33.
first plot yourcurve,
using a
ordering of individuals
33
percent of population targeted
cumulativeincrementalresponse
random
34.
then, plot upliftcurve by
ordering individuals by
descending incremental gain
34
percent of population targeted
cumulativeincrementalresponse
random
class transformation
35.
you can comparemany
methods on the same plot!
35
percent of population targeted
cumulativeincrementalresponse
random
class transformation
two-model approach
36.
if stakeholder buy-inis an issue, you can run an experiment
targetting randomly, and based on uplift
with statistical
significance since it is targetting to maximize incremental lift
36
[1] Real-World UpliftModelling with Significance-Based Uplift Trees,
[2] Causal Inference and Uplift Modeling,
[3] Uplift Modeling in Direct Marketing,
[4] Decision trees for uplift modeling with single and multiple treatments,
[1] Pylift,
[2] CausalLift,
42