SlideShare a Scribd company logo
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_ BOSTON
2015
@opendatasci
Victor S.Y. Lo
May, 2015
Machine Learning Based
Personalization Using
Uplift Analytics:
Examples and Applications
Uplift Modeling Workshop
Outline
 Why do we need Uplift modeling? 10 min
 Various methods for Uplift modeling 30 min
 Break 5 min
 Direct response vs. Uplift modeling 10 min
 Prescriptive Analytics for Multiple Treatments 20 min
 Q&A 10 min
2
3
Disclaimer:
This presentation does not represent
the views or opinions of Fidelity
Investments
4
0%
2%
4%
6%
8%
10%
1 2 3 4 5 6 7 8 9 10
Decile
Response rate
Average2.5%
Top decile lift (over random) = 4 times
Top 3 deciles lift = 2.6 times
Big Lift
Modelers: VERY SUCCESSFUL MODEL!
Response Modeling
5
Top 3 Deciles Random
Treatment 6.7% 2.5%
Control 6.7% 2.5%
Lift 0.0% 0.0%
Campaign Results
No Lift
Marketers: VERY DISAPPOINTING!
Modelers:
Not my problem, it is the mail design!
6
So, Who is Right?
A successful response model
1 2 3 4 5 6 7 8 9 10
A successful marketing campaign
What’s wrong with this picture?
3.3%
2.7%
3.0%
2.3%
1.7%
2.0%
Test 1 Test 2 Total
Treatment Response Rate
Control Response Rate
7
14%
7%
4%
2%
1% 1% 0% 0% 0% 0%
1 2 3 4 5 6 7 8 9 10
Decile
Incidence of Treatment
Responders
0%
50%
100%
0% 50% 100%
PctofTreatmentResponders
Pct of Treatment Group
CUME Pct of Responders Random
DM LIFT? DM LIFT?
Motivation
 Based on the following campaign result, which of the customer
groups is the best for future targeting ?
Treatment Control Difference
<35 0.5% 0.2% 0.3%
35-60 2.5% 0.5% 2.0%
>60 3.5% 2.5% 1.0%
Age
Response Rate By Age and Treatment/Control
• >60 has the highest response rate – treatment-only
focus (common practice)
• 35-60 has the highest Lift (most positively influenced by
the treatment) 8
Framework for Causal and Association
Analysis
9
Causal
Inference
(Lift Analysis,
Average Treatment
Effect)
Uplift
Modeling
(Heterogeneous
Treatment Effect,
Effect Modification)
Reporting /
Summary
Statistics
Response
Modeling /
Propensity
Modeling
Population /
Sub-population
Personalized
FromAssociationtoCausality
Granularity
1 2 3 4 5 6 7 8 9 10
Decile
Treatment "Responders"
Control "Responders"
1 2 3 4 5 6 7 8 9 10
Decile
Treatment "Responders"
Control "Responders"
The Uplift Model Objective
 Maximize the Treatment responders while minimizing
the control “responders”
10
True
lift
True
lift
A standard response model A uplift response model
(Ideal)
Hypothetical data
 Traditional Approach  Uplift Modeling
Uplift Approaches
Previous campaign data
Control Treatment
Training
data set
Holdout
data set
Model
Previous campaign data
Control Treatment
Training
data set
Holdout
data set
Model Source: Lo (2002)
11
Uplift model solutions
0. Baseline results: Standard response model –
treatment-only (as a benchmark)
1. Two Model Approach: Take difference of two
models, Treatment Minus Control
2. Treatment Dummy Approach: Single combined
model using treatment interactions
3. Four Quadrant Method
12
Method 1: Two Model Approach:
Treatment - Control
 Model 1 predicts P(R | Treatment)
 Model Sample = Treatment Group
 Model 2 predicts P(R | no Treatment)
 Model Sample = Control Group
 Final prediction of lift =
Treatment Response Score – Control Response Score
 Pros: simple concept, familiar execution (x2)
 Cons: indirectly models uplift, the difference may be only noise, 2x
the work, scales may not be comparable, 2x the error, variable
reduction done on indirect dependent vars
13
Method 2: Treatment Dummy Approach, Lo (2002)
 1. Estimate both E(Yi|Xi;treatment) and E(Yi|Xi;control) and use a
dummy T to differentiate between treatment and control:
 Linear logistic regression:
 2. Predict the lift value (treatment minus control) for each individual:
)iTiXδ'iγTiXβ'exp(α1
)iTiXδ'iγTiXβ'exp(α
)iX|iE(YiP



)
i
Xβ'exp(α1
)
i
Xβ'exp(α
)
i
Xδ'
i
Xβ'γexp(α1
)
i
Xδ'
i
Xβ'γexp(α
control|iPtreatment|iP
i
Lift







 Pros: simple concept, tests for presence of interaction effects
 Cons: multicollinearity issues 14
Method 3: Four Quadrant Method
 Model predicts probability of being in one
of four categories
 Dependent variable outcome (nominal)
= TR, CR, TN, or CN
 Model Population = Treatment &
Control groups together
 Prediction of lift:
15
 Pros: only one model required; more “success cases” to model after
 Cons: not that intuitive…
Response
Yes No
Treatment
Yes TR TN
No CR CN
𝒁 𝒙 =
𝟏
𝟐
[
𝑷 𝑻𝑹 𝒙
𝑷 𝑻
+
𝑷 𝑪𝑵 𝒙
𝑷 𝑪
−
𝑷 𝑻𝑵 𝒙
𝑷 𝑻
−
𝑷 𝑪𝑹 𝒙
𝑷 𝑪
]
Lai (2006) generalized by Kane, Lo, Zheng (2014)
16
Gini and Top 15% Gini in Holdout Sample
Source: Kane, Lo, and Zheng (2014)
Simulated Example:
Charity Donation
17
 80-20% split between treatment and control
 Randomly split into training (300K) and holdout (200K)
 Predictors available:
 Age of donor
 Frequency – # times a donation was made in the past
 Spent – average $ donation in the past
 Recency – year of the last donation
 Income
 Wealth
18
Holdout Sample Performance
Lift Chart on Simulated Data
Theoretical model: Two logistics for treatment and control
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Baseline Two model Lo (2002) Four Quadrant (KLZ) Random
19
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Baseline Random
Two Model Approach Treatment Dummy Approach
Four Quadrant (KLZ)
Gains Chart on Simulated Data
Gini Gini 15% Gini repeatability (R^2)
Baseline 5.6420 0.5412 0.7311
Method 1: Two Model approach 6.0384 0.7779 0.7830
Method 2: Lo(2002), Treatment Dummy 6.0353 0.7766 0.7836
Method 3: Four Quadrant Method (or KLZ) 5.9063 0.7484 0.7884
Online Merchandise Data
20
From blog.minethatdata.com, with women’s merchandise
online visit as response
50-50% split between treatment and control (43K in total)
Randomly split into training (70%) and holdout (30%)
Predictors available:
• Recency
• Dollar spent last year
• Merchandise purchased last year (men’s, women’s, both)
• Urban, suburban, or rural
• Channel – web, phone, or both for purchase last year
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Baseline Lo(2002) trt dummy
Two model approach Four Quadrant (KLZ)
Random
Holdout Sample Performance
21
Lift Chart on Email Online Merchandise Data
22
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Baseline Random
Two Model Approach Treatment Dummy Approach
Four Quadrant (KLZ)
Gains Chart on Email Online Merchandise Data
Gini Gini15% Ginirepeatability(R^2)
Baseline 1.8556 -0.0240 0.2071
Method1: Two Modelapproach 2.0074 0.0786 0.2941
Method2: Lo(2002),TreatmentDummy 2.4392 0.0431 0.2945
Method3: FourQuadrantMethod(orKLZ) 2.3703 0.2288 0.3290
Ideal Conditions for Uplift Modeling
 A randomized control group is withheld!
 Treatment does not cause all “responses,” i.e.
control response rate > 0
 Natural Response is not highly correlated to Lift
 Lift Signal-to-Noise ratio (Lift/control rate) is
large enough
23
Case I:
Direct Response versus Uplift
24
Direct Response vs. Uplift
Modeling
25
• Retailer couponing
• E-mail click-through
26
Any
Customer
Treatment
(T)
Direct
Response (D)
Response (R)
No Direct
Response
(Dc)
Response (R)
No Response
(N)
Control (C)
Response (R)
No Response
(N)
Decision Tree of Campaign and Customers
27
Any
Customer
Treatment
(T)
Direct
Response (D)
Response (R)
No Direct
Response
(Dc)
Response (R)
No Response
(N)
Control (C)
Response (R)
No Response
(N)
Decision Tree of Campaign and Customers
𝑃 𝐷 𝑇, 𝑥 1 − 𝑃 𝐷 𝑇, 𝑥
𝐿𝑖𝑓𝑡(𝑥) = 𝑃 𝐷 𝑇, 𝑥 + 𝑃(𝑅 𝑇, 𝐷 𝑐, 𝑥 1 − 𝑃 𝐷 𝑇, 𝑥 – 𝑃 𝑅 𝐶, 𝑥
𝑃 𝑅 𝑇, 𝑥
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Lift in
Response
Rate
Semi-decile
direct response only uplift direct response + uplift Baseline Random
28
Holdout Sample Validation of Simulated Data
29
Story of A Mathematician
Case II:
Optimization of Multiple
Treatments -
From Predictive Analytics to
Prescriptive Analytics
30
31
A (or B)Improving
Targeting
No Targeting,
Single Treatment: A (or B)
Individual Level
Targeting - Model-based
No Targeting,
Single Best Treatment
for all individuals
Improving Treatment
Best of A and B
A
B
2) Target Selection
4) Optimal Treatment
for Each Individual
3) One Size Fits All
1) Random Targeting
From Random Selection to Optimization
32
Maximize
𝑖=1
𝑛
𝑗=1
𝑚
△ 𝑝𝑖𝑗 𝑥𝑖𝑗
Subject to:
𝑖=1
𝑛
𝑗=1
𝑚
𝑐𝑖𝑗 𝑥𝑖𝑗 ≤ 𝐵, Budget Constraint
𝑗=1
𝑚
𝑥𝑖𝑗 ≤ 1, for 𝑖 = 1, … , 𝑛,
𝑥𝑖𝑗 = 0 or 1, 𝑖 = 1, … , 𝑛; 𝑗 = 1, … , 𝑚.
where △ 𝑝𝑖𝑗 = estimated lift value for individual i and treatment j,
𝑥𝑖𝑗 (decision variable) = 1 if treatment j is assigned to individual i and 0
otherwise; and 𝑐𝑖𝑗= cost of promoting treatment j to i.
Integer Program Formulation
E.g., size of target population = 30 million, # treatment combinations = 10,
then # decision variables = 300 millions, and
total # possible combinations without constraints = 2300,000,000!
33
A Heuristic Algorithm
1. Perform cluster analysis of the m model-based
lift scores in the holdout sample
2. Compute cluster-level lift score for each
treatment, using sample mean differences
3. Apply cluster solution to new data (for a future
marketing program)
4. Solve a linear programming model to optimize
treatment assignment at the cluster-level
Source: Lo and Pachamanova (2015)
34
Maximize
𝑐=1
𝐶
𝑗=1
𝑚
△ 𝑝 𝑐𝑗 𝑥 𝑐𝑗
Subject to:
𝑐=1
𝐶
𝑗=1
𝑚
𝑐𝑗 𝑥 𝑐𝑗 ≤ 𝐵𝑢𝑑𝑔𝑒𝑡, Budget Constraint
𝑗=1
𝑚
𝑥 𝑐𝑗 ≤ 𝑁𝑐, for 𝑐 = 1, … , 𝐶, Cluster Size Constraint,
and
𝑥 𝑐𝑗 ≥ 0, 𝑐 = 1, … , 𝐶; 𝑗 = 1, … , 𝑚,
where 𝑥 𝑐𝑗 = # individuals in cluster c to receive treatment j,
𝑐𝑗 = cost of treatment j for each individual.
Becomes A Much Simpler Optimization Problem
Can be solved by Excel Solver
35
Online Retail Example
Goal: Optimization of men’s and women’s merchandise
A 10-cluster solution
36
37
CLU
STER
Cluster
Size in
New
Data
Obs. Lift
in
response:
Men's
Obs. Lift
in
response:
Women's
Cost
per
treatme
nt ($)
Decision
var on
number
of men's
Decision
var on
number
of
women's
Total
number
of
treated
by
cluster
Overa 0.07408 0.0438631 4,180 0.1587 0.0224 1 4,180 - 4,180
2 5,650 0.0652 -0.0055 1 - - -
4 60,220 0.0658 0.0628 1 2,340 - 2,340
5 12,370 0.1290 0.0618 1 12,370 - 12,370
6 8,940 0.0672 0.0760 1 - 8,940 8,940
7 29,240 0.0519 0.0213 1 - - -
8 28,070 0.0868 0.0254 1 28,070 - 28,070
9 4,100 0.2249 0.0239 1 4,100 - 4,100
10 37,080 0.0572 0.0426 1 - - -
Total 189,850 obj value 5,773 680 6,453
cost $51,060 $ 8,940 $60,000
Budget 60,000$
Linear Programming Solution from Excel Solver
38
Stochastic Optimization
Lift estimates can have high degree of uncertainty, stochastic
optimization solutions take the uncertainty into account:
 Stochastic Programming
 Robust Optimization
 Mean Variance Optimization
39
Mean Variance Optimization Example
Conclusion
40
• Uplift is a very impactful emerging subfield
• Deserves more R&D
• Extensions are plenty (Lo (2008)):
• Multiple treatments
• Optimization
• Non-randomized experiments
• Direct tracking
• Applications in other fields
• E.g. Potter (2013), Yong (2015)
References
Cai, T., Tian, L., Wong, P., and Wei, L.J. (2011), “Analysis of Randomized Comparative Clinical Trial Data for Personalized Treatment,” Biostatistics, 12:2, p.270-282,
Collins, F.S. (2010), The Language of Life: DNA and the Revolution in Personalized Medicine, HarperCollins.
Conrady, S. and Jouffe, L. (2011). “Causal Inference and Direct Effects,” Bayesia and Conrady Applied Science, at http://www.conradyscience.com/index.php/causality
Freedman, D. (2010). Statistical Methods and Causal Inference. Cambridge.
Hamburg, M.A. and Collins, F.S. (2010). “The path to personalized medicine.” The New England Journal of Medicine, 363;4, p.301-304.
Haughton, D. and Haughton, J. (2011). Living Standards Analytics, Springer.
Holland, C. (2005). Breakthrough Business Results with MVT, Wiley.
Kane, K., Lo, V.S.Y., and Zheng, J. (2014) “Mining for the Truly Responsive Customers and Prospects Using True-Lift Modeling: Comparison of New and Existing
Methods.” Journal of Marketing Analytics, v.2, Issue 4, p.218-238.
Lai, Lilly Y.-T. (2006) Influential Marketing: A New Direct Marketing Strategy Addressing the Existence of Voluntary Buyers. Master of Science thesis, Simon Fraser
University School of Computing Science, Burnaby, BC, Canada.
Lo, V.S.Y. (2002) “The True Lift Model – A Novel Data Mining Approach to Response Modeling in Database Marketing.” SIGKDD Explorations 4, Issue 2, p.78-86, at:
http://www.acm.org/sigs/sigkdd/explorations/issues/4-2-2002-12/lo.pdf
Lo, V.S.Y. (2008), “New Opportunities in Marketing Data Mining," in Encyclopedia of Data Warehousing and Mining, Wang (2008) ed., 2nd edition, Idea Group Publishing.
Lo, V.S.Y. and D. Pachamanova (2015), “A Practical Approach to Treatment Optimization While Accounting for Estimation Risk,” Technical Report.
McKinney, R.E. et al. (1998),”A randomized study of combined zidovudine-lamivudine versus didanosine monotherapy in children with sympotomatic therapy-naïve HIV-1
infection,” J. of Pediatrics,133, no.4, p.500-508.
Mehr, I.J. (2000), “Pharmacogenomics and Industry Change,” Applied Clinical Trials, 9, no.5, p.34,36.
Morgan, S.L. and Winship C. (2007). Counterfactuals and Causal Inference. Cambridge University Press.
Pearl, J. (2000), Causality. Cambridge University Press.
Potter, Daniel (2013) Pinpointing the Persuadables: Convincing the Right Voters to Support Barack Obama. Presented at Predictive Analytics World; Oct, Boston, MA;
http://www.predictiveanalyticsworld.com/patimes/pinpointing-the-persuadables-convincing-the-right-voters-to-support-barack-obama/ (available with free subscription).
Radcliffe, N.J. and Surry, P. (1999). “Differential response analysis: modeling true response by isolating the effect of a single action,” Proceedings of Credit Scoring and
Credit Control VI, Credit Research Centre, U. of Edinburgh Management School.
Radcliffe, N.J. (2007). “Using Control Groups to Target on Predicted Lift,” DMA Analytics Annual Journal, Spring, p.14-21.
Robins, J.M. and Hernan, M.A. (2009), “Estimation of the Causal Effects of Time-Varying Exposures,” In Fitzmaurice G., Davidian, M,, Verbeke, G., and Molenberghs, G.
eds. (2009) Longitudinal Data Analysis, Chapman & Hall/CRC, p.553 – 399.
Rosenbaum, P.R. (2002), Observational Studies. Springer.
Rosenbaum, P.R. (2010), Design of Observational Studies. Springer.
Rubin, D.B. (2006), Matched Sampling for Causal Effects. Cambridge University Press.
Rubin, D.B. (2008), “For Objective Causal Inference, Design Trumps Analysis,” The Annals of Applied Statistics, p.808-840.
Rubin, D.B. and Waterman, R.P. (2006), “Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology,” Statistical Science, p.206-222.
Russek-Cohen, E. and Simon, R.M. (1997), “Evaluating treatments when a gender by treatment interaction may exist,” Statistics in Medicine, 16, issue 4, p.455-464.
Signorovitch, J. (2007), “Estimation and Evaluation of Regression for Patient-Specific Efficacy,” Harvard School of Public Health working paper.
Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search, 2nd edition, MIT Press.
Wikipedia (2010), “Uplift Modeling,” at http://en.wikipedia.org/wiki/Uplift_modelling
Yong, Florence H. (2015), “Quantitative Methods for Stratified Medicine,” PhD Dissertation, Department of Biostatistics, Harvard T.H. Chan School of Public Health,
41

More Related Content

What's hot

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
MLconf
 
Concept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsConcept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML Applications
Lightbend
 
Causal Inference Introduction.pdf
Causal Inference Introduction.pdfCausal Inference Introduction.pdf
Causal Inference Introduction.pdf
Yuna Koyama
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
Bong-Ho Lee
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
Benjamin Le
 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Anmol Bhasin
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
Faisal Siddiqi
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
Pieter de Bruin
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
NamHyuk Ahn
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
Amit Sharma
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
Manuel Martín
 
Tutorial on Bias in Rec Sys @ UMAP2020
Tutorial on Bias in Rec Sys @ UMAP2020Tutorial on Bias in Rec Sys @ UMAP2020
Tutorial on Bias in Rec Sys @ UMAP2020
Mirko Marras
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Justin Basilico
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
Knoldus Inc.
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Förderverein Technische Fakultät
 

What's hot (20)

Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Concept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML ApplicationsConcept Drift: Monitoring Model Quality In Streaming ML Applications
Concept Drift: Monitoring Model Quality In Streaming ML Applications
 
Causal Inference Introduction.pdf
Causal Inference Introduction.pdfCausal Inference Introduction.pdf
Causal Inference Introduction.pdf
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
CounterFactual Explanations.pdf
CounterFactual Explanations.pdfCounterFactual Explanations.pdf
CounterFactual Explanations.pdf
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
MLOps in action
MLOps in actionMLOps in action
MLOps in action
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
DoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End toolDoWhy Python library for causal inference: An End-to-End tool
DoWhy Python library for causal inference: An End-to-End tool
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
 
Tutorial on Bias in Rec Sys @ UMAP2020
Tutorial on Bias in Rec Sys @ UMAP2020Tutorial on Bias in Rec Sys @ UMAP2020
Tutorial on Bias in Rec Sys @ UMAP2020
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.MLOps Bridging the gap between Data Scientists and Ops.
MLOps Bridging the gap between Data Scientists and Ops.
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 

Similar to Uplift Modeling Workshop

Meetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ DataikuMeetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ Dataiku
Johan-André Jeanville
 
ABTest-20231020.pptx
ABTest-20231020.pptxABTest-20231020.pptx
ABTest-20231020.pptx
Michael Ming Lei
 
Causality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAELCausality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAEL
Hanan Shteingart
 
Optimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designsOptimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designs
Pankaj Sharma
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleAhmadullah
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overview
Azmi Mohd Tamil
 
A05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsA05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsLeanleaders.org
 
A05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsA05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsLeanleaders.org
 
Building Institutional Capacity in Thailand to Design and Implement Climate P...
Building Institutional Capacity in Thailand to Design and Implement Climate P...Building Institutional Capacity in Thailand to Design and Implement Climate P...
Building Institutional Capacity in Thailand to Design and Implement Climate P...
UNDP Climate
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistency
cheweb1
 
Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
Jieming Wei
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
 
Medical Segmentation Decathalon
Medical Segmentation DecathalonMedical Segmentation Decathalon
Medical Segmentation Decathalon
imgcommcall
 
Statistics pres 10 27 2015 roy sabo
Statistics pres 10 27 2015   roy saboStatistics pres 10 27 2015   roy sabo
Statistics pres 10 27 2015 roy sabo
tjcarter
 
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.pptMarket Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Edu4Sure
 
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
Minitab, LLC
 
E00 program-level modeling and simulation experiences
E00   program-level modeling and simulation experiencesE00   program-level modeling and simulation experiences
E00 program-level modeling and simulation experiencestherealreverendbayes
 
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
The Statistical and Applied Mathematical Sciences Institute
 

Similar to Uplift Modeling Workshop (20)

report
reportreport
report
 
Meetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ DataikuMeetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ Dataiku
 
ABTest-20231020.pptx
ABTest-20231020.pptxABTest-20231020.pptx
ABTest-20231020.pptx
 
Causality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAELCausality in Python PyCon 2021 ISRAEL
Causality in Python PyCon 2021 ISRAEL
 
Optimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designsOptimizing marketing campaigns using experimental designs
Optimizing marketing campaigns using experimental designs
 
Lecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two SampleLecture 7 Hypothesis Testing Two Sample
Lecture 7 Hypothesis Testing Two Sample
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overview
 
A05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsA05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat Tests
 
A05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat TestsA05 Continuous One Variable Stat Tests
A05 Continuous One Variable Stat Tests
 
Building Institutional Capacity in Thailand to Design and Implement Climate P...
Building Institutional Capacity in Thailand to Design and Implement Climate P...Building Institutional Capacity in Thailand to Design and Implement Climate P...
Building Institutional Capacity in Thailand to Design and Implement Climate P...
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistency
 
Presentation
PresentationPresentation
Presentation
 
Accurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification AlgorithmsAccurate Campaign Targeting Using Classification Algorithms
Accurate Campaign Targeting Using Classification Algorithms
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Medical Segmentation Decathalon
Medical Segmentation DecathalonMedical Segmentation Decathalon
Medical Segmentation Decathalon
 
Statistics pres 10 27 2015 roy sabo
Statistics pres 10 27 2015   roy saboStatistics pres 10 27 2015   roy sabo
Statistics pres 10 27 2015 roy sabo
 
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.pptMarket Research using SPSS _ Edu4Sure Sept 2023.ppt
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
 
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
Critical Checks for Pharmaceuticals and Healthcare: Validating Your Data Inte...
 
E00 program-level modeling and simulation experiences
E00   program-level modeling and simulation experiencesE00   program-level modeling and simulation experiences
E00 program-level modeling and simulation experiences
 
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
PMED Transition Workshop - Dynamic Treatment Regimes via Reward Ignorant Mode...
 

More from odsc

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
odsc
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
odsc
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
odsc
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
odsc
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
odsc
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
odsc
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
odsc
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
odsc
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
odsc
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
odsc
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
odsc
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
odsc
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
odsc
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
odsc
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
odsc
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
odsc
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
odsc
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
odsc
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
odsc
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
odsc
 

More from odsc (20)

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
 

Recently uploaded

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

Uplift Modeling Workshop

  • 1. O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci Victor S.Y. Lo May, 2015 Machine Learning Based Personalization Using Uplift Analytics: Examples and Applications Uplift Modeling Workshop
  • 2. Outline  Why do we need Uplift modeling? 10 min  Various methods for Uplift modeling 30 min  Break 5 min  Direct response vs. Uplift modeling 10 min  Prescriptive Analytics for Multiple Treatments 20 min  Q&A 10 min 2
  • 3. 3 Disclaimer: This presentation does not represent the views or opinions of Fidelity Investments
  • 4. 4 0% 2% 4% 6% 8% 10% 1 2 3 4 5 6 7 8 9 10 Decile Response rate Average2.5% Top decile lift (over random) = 4 times Top 3 deciles lift = 2.6 times Big Lift Modelers: VERY SUCCESSFUL MODEL! Response Modeling
  • 5. 5 Top 3 Deciles Random Treatment 6.7% 2.5% Control 6.7% 2.5% Lift 0.0% 0.0% Campaign Results No Lift Marketers: VERY DISAPPOINTING! Modelers: Not my problem, it is the mail design!
  • 6. 6 So, Who is Right?
  • 7. A successful response model 1 2 3 4 5 6 7 8 9 10 A successful marketing campaign What’s wrong with this picture? 3.3% 2.7% 3.0% 2.3% 1.7% 2.0% Test 1 Test 2 Total Treatment Response Rate Control Response Rate 7 14% 7% 4% 2% 1% 1% 0% 0% 0% 0% 1 2 3 4 5 6 7 8 9 10 Decile Incidence of Treatment Responders 0% 50% 100% 0% 50% 100% PctofTreatmentResponders Pct of Treatment Group CUME Pct of Responders Random DM LIFT? DM LIFT?
  • 8. Motivation  Based on the following campaign result, which of the customer groups is the best for future targeting ? Treatment Control Difference <35 0.5% 0.2% 0.3% 35-60 2.5% 0.5% 2.0% >60 3.5% 2.5% 1.0% Age Response Rate By Age and Treatment/Control • >60 has the highest response rate – treatment-only focus (common practice) • 35-60 has the highest Lift (most positively influenced by the treatment) 8
  • 9. Framework for Causal and Association Analysis 9 Causal Inference (Lift Analysis, Average Treatment Effect) Uplift Modeling (Heterogeneous Treatment Effect, Effect Modification) Reporting / Summary Statistics Response Modeling / Propensity Modeling Population / Sub-population Personalized FromAssociationtoCausality Granularity
  • 10. 1 2 3 4 5 6 7 8 9 10 Decile Treatment "Responders" Control "Responders" 1 2 3 4 5 6 7 8 9 10 Decile Treatment "Responders" Control "Responders" The Uplift Model Objective  Maximize the Treatment responders while minimizing the control “responders” 10 True lift True lift A standard response model A uplift response model (Ideal) Hypothetical data
  • 11.  Traditional Approach  Uplift Modeling Uplift Approaches Previous campaign data Control Treatment Training data set Holdout data set Model Previous campaign data Control Treatment Training data set Holdout data set Model Source: Lo (2002) 11
  • 12. Uplift model solutions 0. Baseline results: Standard response model – treatment-only (as a benchmark) 1. Two Model Approach: Take difference of two models, Treatment Minus Control 2. Treatment Dummy Approach: Single combined model using treatment interactions 3. Four Quadrant Method 12
  • 13. Method 1: Two Model Approach: Treatment - Control  Model 1 predicts P(R | Treatment)  Model Sample = Treatment Group  Model 2 predicts P(R | no Treatment)  Model Sample = Control Group  Final prediction of lift = Treatment Response Score – Control Response Score  Pros: simple concept, familiar execution (x2)  Cons: indirectly models uplift, the difference may be only noise, 2x the work, scales may not be comparable, 2x the error, variable reduction done on indirect dependent vars 13
  • 14. Method 2: Treatment Dummy Approach, Lo (2002)  1. Estimate both E(Yi|Xi;treatment) and E(Yi|Xi;control) and use a dummy T to differentiate between treatment and control:  Linear logistic regression:  2. Predict the lift value (treatment minus control) for each individual: )iTiXδ'iγTiXβ'exp(α1 )iTiXδ'iγTiXβ'exp(α )iX|iE(YiP    ) i Xβ'exp(α1 ) i Xβ'exp(α ) i Xδ' i Xβ'γexp(α1 ) i Xδ' i Xβ'γexp(α control|iPtreatment|iP i Lift         Pros: simple concept, tests for presence of interaction effects  Cons: multicollinearity issues 14
  • 15. Method 3: Four Quadrant Method  Model predicts probability of being in one of four categories  Dependent variable outcome (nominal) = TR, CR, TN, or CN  Model Population = Treatment & Control groups together  Prediction of lift: 15  Pros: only one model required; more “success cases” to model after  Cons: not that intuitive… Response Yes No Treatment Yes TR TN No CR CN 𝒁 𝒙 = 𝟏 𝟐 [ 𝑷 𝑻𝑹 𝒙 𝑷 𝑻 + 𝑷 𝑪𝑵 𝒙 𝑷 𝑪 − 𝑷 𝑻𝑵 𝒙 𝑷 𝑻 − 𝑷 𝑪𝑹 𝒙 𝑷 𝑪 ] Lai (2006) generalized by Kane, Lo, Zheng (2014)
  • 16. 16 Gini and Top 15% Gini in Holdout Sample Source: Kane, Lo, and Zheng (2014)
  • 17. Simulated Example: Charity Donation 17  80-20% split between treatment and control  Randomly split into training (300K) and holdout (200K)  Predictors available:  Age of donor  Frequency – # times a donation was made in the past  Spent – average $ donation in the past  Recency – year of the last donation  Income  Wealth
  • 18. 18 Holdout Sample Performance Lift Chart on Simulated Data Theoretical model: Two logistics for treatment and control -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Baseline Two model Lo (2002) Four Quadrant (KLZ) Random
  • 19. 19 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Baseline Random Two Model Approach Treatment Dummy Approach Four Quadrant (KLZ) Gains Chart on Simulated Data Gini Gini 15% Gini repeatability (R^2) Baseline 5.6420 0.5412 0.7311 Method 1: Two Model approach 6.0384 0.7779 0.7830 Method 2: Lo(2002), Treatment Dummy 6.0353 0.7766 0.7836 Method 3: Four Quadrant Method (or KLZ) 5.9063 0.7484 0.7884
  • 20. Online Merchandise Data 20 From blog.minethatdata.com, with women’s merchandise online visit as response 50-50% split between treatment and control (43K in total) Randomly split into training (70%) and holdout (30%) Predictors available: • Recency • Dollar spent last year • Merchandise purchased last year (men’s, women’s, both) • Urban, suburban, or rural • Channel – web, phone, or both for purchase last year
  • 21. -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Baseline Lo(2002) trt dummy Two model approach Four Quadrant (KLZ) Random Holdout Sample Performance 21 Lift Chart on Email Online Merchandise Data
  • 22. 22 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Baseline Random Two Model Approach Treatment Dummy Approach Four Quadrant (KLZ) Gains Chart on Email Online Merchandise Data Gini Gini15% Ginirepeatability(R^2) Baseline 1.8556 -0.0240 0.2071 Method1: Two Modelapproach 2.0074 0.0786 0.2941 Method2: Lo(2002),TreatmentDummy 2.4392 0.0431 0.2945 Method3: FourQuadrantMethod(orKLZ) 2.3703 0.2288 0.3290
  • 23. Ideal Conditions for Uplift Modeling  A randomized control group is withheld!  Treatment does not cause all “responses,” i.e. control response rate > 0  Natural Response is not highly correlated to Lift  Lift Signal-to-Noise ratio (Lift/control rate) is large enough 23
  • 24. Case I: Direct Response versus Uplift 24
  • 25. Direct Response vs. Uplift Modeling 25 • Retailer couponing • E-mail click-through
  • 26. 26 Any Customer Treatment (T) Direct Response (D) Response (R) No Direct Response (Dc) Response (R) No Response (N) Control (C) Response (R) No Response (N) Decision Tree of Campaign and Customers
  • 27. 27 Any Customer Treatment (T) Direct Response (D) Response (R) No Direct Response (Dc) Response (R) No Response (N) Control (C) Response (R) No Response (N) Decision Tree of Campaign and Customers 𝑃 𝐷 𝑇, 𝑥 1 − 𝑃 𝐷 𝑇, 𝑥 𝐿𝑖𝑓𝑡(𝑥) = 𝑃 𝐷 𝑇, 𝑥 + 𝑃(𝑅 𝑇, 𝐷 𝑐, 𝑥 1 − 𝑃 𝐷 𝑇, 𝑥 – 𝑃 𝑅 𝐶, 𝑥 𝑃 𝑅 𝑇, 𝑥
  • 28. 0 0.1 0.2 0.3 0.4 0.5 0.6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Lift in Response Rate Semi-decile direct response only uplift direct response + uplift Baseline Random 28 Holdout Sample Validation of Simulated Data
  • 29. 29 Story of A Mathematician
  • 30. Case II: Optimization of Multiple Treatments - From Predictive Analytics to Prescriptive Analytics 30
  • 31. 31 A (or B)Improving Targeting No Targeting, Single Treatment: A (or B) Individual Level Targeting - Model-based No Targeting, Single Best Treatment for all individuals Improving Treatment Best of A and B A B 2) Target Selection 4) Optimal Treatment for Each Individual 3) One Size Fits All 1) Random Targeting From Random Selection to Optimization
  • 32. 32 Maximize 𝑖=1 𝑛 𝑗=1 𝑚 △ 𝑝𝑖𝑗 𝑥𝑖𝑗 Subject to: 𝑖=1 𝑛 𝑗=1 𝑚 𝑐𝑖𝑗 𝑥𝑖𝑗 ≤ 𝐵, Budget Constraint 𝑗=1 𝑚 𝑥𝑖𝑗 ≤ 1, for 𝑖 = 1, … , 𝑛, 𝑥𝑖𝑗 = 0 or 1, 𝑖 = 1, … , 𝑛; 𝑗 = 1, … , 𝑚. where △ 𝑝𝑖𝑗 = estimated lift value for individual i and treatment j, 𝑥𝑖𝑗 (decision variable) = 1 if treatment j is assigned to individual i and 0 otherwise; and 𝑐𝑖𝑗= cost of promoting treatment j to i. Integer Program Formulation E.g., size of target population = 30 million, # treatment combinations = 10, then # decision variables = 300 millions, and total # possible combinations without constraints = 2300,000,000!
  • 33. 33 A Heuristic Algorithm 1. Perform cluster analysis of the m model-based lift scores in the holdout sample 2. Compute cluster-level lift score for each treatment, using sample mean differences 3. Apply cluster solution to new data (for a future marketing program) 4. Solve a linear programming model to optimize treatment assignment at the cluster-level Source: Lo and Pachamanova (2015)
  • 34. 34 Maximize 𝑐=1 𝐶 𝑗=1 𝑚 △ 𝑝 𝑐𝑗 𝑥 𝑐𝑗 Subject to: 𝑐=1 𝐶 𝑗=1 𝑚 𝑐𝑗 𝑥 𝑐𝑗 ≤ 𝐵𝑢𝑑𝑔𝑒𝑡, Budget Constraint 𝑗=1 𝑚 𝑥 𝑐𝑗 ≤ 𝑁𝑐, for 𝑐 = 1, … , 𝐶, Cluster Size Constraint, and 𝑥 𝑐𝑗 ≥ 0, 𝑐 = 1, … , 𝐶; 𝑗 = 1, … , 𝑚, where 𝑥 𝑐𝑗 = # individuals in cluster c to receive treatment j, 𝑐𝑗 = cost of treatment j for each individual. Becomes A Much Simpler Optimization Problem Can be solved by Excel Solver
  • 35. 35 Online Retail Example Goal: Optimization of men’s and women’s merchandise A 10-cluster solution
  • 36. 36
  • 37. 37 CLU STER Cluster Size in New Data Obs. Lift in response: Men's Obs. Lift in response: Women's Cost per treatme nt ($) Decision var on number of men's Decision var on number of women's Total number of treated by cluster Overa 0.07408 0.0438631 4,180 0.1587 0.0224 1 4,180 - 4,180 2 5,650 0.0652 -0.0055 1 - - - 4 60,220 0.0658 0.0628 1 2,340 - 2,340 5 12,370 0.1290 0.0618 1 12,370 - 12,370 6 8,940 0.0672 0.0760 1 - 8,940 8,940 7 29,240 0.0519 0.0213 1 - - - 8 28,070 0.0868 0.0254 1 28,070 - 28,070 9 4,100 0.2249 0.0239 1 4,100 - 4,100 10 37,080 0.0572 0.0426 1 - - - Total 189,850 obj value 5,773 680 6,453 cost $51,060 $ 8,940 $60,000 Budget 60,000$ Linear Programming Solution from Excel Solver
  • 38. 38 Stochastic Optimization Lift estimates can have high degree of uncertainty, stochastic optimization solutions take the uncertainty into account:  Stochastic Programming  Robust Optimization  Mean Variance Optimization
  • 40. Conclusion 40 • Uplift is a very impactful emerging subfield • Deserves more R&D • Extensions are plenty (Lo (2008)): • Multiple treatments • Optimization • Non-randomized experiments • Direct tracking • Applications in other fields • E.g. Potter (2013), Yong (2015)
  • 41. References Cai, T., Tian, L., Wong, P., and Wei, L.J. (2011), “Analysis of Randomized Comparative Clinical Trial Data for Personalized Treatment,” Biostatistics, 12:2, p.270-282, Collins, F.S. (2010), The Language of Life: DNA and the Revolution in Personalized Medicine, HarperCollins. Conrady, S. and Jouffe, L. (2011). “Causal Inference and Direct Effects,” Bayesia and Conrady Applied Science, at http://www.conradyscience.com/index.php/causality Freedman, D. (2010). Statistical Methods and Causal Inference. Cambridge. Hamburg, M.A. and Collins, F.S. (2010). “The path to personalized medicine.” The New England Journal of Medicine, 363;4, p.301-304. Haughton, D. and Haughton, J. (2011). Living Standards Analytics, Springer. Holland, C. (2005). Breakthrough Business Results with MVT, Wiley. Kane, K., Lo, V.S.Y., and Zheng, J. (2014) “Mining for the Truly Responsive Customers and Prospects Using True-Lift Modeling: Comparison of New and Existing Methods.” Journal of Marketing Analytics, v.2, Issue 4, p.218-238. Lai, Lilly Y.-T. (2006) Influential Marketing: A New Direct Marketing Strategy Addressing the Existence of Voluntary Buyers. Master of Science thesis, Simon Fraser University School of Computing Science, Burnaby, BC, Canada. Lo, V.S.Y. (2002) “The True Lift Model – A Novel Data Mining Approach to Response Modeling in Database Marketing.” SIGKDD Explorations 4, Issue 2, p.78-86, at: http://www.acm.org/sigs/sigkdd/explorations/issues/4-2-2002-12/lo.pdf Lo, V.S.Y. (2008), “New Opportunities in Marketing Data Mining," in Encyclopedia of Data Warehousing and Mining, Wang (2008) ed., 2nd edition, Idea Group Publishing. Lo, V.S.Y. and D. Pachamanova (2015), “A Practical Approach to Treatment Optimization While Accounting for Estimation Risk,” Technical Report. McKinney, R.E. et al. (1998),”A randomized study of combined zidovudine-lamivudine versus didanosine monotherapy in children with sympotomatic therapy-naïve HIV-1 infection,” J. of Pediatrics,133, no.4, p.500-508. Mehr, I.J. (2000), “Pharmacogenomics and Industry Change,” Applied Clinical Trials, 9, no.5, p.34,36. Morgan, S.L. and Winship C. (2007). Counterfactuals and Causal Inference. Cambridge University Press. Pearl, J. (2000), Causality. Cambridge University Press. Potter, Daniel (2013) Pinpointing the Persuadables: Convincing the Right Voters to Support Barack Obama. Presented at Predictive Analytics World; Oct, Boston, MA; http://www.predictiveanalyticsworld.com/patimes/pinpointing-the-persuadables-convincing-the-right-voters-to-support-barack-obama/ (available with free subscription). Radcliffe, N.J. and Surry, P. (1999). “Differential response analysis: modeling true response by isolating the effect of a single action,” Proceedings of Credit Scoring and Credit Control VI, Credit Research Centre, U. of Edinburgh Management School. Radcliffe, N.J. (2007). “Using Control Groups to Target on Predicted Lift,” DMA Analytics Annual Journal, Spring, p.14-21. Robins, J.M. and Hernan, M.A. (2009), “Estimation of the Causal Effects of Time-Varying Exposures,” In Fitzmaurice G., Davidian, M,, Verbeke, G., and Molenberghs, G. eds. (2009) Longitudinal Data Analysis, Chapman & Hall/CRC, p.553 – 399. Rosenbaum, P.R. (2002), Observational Studies. Springer. Rosenbaum, P.R. (2010), Design of Observational Studies. Springer. Rubin, D.B. (2006), Matched Sampling for Causal Effects. Cambridge University Press. Rubin, D.B. (2008), “For Objective Causal Inference, Design Trumps Analysis,” The Annals of Applied Statistics, p.808-840. Rubin, D.B. and Waterman, R.P. (2006), “Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology,” Statistical Science, p.206-222. Russek-Cohen, E. and Simon, R.M. (1997), “Evaluating treatments when a gender by treatment interaction may exist,” Statistics in Medicine, 16, issue 4, p.455-464. Signorovitch, J. (2007), “Estimation and Evaluation of Regression for Patient-Specific Efficacy,” Harvard School of Public Health working paper. Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search, 2nd edition, MIT Press. Wikipedia (2010), “Uplift Modeling,” at http://en.wikipedia.org/wiki/Uplift_modelling Yong, Florence H. (2015), “Quantitative Methods for Stratified Medicine,” PhD Dissertation, Department of Biostatistics, Harvard T.H. Chan School of Public Health, 41