Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
I listen to ~ 100 Bln ad opportunities daily
I listen to ~ 100 Bln ad opportunities daily
I respond with optimal bids within milliseconds
I listen to ~ 100 Bln ad opportunities daily
I respond with optimal bids within milliseconds
I petabytes of data (ad impre...
Predicting user response to ads is a Machine-Learning problem.
Predicting user response to ads is a Machine-Learning problem.
but quantifying impact of ad-exposure is a Measurement prob...
Spark: existing vs simulated data
Most Spark applications process existing big data-sets.
Spark: existing vs simulated data
Most Spark applications process existing big data-sets.
Today we’re talking about analyz...
Key Conceptual Take-aways
I Issues in Ad lift measurement
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
I Bayesian Methods for Ad ...
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
I Bayesian Methods for Ad ...
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
I Bayesian Methods for Ad ...
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
I Bayesian Methods for Ad ...
Key Conceptual Take-aways
I Issues in Ad lift measurement
I Proper definition
I Confidence bounds
I Bayesian Methods for Ad ...
Application context: ad impact measurement
I Advertisers want to know the impact of showing ads to users.
Measuring Ad Impact: Two Approaches
I Observational studies:
Measuring Ad Impact: Two Approaches
I Observational studies:
I Compare uses who happen to be exposed vs not exposed
Measuring Ad Impact: Two Approaches
I Observational studies:
I Compare uses who happen to be exposed vs not exposed
I Bias...
Measuring Ad Impact: Two Approaches
I Observational studies:
I Compare uses who happen to be exposed vs not exposed
I Bias...
Measuring Ad Impact: Two Approaches
I Observational studies:
I Compare uses who happen to be exposed vs not exposed
I Bias...
Ideal Randomized Test
Ideal Randomized Test
Ideal Randomized Test
Ideal Randomized Test: Ad lift
Ideal Randomized Test: Ad lift
Ad Lift: Response Rates
If we see k = 200 conversions out of N = 10, 000 users,
what is a good estimate for the response-r...
Ad Lift: Response Rates
If we see k = 200 conversions out of N = 10, 000 users,
what is a good estimate for the response-r...
Ad Lift: Response Rates
If we see k = 200 conversions out of N = 10, 000 users,
what is a good estimate for the response-r...
Response Rate 90% Confidence Bounds
Response Rate 90% Confidence Bounds
P(R > ˆR | r = q5) = 5%
Response Rate 90% Confidence Bounds
P(R > ˆR | r = q5) = 5%
P(R < ˆR | r = q95) = 5%
Response-Rate Confidence Bounds
Response-Rate Confidence Bounds
Response-Rate Confidence Bounds
Response-Rate Confidence Bounds
How to find (q5, q95) ?
Response-Rate: Bayesian Confidence Bounds
Randomly generate response rates that are consistent with the data.
Response-Rate: Bayesian Confidence Bounds
Randomly generate response rates that are consistent with the data.
(Sample rates...
Response-Rate: Bayesian Confidence Bounds
Randomly generate response rates that are consistent with the data.
(Sample rates...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
I Assume an unknown true rate r, with a prior distrib. p(r)
I assume p(r) = Beta(...
Response-Rate: Bayesian Confidence Bounds
A simple form of Gibbs Sampling (more later):
I sample M values of r from posteri...
Response-Rate: Bayesian Confidence Bounds
A simple form of Gibbs Sampling (more later):
I sample M values of r from posteri...
Response-Rate: Bayesian Confidence Bounds
Response-Rate: Bayesian Confidence Bounds
Response-Rate: Bayesian Confidence Bounds
Response-Rate: Bayesian Confidence Bounds
Response-Rate: Bayesian Confidence Bounds
Response Rates: Example
If we see k = 200 conversions out of N = 10, 000 users,
what is a good estimate for the response-r...
Response Rates: Example
If we see k = 200 conversions out of N = 10, 000 users,
what is a good estimate for the response-r...
We’ve talked about Response Rates. . .
now let’s consider Ad Lift
Ad Lift: Simple Example
I control: 10,000 users, 200 conversions
I test: 100,000 users, 2200 conversions
Observed response...
Ad Lift: Simple Example
I control: 10,000 users, 200 conversions
I test: 100,000 users, 2200 conversions
Observed response...
Ad Lift: Simple Example
I control: 10,000 users, 200 conversions
I test: 100,000 users, 2200 conversions
Observed response...
Ad Lift: Simple Example
I control: 10,000 users, 200 conversions
I test: 100,000 users, 2200 conversions
Observed response...
Ad Lift: Bayesian Confidence Bounds
Sampling approach:
Observed data: control: (kc, Nc), test: (kt, Nt)
1. Repeat M times:
Ad Lift: Bayesian Confidence Bounds
Sampling approach:
Observed data: control: (kc, Nc), test: (kt, Nt)
1. Repeat M times:
...
Ad Lift: Bayesian Confidence Bounds
Sampling approach:
Observed data: control: (kc, Nc), test: (kt, Nt)
1. Repeat M times:
...
Ad Lift: Bayesian Confidence Bounds
Sampling approach:
Observed data: control: (kc, Nc), test: (kt, Nt)
1. Repeat M times:
...
Ad Lift: Bayesian Confidence Bounds
Sampling approach:
Observed data: control: (kc, Nc), test: (kt, Nt)
1. Repeat M times:
...
Ad Lift: Bayesian Confidence Intervals
I control: nc = 10, 000 users, kc = 200 conversions
I test: nt = 100, 000 users, kt ...
Ad Lift: Bayesian Confidence Intervals
I control: nc = 10, 000 users, kc = 200 conversions
I test: nt = 100, 000 users, kt ...
Complication 1:
Auction win-bias
Ideal Randomized Test
Ideal Randomized Test
Ideal Randomized Test
Ideal Randomized Test
Bids on control users are wasted!
A Less Wasteful Randomized Test
A Less Wasteful Randomized Test: Win-bias
Cannot simply compare Test Winners (tw) and Control (c):
I test-winners selectio...
Ad Lift: Proper Definition
Ad Lift: Proper Definition
Ad Lift: Proper Definition
Ad Lift: Proper Definition
Ad Lift: Proper Definition
Ad Lift: Proper Definition
Ad Lift Estimation
Main ideas:
I observe test-losers response rate RtL
Ad Lift Estimation
Main ideas:
I observe test-losers response rate RtL
I observe test win-rate w
Ad Lift Estimation
Main ideas:
I observe test-losers response rate RtL
I observe test win-rate w
I we show one can estimat...
Ad Lift Estimation
Main ideas:
I observe test-losers response rate RtL
I observe test win-rate w
I we show one can estimat...
Ad Lift Estimation
Main ideas:
I observe test-losers response rate RtL
I observe test win-rate w
I we show one can estimat...
Ad Lift Estimation
How to compute the 90% confidence interval for L?
Ad Lift: Confidence Intervals with Gibbs sampler
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals with Gibbs sampler
Bayesian approach (details omitted, see Chickering/Pearl 1997):
I Assume a...
Ad Lift: Confidence Intervals
Ad Lift: Confidence Intervals
Gibbs sampler convergence may depend on prior distribution:
I start with multiple (say 100) p...
Uses of Monte Carlo Simulations
I confidence intervals
Uses of Monte Carlo Simulations
I confidence intervals
I determine “su cient” population sizes for reliably estimating
Uses of Monte Carlo Simulations
I confidence intervals
I determine “su cient” population sizes for reliably estimating
I re...
Uses of Monte Carlo Simulations
I confidence intervals
I determine “su cient” population sizes for reliably estimating
I re...
Uses of Monte Carlo Simulations
I confidence intervals
I determine “su cient” population sizes for reliably estimating
I re...
Uses of Monte Carlo Simulations
I confidence intervals
I determine “su cient” population sizes for reliably estimating
I re...
Complication 2:
Control contamination due to users with multiple cookies
Control Contamination due to Multiple Cookies
Control Contamination due to Multiple Cookies
Control Contamination due to Multiple Cookies
Control Contamination due to Multiple Cookies
Control Contamination due to Multiple Cookies
Cookie-Contamination Questions
I How does cookie contamination a ect measured lift?
Cookie-Contamination Questions
I How does cookie contamination a ect measured lift?
I Does the cookie-distribution matter?
Cookie-Contamination Questions
I How does cookie contamination a ect measured lift?
I Does the cookie-distribution matter?...
Cookie-Contamination Questions
I How does cookie contamination a ect measured lift?
I Does the cookie-distribution matter?...
Cookie-Contamination Questions
I How does cookie contamination a ect measured lift?
I Does the cookie-distribution matter?...
Simulations for cookie-contamination
I A scenario is a combination of parameters:
I M = # trials for this scenario, usuall...
Scenario Simulations in Spark
Scenario Simulations in Spark
Scenario Simulations in Spark
Scenario Simulations in Spark
Scenario Simulations in Spark
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha
Upcoming SlideShare
Loading in …5
×

Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha

Spark Summit East Talk

Monte Carlo Simulations in Ad-Lift Measurement Using Spark by Prasad Chalasani and Ram Sriharsha

  1. 1. I listen to ~ 100 Bln ad opportunities daily
  2. 2. I listen to ~ 100 Bln ad opportunities daily I respond with optimal bids within milliseconds
  3. 3. I listen to ~ 100 Bln ad opportunities daily I respond with optimal bids within milliseconds I petabytes of data (ad impressions, visits, clicks, conversions)
  4. 4. Predicting user response to ads is a Machine-Learning problem.
  5. 5. Predicting user response to ads is a Machine-Learning problem. but quantifying impact of ad-exposure is a Measurement probem.
  6. 6. Spark: existing vs simulated data Most Spark applications process existing big data-sets.
  7. 7. Spark: existing vs simulated data Most Spark applications process existing big data-sets. Today we’re talking about analyzing simulated big data
  8. 8. Key Conceptual Take-aways I Issues in Ad lift measurement
  9. 9. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition
  10. 10. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds
  11. 11. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds I Bayesian Methods for Ad Lift Confidence Bounds
  12. 12. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds I Bayesian Methods for Ad Lift Confidence Bounds I Gibbs Sampling (MCMC – Markov Chain Monte Carlo)
  13. 13. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds I Bayesian Methods for Ad Lift Confidence Bounds I Gibbs Sampling (MCMC – Markov Chain Monte Carlo) I Using Spark for:
  14. 14. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds I Bayesian Methods for Ad Lift Confidence Bounds I Gibbs Sampling (MCMC – Markov Chain Monte Carlo) I Using Spark for: I Monte Carlo sampling for confidence-bounds
  15. 15. Key Conceptual Take-aways I Issues in Ad lift measurement I Proper definition I Confidence bounds I Bayesian Methods for Ad Lift Confidence Bounds I Gibbs Sampling (MCMC – Markov Chain Monte Carlo) I Using Spark for: I Monte Carlo sampling for confidence-bounds I Monte Carlo simulations
  16. 16. Application context: ad impact measurement I Advertisers want to know the impact of showing ads to users.
  17. 17. Measuring Ad Impact: Two Approaches I Observational studies:
  18. 18. Measuring Ad Impact: Two Approaches I Observational studies: I Compare uses who happen to be exposed vs not exposed
  19. 19. Measuring Ad Impact: Two Approaches I Observational studies: I Compare uses who happen to be exposed vs not exposed I Bias a big issue
  20. 20. Measuring Ad Impact: Two Approaches I Observational studies: I Compare uses who happen to be exposed vs not exposed I Bias a big issue I Randomized tests:
  21. 21. Measuring Ad Impact: Two Approaches I Observational studies: I Compare uses who happen to be exposed vs not exposed I Bias a big issue I Randomized tests: I Randomly expose to test, compare with control (un-exposed)
  22. 22. Ideal Randomized Test
  23. 23. Ideal Randomized Test
  24. 24. Ideal Randomized Test
  25. 25. Ideal Randomized Test: Ad lift
  26. 26. Ideal Randomized Test: Ad lift
  27. 27. Ad Lift: Response Rates If we see k = 200 conversions out of N = 10, 000 users, what is a good estimate for the response-rate?
  28. 28. Ad Lift: Response Rates If we see k = 200 conversions out of N = 10, 000 users, what is a good estimate for the response-rate? Estimated response-rate ˆR = k/N = 200/10, 000 = 2%. . .
  29. 29. Ad Lift: Response Rates If we see k = 200 conversions out of N = 10, 000 users, what is a good estimate for the response-rate? Estimated response-rate ˆR = k/N = 200/10, 000 = 2%. . . But how confident are we?
  30. 30. Response Rate 90% Confidence Bounds
  31. 31. Response Rate 90% Confidence Bounds P(R > ˆR | r = q5) = 5%
  32. 32. Response Rate 90% Confidence Bounds P(R > ˆR | r = q5) = 5% P(R < ˆR | r = q95) = 5%
  33. 33. Response-Rate Confidence Bounds
  34. 34. Response-Rate Confidence Bounds
  35. 35. Response-Rate Confidence Bounds
  36. 36. Response-Rate Confidence Bounds How to find (q5, q95) ?
  37. 37. Response-Rate: Bayesian Confidence Bounds Randomly generate response rates that are consistent with the data.
  38. 38. Response-Rate: Bayesian Confidence Bounds Randomly generate response rates that are consistent with the data. (Sample rates from posterior distribution given data.)
  39. 39. Response-Rate: Bayesian Confidence Bounds Randomly generate response rates that are consistent with the data. (Sample rates from posterior distribution given data.) Find the (0.05, 0.95) quantiles of these rates.
  40. 40. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1)
  41. 41. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1) I Sample from the posterior distribution of the rate r I conditional on the observed data (k conversions out of N) P(r | k) Ã P(k | r) · p(r)
  42. 42. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1) I Sample from the posterior distribution of the rate r I conditional on the observed data (k conversions out of N) P(r | k) Ã P(k | r) · p(r) Ã rk (1 ≠ r)N≠k · Beta(1, 1)
  43. 43. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1) I Sample from the posterior distribution of the rate r I conditional on the observed data (k conversions out of N) P(r | k) Ã P(k | r) · p(r) Ã rk (1 ≠ r)N≠k · Beta(1, 1) Ã rk+1 (1 ≠ r)N≠k+1
  44. 44. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1) I Sample from the posterior distribution of the rate r I conditional on the observed data (k conversions out of N) P(r | k) Ã P(k | r) · p(r) Ã rk (1 ≠ r)N≠k · Beta(1, 1) Ã rk+1 (1 ≠ r)N≠k+1 Ã Beta(k + 1, N ≠ k + 1)
  45. 45. Response-Rate: Bayesian Confidence Bounds I Assume an unknown true rate r, with a prior distrib. p(r) I assume p(r) = Beta(1, 1) = Unif (0, 1) I Sample from the posterior distribution of the rate r I conditional on the observed data (k conversions out of N) P(r | k) Ã P(k | r) · p(r) Ã rk (1 ≠ r)N≠k · Beta(1, 1) Ã rk+1 (1 ≠ r)N≠k+1 Ã Beta(k + 1, N ≠ k + 1) I Compute (0.05, 0.95) quantiles from the generated rates.
  46. 46. Response-Rate: Bayesian Confidence Bounds A simple form of Gibbs Sampling (more later): I sample M values of r from posterior P(r | k) ≥ Beta(k + 1, N ≠ k + 1). I compute (0.05, 0.95) quantiles
  47. 47. Response-Rate: Bayesian Confidence Bounds A simple form of Gibbs Sampling (more later): I sample M values of r from posterior P(r | k) ≥ Beta(k + 1, N ≠ k + 1). I compute (0.05, 0.95) quantiles from numpy.random import beta from scipy.stats.mstats import mquantiles def conf(N, k, samples = 500): rates = beta(k+1, N-k+1, samples) return mquantiles(rates, prob = [0.05, 0.95])
  48. 48. Response-Rate: Bayesian Confidence Bounds
  49. 49. Response-Rate: Bayesian Confidence Bounds
  50. 50. Response-Rate: Bayesian Confidence Bounds
  51. 51. Response-Rate: Bayesian Confidence Bounds
  52. 52. Response-Rate: Bayesian Confidence Bounds
  53. 53. Response Rates: Example If we see k = 200 conversions out of N = 10, 000 users, what is a good estimate for the response-rate? Estimated response-rate ˆR = k/N = 200/10, 000 = 2%. . .
  54. 54. Response Rates: Example If we see k = 200 conversions out of N = 10, 000 users, what is a good estimate for the response-rate? Estimated response-rate ˆR = k/N = 200/10, 000 = 2%. . . =∆ 90% confidence region (1.8%, 2.2%)
  55. 55. We’ve talked about Response Rates. . . now let’s consider Ad Lift
  56. 56. Ad Lift: Simple Example I control: 10,000 users, 200 conversions I test: 100,000 users, 2200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10%
  57. 57. Ad Lift: Simple Example I control: 10,000 users, 200 conversions I test: 100,000 users, 2200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10% This is a great lift !
  58. 58. Ad Lift: Simple Example I control: 10,000 users, 200 conversions I test: 100,000 users, 2200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10% This is a great lift ! Not so fast! Is this a reliable estimate?
  59. 59. Ad Lift: Simple Example I control: 10,000 users, 200 conversions I test: 100,000 users, 2200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10% This is a great lift ! Not so fast! Is this a reliable estimate? Could true lift ¸ be 0%, or even negative ?
  60. 60. Ad Lift: Bayesian Confidence Bounds Sampling approach: Observed data: control: (kc, Nc), test: (kt, Nt) 1. Repeat M times:
  61. 61. Ad Lift: Bayesian Confidence Bounds Sampling approach: Observed data: control: (kc, Nc), test: (kt, Nt) 1. Repeat M times: I draw control response rate rc from posterior P(rc | kc) ≥ Beta(kc + 1, Nc ≠ kc + 1).
  62. 62. Ad Lift: Bayesian Confidence Bounds Sampling approach: Observed data: control: (kc, Nc), test: (kt, Nt) 1. Repeat M times: I draw control response rate rc from posterior P(rc | kc) ≥ Beta(kc + 1, Nc ≠ kc + 1). I draw test response rate rt from posterior P(rt | kt) ≥ Beta(kt + 1, Nt ≠ kt + 1).
  63. 63. Ad Lift: Bayesian Confidence Bounds Sampling approach: Observed data: control: (kc, Nc), test: (kt, Nt) 1. Repeat M times: I draw control response rate rc from posterior P(rc | kc) ≥ Beta(kc + 1, Nc ≠ kc + 1). I draw test response rate rt from posterior P(rt | kt) ≥ Beta(kt + 1, Nt ≠ kt + 1). I compute lift L = rt/rc ≠ 1
  64. 64. Ad Lift: Bayesian Confidence Bounds Sampling approach: Observed data: control: (kc, Nc), test: (kt, Nt) 1. Repeat M times: I draw control response rate rc from posterior P(rc | kc) ≥ Beta(kc + 1, Nc ≠ kc + 1). I draw test response rate rt from posterior P(rt | kt) ≥ Beta(kt + 1, Nt ≠ kt + 1). I compute lift L = rt/rc ≠ 1 2. Compute (0.05, 0.95) quantiles of set of M lifts {L}.
  65. 65. Ad Lift: Bayesian Confidence Intervals I control: nc = 10, 000 users, kc = 200 conversions I test: nt = 100, 000 users, kt = 2, 200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10%
  66. 66. Ad Lift: Bayesian Confidence Intervals I control: nc = 10, 000 users, kc = 200 conversions I test: nt = 100, 000 users, kt = 2, 200 conversions Observed response-rates: I control: ˆRc = 200/10, 000 = 2% I test: ˆRt = 2200/100, 000 = 2.2% Estimated Lift ˆL = 2.2/2 ≠ 1 = 10% 90% confidence interval: (≠2.7%, 23.6%)
  67. 67. Complication 1: Auction win-bias
  68. 68. Ideal Randomized Test
  69. 69. Ideal Randomized Test
  70. 70. Ideal Randomized Test
  71. 71. Ideal Randomized Test Bids on control users are wasted!
  72. 72. A Less Wasteful Randomized Test
  73. 73. A Less Wasteful Randomized Test: Win-bias Cannot simply compare Test Winners (tw) and Control (c): I test-winners selection bias: “win bias”
  74. 74. Ad Lift: Proper Definition
  75. 75. Ad Lift: Proper Definition
  76. 76. Ad Lift: Proper Definition
  77. 77. Ad Lift: Proper Definition
  78. 78. Ad Lift: Proper Definition
  79. 79. Ad Lift: Proper Definition
  80. 80. Ad Lift Estimation Main ideas: I observe test-losers response rate RtL
  81. 81. Ad Lift Estimation Main ideas: I observe test-losers response rate RtL I observe test win-rate w
  82. 82. Ad Lift Estimation Main ideas: I observe test-losers response rate RtL I observe test win-rate w I we show one can estimate R0 tw = Rc ≠ (1 ≠ w)RtL w
  83. 83. Ad Lift Estimation Main ideas: I observe test-losers response rate RtL I observe test win-rate w I we show one can estimate R0 tw = Rc ≠ (1 ≠ w)RtL w I compute lift L = R1 tw /R0 tw ≠ 1
  84. 84. Ad Lift Estimation Main ideas: I observe test-losers response rate RtL I observe test win-rate w I we show one can estimate R0 tw = Rc ≠ (1 ≠ w)RtL w I compute lift L = R1 tw /R0 tw ≠ 1 I similar to Treatment E ect Under Non-compliance in clinicial trials.
  85. 85. Ad Lift Estimation How to compute the 90% confidence interval for L?
  86. 86. Ad Lift: Confidence Intervals with Gibbs sampler
  87. 87. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of:
  88. 88. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors
  89. 89. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors I their probabilities
  90. 90. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors I their probabilities I Set up prior distribution on ◊ ≥ p(◊) (Dirichlet)
  91. 91. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors I their probabilities I Set up prior distribution on ◊ ≥ p(◊) (Dirichlet) I Sample M values of unknown ◊ from posterior: Gibbs Sampler P(◊ |Data) Ã P(Data | ◊) · p(◊)
  92. 92. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors I their probabilities I Set up prior distribution on ◊ ≥ p(◊) (Dirichlet) I Sample M values of unknown ◊ from posterior: Gibbs Sampler P(◊ |Data) Ã P(Data | ◊) · p(◊) I For each sampled ◊ compute lift L using above
  93. 93. Ad Lift: Confidence Intervals with Gibbs sampler Bayesian approach (details omitted, see Chickering/Pearl 1997): I Assume a random parameter vector ◊ consisting of: I user latent (potential) behaviors I their probabilities I Set up prior distribution on ◊ ≥ p(◊) (Dirichlet) I Sample M values of unknown ◊ from posterior: Gibbs Sampler P(◊ |Data) Ã P(Data | ◊) · p(◊) I For each sampled ◊ compute lift L using above I Compute (0.05, 0.95) quantiles of sampled L values
  94. 94. Ad Lift: Confidence Intervals
  95. 95. Ad Lift: Confidence Intervals Gibbs sampler convergence may depend on prior distribution: I start with multiple (say 100) priors I run them all in parallel using Spark.
  96. 96. Uses of Monte Carlo Simulations I confidence intervals
  97. 97. Uses of Monte Carlo Simulations I confidence intervals I determine “su cient” population sizes for reliably estimating
  98. 98. Uses of Monte Carlo Simulations I confidence intervals I determine “su cient” population sizes for reliably estimating I response rates
  99. 99. Uses of Monte Carlo Simulations I confidence intervals I determine “su cient” population sizes for reliably estimating I response rates I lift
  100. 100. Uses of Monte Carlo Simulations I confidence intervals I determine “su cient” population sizes for reliably estimating I response rates I lift I understand e ect of complex phenomena
  101. 101. Uses of Monte Carlo Simulations I confidence intervals I determine “su cient” population sizes for reliably estimating I response rates I lift I understand e ect of complex phenomena I validate/verify analytical formulas
  102. 102. Complication 2: Control contamination due to users with multiple cookies
  103. 103. Control Contamination due to Multiple Cookies
  104. 104. Control Contamination due to Multiple Cookies
  105. 105. Control Contamination due to Multiple Cookies
  106. 106. Control Contamination due to Multiple Cookies
  107. 107. Control Contamination due to Multiple Cookies
  108. 108. Cookie-Contamination Questions I How does cookie contamination a ect measured lift?
  109. 109. Cookie-Contamination Questions I How does cookie contamination a ect measured lift? I Does the cookie-distribution matter?
  110. 110. Cookie-Contamination Questions I How does cookie contamination a ect measured lift? I Does the cookie-distribution matter? I everyone has k cookies vs an average of k cookies
  111. 111. Cookie-Contamination Questions I How does cookie contamination a ect measured lift? I Does the cookie-distribution matter? I everyone has k cookies vs an average of k cookies I What is the influence of the control percentage?
  112. 112. Cookie-Contamination Questions I How does cookie contamination a ect measured lift? I Does the cookie-distribution matter? I everyone has k cookies vs an average of k cookies I What is the influence of the control percentage? I Simulations best way to understand this
  113. 113. Simulations for cookie-contamination I A scenario is a combination of parameters: I M = # trials for this scenario, usually 10K-1M I n = # users, typically 10K - 10M I p = # control percentage (usually 10-50%) I k = cookie-distribution, expressed as 1 : 100, or 1 : 70, 3 : 30 I r = (un-contaminated) control user response rate I a = true lift, i.e. exposed user response rate = r ú (1 + a). I A scenario file specifies a scenario in each row. I could be thousands of scenarios
  114. 114. Scenario Simulations in Spark
  115. 115. Scenario Simulations in Spark
  116. 116. Scenario Simulations in Spark
  117. 117. Scenario Simulations in Spark
  118. 118. Scenario Simulations in Spark

×