Ubisoft

1,040 views

Published on

Matching as an Alternative to A/B testing

Published in: Technology, Economy & Finance
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,040
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Ubisoft

  1. 1. Matching as an Alternative to A/B TestingChristoph SafferlingHead of Game AnalyticsUbisoft Blue ByteGames Industry Analytics ForumMay 9th, 2013
  2. 2. Self-selection in gamesin games, we routinely change things, and want to test if thechange was successfulgame changes: quest changes, introduce new items, etcshop configurations: amount of items, allocation, prices, etc...and many examples more!players self-select into the group that maximises their utility(fun)most game variables are the results of a player’s decision:exogeneity is (usually) not given: E[ε|X] = 0
  3. 3. Treatment effectstest the outcome of a treatment effectE[Y|X, D = 1] − E[Y|X, D = 0] = E[Y(1) − Y(0)|X]with Y as the outcome, X as the observable data, and D asthe treatment dummywe are intested in the average treatment effect on the treated:ATT = E[Y(1) − Y(0)|D = 1]= E[Y(1)|D = 1] − E[Y(0)|D = 1]
  4. 4. E[Y(0)|D = 1] is a counterfactual: unobservableproper control groups (A/B testing!) provides a consistentestimatorsometimes, A/B testing is not available/feasible(one) different econometric modeling strategy: matchingestimatorreproduce the treatment group among the non-treated:find individuals who differ only in their outcomes, and theirtreatment effect (“statistical twins”)
  5. 5. Assumptions and problemsConditional Independence Assumption: given X, we assumethe outcome Y to be independent of the treatment D.→ conditional on observed characteristics, selection bias isremovedCommon Support is given: 0 < P(D = 1|X) < 1→ we exclude unmatched observationsCurse of Dimensionality: increasing X improves the matchingquality, but makes matching more difficult!→ e.g. for continuous variables: P(X1 = x) = 0
  6. 6. Several matching algorithmsone-to-one matching estimatorswith/without replacementnearest-neighbourwithin-calipersmoothed matching estimatorsk-nearest neighbourradius matchingweighted smoothed matching estimatorskernel smoothinglocal linear regression smoothingMahalanobis distance matching
  7. 7. http://xkcd.com/800/
  8. 8. Zeropayments in TSO Russiapayment conversion in TSO RU was lowone explanation: payment process “scary”“zeropayments” guide the player through the paymentprocess, offering a small reward for completing a fakepayment
  9. 9. Results of the treatmentreference: lifetime pay-to-active TSO RU apaid at least once additionally to the zeropayment 5.9apaid after their zeropayment 3.5apaid after their zeropayment, not paid before 1.6a
  10. 10. Matching results (tobit)(1) (2) (5) (6)tobit full tobit2 full tobit cem tobit2 cemhad zero payments 7.376 19.71 -356.3 -350.1(0.974) (0.931) (0.270) (0.276)level 315.3∗∗ 354.1∗∗ 674.4 696.4(0.007) (0.000) (0.177) (0.179)level squared -0.796 -1.441 -9.274 -9.635(0.709) (0.416) (0.291) (0.289)uniqueLogins -26.27∗∗ -28.22∗∗ -33.35 -34.78(0.018) (0.007) (0.199) (0.204)rating for week -407.0† -400.7† 39.74 42.50(0.076) (0.076) (0.915) (0.908)guild 647.9∗∗ 651.2∗∗ 639.6 627.8(0.012) (0.011) (0.388) (0.400)age 53.18∗∗ 52.37∗∗ 185.4 171.8(0.024) (0.025) (0.264) (0.288)(additional controls, including intercept)N 12376 19522 4114 6894pseudo R2 0.162 0.189 0.139 0.158p-values in parentheses
  11. 11. Matching results (zero-inflated negbin)(1) (2) (5) (6)zinb full zinb2 full zinb cem zinb2 cemhad zero payments 0.111 0.110 0.540∗∗ 0.538∗∗(0.463) (0.466) (0.005) (0.006)level 0.148∗∗ 0.150∗∗ -0.153 -0.255†(0.012) (0.010) (0.332) (0.096)level squared -0.00211∗∗ -0.00213∗∗ 0.00429 0.00617∗∗(0.036) (0.032) (0.155) (0.035)uniqueLogins -0.0180∗∗ -0.0180∗∗ -0.0308∗∗ -0.0310∗∗(0.007) (0.006) (0.005) (0.005)rating for week 0.747∗∗ 0.748∗∗ 1.662∗∗ 1.653∗∗(0.000) (0.000) (0.000) (0.000)guild -0.112 -0.112 0.280 0.297(0.319) (0.319) (0.286) (0.264)age 0.0383∗∗ 0.0383∗∗ 0.119 0.192†(0.012) (0.012) (0.308) (0.096)(additional controls, including intercept and inflate regression)N 12376 19522 4114 6894p-values in parentheses
  12. 12. further readingRosenbaum, P. R., Rubin, D. B. (1983). The central role of the propensity score in observational studies for causaleffects. Biometrika 70 (1), pp. 41-55.Heckman, J. J., H. Ichimura, and P. Todd (1997). Matching as an Econometric Evaluation Estimator: EvidenceFrom Evaluating a Job Training Programme. Review of Economic Studies 64, pp. 605-54.Angrist, J. D. and A. B. Krueger (1999). Empirical Strategies in Labor Economics. pp. 1277-1366 in Handbook ofLabor Economics, vol. 3, edited by O. C. Ashenfelter and D. Card. Amsterdam: Elsevier.Blackwell, M., Iacus, S., King, G., Porro, G., (2009). cem: Coarsened exact matching in stata. Stata Journal 9 (4),pp. 524-546.Iacus, S., King, G., Porro, G. (June 2008). Matching for causal inference without balance checking. UNIMI –Research Papers in Economics, Business, and Statistics 1073, Universit´a degli Studi di Milano.Lechner M. (2002). Some practical issues in the evaluation of heterogeneous labour market programmes by matchingmethods. Journal of the Royal Statistical Society. Series A, 165, pp. 59-82.Leuven, E., Sianesi, B. (April 2003). Psmatch2: Stata module to perform full mahalanobis and propensity scorematching, common support graphing, and covariate imbalance testing. S432001 Statistical Software Components,Boston College Department of Economics

×