Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
amshar@microsoft.com
1http://www.github.com/amit-sharma/causal-inference-tutorial
2
3
4
5
Use these correlations to make a predictive model.
Future Activity ->
f(number of friends, logins in past month)

6
7
8
9
10
11
12
13
14
15
16
17
18
19
Old Algorithm (A) New Algorithm (B)
50/1000 (5%) 54/1000 (5.4%)
20
Old Algorithm (A) New Algorithm (B)
10/400 (2.5%) 4/200 (2%)
Old Algorithm (A) New Algorithm (B)
40/600 (6.6%) 50/800 (...
Is Algorithm A better?
Old algorithm (A) New Algorithm
(B)
CTR for Low-
Activity users
10/400 (2.5%) 4/200 (2%)
CTR for Hi...
22
Average comment length decreases over time.
23
But for each yearly cohort of users, comment length
increases over time.
24
25
26
27http://plato.stanford.edu/entries/causation-mani/
28http://plato.stanford.edu/entries/causation-counterfactual/
29
30
31
32
33
34
35
36
37
38
39
40
41Dunning (2002), Rosenzweig-Wolpin (2000)
42
43
44
45
46
47
48
49
50
51
52
53
54
55
Does new Algorithm B increase CTR for recommendations on
Windows Store, compared to old algorithm A?
Does new Algorithm B increase CTR for recommendations on
Windows Store, compared to old algorithm A?
56
57
58
59
60
61
62
63
64
65
𝑷𝒓𝒐𝒑𝒆𝒏𝒔𝒊𝒕𝒚 𝑁𝑒𝑤𝐴𝑙𝑔𝑜 𝑈𝑠𝑒𝑟𝑖 = 𝑳𝒐𝒈𝒊𝒔𝒕𝒊𝒄(𝑎 𝑐𝑎𝑡1, 𝑎 𝑐𝑎𝑡2, … 𝑎 𝑐𝑎𝑡𝑛)
Compare CTR between users with the same propensity score.
66
67
68
69
Non-FriendsEgo Network
f5
u
f1
f4
f3f2
n5
u
n1
n4
n3n2
70
71
72
73http://tylervigen.com/spurious-correlations
74
http://www.github.com/amit-sharma/causal-inference-
tutorial
amshar@microsoft.com
75
https://www.github.com/amit-sharma/causal-inference-tutorial
76
77
78
79
80
81
> nrow(user_app_visits_A)
[1] 1,000,000
> length(unique(user_app_visits_A$user_id))
[1] 10,000
> length(unique(user_app_vi...
83
84
> user_app_visits_B = read.csv("user_app_visits_B.csv")
> naive_observational_estimate <- function(user_visits){
# Naive o...
86
> stratified_by_activity_estimate(user_app_visits_A)
Source: local data frame [4 x 2]
activity_level stratified_estimate
1...
> stratified_by_category_estimate(user_app_visits_A)
Source: local data frame [10 x 2]
category stratified_estimate
1 1 0....
89
90
91
92
> naive_observational_estimate(user_app_visits_A)
naive_estimate
[1] 0.200768
> ranking_discontinuity_estimate(user_app_vi...
94
95
amshar@microsoft.com
Upcoming SlideShare
Loading in …5
×

Causal inference in online systems: Methods, pitfalls and best practices

9,273 views

Published on

From recommending what to buy, which movies to watch, to selecting the news to read, people to follow and jobs to apply for, online systems have become an important part of our daily lives. A natural question to ask is how these socio-technical systems impact our behavior. However, because of the intricate interplay between the outputs of these systems and people's actions, identifying their impact on people's behavior is non-trivial.

Fortunately, there is a rich body of work on causal inference that we can build on. In the first part of the tutorial, I will show the value of counterfactual reasoning for studying socio-technical systems, by demonstrating how predictive modeling based on correlations can be counterproductive. Then, we will discuss different approaches to causal inference, including randomized experiments, natural experiments such as instrumental variables and regression discontinuities, and observational methods such as stratification and matching. Throughout, we will try to make connections with graphical models, machine learning and past work in the social sciences.

The second half will be more hands-on. We will work through a practical example of estimating the causal impact of a recommender system, starting from simple to more complex methods. The goal of the practical exercise will be to appreciate the pitfalls in different approaches to causal reasoning and take away best practices for doing causal inference with messy, real-world data.

Code used is available at: https://github.com/amit-sharma/causal-inference-tutorial/

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Causal inference in online systems: Methods, pitfalls and best practices

  1. 1. amshar@microsoft.com 1http://www.github.com/amit-sharma/causal-inference-tutorial
  2. 2. 2
  3. 3. 3
  4. 4. 4
  5. 5. 5
  6. 6. Use these correlations to make a predictive model. Future Activity -> f(number of friends, logins in past month)  6
  7. 7. 7
  8. 8. 8
  9. 9. 9
  10. 10. 10
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. 16
  17. 17. 17
  18. 18. 18
  19. 19. 19 Old Algorithm (A) New Algorithm (B) 50/1000 (5%) 54/1000 (5.4%)
  20. 20. 20 Old Algorithm (A) New Algorithm (B) 10/400 (2.5%) 4/200 (2%) Old Algorithm (A) New Algorithm (B) 40/600 (6.6%) 50/800 (6.2%) 0 2 4 6 8 Low-activity High-activity CTR
  21. 21. Is Algorithm A better? Old algorithm (A) New Algorithm (B) CTR for Low- Activity users 10/400 (2.5%) 4/200 (2%) CTR for High- Activity users 40/600 (6.6%) 50/800 (6.2%) Total CTR 50/1000 (5%) 54/1000 (5.4%) 21
  22. 22. 22
  23. 23. Average comment length decreases over time. 23 But for each yearly cohort of users, comment length increases over time.
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. 27http://plato.stanford.edu/entries/causation-mani/
  28. 28. 28http://plato.stanford.edu/entries/causation-counterfactual/
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. 37
  38. 38. 38
  39. 39. 39
  40. 40. 40
  41. 41. 41Dunning (2002), Rosenzweig-Wolpin (2000)
  42. 42. 42
  43. 43. 43
  44. 44. 44
  45. 45. 45
  46. 46. 46
  47. 47. 47
  48. 48. 48
  49. 49. 49
  50. 50. 50
  51. 51. 51
  52. 52. 52
  53. 53. 53
  54. 54. 54
  55. 55. 55 Does new Algorithm B increase CTR for recommendations on Windows Store, compared to old algorithm A?
  56. 56. Does new Algorithm B increase CTR for recommendations on Windows Store, compared to old algorithm A? 56
  57. 57. 57
  58. 58. 58
  59. 59. 59
  60. 60. 60
  61. 61. 61
  62. 62. 62
  63. 63. 63
  64. 64. 64
  65. 65. 65
  66. 66. 𝑷𝒓𝒐𝒑𝒆𝒏𝒔𝒊𝒕𝒚 𝑁𝑒𝑤𝐴𝑙𝑔𝑜 𝑈𝑠𝑒𝑟𝑖 = 𝑳𝒐𝒈𝒊𝒔𝒕𝒊𝒄(𝑎 𝑐𝑎𝑡1, 𝑎 𝑐𝑎𝑡2, … 𝑎 𝑐𝑎𝑡𝑛) Compare CTR between users with the same propensity score. 66
  67. 67. 67
  68. 68. 68
  69. 69. 69 Non-FriendsEgo Network f5 u f1 f4 f3f2 n5 u n1 n4 n3n2
  70. 70. 70
  71. 71. 71
  72. 72. 72
  73. 73. 73http://tylervigen.com/spurious-correlations
  74. 74. 74
  75. 75. http://www.github.com/amit-sharma/causal-inference- tutorial amshar@microsoft.com 75
  76. 76. https://www.github.com/amit-sharma/causal-inference-tutorial 76
  77. 77. 77
  78. 78. 78
  79. 79. 79
  80. 80. 80
  81. 81. 81
  82. 82. > nrow(user_app_visits_A) [1] 1,000,000 > length(unique(user_app_visits_A$user_id)) [1] 10,000 > length(unique(user_app_visits_A$product_id)) [1] 990 > length(unique(user_app_visits_A$category)) [1] 10 82
  83. 83. 83
  84. 84. 84
  85. 85. > user_app_visits_B = read.csv("user_app_visits_B.csv") > naive_observational_estimate <- function(user_visits){ # Naive observational estimate # Simply the fraction of visits that resulted in a recommendation click- through. est = summarise(user_visits, naive_estimate=sum(is_rec_visit)/length(is_rec_visit)) return(est) } > naive_observational_estimate(user_app_visits_A) naive_estimate [1] 0.200768 > naive_observational_estimate(user_app_visits_B) naive_estimate [1] 0.226467 85
  86. 86. 86
  87. 87. > stratified_by_activity_estimate(user_app_visits_A) Source: local data frame [4 x 2] activity_level stratified_estimate 1 1 0.1248852 2 2 0.1750483 3 3 0.2266394 4 4 0.2763522 > stratified_by_activity_estimate(user_app_visits_B) Source: local data frame [4 x 2] activity_level stratified_estimate 1 1 0.1253469 2 2 0.1753933 3 3 0.2257211 4 4 0.2749867 87
  88. 88. > stratified_by_category_estimate(user_app_visits_A) Source: local data frame [10 x 2] category stratified_estimate 1 1 0.1758294 2 2 0.2276829 3 3 0.2763157 4 4 0.1239860 5 5 0.1767163 … … … > stratified_by_category_estimate(user_app_visits_B) Source: local data frame [10 x 2] category stratified_estimate 1 1 0.2002127 2 2 0.2517528 3 3 0.3021371 4 4 0.1503150 5 5 0.1999519 … … … 88
  89. 89. 89
  90. 90. 90
  91. 91. 91
  92. 92. 92
  93. 93. > naive_observational_estimate(user_app_visits_A) naive_estimate [1] 0.200768 > ranking_discontinuity_estimate(user_app_visits_A) discontinuity_estimate [1] 0.121362 40% of app visits coming from recommendation click- throughs are not causal. Could have happened even without the recommendation system. 93
  94. 94. 94
  95. 95. 95 amshar@microsoft.com

×