Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Causal inference in data science

2,211 views

Published on

Predictive modeling has led to big successes in making inferences from data. Such models are used extensively, including in systems for recommending items, optimizing content, delivering ads, matching applicants to jobs, identifying health risks and so on. However, predictive models are not well-equipped to answer questions about cause and effect, which form the basis of many practical decision-making scenarios. For example, if a recommendation system is changed or removed, what will be the effect on total customer activity? Which strategy leads to a higher engagement with a product? How can we learn generalizable insights about users from biased data (e.g. that of opt-in users)? Through practical examples, I will show the value of counterfactual reasoning and causal inference for such scenarios, by demonstrating that relying on predictive modeling based on correlations can be counterproductive. I will then present an overview of experimental and observational causal inference methods, that can better inform decision-making through data, and also lead to more robust and generalizable prediction models.

Published in: Data & Analytics

Causal inference in data science

  1. 1. 1 http://www.amitsharma.in http://www.github.com/amit-sharma/causal-inference- tutorial
  2. 2. 2
  3. 3. 3
  4. 4. 4
  5. 5. 5
  6. 6. Use these correlations to make a predictive model. Future Activity -> f(number of friends, logins in past month)  6
  7. 7. 7
  8. 8. 8
  9. 9. 9
  10. 10. 10
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. 16
  17. 17. 17
  18. 18. 18
  19. 19. 19 Algorithm A Algorithm B 50/1000 (5%) 54/1000 (5.4%)
  20. 20. 20 Algorithm A Algorithm B 10/400 (2.5%) 4/200 (2%) Algorithm A Algorithm B 40/600 (6.6%) 50/800 (6.2%)
  21. 21. Is Algorithm A better? Algorithm A Algorithm B Success Rate for Low-Activity users 10/400 (2.5%) 4/200 (2%) Success Rate for High-Activity users 40/600 (6.6%) 50/800 (6.2%) Total Success Rate 50/1000 (5%) 54/1000 (5.4%) 21
  22. 22. 22
  23. 23. Average comment length decreases over time. 23 But for each yearly cohort of users, comment length increases over time.
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. 27http://plato.stanford.edu/entries/causation-mani/
  28. 28. 28http://plato.stanford.edu/entries/causation-counterfactual/
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. 37
  38. 38. 38Dunning (2002), Rosenzweig-Wolpin (2000)
  39. 39. 39
  40. 40. 40
  41. 41. 41
  42. 42. 42
  43. 43. 43
  44. 44. 44
  45. 45. 45
  46. 46. 46
  47. 47. 47http://tylervigen.com/spurious-correlations
  48. 48. http://www.github.com/amit-sharma/causal-inference-tutorial amshar@microsoft.com 48

×