Successfully reported this slideshow.
Your SlideShare is downloading. ×

Causal inference in data science

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 48 Ad

Causal inference in data science

Download to read offline

Predictive modeling has led to big successes in making inferences from data. Such models are used extensively, including in systems for recommending items, optimizing content, delivering ads, matching applicants to jobs, identifying health risks and so on. However, predictive models are not well-equipped to answer questions about cause and effect, which form the basis of many practical decision-making scenarios. For example, if a recommendation system is changed or removed, what will be the effect on total customer activity? Which strategy leads to a higher engagement with a product? How can we learn generalizable insights about users from biased data (e.g. that of opt-in users)? Through practical examples, I will show the value of counterfactual reasoning and causal inference for such scenarios, by demonstrating that relying on predictive modeling based on correlations can be counterproductive. I will then present an overview of experimental and observational causal inference methods, that can better inform decision-making through data, and also lead to more robust and generalizable prediction models.

Predictive modeling has led to big successes in making inferences from data. Such models are used extensively, including in systems for recommending items, optimizing content, delivering ads, matching applicants to jobs, identifying health risks and so on. However, predictive models are not well-equipped to answer questions about cause and effect, which form the basis of many practical decision-making scenarios. For example, if a recommendation system is changed or removed, what will be the effect on total customer activity? Which strategy leads to a higher engagement with a product? How can we learn generalizable insights about users from biased data (e.g. that of opt-in users)? Through practical examples, I will show the value of counterfactual reasoning and causal inference for such scenarios, by demonstrating that relying on predictive modeling based on correlations can be counterproductive. I will then present an overview of experimental and observational causal inference methods, that can better inform decision-making through data, and also lead to more robust and generalizable prediction models.

Advertisement
Advertisement

More Related Content

Viewers also liked (19)

Similar to Causal inference in data science (20)

Advertisement

More from Amit Sharma (16)

Advertisement

Causal inference in data science

  1. 1. 1 http://www.amitsharma.in http://www.github.com/amit-sharma/causal-inference- tutorial
  2. 2. 2
  3. 3. 3
  4. 4. 4
  5. 5. 5
  6. 6. Use these correlations to make a predictive model. Future Activity -> f(number of friends, logins in past month)  6
  7. 7. 7
  8. 8. 8
  9. 9. 9
  10. 10. 10
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. 16
  17. 17. 17
  18. 18. 18
  19. 19. 19 Algorithm A Algorithm B 50/1000 (5%) 54/1000 (5.4%)
  20. 20. 20 Algorithm A Algorithm B 10/400 (2.5%) 4/200 (2%) Algorithm A Algorithm B 40/600 (6.6%) 50/800 (6.2%)
  21. 21. Is Algorithm A better? Algorithm A Algorithm B Success Rate for Low-Activity users 10/400 (2.5%) 4/200 (2%) Success Rate for High-Activity users 40/600 (6.6%) 50/800 (6.2%) Total Success Rate 50/1000 (5%) 54/1000 (5.4%) 21
  22. 22. 22
  23. 23. Average comment length decreases over time. 23 But for each yearly cohort of users, comment length increases over time.
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. 27http://plato.stanford.edu/entries/causation-mani/
  28. 28. 28http://plato.stanford.edu/entries/causation-counterfactual/
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. 37
  38. 38. 38Dunning (2002), Rosenzweig-Wolpin (2000)
  39. 39. 39
  40. 40. 40
  41. 41. 41
  42. 42. 42
  43. 43. 43
  44. 44. 44
  45. 45. 45
  46. 46. 46
  47. 47. 47http://tylervigen.com/spurious-correlations
  48. 48. http://www.github.com/amit-sharma/causal-inference-tutorial amshar@microsoft.com 48

×