Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Talk: Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster

945 views

Published on

Talk at NIPS 2016 "What If?" workshop:
https://arxiv.org/abs/1611.10351

Published in: Science
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE Format, ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Talk: Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster

  1. 1. Joint causal inference on observational and experimental datasets Sara Magliacane, Tom Claassen, Joris M. Mooij s.magliacane@uva.nl What If?, 10th December, 2016 Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 1 / 21
  2. 2. Part I Introduction Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 2 / 21
  3. 3. Causal inference: learning causal relations from data Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
  4. 4. Causal inference: learning causal relations from data Definition X causes Y (X Y ) = intervening upon (changing) X changes Y Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
  5. 5. Causal inference: learning causal relations from data Definition X causes Y (X Y ) = intervening upon (changing) X changes Y We can represent causal relations with a causal DAG (hidden vars): X Y Z E.g. X = Smoking, Y = Cancer, Z = Genetics Causal inference = structure learning of the causal DAG Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
  6. 6. Causal inference: learning causal relations from data Definition X causes Y (X Y ) = intervening upon (changing) X changes Y We can represent causal relations with a causal DAG (hidden vars): X Y Z E.g. X = Smoking, Y = Cancer, Z = Genetics Causal inference = structure learning of the causal DAG Traditionally, causal relations are inferred from interventions. Sometimes, interventions are unethical, unfeasible or too expensive Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
  7. 7. Causal inference from observational and experimental data Holy Grail of Causal Inference Learn as much causal structure as possible from observations, integrating background knowledge and experimental data. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
  8. 8. Causal inference from observational and experimental data Holy Grail of Causal Inference Learn as much causal structure as possible from observations, integrating background knowledge and experimental data. Current causal inference methods: Score-based: evaluate models using a penalized likelihood score Constraint-based: use statistical independences to express constraints over possible causal models Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
  9. 9. Causal inference from observational and experimental data Holy Grail of Causal Inference Learn as much causal structure as possible from observations, integrating background knowledge and experimental data. Current causal inference methods: Score-based: evaluate models using a penalized likelihood score Constraint-based: use statistical independences to express constraints over possible causal models Advantage of constraint-based methods: can handle latent confounders naturally Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
  10. 10. Joint inference on observational and experimental data Advantage of score-based methods: can formulate joint inference on observational and experimental data and learn the targets of interventions, e.g. [Eaton and Murphy, 2007]. raf mek12 pip2 erk akt p38jnk pka (a) (b) raf erk p38 pkc akt pka mek12 B2cAMP f erk akt jnk Psitect AKT inh U0126 PMA p38 G06967 mek12 raf pkc pip3 plcy pip2 pka Present Missing Int. edge (c) (d) logical data. (a) A partial model of the T-cell pathway, as currently accepted by biologists. The small round ent various interventions (green = activators, red = inhibitors). From [SPP+ 05]. Reprinted with permission h marginal probability above 0.5 as estimated by [SPP+ 05]. (c) Edges with marginal probability above 0.5 g known perfect interventions. Dashed edges are ones that are missing from the union of (a) and (b). These edges that Sachs et al missed. (d) Edges with marginal probability above 0.5 as estimated by us, assuming entions, and a fan-in bound of k = 2. The intervention nodes are in red, and edges from the intervention d edges are ones that are missing from the union of (a) and (b). This figure is best viewed in colour. Example from [Eaton and Murphy, 2007] Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 5 / 21
  11. 11. Joint inference on observational and experimental data Advantage of score-based methods: can formulate joint inference on observational and experimental data and learn the targets of interventions, e.g. [Eaton and Murphy, 2007]. raf mek12 pip2 erk akt p38jnk pka (a) (b) raf erk p38 pkc akt pka mek12 B2cAMP f erk akt jnk Psitect AKT inh U0126 PMA p38 G06967 mek12 raf pkc pip3 plcy pip2 pka Present Missing Int. edge (c) (d) logical data. (a) A partial model of the T-cell pathway, as currently accepted by biologists. The small round ent various interventions (green = activators, red = inhibitors). From [SPP+ 05]. Reprinted with permission h marginal probability above 0.5 as estimated by [SPP+ 05]. (c) Edges with marginal probability above 0.5 g known perfect interventions. Dashed edges are ones that are missing from the union of (a) and (b). These edges that Sachs et al missed. (d) Edges with marginal probability above 0.5 as estimated by us, assuming entions, and a fan-in bound of k = 2. The intervention nodes are in red, and edges from the intervention d edges are ones that are missing from the union of (a) and (b). This figure is best viewed in colour. Example from [Eaton and Murphy, 2007] Goal: Can we perform joint inference using constraint-based methods? Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 5 / 21
  12. 12. Part II Joint Causal Inference Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 6 / 21
  13. 13. Joint Causal Inference: Assumptions Idea: Model jointly several observational or experimental datasets {Dr }r∈{1...n} with zero or more possibly unknown intervention targets. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
  14. 14. Joint Causal Inference: Assumptions Idea: Model jointly several observational or experimental datasets {Dr }r∈{1...n} with zero or more possibly unknown intervention targets. We assume a unique underlying causal DAG across datasets defined over system variables {Xj }j∈X (some of which possibly hidden). Example X1 X2 X3 Dataset D1 unknown ints X1 X2 X3 Dataset D2 unknown ints X1 X2 X3 Dataset D3 do X1 Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
  15. 15. Joint Causal Inference: Assumptions Idea: Model jointly several observational or experimental datasets {Dr }r∈{1...n} with zero or more possibly unknown intervention targets. We assume a unique underlying causal DAG across datasets defined over system variables {Xj }j∈X (some of which possibly hidden). Example X1 X2 X3 Dataset D1 unknown ints X1 X2 X3 Dataset D2 unknown ints X1 X2 X3 Dataset D3 do X1 Note: cannot handle certain intervention types, e.g. perfect interventions. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
  16. 16. Joint Causal Inference: SCM We introduce two types of dummy variables in the data: a regime variable R, indicating which dataset Dr a data point is from intervention variables {Ii }i∈I, functions of R Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 8 / 21
  17. 17. Joint Causal Inference: SCM We introduce two types of dummy variables in the data: a regime variable R, indicating which dataset Dr a data point is from intervention variables {Ii }i∈I, functions of R We assume that we can represent the whole system as an acyclic SCM:    R = ER, Ii = gi (R), i ∈ I, Xj = fj (Xpa(Xj )∩X , Ipa(Xj )∩I, Ej ), j ∈ X, P (Ek)k∈X∪{R} = k∈X∪{R} P(Ek). Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 8 / 21
  18. 18. Joint Causal Inference: single joint causal DAG We represent the SCM with a causal DAG C representing all datasets jointly: R I1 I2 X1 X2 X4 1 20 0 0.1 0.2 0.5 1 20 0 0.13 0.21 0.49 1 20 0 . . . . . . . . . 2 20 1 . . . . . . . . . 3 30 0 . . . . . . . . . 4 30 1 . . . . . . . . . 4 datasets with 2 interventions R I1 I2 X1 X2 X3 X4 Causal DAG C We assume Causal Markov and Minimality hold in C. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 9 / 21
  19. 19. Joint Causal Inference: Faithfulness violations Causal Faithfulness assumption: X ⊥⊥ Y | W =⇒ X ⊥d Y | W [C] Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
  20. 20. Joint Causal Inference: Faithfulness violations Causal Faithfulness assumption: X ⊥⊥ Y | W =⇒ X ⊥d Y | W [C] R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 | R ... but X1 ⊥d I1 | R Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
  21. 21. Joint Causal Inference: Faithfulness violations Causal Faithfulness assumption: X ⊥⊥ Y | W =⇒ X ⊥d Y | W [C] R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 | R ... but X1 ⊥d I1 | R Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 | R Problem: ⊥D provably complete (for now) for functionally determined relations Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
  22. 22. Joint Causal Inference: Faithfulness violations Causal Faithfulness assumption: X ⊥⊥ Y | W =⇒ X ⊥d Y | W [C] R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 | R ... but X1 ⊥d I1 | R Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 | R Problem: ⊥D provably complete (for now) for functionally determined relations Solution: restrict deterministic relations to only ∀i ∈ I : Ii = gi (R) D-Faithfulness: X ⊥⊥ Y | W =⇒ X ⊥D Y | W [C]. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
  23. 23. Joint Causal Inference Joint Causal Inference (JCI) = Given all the assumptions, reconstruct the causal DAG C from independence test results. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
  24. 24. Joint Causal Inference Joint Causal Inference (JCI) = Given all the assumptions, reconstruct the causal DAG C from independence test results. X1 ⊥⊥ I1 X1 ⊥⊥ I1 | R X4 ⊥⊥ X2 | I1 . . . =⇒ Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
  25. 25. Joint Causal Inference Joint Causal Inference (JCI) = Given all the assumptions, reconstruct the causal DAG C from independence test results. X1 ⊥⊥ I1 X1 ⊥⊥ I1 | R X4 ⊥⊥ X2 | I1 . . . =⇒ R I1 I2 X1 X2 X3 X4 Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
  26. 26. Joint Causal Inference Joint Causal Inference (JCI) = Given all the assumptions, reconstruct the causal DAG C from independence test results. X1 ⊥⊥ I1 X1 ⊥⊥ I1 | R X4 ⊥⊥ X2 | I1 . . . =⇒ R I1 I2 X1 X2 X3 X4 Problem: Current constraint-based methods cannot work with JCI, because of faithfulness violations. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
  27. 27. Part III Extending constraint-based methods for JCI Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 12 / 21
  28. 28. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  29. 29. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs A simple strategy for dealing with functionally determined relations: 1 Rephrase constraint-based method in terms of d-separations Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  30. 30. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs A simple strategy for dealing with functionally determined relations: 1 Rephrase constraint-based method in terms of d-separations 2 For each independence test result derive sound d-separations: Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  31. 31. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs A simple strategy for dealing with functionally determined relations: 1 Rephrase constraint-based method in terms of d-separations 2 For each independence test result derive sound d-separations: X ⊥⊥ Y |W =⇒ X ⊥d Y |W Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  32. 32. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs A simple strategy for dealing with functionally determined relations: 1 Rephrase constraint-based method in terms of d-separations 2 For each independence test result derive sound d-separations: X ⊥⊥ Y |W =⇒ X ⊥d Y |W X ∈ Det(W ) and Y ∈ Det(W ) and X ⊥⊥ Y | W =⇒ X ⊥d Y | Det(W ) where Det(W ) = variables determined by (a subset of) W . Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  33. 33. A simple strategy for dealing with faithfulness violations Idea: Faithfulness violations → Partial inputs A simple strategy for dealing with functionally determined relations: 1 Rephrase constraint-based method in terms of d-separations 2 For each independence test result derive sound d-separations: X ⊥⊥ Y |W =⇒ X ⊥d Y |W X ∈ Det(W ) and Y ∈ Det(W ) and X ⊥⊥ Y | W =⇒ X ⊥d Y | Det(W ) where Det(W ) = variables determined by (a subset of) W . Conjecture: sound also for a larger class of deterministic relations. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
  34. 34. Ancestral Causal Inference with Determinism (ACID) Idea: Faithfulness violations → Partial inputs Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
  35. 35. Ancestral Causal Inference with Determinism (ACID) Idea: Faithfulness violations → Partial inputs Traditional constraint-based methods (e.g., PC, FCI, ...): cannot handle partial inputs cannot exploit the rich background knowledge in JCI Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
  36. 36. Ancestral Causal Inference with Determinism (ACID) Idea: Faithfulness violations → Partial inputs Traditional constraint-based methods (e.g., PC, FCI, ...): cannot handle partial inputs cannot exploit the rich background knowledge in JCI Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016] Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
  37. 37. Ancestral Causal Inference with Determinism (ACID) Idea: Faithfulness violations → Partial inputs Traditional constraint-based methods (e.g., PC, FCI, ...): cannot handle partial inputs cannot exploit the rich background knowledge in JCI Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016] =⇒ we implement the strategy in an extension of ACI called: Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
  38. 38. Rephrasing ACI rules in terms of d-separations ACI rules: Example For X, Y , W disjoint (sets of) variables: (X ⊥⊥ Y | W ) ∧ (X W ) =⇒ X Y Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 15 / 21
  39. 39. Rephrasing ACI rules in terms of d-separations ACI rules: Example For X, Y , W disjoint (sets of) variables: (X ⊥⊥ Y | W ) ∧ (X W ) =⇒ X Y ACID rules: Example For X, Y , W disjoint (sets of) variables: (X ⊥d Y | W ) ∧ (X W ) =⇒ X Y Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 15 / 21
  40. 40. ACID-JCI ACID-JCI = ACID rules + sound d-separations from strategy + JCI background knowledge Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
  41. 41. ACID-JCI ACID-JCI = ACID rules + sound d-separations from strategy + JCI background knowledge For example: ∀i ∈ I, ∀j ∈ X : (Xj R) ∧ (Xj Ii ) “Standard variables cannot cause dummy variables” Example I1 X1 R I1 X1 R I1 X1 R Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
  42. 42. ACID-JCI ACID-JCI = ACID rules + sound d-separations from strategy + JCI background knowledge For example: ∀i ∈ I, ∀j ∈ X : (Xj R) ∧ (Xj Ii ) “Standard variables cannot cause dummy variables” Example I1 X1 R I1 X1 R I1 X1 R Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
  43. 43. Part IV Evaluation Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 17 / 21
  44. 44. Simulated data accuracy: example Precision Recall curve Ancestral (“causes”) relations Non-ancestral (“not causes”) Precision Recall curves of 2000 randomly generated causal graphs for 4 system variables and 3 interventions ACID-JCI substantially improves accuracy w.r.t. merging learnt structures (merged ACI) Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 18 / 21
  45. 45. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  46. 46. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods A simple strategy for dealing with faithfulness violations due to functionally determined relations Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  47. 47. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods A simple strategy for dealing with faithfulness violations due to functionally determined relations An implementation, ACID-JCI, that substantially improves the accuracy w.r.t. state-of-the-art Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  48. 48. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods A simple strategy for dealing with faithfulness violations due to functionally determined relations An implementation, ACID-JCI, that substantially improves the accuracy w.r.t. state-of-the-art Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  49. 49. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods A simple strategy for dealing with faithfulness violations due to functionally determined relations An implementation, ACID-JCI, that substantially improves the accuracy w.r.t. state-of-the-art Future work: Improve scalability (now max 7 variables in C). Working paper: https://arxiv.org/abs/1611.10351 Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  50. 50. Conclusion Joint Causal Inference, a powerful formulation of causal discovery over multiple datasets for constraint-based methods A simple strategy for dealing with faithfulness violations due to functionally determined relations An implementation, ACID-JCI, that substantially improves the accuracy w.r.t. state-of-the-art Future work: Improve scalability (now max 7 variables in C). Working paper: https://arxiv.org/abs/1611.10351 Collaborators: Tom Claassen Joris M. Mooij Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
  51. 51. References I Claassen, T. and Heskes, T. (2011). A logical characterization of constraint-based causal discovery. In UAI 2011, pages 135–144. Eaton, D. and Murphy, K. (2007). Exact Bayesian structure learning from uncertain interventions. In Proceedings of the 10th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 107–114. Magliacane, S., Claassen, T., and Mooij, J. M. (2016). Ancestral causal inference. In NIPS. Mooij, J. M. and Heskes, T. (2013). Cyclic causal discovery from continuous equilibrium data. In Nicholson, A. and Smyth, P., editors, Proceedings of the 29th Annual Conference on Uncertainty in Artificial Intelligence (UAI-13), pages 431–439. AUAI Press. Spirtes, P., Glymour, C., and Scheines, R. (2000). Causation, Prediction, and Search. MIT press, 2nd edition. Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 20 / 21
  52. 52. Ancestral Causal Inference (ACI) Weighted list of inputs: I = {(ij , wj )}: E.g. I = { (Y ⊥⊥ Z | X, 0.2), (Y ⊥⊥ X, 0.1)} } Any consistent weighting scheme, e.g. frequentist, Bayesian For any possible ancestral structure C, we define the loss function: Loss(C, I) := (ij ,wj )∈I: ij is not satisfied in C wj Here: “ij is not satisfied in C” = defined by ancestral reasoning rules For each possible causal relation X Y provide score: Conf (X Y ) = min C∈C Loss(C, I + (X Y , ∞)) − min C∈C Loss(C, I + (X Y , ∞)) Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 21 / 21

×