Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster

Joint Causal Inference
Sara Magliacane1,2
, Tom Claassen1,3
, Joris M. Mooij1
1 University of Amsterdam 2 VU Amsterdam 3 Radboud University Nijmegen
Current'best'choice'
CausAM
Causality@AmsterdaM
Abstract
We propose Joint Causal Inference (JCI), a powerful formulation of
causal discovery over multiple datasets in which we jointly learn both the
causal structure and targets of interventions from independence test
results. Our implementation ACID-JCI substantially improves the accuracy
of the causal predictions with respect to the state-of-the-art.
Causal discovery methods
To answer “what if?” questions we need the causal structure. Two main
categories for methods that learn causal structures from data:
•Score-based: evaluate models using a penalized likelihood score
•Constraint-based: use statistical independences to express con-
straints over possible causal models
Advantage of constraint-based methods:
•can handle latent confounders naturally
Advantage of score-based methods:
•can formulate joint inference on observational and experimental data
and learn the targets of interventions, e.g. [Eaton and Murphy, 2007].
Goal: Can we perform joint inference using constraint-based methods?
Joint causal inference (JCI)
We propose to model jointly several observational or experimental datasets
{Dr}r∈{1...n} with zero or more possibly unknown intervention targets.
We assume a unique underlying causal DAG across datasets defined over
system variables {Xj}j∈X (some of which possibly hidden). Consequence:
we do not allow certain interventions, e.g. perfect interventions.
We introduce two types of dummy variables in the data:
•a regime variable R, indicating the dataset Dr a data point is from
•intervention variables {Ii}i∈I, which are functions of R.
We assume that we can represent the whole system as an acyclic SCM:



R = ER,
Ii = gi(R), i ∈ I,
Xj = fj(Xpa(Xj)∩X , Ipa(Xj)∩I, Ej), j ∈ X,
P (Ek)k∈X∪{R} =
k∈X∪{R}
P(Ek).
We represent this SCM with a causal DAG C, for example:
R I1 I2 X1 X2 X4
1 20 0 0.1 0.2 0.5
1 20 0 0.13 0.21 0.49
1 20 0 . . . . . . . . .
2 20 1 . . . . . . . . .
3 30 0 . . . . . . . . .
4 30 1 . . . . . . . . .
4 datasets with 2 interventions
R
I1 I2
X1
X2
X3
X4
Causal DAG representing all 4 datasets
We assume Causal Markov and Minimality Assumptions hold in C.
Deterministic relations in JCI: R determines each of {Ii}i∈I and there
are no other deterministic relations. In this setting D-separation ⊥D is
provably complete. We conjecture it may also be complete more generally.
To allow these deterministic relations, we relax faithfulness assumption to:
D-Faithfulness assumption: X ⊥⊥ Y | W =⇒ X ⊥D Y | W [C].
Joint Causal Inference = Given all the assumptions, reconstruct the
causal DAG C from independence test results.
Problem: Current constraint-based methods cannot work with JCI.
Extending constraint-based methods for JCI
We propose a simple but effective strategy for dealing with faithfulness
violations due to functionally determined relations, e.g. in JCI.
This strategy can be applied to any constraint-based method, if it can deal
with partial inputs (missing results for certain independence tests).
1.Rephrase a constraint-based method in terms of d-separations in-
stead of independence test results
2.Decide for each independence test result which d-separations can be
soundly derived and provide them as input to the method:
•X ⊥⊥ Y |W =⇒ X ⊥d Y |W
•X ∈ Det(W ) and Y ∈ Det(W ) and X ⊥⊥ Y | W =⇒
X ⊥d Y | Det(W )
where ⊥d is d-separation, ⊥d is d-connection, and Det(W ) are the vari-
ables determined by (a subset of) W .
Under Causal Markov, Minimality and D-Faithfulness we show this strategy
is sound. Conjecture: sound also for a larger class of deterministic relations.
Ancestral Causal Inference with Determinism (ACID)
We implement the strategy in ACID, a determinism-tolerant extension of
ACI [Magliacane et al., 2016], a causal discovery method that accurately
reconstructs ancestral relations in the presence of latent confounders.
ACI is based on a set of logical rules, e.g.:
(X ⊥⊥ Y | Z) ∧ (X Z) =⇒ X Y.
ACID implements the proposed strategy for dealing with faithfulness viola-
tions and reformulates the rules of ACI in terms of d-separation, e.g.:
(X ⊥d Y | Z) ∧ (X Z) =⇒ X Y.
ACID-JCI: To improve the identifiability and accuracy of the predictions,
we also add as background knowledge a series of rules encoding the JCI
background knowledge on the dummy variables, e.g.:
∀i ∈ I, ∀j ∈ X : (Xj R) ∧ (Xj Ii).
Preliminary evaluation on simulated data
•Simulated data: 600 randomly generated causal graphs
•PR curves for ancestral (left) and nonancestral (right) relations
•ACID-JCI substantially improves the accuracy compared to learning
the causal structures on each dataset separately and merging them.
References
D. Eaton and K. Murphy. Exact Bayesian structure learning from uncertain interventions.
In AISTATS, pages 107–114, 2007.
Sara Magliacane, Tom Claassen, and Joris M. Mooij. Ancestral Causal Inference. In NIPS,
2016.
Working paper: https://arxiv.org/abs/1611.10351
SM and JMM were supported by NWO (VIDI grant 639.072.410). SM was also sup-
ported by COMMIT/ under the Data2Semantics project. TC was supported by NWO grant
612.001.202 (MoCoCaDi), and EU-FP7 grant agreement n.603016 (MATRICS).
NIPS 2016 Barcelona, Spain; 10-12-2016

Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster

Similar to Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster (20)

Recently uploaded

Recently uploaded (20)

Joint causal inference on observational and experimental data - NIPS 2016 "What If?" workshop poster