Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ancestral Causal Inference - NIPS 2016 poster


Published on

Ancestral Causal Inference:
NIPS 2016 poster

Published in: Science
  • Be the first to comment

  • Be the first to like this

Ancestral Causal Inference - NIPS 2016 poster

  1. 1. Ancestral Causal Inference Sara Magliacane1,2 , Tom Claassen1,3 , Joris M. Mooij1 1 University of Amsterdam; 2 VU Amsterdam 3 Radboud University Nijmegen Current'best'choice' CausAM Causality@AmsterdaM Main Contributions • Ancestral Causal Discovery (ACI), a causal discovery method as accurate as the state-of-the-art but much more scalable • A method for scoring causal relations that approximates marginal probability Causal discovery methods • Score-based: evaluate models using a penalized likelihood score • Constraint-based causal discovery: use statistical independences to express constraints over possible causal models Advantages of constraint-based w.r.t. score-based methods: • can handle latent confounders naturally • easy integration of background knowledge Disadvantages of constraint-based methods: • vulnerability to errors in statistical independence tests • No estimation of confidence in the causal predictions Causal inference as an optimization problem To solve the vulnerability to errors in statistical tests Hyttinen et al. [2014] propose HEJ, which formulates causal discovery as an optimization problem: • Weighted list of statistical independence results: I = {(ij, wj)}: – E.g. I = { (Y ⊥⊥ Z | X, 0.2), (Y ⊥⊥ X, 0.1)} • For any possible causal structure C, define a loss function: loss(C, I) := (ij,wj)∈I: ij is not satisfied in C wj • “ij is not satisfied in C” = defined by causal reasoning rules • Causal inference = Find causal structure minimizing loss function C∗ = arg min C∈C loss(C, I) Problem: Scalability, e.g. HEJ is very slow already for 8 random variables. Ancestral Causal Inference (ACI) Instead of direct causal relations use a more coarse-grained representation, e.g., an ancestral structure, i.e. the transitive closure of the observed variables of the DAG: (reflexivity) : X X, (transitivity) : X Y ∧ Y Z =⇒ X Z, (antisymmetry) : X Y ∧ Y X =⇒ X = Y, Ancestral Causal Inference (ACI) We reformulate the causal discovery as an optimization problem in terms of ancestral structures, which reduce drastically the search space (e.g. for 7 variables: 2.3 × 1015 → 6 × 106 possible structures). This requires new ancestral reasoning rules: For X, Y , W disjoint (sets of) variables: 1. (X ⊥⊥ Y | W ) ∧ (X W ) =⇒ X Y 2. X ⊥⊥ Y | W ∪ [Z] =⇒ (X ⊥⊥ Z | W ) ∧ (Z {X, Y } ∪ W ) 3. X ⊥⊥ Y | W ∪ [Z] =⇒ (X ⊥⊥ Z | W ) ∧ (Z {X, Y } ∪ W ) 4. (X ⊥⊥ Y | W ∪ [Z]) ∧ (X ⊥⊥ Z | W ∪ U) =⇒ (X ⊥⊥ Y | W ∪ U) 5. (Z ⊥⊥ X | W ) ∧ (Z ⊥⊥ Y | W ) ∧ (X ⊥⊥ Y | W ) =⇒ X ⊥⊥ Y | W ∪ Z Possible weighting schemes for inputs ACI supports two types of weighted input statements: statistical independence results and ancestral relations. We propose two simple weighting schemes: • a frequentist approach, in which for any appropriate frequentist statistical test with independence as null hypothesis, we define the weight: w = | log p − log α|, where p = p-value of the test, α = significance level (e.g., 5%); • a Bayesian approach, in which the weight of each input i using data set D is: w = log p(i|D) p(¬i|D) = log p(D|i) p(D|¬i) p(i) p(¬i) , where the prior probability p(i) can be used as a tuning parameter. For X Y we test the independence of Y and IX, an indicator variable (0 for observational samples, 1 for samples from the distribution where X is intervened upon). A method for scoring causal predictions • Score the confidence in a predicted statement s (e.g. X Y ) as: C(f) = min C∈C loss(C, I + (¬s, ∞)) − min C∈C loss(C, I + (s, ∞)) • ≈ MAP approximation of the log-odds ratio of s • Asymptotically consistent, when consistent input weights • Can be used with any method that solves an optimization problem Simulated data • Generate randomly 2000 linear acyclic models of n observed variables, with latent variables and Gaussian noise • Per model: sample 500 data points and perform independence tests up to order c Evaluation on Simulated data We compare ACI, HEJ [Hyttinen et al., 2014] equipped with our scoring method, and bootstrapped versions of FCI and CFCI. Recall 0 0.05 0.1 0.15 0.2 Precision 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Bootstrapped (100) CFCI Bootstrapped (100) FCI HEJ (c=1) ACI (c=1) Standard CFCI Standard FCI Recall 0 0.005 0.01 0.015 0.02 Precision 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Precision recall curves for ancestral (left) and nonancestral (right) relations. The middle column is a zoom of ancestral PR curve. • ACI is as accurate as HEJ for c = 1, outperforming bootstrapped C/FCI 0.01 0.1 1 10 100 1000 6 6.5 7 7.5 8 8.5 9 Executiontime(s) Number of variables HEJ ACI • ACI is orders of magnitude faster than HEJ • The difference grows exponentially as the number of variables n increases (log-scale) • HEJ is not feasible for 8 variables • ACI can scale up to 12 variables Application on real data We apply ACI to reconstruct a signalling network from flow cytometry data. Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK BCFCI (indep. <= 1) Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK Bootstrapped CFCI (in- dependences c = 1) Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK ACI (ancestral relations) Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK ACI (ancestral rela- tions) Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK ACI (ancestral r. + indep. <= 1) Raf Mek PLCg PIP2 PIP3 Erk Akt PKA PKC p38 JNK ACI (ancestral relations and indep. c = 1) • ACI can take advantage of weighted ancestral re- lations from experimental data • CFCI cannot, so it predicts much less • ACI is consistent with other methods, e.g. [MooijHeskes2013] Raf Mek Erk Akt JNK PIP3 PLCg PIP2 PKC PKA p38 References Antti Hyttinen, Frederick Eberhardt, and Matti J¨arvisalo. Constraint-based causal dis- covery: Conflict resolution with Answer Set Programming. In UAI, 2014. ACI source code: