SlideShare a Scribd company logo
A statistical criterion for reducing indeterminacy in
linear causal modeling
Gianluca Bontempi
Machine Learning Group,
Computer Science Department
ULB, Université Libre de Bruxelles
Boulevard de Triomphe - CP 212
Bruxelles, Belgium
http://mlg.ulb.ac.be
Causality in science
A major goal of the scientific activity is to model real
phenomena by studying the dependency between entities,
objects or more in general variables.
Sometimes the goal of the modeling activity is simply
predicting future behaviors. Sometimes the goal is to
understand the causes of a phenomenon (e.g. a disease).
Understanding the causes of a phenomenon means
understanding the mechanisms by which the observed variables
take their values and predicting how the values of those
variable would change if the mechanisms were subject to
manipulations.
Applications: understanding which actions to perform on a
system to have a desired effect (eg. understanding the causes
of tumor, the causes of the activation of a gene, the causes of
different survival rates in a cohort of patients.)
Graphical models
Graphical models are a conventional formalism used to represent
causal dependencies.
Coughing
Allergy
Smoking
Anxiety
Genetic
factor
(a)
(d) (c)
Hormonal
factor
Metastasis
(b)
Other
cancers
Lung cancer
Inferring causes from observations
Inferring causal relationships from observational data is an
open challenge in machine learning.
State-of-the-art approaches often rely on algorithms which
detect v-structures in triplets of nodes in order to orient arcs.
Bayesian networks techniques search for unshielded colliders
i.e. patterns where two variables are both direct causes of a
third one, without being each a direct cause of the other.
Under assumptions of Causal Markov Condition and
Faithfulness, this structure is statistically distinguishable and
so-called constraint based algorithms (notably the PC and the
SGS algorithms) rely on conditional independence tests to
orient at least partially a graph.
Indeterminacy
Conditional independence algorithms are destined to fail when
confronted with completely connected triplets. In this case the
lack of independency makes ineffective the adoption of
conditional independency tests to infer the direction of the
arrows.
When there are no independencies, the direction of the arrows
can be anything. In other terms distinct causal patterns are
not distinguishable.
If some causal patterns are not distinguishable in terms of
conditional independence, this doesn’t necessarily mean that
they cannot be (at least partially) distinguished by using other
measures.
Are there statistics or measures that can help in making the
problem more distinguishable or in other terms provide
information about which configuration is more probable?
Completely connected triplet
b
2
x1 x2
x3
b3
b1
w1 w2
w3
Contributions
Our paper proposes a criterion to deal with arc orientation also
in presence of completely linearly connected triplets.
This criterion is then used in a Relevance-Causal (RC)
algorithm, which combines the original causal criterion with a
relevance measure, to infer causal dependencies from
observational data.
A set of simulated experiments on the inference of the causal
structure of linear and nonlinear networks shows the
effectiveness of the proposed approach.
Motivations
We define a data-dependent measure able to reduce the
statistical indistinguishability of completely and linearly
connected triplets.
It is a modification of the covariance formula of a Structural
Equation Model (SEM) which results in a statistic taking
opposite signs for different causal patterns when the
unexplained variations of the variables are of the same
magnitude.
Though this assumption could appear as too strong,
assumptions of comparable strength (e.g. the existence of
unshielded colliders) have been commonly used so far in causal
inference.
We expect that this alternative approach could shed additional
light on the issue of causality in the perspective of extending it
to more general configurations.
Completely connected triplet
b
2
x1 x2
x3
b3
b1
w1 w2
w3
Structural equation models (SEM)
Let us consider the linear causal structure represented by the
completely connected triplet. In algebraic notation this corresponds
to the system of linear equations



x1 = w1
x2 = b1x1 + w2
x3 = b3x1 + b2x2 + w3
where it is assumed that each variable has mean 0, the bi = 0 are
also known as structural coefficients and the disturbances,
supposed to be independent, are designated by wi .
SEM matrix form
This set of equations can be put in the matrix form
x = Ax + w
where x = [x1, x2, x3]T ,
A =


0 0 0
b1 0 0
b3 b2 0


and w = [w1, w2, w3]T . The multivariate variance-covariance
matrix has no zero entries and is given by
Σ = (I − A)−1
G((I − A)T
)−1
(1)
where I is the identity matrix and
G =


σ2
1 0 0
0 σ2
2 0
0 0 σ2
3


is the diagonal covariance matrix of the disturbances.
Structural equation models and causal inference
Structural equation modeling techniques for causal inference
proceed by
1 making some assumptions on the structure underlying the
data,
2 performing the related parameter estimation, usually based on
maximum likelihood and
3 assessing by significance testing the discrepancy between the
observed (sample) covariance matrix ˆΣ and the covariance
matrix Σh implied by the hypothesis.
Conventional SEM is not able to reconstruct the right directionality
of the connections in a completely connected triplet.
Indistinguishability in SEM
Suppose we want to test two alternative hypothesis,
represented by the two directed graphs in the following slide.
Note that the hypothesis on the left is correct while the
hypothesis on the right inverses the directionality of the link
between x2 and x3 and consequently misses the causal role of
the variable x2.
Let us consider the following question: is it possible to
discriminate between the structures 1 and 2 by simply relying
on parameter estimation (in this case regression fitting)
according to the hypothesized dependencies?
The answer is unfortunately negative. Both the covariance
matrices Σ1 and Σ2 coincide with ˆΣ.
Two hypothesis
b
2
x1 x2
x3
b3
b1
b
2
x1 x2
x3
b3
b1
Our criterion
We propose here is an alternative criterion able to perform
such distinction.
The computation of our criterion requires the fitting of the two
hypothetical structures as in conventional SEM.
What is different is that, instead of computing the covariance
term we consider the term
S = (I − A)−1
((I − A)T
)−1
.
NOTA BENE: we will make the assumption that
σ1 = σ2 = σ3 = σ, i.e. that the unexplained variations of the
variables are of comparable magnitude.
Collider pattern
b
2
x1 x2
x3
b3
b1
w1 w2
w3
Two hypothesis
b
2
x1 x2
x3
b3
b1
b
2
x1 x2
x3
b3
b1
Collider case
Let us suppose that data are generated according to a collider
structure where the node x3 is a collider.
We fit the hypothesis 1 to the data by using least-squares. We
obtain the matrix ˆA1 and we compute the term
ˆS1 = (I − ˆA1)−1
((I − ˆA1)T
)−1
.
We fit the hypothesis 2 to data by using least-squares. We
obtain the matrix ˆA2 and we compute the term ˆS2.
We showed that the quantity
C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2]
is greater than zero for any sign of the structural coefficients.
Chain pattern
b
2
x1 x2
x3
b3
b1
w2
w1
w3
Chain pattern
Let us suppose that data are generated according to a chain
structure where x3 is part of the chain pattern x1 → x3 → x2.
We fit the hypothesis 1 to the data and we compute the term
S1
We fit the hypothesis 2 to data and we compute the term S2
We showed that the quantity
C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2]
is less than zero whatever the sign of the structural
coefficients bi
Fork pattern
b
2
x1 x2
x3
b3
b1
w1 w2
w3
Fork pattern
Let us suppose that data are generated according to a fork
structure where x3 is common cause of x1 and x2.
We fit the hypothesis 1 to the data and we compute the term
S1
We fit the hypothesis 2 to data and we compute the term S2
We showed that the quantity
C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2]
is less than zero whatever the sign of the structural
coefficients bi
Causal statistic
The previous results show that the computation of the
quantity C on the basis of observational data only can help in
discriminating between the collider configuration i where the
nodes x1 and x2 are direct causes of x3 (C > 0) and non
collider configurations (i.e. fork or chain) (C < 0).
In other terms, given a completely connected triplet of
variables, the quantity C(x1, x2, x3) returns useful information
about the causal role of x1 and x2 with respect to x3 whatever
is the strength or the direction of the link between x1 and x2.
The properties of the quantity C encourage its use in an
algorithm to infer directionality from observational data.
RC (Relevance Causal) algorithm
We propose then a RC (Relevance Causal) algorithm for linear
causal modeling inspired to the mIMR causal filter selection
algorithm.
It is a forward selection algorithm which given a set XS of d
already selected variables, updates this set by adding the
d + 1th variable which satisfies
x∗
d+1 = arg max
xk ∈X−XS
(1 − λ)R({XS , xk}; y)+
λ
d
xi ∈XS
C(xi ; xk; y) (2)
where λ ∈ [0, 1] weights the R and the C contribution, the R
term quantifies the relevance of the subset {XS , xk} and the C
term quantifies the causal role of an input xk with respect to
the set of selected variables xi ∈ XS .
Experimental setting
The aim of the experiment is to reverse engineer both linear
and nonlinear scale-free causal networks, i.e. networks where
the distribution of the degree follows a power law, from a
limited amount of observational data.
We consider networks with n = 5000 nodes and the degree α
of the power law ranges between 2.1 and 3. The inference is
done on the basis of N = 200 observations.
We compare the accuracy of several algorithms in terms of the
mean F-measure (the higher, the better) averaged over 10
runs and over all the nodes with a number of parents and
children larger equal than two.
Experimental setting
We considered the following algorithms for comparison:
the IAMB algorithm implemented by the Causal Explorer
software which estimates for a given variable the set of
variables belonging to its Markov blanket,
the mIMR causal algorithm,
the mRMR algorithm and
three versions of the RC algorithm with three different values
λ = 0, 0.5, 1.
Note that the RC algorithm with λ = 0 boils down to a
conventional wrapper algorithm based on the leave-one-out
assessment of the variables’ subsets.
Linear inference
α IAMB mIMR mRMR RC0 RC0.5 RC1
2.2 0.375 0.324 0.319 0.386 0.421 0.375
2.3 0.378 0.337 0.333 0.387 0.437 0.401
2.4 0.376 0.342 0.342 0.385 0.441 0.414
2.5 0.348 0.322 0.313 0.358 0.422 0.413
2.6 0.347 0.318 0.311 0.355 0.432 0.414
2.7 0.344 0.321 0.311 0.352 0.424 0.423
2.8 0.324 0.304 0.293 0.334 0.424 0.422
2.9 0.342 0.333 0.321 0.353 0.448 0.459
3.0 0.321 0.319 0.297 0.326 0.426 0.448
F-measure (averaged over all nodes with a number of parents and
children ≥ 2 and over 10 runs) of the accuracy of the inferred
networks.
Nonlinear inference
α IAMB mIMR mRMR RC0 RC0.5 RC1
2.2 0.312 0.310 0.304 0.314 0.356 0.324
2.3 0.317 0.328 0.316 0.320 0.375 0.349
2.4 0.304 0.317 0.304 0.306 0.366 0.351
2.5 0.321 0.327 0.328 0.325 0.379 0.359
2.6 0.306 0.325 0.306 0.309 0.379 0.365
2.7 0.313 0.319 0.303 0.316 0.380 0.359
2.8 0.297 0.326 0.300 0.300 0.392 0.382
2.9 0.310 0.329 0.313 0.313 0.389 0.377
3.0 0.299 0.324 0.300 0.303 0.399 0.392
F-measure (averaged over all nodes with a number of parents and
children ≥ 2 and over 10 runs) of the accuracy of the inferred
network.
Discussion
The results show the potential of the criterion C and of the
RC algorithm in network inference tasks where dependencies
between parents are frequent because of direct links or
common ancestors.
According to the F-measures reported in the Tables the RC
accuracy with λ = 0.5 and λ = 1 is coherently better than the
ones of mIMR, mRMR and IAMB algorithms for all the
considered degrees of the distribution.
RC outperforms a conventional wrapper approach which
targets only prediction accuracy (λ = 0 ) when a causal
criterion C is taken into account together with a predictive one
(λ = 0.5).
Conclusions
Causal inference from complex large dimensional data is taking
a growing importance in machine learning and knowledge
discovery.
Currently, most of the existing algorithms are limited by the
fact that the discovery of causal directionality is submitted to
the detection of a limited set of distinguishable patterns, like
unshielded colliders.
However the scarcity of data and the intricacy of dependencies
in networks could make the detection of such patterns so rare
that the resulting precision would be unacceptable.
This paper shows that it is possible to identify new statistical
measures helping in reducing indistinguishability under the
assumption of equal variances of the unexplained variations of
the three variables.
Conclusions
Though this assumption could be questioned, we deem that it
is important to define new statistics to help discriminating
between causal structures for completely connected triplets in
linear causal modeling.
Future work will focus on assessing whether such statistic is
useful in reducing indeterminacy also when the assumption of
equal variance is not satisfied.

More Related Content

What's hot

The geometry of three planes in space
The geometry of three planes in spaceThe geometry of three planes in space
The geometry of three planes in spaceTarun Gehlot
 
Final presentation
Final presentationFinal presentation
Final presentationpaezp
 
Solving systems of equations in 3 variables
Solving systems of equations in 3 variablesSolving systems of equations in 3 variables
Solving systems of equations in 3 variablesJessica Garcia
 
Bayesian Inference: An Introduction to Principles and ...
Bayesian Inference: An Introduction to Principles and ...Bayesian Inference: An Introduction to Principles and ...
Bayesian Inference: An Introduction to Principles and ...butest
 
MCA_UNIT-2_Computer Oriented Numerical Statistical Methods
MCA_UNIT-2_Computer Oriented Numerical Statistical MethodsMCA_UNIT-2_Computer Oriented Numerical Statistical Methods
MCA_UNIT-2_Computer Oriented Numerical Statistical MethodsRai University
 
Math 1300: Section 4- 3 Gauss-Jordan Elimination
Math 1300: Section 4- 3 Gauss-Jordan EliminationMath 1300: Section 4- 3 Gauss-Jordan Elimination
Math 1300: Section 4- 3 Gauss-Jordan EliminationJason Aubrey
 
Linear equations
Linear equationsLinear equations
Linear equationsNisarg Amin
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
Consistency of linear equations in two and three variables
Consistency of linear equations in two and three variablesConsistency of linear equations in two and three variables
Consistency of linear equations in two and three variablesAamlan Saswat Mishra
 
Solution of second kind volterra integro equations using linear
Solution of second kind volterra integro equations using linearSolution of second kind volterra integro equations using linear
Solution of second kind volterra integro equations using linearAlexander Decker
 
11.a family of implicit higher order methods for the numerical integration of...
11.a family of implicit higher order methods for the numerical integration of...11.a family of implicit higher order methods for the numerical integration of...
11.a family of implicit higher order methods for the numerical integration of...Alexander Decker
 
STUDY OF Ε-SMOOTH SUPPORT VECTOR REGRESSION AND COMPARISON WITH Ε- SUPPORT ...
STUDY OF Ε-SMOOTH SUPPORT VECTOR  REGRESSION AND COMPARISON WITH Ε- SUPPORT  ...STUDY OF Ε-SMOOTH SUPPORT VECTOR  REGRESSION AND COMPARISON WITH Ε- SUPPORT  ...
STUDY OF Ε-SMOOTH SUPPORT VECTOR REGRESSION AND COMPARISON WITH Ε- SUPPORT ...ijscai
 
Estimation theory 1
Estimation theory 1Estimation theory 1
Estimation theory 1Gopi Saiteja
 
interpolation of unequal intervals
interpolation of unequal intervalsinterpolation of unequal intervals
interpolation of unequal intervalsvaani pathak
 
System of equations
System of equationsSystem of equations
System of equationsmariacadena
 

What's hot (19)

The geometry of three planes in space
The geometry of three planes in spaceThe geometry of three planes in space
The geometry of three planes in space
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Solution of linear system of equations
Solution of linear system of equationsSolution of linear system of equations
Solution of linear system of equations
 
Solving systems of equations in 3 variables
Solving systems of equations in 3 variablesSolving systems of equations in 3 variables
Solving systems of equations in 3 variables
 
Bayesian Inference: An Introduction to Principles and ...
Bayesian Inference: An Introduction to Principles and ...Bayesian Inference: An Introduction to Principles and ...
Bayesian Inference: An Introduction to Principles and ...
 
MCA_UNIT-2_Computer Oriented Numerical Statistical Methods
MCA_UNIT-2_Computer Oriented Numerical Statistical MethodsMCA_UNIT-2_Computer Oriented Numerical Statistical Methods
MCA_UNIT-2_Computer Oriented Numerical Statistical Methods
 
Math 1300: Section 4- 3 Gauss-Jordan Elimination
Math 1300: Section 4- 3 Gauss-Jordan EliminationMath 1300: Section 4- 3 Gauss-Jordan Elimination
Math 1300: Section 4- 3 Gauss-Jordan Elimination
 
Linear equations
Linear equationsLinear equations
Linear equations
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Consistency of linear equations in two and three variables
Consistency of linear equations in two and three variablesConsistency of linear equations in two and three variables
Consistency of linear equations in two and three variables
 
Solution of second kind volterra integro equations using linear
Solution of second kind volterra integro equations using linearSolution of second kind volterra integro equations using linear
Solution of second kind volterra integro equations using linear
 
11.a family of implicit higher order methods for the numerical integration of...
11.a family of implicit higher order methods for the numerical integration of...11.a family of implicit higher order methods for the numerical integration of...
11.a family of implicit higher order methods for the numerical integration of...
 
STUDY OF Ε-SMOOTH SUPPORT VECTOR REGRESSION AND COMPARISON WITH Ε- SUPPORT ...
STUDY OF Ε-SMOOTH SUPPORT VECTOR  REGRESSION AND COMPARISON WITH Ε- SUPPORT  ...STUDY OF Ε-SMOOTH SUPPORT VECTOR  REGRESSION AND COMPARISON WITH Ε- SUPPORT  ...
STUDY OF Ε-SMOOTH SUPPORT VECTOR REGRESSION AND COMPARISON WITH Ε- SUPPORT ...
 
Linear Algebra and its use in finance:
Linear Algebra and its use in finance:Linear Algebra and its use in finance:
Linear Algebra and its use in finance:
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
 
Econometrics ch11
Econometrics ch11Econometrics ch11
Econometrics ch11
 
Estimation theory 1
Estimation theory 1Estimation theory 1
Estimation theory 1
 
interpolation of unequal intervals
interpolation of unequal intervalsinterpolation of unequal intervals
interpolation of unequal intervals
 
System of equations
System of equationsSystem of equations
System of equations
 

Viewers also liked

3.5. Facts About Disposal in the U.S. (Kreisberg)
3.5. Facts About Disposal in the U.S. (Kreisberg)3.5. Facts About Disposal in the U.S. (Kreisberg)
3.5. Facts About Disposal in the U.S. (Kreisberg)Teleosis Institute
 
Actividades deportivas av iberotanagra 1º semestre 2017
Actividades deportivas av iberotanagra 1º semestre 2017Actividades deportivas av iberotanagra 1º semestre 2017
Actividades deportivas av iberotanagra 1º semestre 2017alex lopez
 
Final professional artefacts &amp; oral presentation m3 18.4.16
Final professional artefacts &amp; oral presentation m3 18.4.16Final professional artefacts &amp; oral presentation m3 18.4.16
Final professional artefacts &amp; oral presentation m3 18.4.16Paula Nottingham
 
Final slideshare 5.10.11 1st Campus Session Module 3 WBS3760
 Final slideshare 5.10.11 1st Campus Session  Module 3 WBS3760 Final slideshare 5.10.11 1st Campus Session  Module 3 WBS3760
Final slideshare 5.10.11 1st Campus Session Module 3 WBS3760Paula Nottingham
 
Organización operativa del tráfico de mercancías por carretera
Organización operativa del tráfico de mercancías por carreteraOrganización operativa del tráfico de mercancías por carretera
Organización operativa del tráfico de mercancías por carreteraOrbe Distribuidora de Formación
 
250 diapositivas
250 diapositivas 250 diapositivas
250 diapositivas dinubazan
 
Cold starting problem in scooty pep+
Cold starting problem in scooty pep+Cold starting problem in scooty pep+
Cold starting problem in scooty pep+Sahir Momin
 
Surgical Site infections
Surgical Site infectionsSurgical Site infections
Surgical Site infectionsHany Lotfy
 
Mette Lindahl-Olsson: From managing disasters to managing riskis
Mette Lindahl-Olsson: From managing disasters to managing riskisMette Lindahl-Olsson: From managing disasters to managing riskis
Mette Lindahl-Olsson: From managing disasters to managing riskisTHL
 
Patient safety goals effective january 1, 2016
Patient safety goals effective january 1, 2016Patient safety goals effective january 1, 2016
Patient safety goals effective january 1, 2016Hisham Aldabagh
 
Virat kohli the man &amp; his behind
Virat kohli   the man &amp; his behindVirat kohli   the man &amp; his behind
Virat kohli the man &amp; his behindShashank Varun
 
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...Isam Shahrour
 
Subscribed World Tour Keynote: London, 2015
Subscribed World Tour Keynote: London, 2015Subscribed World Tour Keynote: London, 2015
Subscribed World Tour Keynote: London, 2015Zuora, Inc.
 
Subscribed 2016: 100 Years Young
Subscribed 2016: 100 Years YoungSubscribed 2016: 100 Years Young
Subscribed 2016: 100 Years YoungZuora, Inc.
 

Viewers also liked (18)

Propuesta store travel(final)
Propuesta store travel(final)Propuesta store travel(final)
Propuesta store travel(final)
 
3.5. Facts About Disposal in the U.S. (Kreisberg)
3.5. Facts About Disposal in the U.S. (Kreisberg)3.5. Facts About Disposal in the U.S. (Kreisberg)
3.5. Facts About Disposal in the U.S. (Kreisberg)
 
Actividades deportivas av iberotanagra 1º semestre 2017
Actividades deportivas av iberotanagra 1º semestre 2017Actividades deportivas av iberotanagra 1º semestre 2017
Actividades deportivas av iberotanagra 1º semestre 2017
 
Derechos de autor
Derechos de autorDerechos de autor
Derechos de autor
 
Mi biografia
Mi biografiaMi biografia
Mi biografia
 
Final professional artefacts &amp; oral presentation m3 18.4.16
Final professional artefacts &amp; oral presentation m3 18.4.16Final professional artefacts &amp; oral presentation m3 18.4.16
Final professional artefacts &amp; oral presentation m3 18.4.16
 
Final slideshare 5.10.11 1st Campus Session Module 3 WBS3760
 Final slideshare 5.10.11 1st Campus Session  Module 3 WBS3760 Final slideshare 5.10.11 1st Campus Session  Module 3 WBS3760
Final slideshare 5.10.11 1st Campus Session Module 3 WBS3760
 
Organización operativa del tráfico de mercancías por carretera
Organización operativa del tráfico de mercancías por carreteraOrganización operativa del tráfico de mercancías por carretera
Organización operativa del tráfico de mercancías por carretera
 
250 diapositivas
250 diapositivas 250 diapositivas
250 diapositivas
 
Summer_Work
Summer_WorkSummer_Work
Summer_Work
 
Cold starting problem in scooty pep+
Cold starting problem in scooty pep+Cold starting problem in scooty pep+
Cold starting problem in scooty pep+
 
Surgical Site infections
Surgical Site infectionsSurgical Site infections
Surgical Site infections
 
Mette Lindahl-Olsson: From managing disasters to managing riskis
Mette Lindahl-Olsson: From managing disasters to managing riskisMette Lindahl-Olsson: From managing disasters to managing riskis
Mette Lindahl-Olsson: From managing disasters to managing riskis
 
Patient safety goals effective january 1, 2016
Patient safety goals effective january 1, 2016Patient safety goals effective january 1, 2016
Patient safety goals effective january 1, 2016
 
Virat kohli the man &amp; his behind
Virat kohli   the man &amp; his behindVirat kohli   the man &amp; his behind
Virat kohli the man &amp; his behind
 
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...
Talk of Isam Shahrour at Science Po, Paris TEDx - Let's build Smart Cities fo...
 
Subscribed World Tour Keynote: London, 2015
Subscribed World Tour Keynote: London, 2015Subscribed World Tour Keynote: London, 2015
Subscribed World Tour Keynote: London, 2015
 
Subscribed 2016: 100 Years Young
Subscribed 2016: 100 Years YoungSubscribed 2016: 100 Years Young
Subscribed 2016: 100 Years Young
 

Similar to A statistical criterion for reducing indeterminacy in linear causal modeling

REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREShriramKargaonkar
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).pptMuhammadAftab89
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.pptRidaIrfan10
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptkrunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptMoinPasha12
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Sciencessuser71ac73
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionKush Kulshrestha
 
Cs229 notes9
Cs229 notes9Cs229 notes9
Cs229 notes9VuTran231
 
Correation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R softwareCorreation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R softwareshrikrishna kesharwani
 
Machine learning (10)
Machine learning (10)Machine learning (10)
Machine learning (10)NYversity
 
Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Loc Nguyen
 

Similar to A statistical criterion for reducing indeterminacy in linear causal modeling (20)

REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
07 Tensor Visualization
07 Tensor Visualization07 Tensor Visualization
07 Tensor Visualization
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Regression
RegressionRegression
Regression
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
 
Sem+Essentials
Sem+EssentialsSem+Essentials
Sem+Essentials
 
Machine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear RegressionMachine Learning Algorithm - Linear Regression
Machine Learning Algorithm - Linear Regression
 
Cs229 notes9
Cs229 notes9Cs229 notes9
Cs229 notes9
 
Regression Analysis.pdf
Regression Analysis.pdfRegression Analysis.pdf
Regression Analysis.pdf
 
Correation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R softwareCorreation, Linear Regression and Multilinear Regression using R software
Correation, Linear Regression and Multilinear Regression using R software
 
Machine learning (10)
Machine learning (10)Machine learning (10)
Machine learning (10)
 
CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...
CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...
CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...
 
Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...Extreme bound analysis based on correlation coefficient for optimal regressio...
Extreme bound analysis based on correlation coefficient for optimal regressio...
 

More from Gianluca Bontempi

Adaptive model selection in Wireless Sensor Networks
Adaptive model selection in Wireless Sensor NetworksAdaptive model selection in Wireless Sensor Networks
Adaptive model selection in Wireless Sensor NetworksGianluca Bontempi
 
Combining Lazy Learning, Racing and Subsampling for Effective Feature Selection
Combining Lazy Learning, Racing and Subsampling for Effective Feature SelectionCombining Lazy Learning, Racing and Subsampling for Effective Feature Selection
Combining Lazy Learning, Racing and Subsampling for Effective Feature SelectionGianluca Bontempi
 
A model-based relevance estimation approach for feature selection in microarr...
A model-based relevance estimation approach for feature selection in microarr...A model-based relevance estimation approach for feature selection in microarr...
A model-based relevance estimation approach for feature selection in microarr...Gianluca Bontempi
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionGianluca Bontempi
 
Feature selection and microarray data
Feature selection and microarray dataFeature selection and microarray data
Feature selection and microarray dataGianluca Bontempi
 
A Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionA Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionGianluca Bontempi
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningGianluca Bontempi
 
FP7 evaluation & selection: the point of view of an evaluator
FP7 evaluation & selection: the point of view of an evaluatorFP7 evaluation & selection: the point of view of an evaluator
FP7 evaluation & selection: the point of view of an evaluatorGianluca Bontempi
 
Local modeling in regression and time series prediction
Local modeling in regression and time series predictionLocal modeling in regression and time series prediction
Local modeling in regression and time series predictionGianluca Bontempi
 
Perspective of feature selection in bioinformatics
Perspective of feature selection in bioinformaticsPerspective of feature selection in bioinformatics
Perspective of feature selection in bioinformaticsGianluca Bontempi
 
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionGianluca Bontempi
 

More from Gianluca Bontempi (11)

Adaptive model selection in Wireless Sensor Networks
Adaptive model selection in Wireless Sensor NetworksAdaptive model selection in Wireless Sensor Networks
Adaptive model selection in Wireless Sensor Networks
 
Combining Lazy Learning, Racing and Subsampling for Effective Feature Selection
Combining Lazy Learning, Racing and Subsampling for Effective Feature SelectionCombining Lazy Learning, Racing and Subsampling for Effective Feature Selection
Combining Lazy Learning, Racing and Subsampling for Effective Feature Selection
 
A model-based relevance estimation approach for feature selection in microarr...
A model-based relevance estimation approach for feature selection in microarr...A model-based relevance estimation approach for feature selection in microarr...
A model-based relevance estimation approach for feature selection in microarr...
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
Feature selection and microarray data
Feature selection and microarray dataFeature selection and microarray data
Feature selection and microarray data
 
A Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series predictionA Monte Carlo strategy for structure multiple-step-head time series prediction
A Monte Carlo strategy for structure multiple-step-head time series prediction
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine Learning
 
FP7 evaluation & selection: the point of view of an evaluator
FP7 evaluation & selection: the point of view of an evaluatorFP7 evaluation & selection: the point of view of an evaluator
FP7 evaluation & selection: the point of view of an evaluator
 
Local modeling in regression and time series prediction
Local modeling in regression and time series predictionLocal modeling in regression and time series prediction
Local modeling in regression and time series prediction
 
Perspective of feature selection in bioinformatics
Perspective of feature selection in bioinformaticsPerspective of feature selection in bioinformatics
Perspective of feature selection in bioinformatics
 
Computational Intelligence for Time Series Prediction
Computational Intelligence for Time Series PredictionComputational Intelligence for Time Series Prediction
Computational Intelligence for Time Series Prediction
 

Recently uploaded

Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportSatyamNeelmani2
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Domenico Conte
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundOppotus
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxzahraomer517
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxbenishzehra469
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单nscud
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationBoston Institute of Analytics
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单ocavb
 

Recently uploaded (20)

Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 

A statistical criterion for reducing indeterminacy in linear causal modeling

  • 1. A statistical criterion for reducing indeterminacy in linear causal modeling Gianluca Bontempi Machine Learning Group, Computer Science Department ULB, Université Libre de Bruxelles Boulevard de Triomphe - CP 212 Bruxelles, Belgium http://mlg.ulb.ac.be
  • 2. Causality in science A major goal of the scientific activity is to model real phenomena by studying the dependency between entities, objects or more in general variables. Sometimes the goal of the modeling activity is simply predicting future behaviors. Sometimes the goal is to understand the causes of a phenomenon (e.g. a disease). Understanding the causes of a phenomenon means understanding the mechanisms by which the observed variables take their values and predicting how the values of those variable would change if the mechanisms were subject to manipulations. Applications: understanding which actions to perform on a system to have a desired effect (eg. understanding the causes of tumor, the causes of the activation of a gene, the causes of different survival rates in a cohort of patients.)
  • 3. Graphical models Graphical models are a conventional formalism used to represent causal dependencies. Coughing Allergy Smoking Anxiety Genetic factor (a) (d) (c) Hormonal factor Metastasis (b) Other cancers Lung cancer
  • 4. Inferring causes from observations Inferring causal relationships from observational data is an open challenge in machine learning. State-of-the-art approaches often rely on algorithms which detect v-structures in triplets of nodes in order to orient arcs. Bayesian networks techniques search for unshielded colliders i.e. patterns where two variables are both direct causes of a third one, without being each a direct cause of the other. Under assumptions of Causal Markov Condition and Faithfulness, this structure is statistically distinguishable and so-called constraint based algorithms (notably the PC and the SGS algorithms) rely on conditional independence tests to orient at least partially a graph.
  • 5. Indeterminacy Conditional independence algorithms are destined to fail when confronted with completely connected triplets. In this case the lack of independency makes ineffective the adoption of conditional independency tests to infer the direction of the arrows. When there are no independencies, the direction of the arrows can be anything. In other terms distinct causal patterns are not distinguishable. If some causal patterns are not distinguishable in terms of conditional independence, this doesn’t necessarily mean that they cannot be (at least partially) distinguished by using other measures. Are there statistics or measures that can help in making the problem more distinguishable or in other terms provide information about which configuration is more probable?
  • 6. Completely connected triplet b 2 x1 x2 x3 b3 b1 w1 w2 w3
  • 7. Contributions Our paper proposes a criterion to deal with arc orientation also in presence of completely linearly connected triplets. This criterion is then used in a Relevance-Causal (RC) algorithm, which combines the original causal criterion with a relevance measure, to infer causal dependencies from observational data. A set of simulated experiments on the inference of the causal structure of linear and nonlinear networks shows the effectiveness of the proposed approach.
  • 8. Motivations We define a data-dependent measure able to reduce the statistical indistinguishability of completely and linearly connected triplets. It is a modification of the covariance formula of a Structural Equation Model (SEM) which results in a statistic taking opposite signs for different causal patterns when the unexplained variations of the variables are of the same magnitude. Though this assumption could appear as too strong, assumptions of comparable strength (e.g. the existence of unshielded colliders) have been commonly used so far in causal inference. We expect that this alternative approach could shed additional light on the issue of causality in the perspective of extending it to more general configurations.
  • 9. Completely connected triplet b 2 x1 x2 x3 b3 b1 w1 w2 w3
  • 10. Structural equation models (SEM) Let us consider the linear causal structure represented by the completely connected triplet. In algebraic notation this corresponds to the system of linear equations    x1 = w1 x2 = b1x1 + w2 x3 = b3x1 + b2x2 + w3 where it is assumed that each variable has mean 0, the bi = 0 are also known as structural coefficients and the disturbances, supposed to be independent, are designated by wi .
  • 11. SEM matrix form This set of equations can be put in the matrix form x = Ax + w where x = [x1, x2, x3]T , A =   0 0 0 b1 0 0 b3 b2 0   and w = [w1, w2, w3]T . The multivariate variance-covariance matrix has no zero entries and is given by Σ = (I − A)−1 G((I − A)T )−1 (1) where I is the identity matrix and G =   σ2 1 0 0 0 σ2 2 0 0 0 σ2 3   is the diagonal covariance matrix of the disturbances.
  • 12. Structural equation models and causal inference Structural equation modeling techniques for causal inference proceed by 1 making some assumptions on the structure underlying the data, 2 performing the related parameter estimation, usually based on maximum likelihood and 3 assessing by significance testing the discrepancy between the observed (sample) covariance matrix ˆΣ and the covariance matrix Σh implied by the hypothesis. Conventional SEM is not able to reconstruct the right directionality of the connections in a completely connected triplet.
  • 13. Indistinguishability in SEM Suppose we want to test two alternative hypothesis, represented by the two directed graphs in the following slide. Note that the hypothesis on the left is correct while the hypothesis on the right inverses the directionality of the link between x2 and x3 and consequently misses the causal role of the variable x2. Let us consider the following question: is it possible to discriminate between the structures 1 and 2 by simply relying on parameter estimation (in this case regression fitting) according to the hypothesized dependencies? The answer is unfortunately negative. Both the covariance matrices Σ1 and Σ2 coincide with ˆΣ.
  • 15. Our criterion We propose here is an alternative criterion able to perform such distinction. The computation of our criterion requires the fitting of the two hypothetical structures as in conventional SEM. What is different is that, instead of computing the covariance term we consider the term S = (I − A)−1 ((I − A)T )−1 . NOTA BENE: we will make the assumption that σ1 = σ2 = σ3 = σ, i.e. that the unexplained variations of the variables are of comparable magnitude.
  • 18. Collider case Let us suppose that data are generated according to a collider structure where the node x3 is a collider. We fit the hypothesis 1 to the data by using least-squares. We obtain the matrix ˆA1 and we compute the term ˆS1 = (I − ˆA1)−1 ((I − ˆA1)T )−1 . We fit the hypothesis 2 to data by using least-squares. We obtain the matrix ˆA2 and we compute the term ˆS2. We showed that the quantity C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2] is greater than zero for any sign of the structural coefficients.
  • 20. Chain pattern Let us suppose that data are generated according to a chain structure where x3 is part of the chain pattern x1 → x3 → x2. We fit the hypothesis 1 to the data and we compute the term S1 We fit the hypothesis 2 to data and we compute the term S2 We showed that the quantity C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2] is less than zero whatever the sign of the structural coefficients bi
  • 22. Fork pattern Let us suppose that data are generated according to a fork structure where x3 is common cause of x1 and x2. We fit the hypothesis 1 to the data and we compute the term S1 We fit the hypothesis 2 to data and we compute the term S2 We showed that the quantity C(x1, x2, x3) = ˆS1[3, 3] − ˆS2[3, 3] + ˆS1[2, 2] − ˆS2[2, 2] is less than zero whatever the sign of the structural coefficients bi
  • 23. Causal statistic The previous results show that the computation of the quantity C on the basis of observational data only can help in discriminating between the collider configuration i where the nodes x1 and x2 are direct causes of x3 (C > 0) and non collider configurations (i.e. fork or chain) (C < 0). In other terms, given a completely connected triplet of variables, the quantity C(x1, x2, x3) returns useful information about the causal role of x1 and x2 with respect to x3 whatever is the strength or the direction of the link between x1 and x2. The properties of the quantity C encourage its use in an algorithm to infer directionality from observational data.
  • 24. RC (Relevance Causal) algorithm We propose then a RC (Relevance Causal) algorithm for linear causal modeling inspired to the mIMR causal filter selection algorithm. It is a forward selection algorithm which given a set XS of d already selected variables, updates this set by adding the d + 1th variable which satisfies x∗ d+1 = arg max xk ∈X−XS (1 − λ)R({XS , xk}; y)+ λ d xi ∈XS C(xi ; xk; y) (2) where λ ∈ [0, 1] weights the R and the C contribution, the R term quantifies the relevance of the subset {XS , xk} and the C term quantifies the causal role of an input xk with respect to the set of selected variables xi ∈ XS .
  • 25. Experimental setting The aim of the experiment is to reverse engineer both linear and nonlinear scale-free causal networks, i.e. networks where the distribution of the degree follows a power law, from a limited amount of observational data. We consider networks with n = 5000 nodes and the degree α of the power law ranges between 2.1 and 3. The inference is done on the basis of N = 200 observations. We compare the accuracy of several algorithms in terms of the mean F-measure (the higher, the better) averaged over 10 runs and over all the nodes with a number of parents and children larger equal than two.
  • 26. Experimental setting We considered the following algorithms for comparison: the IAMB algorithm implemented by the Causal Explorer software which estimates for a given variable the set of variables belonging to its Markov blanket, the mIMR causal algorithm, the mRMR algorithm and three versions of the RC algorithm with three different values λ = 0, 0.5, 1. Note that the RC algorithm with λ = 0 boils down to a conventional wrapper algorithm based on the leave-one-out assessment of the variables’ subsets.
  • 27. Linear inference α IAMB mIMR mRMR RC0 RC0.5 RC1 2.2 0.375 0.324 0.319 0.386 0.421 0.375 2.3 0.378 0.337 0.333 0.387 0.437 0.401 2.4 0.376 0.342 0.342 0.385 0.441 0.414 2.5 0.348 0.322 0.313 0.358 0.422 0.413 2.6 0.347 0.318 0.311 0.355 0.432 0.414 2.7 0.344 0.321 0.311 0.352 0.424 0.423 2.8 0.324 0.304 0.293 0.334 0.424 0.422 2.9 0.342 0.333 0.321 0.353 0.448 0.459 3.0 0.321 0.319 0.297 0.326 0.426 0.448 F-measure (averaged over all nodes with a number of parents and children ≥ 2 and over 10 runs) of the accuracy of the inferred networks.
  • 28. Nonlinear inference α IAMB mIMR mRMR RC0 RC0.5 RC1 2.2 0.312 0.310 0.304 0.314 0.356 0.324 2.3 0.317 0.328 0.316 0.320 0.375 0.349 2.4 0.304 0.317 0.304 0.306 0.366 0.351 2.5 0.321 0.327 0.328 0.325 0.379 0.359 2.6 0.306 0.325 0.306 0.309 0.379 0.365 2.7 0.313 0.319 0.303 0.316 0.380 0.359 2.8 0.297 0.326 0.300 0.300 0.392 0.382 2.9 0.310 0.329 0.313 0.313 0.389 0.377 3.0 0.299 0.324 0.300 0.303 0.399 0.392 F-measure (averaged over all nodes with a number of parents and children ≥ 2 and over 10 runs) of the accuracy of the inferred network.
  • 29. Discussion The results show the potential of the criterion C and of the RC algorithm in network inference tasks where dependencies between parents are frequent because of direct links or common ancestors. According to the F-measures reported in the Tables the RC accuracy with λ = 0.5 and λ = 1 is coherently better than the ones of mIMR, mRMR and IAMB algorithms for all the considered degrees of the distribution. RC outperforms a conventional wrapper approach which targets only prediction accuracy (λ = 0 ) when a causal criterion C is taken into account together with a predictive one (λ = 0.5).
  • 30. Conclusions Causal inference from complex large dimensional data is taking a growing importance in machine learning and knowledge discovery. Currently, most of the existing algorithms are limited by the fact that the discovery of causal directionality is submitted to the detection of a limited set of distinguishable patterns, like unshielded colliders. However the scarcity of data and the intricacy of dependencies in networks could make the detection of such patterns so rare that the resulting precision would be unacceptable. This paper shows that it is possible to identify new statistical measures helping in reducing indistinguishability under the assumption of equal variances of the unexplained variations of the three variables.
  • 31. Conclusions Though this assumption could be questioned, we deem that it is important to define new statistics to help discriminating between causal structures for completely connected triplets in linear causal modeling. Future work will focus on assessing whether such statistic is useful in reducing indeterminacy also when the assumption of equal variance is not satisfied.