Statistical Modelling and Causality Federica Russo*, Michel Mouchart**, Michel Ghins*, Guillaume Wunsch*** *  Institut Sup...
Structure of the paper <ul><li>Scientific knowledge </li></ul><ul><li>Data </li></ul><ul><li>Causality and statistical mod...
Scientific knowledge <ul><li>We are moderate realists: </li></ul><ul><ul><li>Models grant cognitive access  </li></ul></ul...
Data <ul><li>To acquire causal knowledge </li></ul><ul><li>we try to make sense of observations </li></ul><ul><li>However,...
Causality and statistical modelling <ul><li>The statistical model </li></ul><ul><ul><li>A stochastic representation of the...
Causality and statistical modelling <ul><li>Statistical inference and structural models </li></ul><ul><ul><li>Structural m...
Causality and statistical modelling <ul><li>Conditional models and exogeneity </li></ul><ul><li>p (x |    ) = p (z |    ...
Causality and statistical modelling <ul><ul><li>Causality: exogeneity in a structural model </li></ul></ul><ul><ul><ul><li...
Hypothetico-deductive methodology <ul><li>Theorizing out-of-sample information </li></ul><ul><li>Choice of variables </li>...
The population and the individual <ul><li>Methodological issue: </li></ul><ul><ul><li>Detect causal variables, provide a s...
To conclude… <ul><li>Causality:  exogeneity  in a structural model </li></ul><ul><li>Thus defined, causality is  internal ...
Upcoming SlideShare
Loading in …5
×

Iussp2005 Presentation1

530 views
461 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
530
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Causality has been debated both in philosophy and science for long time. However, discussions have been held quite independently An objective of this paper is to restore a fruitful dialogue between philosophy ad science. Here, address the question: to what extent can a statistical model say something about causal relations among variables? We attempt an answer by analyzing a special class of statistical models, i.e. structural models. Take home message: from a statistical viewpoint, causality can be operationally defined in terms of exogeneity. A closest epistemological analysis of structural models reveals the fundamental role of assumptions, background knowledge and h-d methodology
  • Main message: We espouse a moderate version of scientific realism, according to which models grant cognitive access to (at least some) unobservable aspects of reality. Modelling consists in abstracting, constructing a simplified representation of a complex reality. The purpose of the model is NOT to be true, BUT to be useful.
  • The aim is to acquire causal knowledge; in order to do that we try to make sense of observations. However, collecting data is problematic under several respects What we decide to observe depends on research questions Erroneous data (voluntary – non voluntary errors) Time ordering Definition of abstract concepts …
  • I’m not going to teach statistics, I’ll suppose familiarity with technicalities. Just emphasize a couple of conceptual issues. We consider statistical models to be stochastic representations of the world. Error term represents what is not explained by the model. Data can be analyzed as if they were a realization of a family of distributions M A model is also made of assumptions
  • Statistical inference is concerned with 2 aspects: induction: drawing conclusions about what has not been observed from what has been observed Learning-by-observing: learning about some aspect of interest; accumulating information as observations accumulate Structural models make statistical inference operational and meaningful Structural means a representation of the world that is stable under a large class of interventions. Structural modelling means capturing an underlying (=causal) structure of the world.
  • Informally and very briefly: Exogenous variable = variable for which the mechanism explaining this variable does not give any information on the mechanism of interest. Endogenous variable = variable for which the mechanism explaining this variable is of interest. And the exogenous variable participate in the explanation of the mechanism of the endogenous variable A causal variable is an exogenous variable in a structural conditional model
  • Consequences: Causality is internal to the model Exogeneity is an operational concept of causality However, other features of structural models grant causality. We distinguish temporal and atemporal aspects. These are assumptions about: determinism, recursiveness, covariate sufficiency, no confounding, invariance, causal asymmetry, direction of time …
  • Causality is internal to a model BUT we do not deduce causes from correlations. A hypothetico-deductive methodology is employed in case we have at our disposal enough well confirmed theories and background knowledge to formulate a prior causal hypothesis. Structural models are hypothetico-deductive models, for which empirical testing is performed through two stages: (i) prior theorizing of out-of-sample information, including in particular the selection of variables deemed to be of interest, the formulation of a causal hypothesis (also called the conceptual hypothesis), etc. ; (ii) iteratively: a. building the statistical model; b. testing the adequacy between the model and the data to accept the empirical validity or non validity of the causal hypothesis.
  • Why the problem arise: causal conclusions drawn from statistical models concern populations as well as individuals, although probability distributions and their parameters are typically defined relative to the population, Populations are made of individuals So, we distinguish 2 levels of causation: population-level and individual-level Mention example: smoking and lung cancer Methodological issue: what the causal ( i.e. exogenous) variables are and whether it is possible at all to provide a sufficient list, and what mechanisms operate among the variables deemed to be causal. These two tasks are difficult to achieve because of heterogeneity of individuals in the same population. Practical issue: Can a physician decide whether to prescribe a treatment or not on the basis of a causal model? Behind the practical issue hides the epistemological one: the physician’s decision depends in the relationships between causality at the population level and at the individual level what we discover about the average relation between smoking and lung cancer, i.e. at the population level, can guide causal attribution in the case of Harry through a simple tool of probabilistic reasoning, namely Bayes’ theorem. In fact, Bayes’ theorem allows us to calculate the posterior probability of the cause for a given individual, provided that the population risk is interpreted as a prior probability for this individual.
  • Structural models are characterized by parameters that are stable over a large class of interventions; in the marginal-conditional decomposition, the conditional part describes the data generating process; this part is structural, i.e. causal. Causality is here defined in terms of exogeneity However, we have to go beyond the operational concept of exogeneity. A more complex and rich concept of causality requires to acknowledge the role of assumptions, of background knowledge and of the H-D methodology Within these structural models we are allowed to formulate causal statements, i.e. causality is internal to the model Causality is relative to a structural model, but this is not to deny causality in the world. Rather, this is to emphasize that causal knowledge depends on structural models that mediate epistemic access to causal relations.
  • Iussp2005 Presentation1

    1. 1. Statistical Modelling and Causality Federica Russo*, Michel Mouchart**, Michel Ghins*, Guillaume Wunsch*** * Institut Supérieur de Philosophie, Université Catholique de Louvain ** Institut de Statistique, Université Catholique de Louvain *** Institut de Démographie, Université Catholique de Louvain
    2. 2. Structure of the paper <ul><li>Scientific knowledge </li></ul><ul><li>Data </li></ul><ul><li>Causality and statistical modelling </li></ul><ul><ul><li>The statistical model </li></ul></ul><ul><ul><li>Statistical inference and structural models </li></ul></ul><ul><ul><li>Conditional models and exogeneity </li></ul></ul><ul><ul><li>Beyond exogeneity </li></ul></ul><ul><ul><li>Hypothetico-deductive methodology </li></ul></ul><ul><li>The population and the individual </li></ul>
    3. 3. Scientific knowledge <ul><li>We are moderate realists: </li></ul><ul><ul><li>Models grant cognitive access </li></ul></ul><ul><ul><li>to some unobservable parts of reality </li></ul></ul><ul><ul><li>Modelling is constructing a simplified </li></ul></ul><ul><ul><li>representation of a complex reality </li></ul></ul>
    4. 4. Data <ul><li>To acquire causal knowledge </li></ul><ul><li>we try to make sense of observations </li></ul><ul><li>However, collecting data is </li></ul><ul><li>problematic under several respects </li></ul>
    5. 5. Causality and statistical modelling <ul><li>The statistical model </li></ul><ul><ul><li>A stochastic representation of the world </li></ul></ul><ul><ul><li>Analyze data as a realization </li></ul></ul><ul><ul><li>of a family of distributions </li></ul></ul>
    6. 6. Causality and statistical modelling <ul><li>Statistical inference and structural models </li></ul><ul><ul><li>Structural models make statistical inference </li></ul></ul><ul><ul><li>operational and meaningful </li></ul></ul><ul><ul><li>through a learning-by-observing process </li></ul></ul><ul><ul><li>Structural models: a representation of the </li></ul></ul><ul><ul><li>world that is stable under a large class </li></ul></ul><ul><ul><li>of interventions </li></ul></ul>
    7. 7. Causality and statistical modelling <ul><li>Conditional models and exogeneity </li></ul><ul><li>p (x |  ) = p (z |  ) p (y | z ,  ) </li></ul>The conditional part is structural and represents the data generating process <ul><ul><li>Z is an exogenous variable in a structural model, </li></ul></ul><ul><ul><li>that is Z is a causal variable </li></ul></ul>
    8. 8. Causality and statistical modelling <ul><ul><li>Causality: exogeneity in a structural model </li></ul></ul><ul><ul><ul><li>operational concept, </li></ul></ul></ul><ul><ul><ul><li>internal to the model </li></ul></ul></ul><ul><li>Beyond exogeneity </li></ul><ul><ul><li>Temporal and atemporal features grant causality </li></ul></ul>
    9. 9. Hypothetico-deductive methodology <ul><li>Theorizing out-of-sample information </li></ul><ul><li>Choice of variables </li></ul><ul><li>Formulation of the causal hypothesis </li></ul><ul><li>Iteratively: </li></ul><ul><ul><li>Building the statistical model </li></ul></ul><ul><ul><li>Testing the adequacy model-data </li></ul></ul>
    10. 10. The population and the individual <ul><li>Methodological issue: </li></ul><ul><ul><li>Detect causal variables, provide a sufficient list </li></ul></ul><ul><ul><li>Describe the causal mechanism </li></ul></ul><ul><li>Practical issue: </li></ul><ul><ul><li>Take decisions about individuals </li></ul></ul><ul><ul><li>based on knowledge about the population </li></ul></ul><ul><li>Epistemological issue: </li></ul><ul><ul><li>Causal knowledge about the population </li></ul></ul><ul><ul><li>guides causal attribution about individuals </li></ul></ul><ul><ul><li>through Bayes’ theorem </li></ul></ul>
    11. 11. To conclude… <ul><li>Causality: exogeneity in a structural model </li></ul><ul><li>Thus defined, causality is internal </li></ul><ul><li>to the model </li></ul><ul><li>Structural models mediate </li></ul><ul><li>epistemic access to causal relations </li></ul>

    ×