Causality Confoundingand Control060619


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • [need] Causality stays problematic issue in applied research. When is a relation causal and when is the result of confounding? When should we control for confounding variables? [task] Tackle confounding and control (give an alternative definition of confounding and discuss when to control) within the framework of structural modelling. This framework is far broader than the mere formal specification of a set of equation as it requires a “philosophical” stance on what a model is and on the meaning of structural. [main message] Approaching causality in terms of exogeneity is a viable option if the whole conceptual apparatus of structural models is specified – and yet latent confounders can put exogeneity in serious trouble.
  • Briefly browse over topics
  • I’m not going to teach statistics, I’ll suppose familiarity with technicalities. Just emphasize a couple of conceptual issues. We consider statistical models to be stochastic representations of the world. Error term represents what is not explained by the model. Modelling consists in abstracting, meaning, constructing a simplified representation of a complex reality. The purpose of the model is NOT to be true, BUT to be useful. Data can be analyzed as if they were a realization of a member of the family of distributions M A model is also made of assumptions. Some of them are just statistical in character, others have a fundamental bearing on causality
  • Present example. Interested in the (causal) relation between smoking and cancer. As is well known other factors can cause cancer, e.g. asbestos dust exposure Question: how do we model this relation?  next slide
  • A special class of statistical models, namely structural models, make statistical inference operational and meaningful Structural means a representation of the world that is stable under a large class of interventions. Structural modelling means capturing an underlying (meaning, causal) structure of the world. A special class of statistical models, namely structural models, make statistical inference operational and meaningful To uncover an underlying causal structure we operate a marginal-conditional decomposition on the statistical model, where only the conditional part is structural and it represents the data generating process In our example of smoking, asbestos and cancer, we would operate this decomposition. Each distribution rhs represent a data generating process. Only the 3° component is structural  A and S are exogenous for the parameter of interest. Informally and very briefly: Exogenous variable = variable for which the mechanism explaining this variable does not give any information on the mechanism of interest. Endogenous variable = variable for which the mechanism explaining this variable is of interest. An exogenous variable is an causal variable in a structural conditional model. Exogeneity is a condition of separability of inference.
  • Consider again smoking-cancer example. More realistic picture: smoking and asbestos are both dependent on SES Standard definition of confounding. 2 conditions: 1 The risk groups differ on the (confounding) variable 2 The variable itself influences the outcome In the example, A is a confounder Alternative definition of confounding: A confounding variable is a common cause both of the putative cause and of its outcome In the example, SES is a confounder, and this also explains why A and S are associated (they have a common cause)
  • Let’s make things more complicated. We are still interested in the relation S  C. Suppose now that (1st graph) SES and A latent or overlooked. Because SES and A are not observed, it might be tempting to collapse the graph in to the 2 nd graph. Intuitively right BUT what happen in the structural-conditional model? Formally, graph on the right is obtained by integrating the latent variables (trust the statistician who made the calculations!) This leads to a loss of exogeneity because we now have common parameters (highlighted). Therefore S is not exogenous anymore. Moral: latent confounders put exogenous variables in trouble and simplified graphs (on the right) become misleading
  • So what? If SES is not observed, what can we do? Try to use another (observed!!) variable, for instance K (K might be a proxy of SES) Try to use an intermediate variable between SES and S, for instance N (N might be the network of peers)
  • Briefly recall the topics covered in the talk
  • Let’s draw some general conclusions. There would be a lot to say – let’s focus on three issues Background knowledge. Its fundamental role for model building (eg selection of variables, formulation of working hypotheses …), evaluation (is the mechanism a plausible one?) What’s the meaning of causality then? In structural modelling, epistemic NOT metaphysical concept of causality. Causality concern our knowledge and representation of causal relations, rather than what a cause *really* is (besides, this raises quite substantial issues in the ontology of social sciences). Causalist perspective justified: 1. for cognitive goals and 2. for action-oriented goals. Notice that to design effective policies, interventions, treatments (action-oriented goal) we have to *know* (cognitive goal) what causes what.
  • Causality Confoundingand Control060619

    1. 1. Causality, Confounding, and Control Michel Mouchart, Federica Russo, Guillaume Wunsch Institute of Statistics, UcLouvain Philosophy, University of Kent Institute of Demography, UcLouvain
    2. 2. Overview <ul><li>Models and Structure </li></ul><ul><li>Confounders and Confounding </li></ul><ul><li>Heterogeneity and Latent Confounders </li></ul><ul><li>Controlling Latent Confounders </li></ul><ul><li>Discussion and conclusions </li></ul>
    3. 3. Models and structure (i) <ul><li>What is a statistical model? </li></ul><ul><ul><li>M = { S, P   } </li></ul></ul>Sample space Sampling distribution Parameter <ul><ul><li>A stochastic representation of the world </li></ul></ul><ul><ul><li>Analyze data as a realization of a member of the family of distributions </li></ul></ul>
    4. 4. Models and structure (ii) A S C An example: C ancer, S moking and A sbestos exposure
    5. 5. Models and structure (iii) <ul><li>Each distribution represents </li></ul><ul><li>a data generating process </li></ul><ul><li>The 3 rd component is structural : </li></ul><ul><li>A and S are exogenous variables for  C|A,S </li></ul><ul><li>A and S are causal variables </li></ul>The marginal-conditional decomposition:
    6. 6. Confounders and Confounding <ul><li>C ancer, S moking, A sbestos </li></ul><ul><li>and S ocio- E conomic S tatus </li></ul>A SES S C <ul><li>Standard definitions of confounding: </li></ul><ul><ul><li>A as a confounder </li></ul></ul><ul><li>Confounders as Common cause: </li></ul><ul><ul><li>SES as a confounder </li></ul></ul>
    7. 7. Heterogeneity and Latent Confounders <ul><li>Non observability of confounders: problems for exogeneity </li></ul>SES C S C S A
    8. 8. Controlling Latent Confounders SES K S C A N
    9. 9. To sum up … <ul><li>Exogeneity and confounding are defined only within a structural model </li></ul><ul><li>Exogeneity is relative to parameters of interest </li></ul><ul><li>Confounding as a common cause </li></ul><ul><li>of treatment and outcome </li></ul><ul><li>Latent variables may play havoc with exogeneity </li></ul><ul><li>Controlling for latent variables ? </li></ul>
    10. 10. Conclusions <ul><li>Background knowledge </li></ul><ul><li>is essential in structural modelling </li></ul><ul><li>Meaning of causality: </li></ul><ul><ul><li>epistemic, not metaphysic </li></ul></ul><ul><li>An explicit causalist perspective is justified: </li></ul><ul><ul><li>cognitive and action-oriented goals </li></ul></ul>