SlideShare a Scribd company logo
1 of 35
Introduction to Directed Acyclic Graphs
Causal Inference Shared Interest Group
January 19th, 2016
Overview/Goals
• Provide introduction on the use of DAGs for confounder
selection in nonexperimental studies
– What DAGs offer; limitations
– Overview of DAG terminology and rules
– Overview of d-separation criteria for assessing open and closed paths
– Share approaches for selecting minimum adjustment set
• Share a few examples of DAGs and when/why other
approaches to covariate selection may fail
• Have a broad discussion on the use of DAGs; hope for
this to be a conversation
Disclaimers/Focus
• Most examples can be found in:
– Modern Epidemiology 3rd edition (Chapter 12)
– Williamson et al. (2014) Introduction to casual diagrams for
confounder selection
– Hernan et al. (2002) Causal knowledge as a prerequisite for
confounding evaluation: an application to birth defects epidemiology
• Will not be covering mathematical proofs supporting use
of DAGs for causal inference or grounding in
counterfactual definition of causation
• Will focus on use of DAGs for confounder selection; many
other uses (e.g. selection bias, instrumental variable selection,
measurement error, mediation/moderation, etc. – see Ch. 12 Modern Epi)
What is a DAG?
What is a DAG?
• A casual diagram (e.g. graph) depicting the investigator's
assumptions about casual relations among the exposure,
outcome and covariates…that’s it!
• Arrows are used to denote directionality
• Contain only unidirectional arrows (single headed)
• Must be acyclic; contain no feedback loops
• Include all endogenous variables even if unknown (U)
Why DAGs?
• Because we are often interested in estimating casual
affects from observational data
– Not an easy task as sample associations are directly
observable, but causation is not
– Four possible casual structures can contribute to an
association between X and Y
• 1.) X may cause Y; what we are interested in
• 2.) Y may cause X; fairly easy to rule out
• 3.) X and Y may share a common cause we have failed to condition
on; this is classical confounding
• 4.) we have conditioned on a collider (variable affected by X and Y)
• *Associations may also be due to chance (i.e. sampling variability)
Why DAGs?
• DAGs provide a powerful and intuitive tool for deducing the
statistical associations implied in a hypothesized causal
network (without requiring any mathematics)
• Thus, they help us to select which variables we need to
condition on (e.g. control for) to obtain unbiased estimates in
nonexperimental studies
• If our DAG is correct, and all important factors can be
measured, we can identify the minimum adjustment set
necessary to achieve “conditional exchangeability”
– e.g. exposure is essentially randomized within levels of covariates
– And causal inference can be made from the observed data
Why DAGs
• Approaches to confounder selection based on statistical
associations may fail to ID important confounders and support
adjusting on non-confounders
– Automatic variable selection; implicit assumption is that not all
variables selected will be confounders, but all important confounders
will be selected
– Change-in-estimate criteria; implicit assumption is that any variable
associated with a change is worth adjusting for
– Traditional confounder definition – i.) associated w/ exposure in source
population, ii) associated with outcome among the unexposed, and iii.)
not on the casual pathway
– Intermediate, collider, and instrumental variables can all behave
statistically like confounders – need knowledge to guide selection!
Limitations of DAGs
• How can we know the true underlying causal
structure/network to draw the DAG?
– Can’t be known; If knew we won’t need the study!
– However, true casual structure exists even if we do not
know it…
• All casual inference based on statistical models are implicitly based
on some casual structure
• DAGs simply make these assumptions more explicit
• Can compare competing casual models to assess their
compatibility with the observed data
• From the observed statistical associations can also deduce causal
structures compatible with the observations/data
Limitations of DAGs
• Drawing DAGs can be challenging as they need to be
developed in the context of all available evidence
– Often requires multidisciplinary input; and well developed theory/area
• Latent variables can pose problems; causes and affects harder
to deduce; can’t measure directly also harder to condition on
• Effect modification “difficultish” to represent since each
variable represented by a single node
– Often represented by single node conceptually representing both
variables; may necessitate multiple DAGs; current area of research
• DAGs provide no information on the effect sizes or functional
forms
– Simply nonparametric graphical representations of casual networks
DAG Terminology
• Variables are nodes
• Arrows are edges and
imply direction
• Paths are unbroken
sequences of arrows
linking nodes regardless
of direction
– X > Y > Z
– U > Y > Z
– U > Y < X
X
U
Y Z
DAG Terminology
• X is said to directly affect Y if there is an arrow from X to Y
• X indirectly affects Z if there is a unidirectional path from X to
Z (X>Y>Z)
• Y is an “intermediate” variable between X and Z
• Y is also said to “intercept” the path from (X > Y > Z) or from
(U > Y > Z)
DAG Terminology
• Children of X are variables directly affected by X; {Y}
• Parents of Y are variables that directly affect Y; {X, U}
• Descendants of X are all variables either directly or indirectly
affected by X; {Y, Z}, but not U
• Ancestors of Z are all variables that either directly or indirectly
affect Z; {Y, X, U}
DAG Terminology
• Directed paths are the special case where all edges in path
flow head-to-tail (unidirectional); these are causal pathways
– Ex. (X > Y), (X > Y > Z), (U > Y), (U > Y > Z)
• All undirected paths are non-causal; (U > Y < X)
• A variable on the path where two arrows meet is called a
“collider”; child of both the variable before and after it
DAG Terminology
• Nodes where multiple arrows meet are called “colliders”;
node has multiple parents (U > Y < X)
• Traditional terminology is that Y is a common effect of U and
X; e.g. Y is child to both
• Association is not transmitted across common effects!
– Because two factors share a common effect doesn’t mean associated
– No need to condition on collider; path is closed
• Colliders cause special problems for causal inference
Collider bias
• Conditioning on a collider results in collider bias because it
opens a backdoor path across the colliding nodes
• Basically, what happens is that two variables that were
originally marginally independent become dependent
conditional on Z
• Thus, not only is adjustment not needed, but can be harmful
• Let’s look at a hypothetical example…
Example Hernan 2002
• X represents being on a diet
• Y represents recent diagnosis of non-diet-related cancer
• Z represent recent weight loss > 5kg (1=yes, 0=no)
• Assume X does not cause Y; Z common cause of X and Y
• X and Y are marginally independent in the source population
– Table 1 below
• RRXY = (100/(100+100)) / (200/(200+200)) = 1
• Knowing that someone was dieting does not change prob. of cancer
Y=1 Y=0
X=1 100 100
X=0 200 200
Example Hernan 2002
• Now we condition on recent weight loss (e.g. collider)
• Given that someone lost weight, it becomes more likely that
she had cancer if is she was not dieting
– Data here are made-up, but the above conditional association is
intuitive
• Thus, for those that lost weight, being on a diet and being
recently diagnosed with cancer are inversely related
• RRXY|C=1 = 0.79 RRXY|C=0 = 0.92
Z=1
D=1 D=0
X=1 55 25
X=0 70 10
Z=0
D=1 D=0
X=1 45 75
X=0 130 190
Examples of confounding and collider bias
- Top: No effect of A on Y; no effect of A
on Y observed; path blocked at collider
C4; no bias
- Middle: No effect of A on Y; potential
association between A and Y observed;
conditioning on collider C4; bias
- Bottom: No effect A on Y; A < C4 > Y
represents classical confounding, but
adjustment opens up A < C1 > C4 < C3
< C2 > Y path; A Y association
observed; bias (need also adjust C1)
d-separation criteria
• Stands for “direction” separation criteria a.k.a. “directed graph
separation rules”; this is one of the reasons we care about DAGs!
• Two variables are d-connected if there is an open path between
them; they are d-separated if all paths are closed
• If two variables are d-separated then by definition they will be
unassociated; no path from X to Y
• If two variables are d-connected there is an open path implying a
marginal association between them
– Open directed pathways between X and Y are what we want to estimate!
– We also want to control for all undirected open pathways between X and Y
d-separation criteria
• Often defined in terms of unconditional and conditional separation
• Unconditional d-separation
– Path is open if there are no colliders on the path
– If collider on path; closed unconditionally ; collider block path
– Thus, all directed paths are open (as can have no collider)
• Conditional d-separation
– Conditioning on a non-collider Z, blocks the path at Z
– Conditioning on a collider, or decedent of a collider, open the path
• Combining these criteria allows us to identify the set of covariates
that will block confounding paths!
Williamson d-separation summary
Let’s look at some examples
• Interested in association b/w
smoking and adult asthma
• Here truth assumed to be
known
• Smoking is not a cause of
asthma
• Do we need to adjust for
childhood asthma?
– Smk < cAsthma > aAsthma is
open
– Estimate will be biased
– Need to condition in cAsthma
– Classic confounding bias
Let’s look at some examples
• What if the true casual
structure were different?
• What do we condition on
here?
• Nothing – path blocked at
collider parent smk > cAsthma
< atopy; unbiased
• If adjust for cAsthma here
open up backdoor path; bias
• Would have to also condition
on parental smoking; unbiased
– Minimum set = {parent
smoking and cAsthma}
Let’s look at some examples
• What of the casual
structure were different?
• What to condition on?
• Path from personal smoking
< cAsthma < atopy >
aAsthma is now open; bias
• Colliders are path specific!
• Minimal sufficient
adjustment sets
– {atopy}
– {cAsthma, parent smoking}
Bit more realistic example; still simple
- Might need some rules to figure out what to condition on
here!
DAGs can get highly complex!
-Definitely going to need some rules here!
DAGs can get highly complex!
-And here!
Pearl’s Rules to ID Minimal Adjustment Set
Pearl’s Rules to ID Minimal Adjustment Set
Pearl’s Rules to ID Minimal Adjustment Set
• Rules provide a formulaic approach to identifying a
minimal adjustment set
• Still need to start by guessing at the set
• Still likely to be a bit confusing at first
• Lucky there are some terrific open-source software
packages that will do this for us!
– DAGitty: web platform – very easy to use
– dagR package in R
DAGitty Example
• I made this example in about 60 sec.
• Red shows biasing path
Example of when other approaches may fail
• Example taken from Hernan et al. (2002) Causal Knowledge as
a Prerequisite for Confounding Evaluation
• Examined relation between folic acid supplementation and
neural tube defects in the Slone Epidemiology Birth Defects
Study (1992-1997 to rule out impact of fortification)
• Cases were mothers who’s infants had neural tube defects;
controls delivered infants with non-folic acid related defects
• Exposure was folic acid supplementation (yes/no)
• C is a potential confounder, unrevealed for now, but known to
not be on the casual pathway from exposure to disease
• For simplicity assume all variables measured w/o error
Example of when other approaches may fail
• Automatic variable selection: add if p-value < 0.10; covariate meets
this criteria
• Change-in-estimate criteria: changes OR by ~15%; covariate meets
criteria
• Traditional confounding definition: meets all criteria
• Unadjusted OR = 0.65 (0.45; 0.94); inversely related
• Adjusted OR = 0.80 (0.53; 1.20) get some attenuation
• What is the covariate? Whether pregnancy ends in still
birth/therapeutic abortion
– Do we think it is a confounder? Common cause of supplementation use
and neutral tube defects? Not likely, but commonly adjusted for
– Theory suggest loss of birth common cause of low folic acid; collider
– Analogous to restricting subjects to just live births!
Good Resources
• Introductory papers:
– Williamson E, Aitken Z, Lawrie J, Dharmage S, Burgess J,
Forbes A. Introduction to causal diagrams for confounder
selection.Respirology. 2014;19(3):303–311.
– Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA.
Causal knowledge as a prerequisite for confounding
evaluation: an application to birth defects
epidemiology. Am J Epidemiol. 2002;155(2):176–84.
– Shrier I, Platt R. Reducing bias through directed acyclic
graphs. Bmc Med Res Methodol. 2008;8(1):70.
doi:10.1186/1471-2288-8-70.
• Chapter 12: Modern Epidemiology 3rd edition

More Related Content

Similar to Introduction to Directed Acyclic Graphs (DAGs) for Confounder Selection

TYPESOFDATAANALYSIS research methodology .pdf
TYPESOFDATAANALYSIS research methodology .pdfTYPESOFDATAANALYSIS research methodology .pdf
TYPESOFDATAANALYSIS research methodology .pdfMounika711622
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxwhitneyleman54422
 
ML4 Regression.pptx
ML4 Regression.pptxML4 Regression.pptx
ML4 Regression.pptxDayal Sati
 
MEASURES OF DISPERSION.ppt
MEASURES OF DISPERSION.pptMEASURES OF DISPERSION.ppt
MEASURES OF DISPERSION.pptVnDr
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersionSanoj Fernando
 
Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and RegressionColby Stoever
 
how to select the appropriate method for our study of interest
how to select the appropriate method for our study of interest how to select the appropriate method for our study of interest
how to select the appropriate method for our study of interest NurFathihaTahiatSeeu
 
Duality in AdS/CFT, Chicago 7 Nov. 2014
Duality in AdS/CFT, Chicago 7 Nov. 2014Duality in AdS/CFT, Chicago 7 Nov. 2014
Duality in AdS/CFT, Chicago 7 Nov. 2014Sebastian De Haro
 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptTripthiDubey
 
Causal Inference Introduction.pdf
Causal Inference Introduction.pdfCausal Inference Introduction.pdf
Causal Inference Introduction.pdfYuna Koyama
 
Chapter 2 methods and statistics
Chapter 2  methods and statisticsChapter 2  methods and statistics
Chapter 2 methods and statisticsPsych Soon
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxDrLasya
 
Quantitative Research-Measurement & presentation.pdf
Quantitative Research-Measurement & presentation.pdfQuantitative Research-Measurement & presentation.pdf
Quantitative Research-Measurement & presentation.pdfSameena Siddique
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.pptabir014
 

Similar to Introduction to Directed Acyclic Graphs (DAGs) for Confounder Selection (20)

TYPESOFDATAANALYSIS research methodology .pdf
TYPESOFDATAANALYSIS research methodology .pdfTYPESOFDATAANALYSIS research methodology .pdf
TYPESOFDATAANALYSIS research methodology .pdf
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
 
ML4 Regression.pptx
ML4 Regression.pptxML4 Regression.pptx
ML4 Regression.pptx
 
MEASURES OF DISPERSION.ppt
MEASURES OF DISPERSION.pptMEASURES OF DISPERSION.ppt
MEASURES OF DISPERSION.ppt
 
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
 
Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and Regression
 
template.pptx
template.pptxtemplate.pptx
template.pptx
 
how to select the appropriate method for our study of interest
how to select the appropriate method for our study of interest how to select the appropriate method for our study of interest
how to select the appropriate method for our study of interest
 
Duality in AdS/CFT, Chicago 7 Nov. 2014
Duality in AdS/CFT, Chicago 7 Nov. 2014Duality in AdS/CFT, Chicago 7 Nov. 2014
Duality in AdS/CFT, Chicago 7 Nov. 2014
 
Introduction to Statistics53004300.ppt
Introduction to Statistics53004300.pptIntroduction to Statistics53004300.ppt
Introduction to Statistics53004300.ppt
 
RM7.ppt
RM7.pptRM7.ppt
RM7.ppt
 
Causal Inference Introduction.pdf
Causal Inference Introduction.pdfCausal Inference Introduction.pdf
Causal Inference Introduction.pdf
 
Chapter 2 methods and statistics
Chapter 2  methods and statisticsChapter 2  methods and statistics
Chapter 2 methods and statistics
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptx
 
07 Applications of Diffusion (2017)
07 Applications of Diffusion (2017)07 Applications of Diffusion (2017)
07 Applications of Diffusion (2017)
 
Quantitative Research-Measurement & presentation.pdf
Quantitative Research-Measurement & presentation.pdfQuantitative Research-Measurement & presentation.pdf
Quantitative Research-Measurement & presentation.pdf
 
Multiple regression .pptx
Multiple regression .pptxMultiple regression .pptx
Multiple regression .pptx
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 

More from DivyanshGupta922023

(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis
(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis
(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apisDivyanshGupta922023
 
DevOps The Buzzword - everything about devops
DevOps The Buzzword - everything about devopsDevOps The Buzzword - everything about devops
DevOps The Buzzword - everything about devopsDivyanshGupta922023
 
Git Basics walkthough to all basic concept and commands of git
Git Basics walkthough to all basic concept and commands of gitGit Basics walkthough to all basic concept and commands of git
Git Basics walkthough to all basic concept and commands of gitDivyanshGupta922023
 
jquery summit presentation for large scale javascript applications
jquery summit  presentation for large scale javascript applicationsjquery summit  presentation for large scale javascript applications
jquery summit presentation for large scale javascript applicationsDivyanshGupta922023
 
DHC Microbiome Presentation 4-23-19.pptx
DHC Microbiome Presentation 4-23-19.pptxDHC Microbiome Presentation 4-23-19.pptx
DHC Microbiome Presentation 4-23-19.pptxDivyanshGupta922023
 
10-security-concepts-lightning-talk 1of2.pptx
10-security-concepts-lightning-talk 1of2.pptx10-security-concepts-lightning-talk 1of2.pptx
10-security-concepts-lightning-talk 1of2.pptxDivyanshGupta922023
 

More from DivyanshGupta922023 (17)

(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis
(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis
(Public) FedCM BlinkOn 16 fedcm and privacy sandbox apis
 
DevOps The Buzzword - everything about devops
DevOps The Buzzword - everything about devopsDevOps The Buzzword - everything about devops
DevOps The Buzzword - everything about devops
 
Git Basics walkthough to all basic concept and commands of git
Git Basics walkthough to all basic concept and commands of gitGit Basics walkthough to all basic concept and commands of git
Git Basics walkthough to all basic concept and commands of git
 
jquery summit presentation for large scale javascript applications
jquery summit  presentation for large scale javascript applicationsjquery summit  presentation for large scale javascript applications
jquery summit presentation for large scale javascript applications
 
Next.js - ReactPlayIO.pptx
Next.js - ReactPlayIO.pptxNext.js - ReactPlayIO.pptx
Next.js - ReactPlayIO.pptx
 
Management+team.pptx
Management+team.pptxManagement+team.pptx
Management+team.pptx
 
DHC Microbiome Presentation 4-23-19.pptx
DHC Microbiome Presentation 4-23-19.pptxDHC Microbiome Presentation 4-23-19.pptx
DHC Microbiome Presentation 4-23-19.pptx
 
developer-burnout.pdf
developer-burnout.pdfdeveloper-burnout.pdf
developer-burnout.pdf
 
AzureIntro.pptx
AzureIntro.pptxAzureIntro.pptx
AzureIntro.pptx
 
api-driven-development.pdf
api-driven-development.pdfapi-driven-development.pdf
api-driven-development.pdf
 
Internet of Things.pptx
Internet of Things.pptxInternet of Things.pptx
Internet of Things.pptx
 
Functional JS+ ES6.pptx
Functional JS+ ES6.pptxFunctional JS+ ES6.pptx
Functional JS+ ES6.pptx
 
AAAI19-Open.pptx
AAAI19-Open.pptxAAAI19-Open.pptx
AAAI19-Open.pptx
 
10-security-concepts-lightning-talk 1of2.pptx
10-security-concepts-lightning-talk 1of2.pptx10-security-concepts-lightning-talk 1of2.pptx
10-security-concepts-lightning-talk 1of2.pptx
 
ReactJS presentation.pptx
ReactJS presentation.pptxReactJS presentation.pptx
ReactJS presentation.pptx
 
01-React js Intro.pptx
01-React js Intro.pptx01-React js Intro.pptx
01-React js Intro.pptx
 
Nextjs13.pptx
Nextjs13.pptxNextjs13.pptx
Nextjs13.pptx
 

Recently uploaded

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Recently uploaded (20)

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

Introduction to Directed Acyclic Graphs (DAGs) for Confounder Selection

  • 1. Introduction to Directed Acyclic Graphs Causal Inference Shared Interest Group January 19th, 2016
  • 2. Overview/Goals • Provide introduction on the use of DAGs for confounder selection in nonexperimental studies – What DAGs offer; limitations – Overview of DAG terminology and rules – Overview of d-separation criteria for assessing open and closed paths – Share approaches for selecting minimum adjustment set • Share a few examples of DAGs and when/why other approaches to covariate selection may fail • Have a broad discussion on the use of DAGs; hope for this to be a conversation
  • 3. Disclaimers/Focus • Most examples can be found in: – Modern Epidemiology 3rd edition (Chapter 12) – Williamson et al. (2014) Introduction to casual diagrams for confounder selection – Hernan et al. (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology • Will not be covering mathematical proofs supporting use of DAGs for causal inference or grounding in counterfactual definition of causation • Will focus on use of DAGs for confounder selection; many other uses (e.g. selection bias, instrumental variable selection, measurement error, mediation/moderation, etc. – see Ch. 12 Modern Epi)
  • 4. What is a DAG?
  • 5. What is a DAG? • A casual diagram (e.g. graph) depicting the investigator's assumptions about casual relations among the exposure, outcome and covariates…that’s it! • Arrows are used to denote directionality • Contain only unidirectional arrows (single headed) • Must be acyclic; contain no feedback loops • Include all endogenous variables even if unknown (U)
  • 6. Why DAGs? • Because we are often interested in estimating casual affects from observational data – Not an easy task as sample associations are directly observable, but causation is not – Four possible casual structures can contribute to an association between X and Y • 1.) X may cause Y; what we are interested in • 2.) Y may cause X; fairly easy to rule out • 3.) X and Y may share a common cause we have failed to condition on; this is classical confounding • 4.) we have conditioned on a collider (variable affected by X and Y) • *Associations may also be due to chance (i.e. sampling variability)
  • 7. Why DAGs? • DAGs provide a powerful and intuitive tool for deducing the statistical associations implied in a hypothesized causal network (without requiring any mathematics) • Thus, they help us to select which variables we need to condition on (e.g. control for) to obtain unbiased estimates in nonexperimental studies • If our DAG is correct, and all important factors can be measured, we can identify the minimum adjustment set necessary to achieve “conditional exchangeability” – e.g. exposure is essentially randomized within levels of covariates – And causal inference can be made from the observed data
  • 8. Why DAGs • Approaches to confounder selection based on statistical associations may fail to ID important confounders and support adjusting on non-confounders – Automatic variable selection; implicit assumption is that not all variables selected will be confounders, but all important confounders will be selected – Change-in-estimate criteria; implicit assumption is that any variable associated with a change is worth adjusting for – Traditional confounder definition – i.) associated w/ exposure in source population, ii) associated with outcome among the unexposed, and iii.) not on the casual pathway – Intermediate, collider, and instrumental variables can all behave statistically like confounders – need knowledge to guide selection!
  • 9. Limitations of DAGs • How can we know the true underlying causal structure/network to draw the DAG? – Can’t be known; If knew we won’t need the study! – However, true casual structure exists even if we do not know it… • All casual inference based on statistical models are implicitly based on some casual structure • DAGs simply make these assumptions more explicit • Can compare competing casual models to assess their compatibility with the observed data • From the observed statistical associations can also deduce causal structures compatible with the observations/data
  • 10. Limitations of DAGs • Drawing DAGs can be challenging as they need to be developed in the context of all available evidence – Often requires multidisciplinary input; and well developed theory/area • Latent variables can pose problems; causes and affects harder to deduce; can’t measure directly also harder to condition on • Effect modification “difficultish” to represent since each variable represented by a single node – Often represented by single node conceptually representing both variables; may necessitate multiple DAGs; current area of research • DAGs provide no information on the effect sizes or functional forms – Simply nonparametric graphical representations of casual networks
  • 11. DAG Terminology • Variables are nodes • Arrows are edges and imply direction • Paths are unbroken sequences of arrows linking nodes regardless of direction – X > Y > Z – U > Y > Z – U > Y < X X U Y Z
  • 12. DAG Terminology • X is said to directly affect Y if there is an arrow from X to Y • X indirectly affects Z if there is a unidirectional path from X to Z (X>Y>Z) • Y is an “intermediate” variable between X and Z • Y is also said to “intercept” the path from (X > Y > Z) or from (U > Y > Z)
  • 13. DAG Terminology • Children of X are variables directly affected by X; {Y} • Parents of Y are variables that directly affect Y; {X, U} • Descendants of X are all variables either directly or indirectly affected by X; {Y, Z}, but not U • Ancestors of Z are all variables that either directly or indirectly affect Z; {Y, X, U}
  • 14. DAG Terminology • Directed paths are the special case where all edges in path flow head-to-tail (unidirectional); these are causal pathways – Ex. (X > Y), (X > Y > Z), (U > Y), (U > Y > Z) • All undirected paths are non-causal; (U > Y < X) • A variable on the path where two arrows meet is called a “collider”; child of both the variable before and after it
  • 15. DAG Terminology • Nodes where multiple arrows meet are called “colliders”; node has multiple parents (U > Y < X) • Traditional terminology is that Y is a common effect of U and X; e.g. Y is child to both • Association is not transmitted across common effects! – Because two factors share a common effect doesn’t mean associated – No need to condition on collider; path is closed • Colliders cause special problems for causal inference
  • 16. Collider bias • Conditioning on a collider results in collider bias because it opens a backdoor path across the colliding nodes • Basically, what happens is that two variables that were originally marginally independent become dependent conditional on Z • Thus, not only is adjustment not needed, but can be harmful • Let’s look at a hypothetical example…
  • 17. Example Hernan 2002 • X represents being on a diet • Y represents recent diagnosis of non-diet-related cancer • Z represent recent weight loss > 5kg (1=yes, 0=no) • Assume X does not cause Y; Z common cause of X and Y • X and Y are marginally independent in the source population – Table 1 below • RRXY = (100/(100+100)) / (200/(200+200)) = 1 • Knowing that someone was dieting does not change prob. of cancer Y=1 Y=0 X=1 100 100 X=0 200 200
  • 18. Example Hernan 2002 • Now we condition on recent weight loss (e.g. collider) • Given that someone lost weight, it becomes more likely that she had cancer if is she was not dieting – Data here are made-up, but the above conditional association is intuitive • Thus, for those that lost weight, being on a diet and being recently diagnosed with cancer are inversely related • RRXY|C=1 = 0.79 RRXY|C=0 = 0.92 Z=1 D=1 D=0 X=1 55 25 X=0 70 10 Z=0 D=1 D=0 X=1 45 75 X=0 130 190
  • 19. Examples of confounding and collider bias - Top: No effect of A on Y; no effect of A on Y observed; path blocked at collider C4; no bias - Middle: No effect of A on Y; potential association between A and Y observed; conditioning on collider C4; bias - Bottom: No effect A on Y; A < C4 > Y represents classical confounding, but adjustment opens up A < C1 > C4 < C3 < C2 > Y path; A Y association observed; bias (need also adjust C1)
  • 20. d-separation criteria • Stands for “direction” separation criteria a.k.a. “directed graph separation rules”; this is one of the reasons we care about DAGs! • Two variables are d-connected if there is an open path between them; they are d-separated if all paths are closed • If two variables are d-separated then by definition they will be unassociated; no path from X to Y • If two variables are d-connected there is an open path implying a marginal association between them – Open directed pathways between X and Y are what we want to estimate! – We also want to control for all undirected open pathways between X and Y
  • 21. d-separation criteria • Often defined in terms of unconditional and conditional separation • Unconditional d-separation – Path is open if there are no colliders on the path – If collider on path; closed unconditionally ; collider block path – Thus, all directed paths are open (as can have no collider) • Conditional d-separation – Conditioning on a non-collider Z, blocks the path at Z – Conditioning on a collider, or decedent of a collider, open the path • Combining these criteria allows us to identify the set of covariates that will block confounding paths!
  • 23. Let’s look at some examples • Interested in association b/w smoking and adult asthma • Here truth assumed to be known • Smoking is not a cause of asthma • Do we need to adjust for childhood asthma? – Smk < cAsthma > aAsthma is open – Estimate will be biased – Need to condition in cAsthma – Classic confounding bias
  • 24. Let’s look at some examples • What if the true casual structure were different? • What do we condition on here? • Nothing – path blocked at collider parent smk > cAsthma < atopy; unbiased • If adjust for cAsthma here open up backdoor path; bias • Would have to also condition on parental smoking; unbiased – Minimum set = {parent smoking and cAsthma}
  • 25. Let’s look at some examples • What of the casual structure were different? • What to condition on? • Path from personal smoking < cAsthma < atopy > aAsthma is now open; bias • Colliders are path specific! • Minimal sufficient adjustment sets – {atopy} – {cAsthma, parent smoking}
  • 26. Bit more realistic example; still simple - Might need some rules to figure out what to condition on here!
  • 27. DAGs can get highly complex! -Definitely going to need some rules here!
  • 28. DAGs can get highly complex! -And here!
  • 29. Pearl’s Rules to ID Minimal Adjustment Set
  • 30. Pearl’s Rules to ID Minimal Adjustment Set
  • 31. Pearl’s Rules to ID Minimal Adjustment Set • Rules provide a formulaic approach to identifying a minimal adjustment set • Still need to start by guessing at the set • Still likely to be a bit confusing at first • Lucky there are some terrific open-source software packages that will do this for us! – DAGitty: web platform – very easy to use – dagR package in R
  • 32. DAGitty Example • I made this example in about 60 sec. • Red shows biasing path
  • 33. Example of when other approaches may fail • Example taken from Hernan et al. (2002) Causal Knowledge as a Prerequisite for Confounding Evaluation • Examined relation between folic acid supplementation and neural tube defects in the Slone Epidemiology Birth Defects Study (1992-1997 to rule out impact of fortification) • Cases were mothers who’s infants had neural tube defects; controls delivered infants with non-folic acid related defects • Exposure was folic acid supplementation (yes/no) • C is a potential confounder, unrevealed for now, but known to not be on the casual pathway from exposure to disease • For simplicity assume all variables measured w/o error
  • 34. Example of when other approaches may fail • Automatic variable selection: add if p-value < 0.10; covariate meets this criteria • Change-in-estimate criteria: changes OR by ~15%; covariate meets criteria • Traditional confounding definition: meets all criteria • Unadjusted OR = 0.65 (0.45; 0.94); inversely related • Adjusted OR = 0.80 (0.53; 1.20) get some attenuation • What is the covariate? Whether pregnancy ends in still birth/therapeutic abortion – Do we think it is a confounder? Common cause of supplementation use and neutral tube defects? Not likely, but commonly adjusted for – Theory suggest loss of birth common cause of low folic acid; collider – Analogous to restricting subjects to just live births!
  • 35. Good Resources • Introductory papers: – Williamson E, Aitken Z, Lawrie J, Dharmage S, Burgess J, Forbes A. Introduction to causal diagrams for confounder selection.Respirology. 2014;19(3):303–311. – Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–84. – Shrier I, Platt R. Reducing bias through directed acyclic graphs. Bmc Med Res Methodol. 2008;8(1):70. doi:10.1186/1471-2288-8-70. • Chapter 12: Modern Epidemiology 3rd edition