Latent Variable Models,
Causal Inference,
and Sensitivity Analysis
Alex D’Amour
Google Brain
SAMAI Causal Inference Opening Workshop
Dec 8, 2019
Goal
Summarize a recent conversation at the intersection of ML
and Causal Inference communities.
Advocate for sensitivity analysis as a constructive way
forward.
The Deconfounder Debate
Causes Outcome
A = (A(1)
, ... , A(m)
)
P(Y | do(A)) is not non-parametrically identified.
U factorizes P(A)
U factorizes P(A)
Factor Model as “Soft” Observation
Factor Model as “Soft” Observation
Factor Model as “Soft” Observation
Apply standard causal adjustment
Lots of ML tools for fitting the factor
model.
Factor Model as “Soft” Observation
Methods of this type appear across science, and are
standard procedure (e.g., Price et al 2006 in GWAS).
This paper brings them under the causal
microscope.
Methods of this type appear across science, and are
standard procedure (e.g., Price et al 2006 in GWAS).
This paper finally brings them under the causal
microscope.
Methods of this type appear across science, and are
standard procedure (e.g., Price et al 2006 in GWAS).
This paper makes the causal argument explicit.
The factor model setting buys useful information,
but not sufficient for identification.
Identification and Sensitivity
Knowing the Audience
The ML community is optimistic.
Presumption: Identified until proven otherwise (this is changing).
Knowing the Audience
The ML community is optimistic.
Presumption: Identified until proven otherwise (this is changing).
Proofs of identification.
Knowing the Audience
The ML community is optimistic.
Presumption: Identified until proven otherwise (this is changing).
Proofs of identification.
Counter-examples with non-identification.
Causal Inference Optimism
In every badly identified model,
there lies a great sensitivity analysis.
Causal Inference Optimism
In every badly identified model,
there lies a great* sensitivity analysis.
*Here, “great” means a sensitivity analysis that maps out a true
region of partial identification. See, e.g.,
https://arxiv.org/abs/1809.00399 and cited work.
Sensitivity Analysis
Constructive proof of non-identification.
Identify indeterminacy in the model for U, A, Y.
Ignorance region: the set of causal parameters compatible
with observed data distribution.
For fancy machine learning, consider
2 x 2 tables
Binary with Multiple Causes
Binary with Multiple Causes
Binary with Multiple Causes
Binary with Multiple Causes
Binary with Multiple Causes
Binary with Multiple Causes
Binary with Multiple Causes
How much do we know about the confounding?
1. P(U | A) ambiguous.MoreinformationaboutU
How much do we know about the confounding?
1. P(U | A) ambiguous.
2. P(U | A) identified.
MoreinformationaboutU
How much do we know about the confounding?
1. P(U | A) ambiguous.
2. P(U | A) identified.
3. P(U | A) degenerate.
MoreinformationaboutU
Rosenbaum and Rubin 1983
Single cause setting.
All binary.
Parameterize violations of
ignorability by 𝛿 and 𝛼.
Examine range of conclusions
as parameters vary.
Rosenbaum and Rubin 1983, Assessing Sensitivity to an Unobserved Binary
Covariate in an Observational Study with Binary Outcome. JRSS-B.
𝛿𝛼
Binary with Multiple Causes
Note: m ≥ 3 and pA
(U) non-trivial ⇒ P(U, A) unique.
Rosenbaum and Rubin 1983
Rosenbaum and Rubin 1983, Assessing Sensitivity to an Unobserved Binary
Covariate in an Observational Study with Binary Outcome. JRSS-B.
𝛿𝛼
Rosenbaum and Rubin 1983
Rosenbaum and Rubin 1983, Assessing Sensitivity to an Unobserved Binary
Covariate in an Observational Study with Binary Outcome. JRSS-B.
𝛿𝛼
P(U, Y | A = a)
What if there is uncertainty about the
unobserved confounder?
Non-Degenerate Case
Constrained by
observed data
Constrained by
identified
factorization
Sums to 1
Non-Degenerate Case
Constrained by
observed data
Constrained by
identified
factorization
Sums to 1
Non-Degenerate Case
Constrained by
observed data
Constrained by
identified
factorization
Sums to 1
(0,0,0,0,0,0)
(1,0,0,0,0,0)
…
(0,0,0,0,0,1)
What if we can recover the confounder?
Degenerate Case
U can be reconstructed
from A.
Cause Projections
Cause Projections
Mean of A across dimensions.
Cause Projections
Mean of A across dimensions.
Orthogonal direction
Cause Projections
Color: U-value
Opacity: Probability
Cause Projections
Cause Projections
Cause Projections
Confounder can be recovered exactly!
Cause Projections
What if we move a red point (U = 0) to the right?
Bridging the Gap with Parametric
Assumptions?
Beyond Binary
Beyond Binary
Non-Degenerate Case
Copula c(U, Y | A) is not
identified without further
modeling assumptions.
Non-Degenerate Case
Copula c(U, Y | A) is not
identified without further
modeling assumptions.
Non-Degenerate Case
Copula c(U, Y | A) is not
identified without further
modeling assumptions.
Non-Degenerate Case
Parametric assumptions
about outcome model
necessary to fix the
copula.
Degenerate Case
Degenerate Case
Left or Right
Degenerate Case
Left or Right
Parametric assumptions
needed to break tie.
Implications: Non-overlap Regime
Moving Forward with Sensitivity Analysis
A Template
Where do the assumptions live?
ML lives here
Where do the assumptions live?
Parameterize for
sensitivity analysis
Where do the assumptions live?
Interpretation
hinges on this
Going Forward
In every badly identified model, there lies a great sensitivity analysis.
Many threats to identification not covered here.
E.g., partitioning variation between confounders and mediators;
measurement error models (for some of these concerns, see Ogburn et al
comment).
Thanks!
AISTATS paper: https://arxiv.org/abs/1902.10286.
JASA comment: https://arxiv.org/abs/1910.08042.
Ogburn et al comment: https://arxiv.org/abs/1910.05438.
Contact:
alexdamour@google.com
@alexdamour
More Modest Goals
Apologies, U will now be denoted by Z.
Two Approaches to “Downsizing”
Effects on the treated
Subset of causes
“Consistent” Confounder Recovery (no overlap)
Require
Then missing outcome distributions are identifiable for values of A that map to
same value of Z. Specifically, for the above estimands
“Consistent” Confounder Recovery
Require
Then missing outcome distributions are identifiable for values of A that map to
same value of Z. Specifically, for the above estimands
Where’d the factor
model go?
The Deconfounder as Detector
Suppose we obtain a function
That renders the causes A mutually
conditionally independent. Then,
Note: Requires overlap in hat z(A, X) to be useful.
Alternative Settings
Strategy 1: Change the Estimand
Give up on estimating effect of all causes simultaneously.
Set some causes aside. Call these W.
Simpler estimand,
harder to interpret.
Strategy 1: Change the Estimand
If P(U | W) degenerate, identified! (Theorem 7).
Technically, no unobserved confounding.
Factor model provides useful
structure.
(Also approach in Price et al 2006)
Add a negative control exposure Z.
Strategy 2: Augment the Data
Miao W, Geng Z, Tchetgen Tchetgen EJ. Identifying causal effects with proxy
variables of an unmeasured confounder. Biometrika. 2018.
Strategy 2: Augment the Data
Miao W, Geng Z, Tchetgen Tchetgen EJ. Identifying causal effects with proxy
variables of an unmeasured confounder. Biometrika. 2018.
Under completeness conditions, conditional independence
between Z and Y identifies the copula.
Completeness conditions
put strong constraints on
how Z, W represent U.
An Identification Result
Kong, Yang, and Wang 2019. Multi-causal causal inference
with unmeasured confounding and binary outcome.
https://arxiv.org/abs/1907.13323.
An Identification Result
Kong, Yang, and Wang 2019. Multi-causal causal inference
with unmeasured confounding and binary outcome.
https://arxiv.org/abs/1907.13323.
✔Identified!
An Identification Result?
Kong, Yang, and Wang 2019. Multi-causal causal inference
with unmeasured confounding and binary outcome.
https://arxiv.org/abs/1907.13323.
An Identification Result?
Kong, Yang, and Wang 2019. Multi-causal causal inference
with unmeasured confounding and binary outcome.
https://arxiv.org/abs/1907.13323.
✘Not identified!
Genome <- Geo Ancestry -> Education
Thanks!
AISTATS paper: https://arxiv.org/abs/1902.10286.
Thanks to Yixin and Dave for open conversations.
Contact:
alexdamour@google.com
@alexdamour

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference, and Sensitivity - Alex D'Amour, December 9, 2019