The document discusses understanding autoencoders through interventions on latent variables. It introduces representation learning and the manifold hypothesis. Good representations are described as extensible, compact, able to extrapolate, robust, and self-aware. Disentanglement aims to maximize independence between latent variables but factors may be related. Interventional consistency focuses on the causal structure of the generative process. Probing a trained VAE using interventions reveals unexpected structure in the learned latent space beyond statistical independence of factors. Response functions and matrices are used to analyze causal links between latent variables and disentangle factors in a semantically meaningful way.
2. Representation Learning
• The fundamental claim of representation learning is that our
problem can be better solved using a different (smaller) space
than the input (ambient) space → the Manifold Hypothesis
• So, break down our solution into two pieces:
1. Organize input into a more useful form → learn a representation
2. Focusing on what is left → solving the actual problem/s (not shown)
• In other words, if the input lives in , we only need , where
2
3. Understanding the representation
• High-level features of good representations:
❑Extensible – easily integrate expert knowledge
❑Compact – efficient time and space complexity
❑Extrapolate – generalize on a semantic level
❑Robust – not sensitive to unimportant changes
❑Self-aware – estimates uncertainties
• Why bother with story-telling, when the performance is what matters?
• Form connections with past work → educational
• Identify weaknesses and motivate improvements → innovative
3
Key question: On the quest for good representations, how can we make sense of what we have?
The Mythos of Model Interpretability
by Zachary Lipton (2017)
4. Disentanglement (the obvious)
• With the manifold hypothesis, we assumed there are a small
number of underlying factors that give rise to the observation, so
how about the representation just disentangles those factors?
• Simple inductive bias: maximize statistical independence
between latent variables to ensure there’s no overlapping
information
• What if the factors are not statistically independent?
What about non-trivial variable structure?
4
Example from Yoshua Bengio: a fork and knife are not statistically
independent, but can however be separately manipulated.
β-VAE, FVAE, DIP-VAE, TC-VAE, β-TC-VAE, etc.
5. Causality: Genuinely predictive models
• Statistical models identify patterns in
the dataset, but these correlations
may be spurious → non-predictive!
• ICM Principle – although individual
factors may not be independent,
the true process to be is comprised
of independent mechanisms
(→ interventions in an SCM)
• However, without strong assumptions or supervision, the true causal variables
cannot be identified (much less the full mechanisms) → guarantees are unrealistic
5
Towards Causal Representation Learning by Schölkopf et al. (2021) Locatello et al. (2018; arXiv 1811.12359)
6. Guiding Principle: Interventional Consistency
7
noise
interventions
Identifiability problem: impossible to guarantee that the model learns the true causal drivers
The effect of each individual lever may be different, but if they are equivalent in aggregate,
then you can’t distinguish the “true” from the learned generative process.
→ Now let’s focus on the causal structure of our learned generative process?
noise
interventions
7. Setting the Scene
• I give you a trained (beta-)VAE using a
deep CNN (500k params). Nothing special.
• Trained on 3D-Shapes – synthetic process
with relatively small observations and
6 independent DOFs (no supervision).
• The true factors of variation are
independent, and we see disentanglement
…but is that the full story?
Ours:
Disentangled:
8. Curiosity #1:
Prior doesn’t match the aggregate posterior
• Prior (green) doesn’t match the
aggregate posterior (blue)
• Latent variables are not statistically
independent
• But maybe that’s the point – you
can only trust your regularization
objective so much.
Weird
Ms. Statistics
9. Curiosity #2:
Decoder extends beyond the training manifold
• Decoder can still generate sensible
samples beyond the aggregate
posterior → good for generative
modeling
• Decoder doesn’t just invert the
encoder, but is doing more work
“for free”.
Weird
Manifold Man
2D Latent Traversal
VAEs display some interventional consistency out of the box
→How can we use that?
10. Hypothesis: Latent Space vs Latent Manifold
Can we separate the semantic information S (→ necessary to reconstruct
the sample) from the any “exogenous” information U in the latent space
which the decoder ignores anyway?
For each observation, find a latent vector that makes the
subsequent reconstruction as good as possible (without
straying too far from the prior)
For each point in the latent space, place it as close to the
data manifold as possible consistent with the encoder
(for reconstruction)
11. Latent Responses
• We quantify the semantic change in the
sample by measuring the effect of the
intervention in
• This enables quantifying the relationship
between latent variables by observing
how interventions “propagate”
→ how is semantics captured?
13
?
12. Probing the Learned Manifold
• Assuming the reconstructions have sufficiently high fidelity,
we can treat
• Response function:
• Interventional Response:
14
The response function projects
the perturbed point back onto the
latent manifold*
*similar to memorization in Radhakrishnan et. al. (2018; arXiv: 1810.10333)
Ambient Space Latent Space Latent Space
Ambient Space
Latent
Manifold
Data
Manifold
Generative
Manifold
“Response”
Manifold
13. Structure of the Latent Space
Assuming the fidelity of reconstructions is sufficiently high, the response
function filters out noise leaving only the semantic information in the
latent code.
15
14. Latent Response Matrix
• Define as resampling only
• To identify the causal links between the
latent variables, we intervene on one latent
variable at a time and compute the average
resulting effect on all latent variables
• Note that for this model, interventions on
many of the latent variables doesn’t result in
any significant effect → non-informative
17
15. Curiosity #3: Unexpected structure emerges
• Despite the true factors being
statistically independent, the
learned variables are not
• Perhaps the latent variables
contains additional structure
selected (implicitly) by our
inductive biases (e.g. continuity)
Cool
Frau Causality
→ What is this unexpected structure in the learned generative process?
16. Causal Disentanglement
• Conventionally disentanglement is evaluated by quantifying
how predictive each latent variable is for each true factor
• But for a generative model, what matters is how well a latent
variable controls a desired true factor
Conditioned Response Matrix (Causal)
DCI Responsibility Matrix (Statistical)
Eastwood et al. (2018; OpenReview By-7dz-AZ)
20
17. Latent Response Maps
• Starting from a 2D projection of the latent space, we can evaluate the
latent motion all over the latent space to map out the latent manifold
directly.
• Think of the response map as a field showing how far the model will
move in the latent space to reach the manifold
• We can use the divergence of the response map to get a sense
whether the response is converging or diverging at any point in the
latent space
• Lastly, the mean curvature , tells us where the response is
converging to → the latent manifold
23
Example Response Map
Blue:
Orange:
Arrow:
18. Double-helix Toy Example
• Given noisy samples from a double helix (3D ambient
space), our representation is 2D
24
19. Traversing the Helix Manifold
• Now that we can explicitly map out the latent manifold, we can directly
traverse along the maximum curvature regions of the latent space to
avoid leaving the manifold
→ semantic interpolations
Interpolating between two
(orange) samples. Naively we
take the Euclidean shortest path
(red), but using the response
maps, we can find a more
meaningful path (green)
25
Latent Space Ambient Space
20. So… what’s the manifold look like?
Divergence Mean Curvature
Decoded Samples
Note, floor color changes when
crossing the “decision boundaries”
where the latent response has high
divergence.
The high curvature regions (i.e. where
the responses converge to) resemble
10 categories ordered as a circle →
ground truth hue!
22. Another Opportunity for some Interpolations
30
Shortest path (Euclidean)
Best path (using response maps)
23. Conclusions
• Naïve disentanglement fails to capture:
• Non-trivial geometry of the true factors (e.g. periodicity)
• Relationships between true factors (e.g. facial hair vs. sex)
• Latent Responses – in reality, the true factors are out of scope, so let’s use
the causal machinery to understand the learned process in its own right.
• Identify causal links between learned variables directly
• Condition on true factors to evaluate causal disentanglement → fairness of
generative model
• Visualize the learned manifold directly (to reveal learned hidden geometry)
Yup!
Frau Causality
Hmm
Manifold Man
Hmm
Ms. Statistics
For links to identifiability: Reizinger et al. (2022: arXiv:2206.02416) 41
24. 42
2D MNIST
• Everyone who worked on the
project: Stefan Bauer,
Michel Besserve, and of
course Bernhard Schölkopf
• Thanks to: the EI department
and the MPI-IS
• For more details, see our
arXiv paper: 2106.16091
Thank you!