Progressive Learning and Disentanglement of
Hierarchical Representations
Hwang seung hyun
Yonsei University Severance Hospital CCIDS
Rochester Institute of Technology | ICLR 2020
2020.08.23
Introduction Related Work Methods and
Experiments
01 02 03
Conclusion
04
Yonsei Unversity Severance Hospital CCIDS
Contents
Pro-VLAE
Introduction – Background
• Learning rich representation from data is
important for deep generative models like
VAE
• VAE has not been able to fully utilize the
depth of NNs → Encoder is able to extract
high-level representations helpful for
discriminative tasks, but generation requires
the preservation of all generative factors at
different abstraction levels
• It is difficult to extract and disentangle all
generative factors at once, especially at
different abstraction levels
Introduction / Related Work / Methods and Experiments / Conclusion
01
[Variational Auto Encoder]
[Latent space Disentanglement]
Pro-VLAE
Introduction – Proposal
• Use progressive learning strategies to improve learning and disentangling of
hierarchical representations
- Hierarchical representations are latent representations from different depths of the
inference(encoder) and generation(decoder) model [1] VLAE
- Progressively grow the capacity of the network.
→ Named it pro-VLAE
Introduction / Related Work / Methods and Experiments / Conclusion
02[1] Shengjia Zhao, Jiaming Song, and Stefano Ermon. Learning hierarchical features from deep generative models. In International Conference on Machine Learning, pp. 4091–
4099, 2017.
Pro-VLAE
Introduction – Contribution
• Showed improved disentanglement on several datasets using three disentanglement
metrics
• Presented both qualitative and quantitative evidence that pro-VLAE is able to first learn
the most abstract representations and then progressively disentangle existing factors.
• Showed superior disentanglement performance compared to B-VAE, VLAE, and the
teacher-student strategy.
Introduction / Related Work / Methods and Experiments / Conclusion
03
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
04
Variational Autoencoder
[Encoder]
[Decoder]
* Images imported from https://itrepo.tistory.com/48
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
05
Variants of VAE
VLAE [1]
B-VAE [2]
[1] Shengjia Zhao, Jiaming Song, and Stefano Ermon. Learning hierarchical features from deep generative models. In International Conference on Machine Learning, pp. 4091–
4099, 2017.
[2] Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-vae: Learning basic visual
concepts with a constrained variational framework. In International Conference on Learning Representations, 2017.
• Learn independent hierarchical representation
from different levels of abstraction
[Variational Ladder AutoEncoder]
[Beta - VAE]
• When B>1, applies a stronger constraint on the latent bottleneck
and limits the representation capacity of z
• Higher ‘B’ value encourages more efficient latent encoding and
disentanglement, in change of reconstruction quality.
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
06
Progressive Learning
[3] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning
Representations, 2018.
[4] Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga SorkineHornung, and Christopher Schroers. A fully progressive approach to single-image
superresolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 864–873, 2018.
• A small number of works have demonstrated the ability to simply grow the
architecture of GAN’s from a shallow network with limited capacity for generating
low-resolution images, to a deep network capable of generating super-resolution
images. [3], [4]
Related Work
Introduction / Related Work / Methods and Experiments / Conclusion
07
Teacher-student strategy
• Teacher-student strategy was used to progressively grow the dimension of the
latent representations in VAE [5]
• After a teacher is trained to correctly disentangle attributes, a student model will
be trained to improve the visual quality of the reconstruction
[5] Jose Lezama. Overcoming the disentanglement vs reconstruction trade-off via jacobian supervision. ´ In International Conference on Learning Representations, 2019.
Methods and Experiments
Model : VAE with hierarchical representations
Introduction / Related Work / Methods and Experiments / Conclusion
08
• Decompose z into {z1, z2, …zL} with each z from different abstraction levels.
• z are independent and each represents generative factors not captured in other
levels
• Define an inference model to approximate the posterior
• Final goal of model is to maximize a modified evidence lower bound(ELBO)
Methods and Experiments
Progressive learning of hierarchical representation
Introduction / Related Work / Methods and Experiments / Conclusion
09
• Learn the latent variables progressively from the highest to the lowest (
• At each progressive step, move down one abstraction level and grow the inference model
by introducing new latent code
• Simultaneously, grow the decoder
Methods and Experiments
Implementation Strategies
Introduction / Related Work / Methods and Experiments / Conclusion
10
(1) Fade In
- fade in coefficient alpha when growing new components into encoder and decoder
(2) Regularization term on Z
- applying KL penalty to all latent variables that have not been used in the generation at
progressive step s
Methods and Experiments
Code
Introduction / Related Work / Methods and Experiments / Conclusion
11
[1] Inference [2] Sampling
[3] Generation
[4] Latent Loss
[5] Reconstruction Loss
Methods and Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
12
[7] Optimize
[6] Training
Code
Methods and Experiments
Disentanglement Metric
Introduction / Related Work / Methods and Experiments / Conclusion
13
(1) MIG (Mutual Information Gap) [6]
[6] Jose Lezama. Overcoming the disentanglement vs reconstruction trade-off via jacobian supervision. ´ In International Conference on Learning Representations, 2019.
(2) MIG-sup (supplement MIG)
• Single factor can have high mutual information with multiple latent variables
• Measure the difference between the top two latent variables with highest mutual information
• If different generative factors are entangled into the same latent dimension, MIG score doesn’t work
• Propose MIG-sup to recognize the entanglement of multiple generative factors in same latent
dimension
Latent A Latent B
Generative
factor A
Generative
factor B
[Generative Factor K]
[Latent Space J]
Methods and Experiments
Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
14
Methods and Experiments
Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
15
Methods and Experiments
Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
16
Methods and Experiments
Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
17
Methods and Experiments
Experiments
Introduction / Related Work / Methods and Experiments / Conclusion
18
Conclusion
Introduction / Related Work / Methods and Experiments / Conclusion
• Presented a progressive strategy for learning and disentangling
hierarchical representations
• Model learn independent representations from high-to-low levels of
abstraction by progressively growing the capacity of the VAE deep to
shallow
• Future work is to include stronger guidance for allocating information
across the hierarchy of abstraction levels
19

Progressive learning and Disentanglement of hierarchical representations

  • 1.
    Progressive Learning andDisentanglement of Hierarchical Representations Hwang seung hyun Yonsei University Severance Hospital CCIDS Rochester Institute of Technology | ICLR 2020 2020.08.23
  • 2.
    Introduction Related WorkMethods and Experiments 01 02 03 Conclusion 04 Yonsei Unversity Severance Hospital CCIDS Contents
  • 3.
    Pro-VLAE Introduction – Background •Learning rich representation from data is important for deep generative models like VAE • VAE has not been able to fully utilize the depth of NNs → Encoder is able to extract high-level representations helpful for discriminative tasks, but generation requires the preservation of all generative factors at different abstraction levels • It is difficult to extract and disentangle all generative factors at once, especially at different abstraction levels Introduction / Related Work / Methods and Experiments / Conclusion 01 [Variational Auto Encoder] [Latent space Disentanglement]
  • 4.
    Pro-VLAE Introduction – Proposal •Use progressive learning strategies to improve learning and disentangling of hierarchical representations - Hierarchical representations are latent representations from different depths of the inference(encoder) and generation(decoder) model [1] VLAE - Progressively grow the capacity of the network. → Named it pro-VLAE Introduction / Related Work / Methods and Experiments / Conclusion 02[1] Shengjia Zhao, Jiaming Song, and Stefano Ermon. Learning hierarchical features from deep generative models. In International Conference on Machine Learning, pp. 4091– 4099, 2017.
  • 5.
    Pro-VLAE Introduction – Contribution •Showed improved disentanglement on several datasets using three disentanglement metrics • Presented both qualitative and quantitative evidence that pro-VLAE is able to first learn the most abstract representations and then progressively disentangle existing factors. • Showed superior disentanglement performance compared to B-VAE, VLAE, and the teacher-student strategy. Introduction / Related Work / Methods and Experiments / Conclusion 03
  • 6.
    Related Work Introduction /Related Work / Methods and Experiments / Conclusion 04 Variational Autoencoder [Encoder] [Decoder] * Images imported from https://itrepo.tistory.com/48
  • 7.
    Related Work Introduction /Related Work / Methods and Experiments / Conclusion 05 Variants of VAE VLAE [1] B-VAE [2] [1] Shengjia Zhao, Jiaming Song, and Stefano Ermon. Learning hierarchical features from deep generative models. In International Conference on Machine Learning, pp. 4091– 4099, 2017. [2] Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, 2017. • Learn independent hierarchical representation from different levels of abstraction [Variational Ladder AutoEncoder] [Beta - VAE] • When B>1, applies a stronger constraint on the latent bottleneck and limits the representation capacity of z • Higher ‘B’ value encourages more efficient latent encoding and disentanglement, in change of reconstruction quality.
  • 8.
    Related Work Introduction /Related Work / Methods and Experiments / Conclusion 06 Progressive Learning [3] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning Representations, 2018. [4] Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga SorkineHornung, and Christopher Schroers. A fully progressive approach to single-image superresolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 864–873, 2018. • A small number of works have demonstrated the ability to simply grow the architecture of GAN’s from a shallow network with limited capacity for generating low-resolution images, to a deep network capable of generating super-resolution images. [3], [4]
  • 9.
    Related Work Introduction /Related Work / Methods and Experiments / Conclusion 07 Teacher-student strategy • Teacher-student strategy was used to progressively grow the dimension of the latent representations in VAE [5] • After a teacher is trained to correctly disentangle attributes, a student model will be trained to improve the visual quality of the reconstruction [5] Jose Lezama. Overcoming the disentanglement vs reconstruction trade-off via jacobian supervision. ´ In International Conference on Learning Representations, 2019.
  • 10.
    Methods and Experiments Model: VAE with hierarchical representations Introduction / Related Work / Methods and Experiments / Conclusion 08 • Decompose z into {z1, z2, …zL} with each z from different abstraction levels. • z are independent and each represents generative factors not captured in other levels • Define an inference model to approximate the posterior • Final goal of model is to maximize a modified evidence lower bound(ELBO)
  • 11.
    Methods and Experiments Progressivelearning of hierarchical representation Introduction / Related Work / Methods and Experiments / Conclusion 09 • Learn the latent variables progressively from the highest to the lowest ( • At each progressive step, move down one abstraction level and grow the inference model by introducing new latent code • Simultaneously, grow the decoder
  • 12.
    Methods and Experiments ImplementationStrategies Introduction / Related Work / Methods and Experiments / Conclusion 10 (1) Fade In - fade in coefficient alpha when growing new components into encoder and decoder (2) Regularization term on Z - applying KL penalty to all latent variables that have not been used in the generation at progressive step s
  • 13.
    Methods and Experiments Code Introduction/ Related Work / Methods and Experiments / Conclusion 11 [1] Inference [2] Sampling [3] Generation [4] Latent Loss [5] Reconstruction Loss
  • 14.
    Methods and Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 12 [7] Optimize [6] Training Code
  • 15.
    Methods and Experiments DisentanglementMetric Introduction / Related Work / Methods and Experiments / Conclusion 13 (1) MIG (Mutual Information Gap) [6] [6] Jose Lezama. Overcoming the disentanglement vs reconstruction trade-off via jacobian supervision. ´ In International Conference on Learning Representations, 2019. (2) MIG-sup (supplement MIG) • Single factor can have high mutual information with multiple latent variables • Measure the difference between the top two latent variables with highest mutual information • If different generative factors are entangled into the same latent dimension, MIG score doesn’t work • Propose MIG-sup to recognize the entanglement of multiple generative factors in same latent dimension Latent A Latent B Generative factor A Generative factor B [Generative Factor K] [Latent Space J]
  • 16.
    Methods and Experiments Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 14
  • 17.
    Methods and Experiments Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 15
  • 18.
    Methods and Experiments Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 16
  • 19.
    Methods and Experiments Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 17
  • 20.
    Methods and Experiments Experiments Introduction/ Related Work / Methods and Experiments / Conclusion 18
  • 21.
    Conclusion Introduction / RelatedWork / Methods and Experiments / Conclusion • Presented a progressive strategy for learning and disentangling hierarchical representations • Model learn independent representations from high-to-low levels of abstraction by progressively growing the capacity of the VAE deep to shallow • Future work is to include stronger guidance for allocating information across the hierarchy of abstraction levels 19