Diffusion Models Beat GANs
on Image Synthesis
OpenAI
Presented by: Beeren Sahu
Likelihood based models
Generative models whose objective function is log-likelihood.
Examples: Autoregressive models, VAE, Diffusion Models, etc
GANs use sample quality metrics such as FID, Inception Score and Precision
some of these metrics do not fully capture diversity.
Negative Log Likelihood does otherwise
GAN vs Likelihood-base models
● GANs capture less diversity than state-of-the-art likelihood-based models
○ Ali Razavi, Aaron van den Oord, and Oriol Vinyals. Generating diverse high-fidelity images
with VQ-VAE-2. arXiv:1906.00446, 2019.
○ Alex Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models.
arXiv:2102.09672, 2021.
○ Charlie Nash, Jacob Menick, Sander Dieleman, and Peter W. Battaglia. Generating images
with sparse representations. arXiv:2103.03841, 2021.
● GANs are often difficult to train, collapsing without carefully selected
hyperparameters and regularizers
● Thus, make them difficult to scale and apply to new domains.
GAN vs Likelihood-base models
● Pros:
○ these models capture more diversity
○ typically easier to scale and train than GANs
● Cons:
○ they still fall short in terms of visual sample quality
○ except for VAEs, sampling from these models is slower than GANs
Diffusion Model
Diffusion models are a class of likelihood-based models which have recently been
shown to produce high-quality images while offering desirable properties such as
distribution coverage, a stationary training objective, and easy scalability.
This paper:
● achieve an FID of 2.97 on ImageNet 128×128 (May 2020)
● Beats previous record holder BigGAN-deep with FID of 5.7 (2018)
“We hypothesize that the gap between diffusion models and GANs stems from at
least two factors: first, that the model architectures used by recent GAN literature
have been heavily explored and refined; second, that GANs are able to trade off
diversity for quality, producing high quality samples but not covering the whole
distribution.”
PapersWithCode LeaderBoard
Denoising Diffusion Probabilistic Models
Probabilistic Models: learns some probability Distribution (PD)
Diffusion: gradually diffusing/adding noise to input image (Forward process)
Denoising: removing the noise to synthesis realistic image (Backward process)
Intuition
Forward: q(xt|xt-1)
Backward: pθ(xt-1|xt )
Evolution of Diffusion Models
Feb 2021
Improved Denoising Diffusion
Probabilistic Models
Alex Nichol, Prafulla Dhariwal
OpenAI
May 2021
Diffusion Models Beat GANs on
Image Synthesis
Prafulla Dhariwal, Alex Nichol
OpenAI
2015
Deep Unsupervised Learning
using Nonequilibrium
Thermodynamics
Jascha Sohl-Dickstein, Eric Weiss,
Niru Maheswaranathan, and Surya
Ganguli
Stanford and UC Berkeley
2019
Denoising Diffusion
Probabilistic Models
Jonathan Ho, Ajay Jain, Pieter
Abbeel
UC Berkeley
Forward Process
Backward Process
Loss
Loss (simplified)
Summary
Resources
Repo: https://github.com/openai/guided-diffusion
Video: https://www.youtube.com/watch?v=W-O7AZNzbzQ
VAE: https://towardsdatascience.com/understanding-variational-autoencoders-vaes-
f70510919f73

Diffusion models beat gans on image synthesis

  • 1.
    Diffusion Models BeatGANs on Image Synthesis OpenAI Presented by: Beeren Sahu
  • 4.
    Likelihood based models Generativemodels whose objective function is log-likelihood. Examples: Autoregressive models, VAE, Diffusion Models, etc GANs use sample quality metrics such as FID, Inception Score and Precision some of these metrics do not fully capture diversity. Negative Log Likelihood does otherwise
  • 5.
    GAN vs Likelihood-basemodels ● GANs capture less diversity than state-of-the-art likelihood-based models ○ Ali Razavi, Aaron van den Oord, and Oriol Vinyals. Generating diverse high-fidelity images with VQ-VAE-2. arXiv:1906.00446, 2019. ○ Alex Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. arXiv:2102.09672, 2021. ○ Charlie Nash, Jacob Menick, Sander Dieleman, and Peter W. Battaglia. Generating images with sparse representations. arXiv:2103.03841, 2021. ● GANs are often difficult to train, collapsing without carefully selected hyperparameters and regularizers ● Thus, make them difficult to scale and apply to new domains.
  • 6.
    GAN vs Likelihood-basemodels ● Pros: ○ these models capture more diversity ○ typically easier to scale and train than GANs ● Cons: ○ they still fall short in terms of visual sample quality ○ except for VAEs, sampling from these models is slower than GANs
  • 7.
    Diffusion Model Diffusion modelsare a class of likelihood-based models which have recently been shown to produce high-quality images while offering desirable properties such as distribution coverage, a stationary training objective, and easy scalability. This paper: ● achieve an FID of 2.97 on ImageNet 128×128 (May 2020) ● Beats previous record holder BigGAN-deep with FID of 5.7 (2018) “We hypothesize that the gap between diffusion models and GANs stems from at least two factors: first, that the model architectures used by recent GAN literature have been heavily explored and refined; second, that GANs are able to trade off diversity for quality, producing high quality samples but not covering the whole distribution.”
  • 8.
  • 9.
    Denoising Diffusion ProbabilisticModels Probabilistic Models: learns some probability Distribution (PD) Diffusion: gradually diffusing/adding noise to input image (Forward process) Denoising: removing the noise to synthesis realistic image (Backward process)
  • 10.
  • 11.
    Evolution of DiffusionModels Feb 2021 Improved Denoising Diffusion Probabilistic Models Alex Nichol, Prafulla Dhariwal OpenAI May 2021 Diffusion Models Beat GANs on Image Synthesis Prafulla Dhariwal, Alex Nichol OpenAI 2015 Deep Unsupervised Learning using Nonequilibrium Thermodynamics Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli Stanford and UC Berkeley 2019 Denoising Diffusion Probabilistic Models Jonathan Ho, Ajay Jain, Pieter Abbeel UC Berkeley
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
    Resources Repo: https://github.com/openai/guided-diffusion Video: https://www.youtube.com/watch?v=W-O7AZNzbzQ VAE:https://towardsdatascience.com/understanding-variational-autoencoders-vaes- f70510919f73