SlideShare a Scribd company logo
Introduction to Diffusion Models
2022.01.03.
KAIST ALIN-LAB
Sangwoo Mo
1
• Diffusion model is SOTA on image generation
• Beat BigGAN and StyleGAN on high-resolution images
Diffusion Model Boom!
2
Dhariwal & Nichol. Diffusion Models Beat GANs on Image Synthesis. NeurIPS’21
• Diffusion model is SOTA on density estimation
• Beat autoregressive models on likelihood score
Diffusion Model Boom!
3
Song et al. Maximum Likelihood Training of Score-Based Diffusion Models. NeurIPS’21
Kingma et al. Variational Diffusion Models. NeurIPS’21
• Diffusion model is useful for image editing
• Editing = Rough scribble + diffusion (i.e., naturalization)
• Scribbled images are unseen for GANs, but diffusion models still can denoise them
Diffusion Model Boom!
4
Meng et al. SDEdit: Image Synthesis and Editing with Stochastic Differential Equations. arXiv’21
• Diffusion model is useful for image editing
• Also can be combined with vision-and-language model
Diffusion Model Boom!
5
Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv’21
• Diffusion model is also effective for non-visual domains
• Continuous domains like speech, and even for discrete domains like text
Diffusion Model Boom!
6
Kong et al. DiffWave: A Versatile Diffusion Model for Audio Synthesis. ICLR’21
Austin et al. Structured Denoising Diffusion Models in Discrete State-Spaces. NeurIPS’21
• Trilemma of generative models: Quality vs. Diversity vs. Speed
• Diffusion model produces diverse and high-quality samples, but generations is slow
Diffusion Model is All We Need?
7
Xiao et al. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs. arXiv’21
• Today’s content
• Diffusion Probabilistic Model – ICML’15
• Denoising Diffusion Probabilistic Model (DDPM) – NeurIPS’20
• Improve quality & diversity of diffusion model
• Denoising Diffusion Implicit Model (DDIM) – ICLR’21
• Improve generation speed of diffusion model
• Not covering
• Relation of diffusion model and score matching
• Extension to stochastic differential equation
• There are lots of new interesting works (see NeurIPS’21, ICLR’22)
Outline
8
Score SDE: Song et al. Score-Based Generative Modeling through Stochastic Differential Equations. ICLR’21
→ See Score SDE (ICLR’21)
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼))
Diffusion Probabilistic Model
9
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
Forward (diffusion) process
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼))
• Reverse step: Recover the original sample from the noise
→ Note that it is the “generation” procedure
Diffusion Probabilistic Model
10
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
Reverse process
Forward (diffusion) process
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%)
• Usually, the parameters 𝛽# are fixed (one can jointly learn, but not beneficial)
• Noise annealing (i.e., reducing noise scale 𝛽# < 𝛽#$%) is crucial to the performance
Diffusion Probabilistic Model
11
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
→ Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%)
• Reverse step: Recover the original sample from the noise
→ It is also a product of conditional (de)noise distributions 𝑝&(𝐱#'%|𝐱#)
• Use the learned parameters: denoiser 𝝁& (main part) and randomness 𝚺&
Diffusion Probabilistic Model
12
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
Reverse step: Recover the original sample from the noise
• Training: Minimize variational lower bound of the model 𝑝&(𝐱!)
Diffusion Probabilistic Model
13
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Forward step: (Iteratively) Add noise to the original sample
Reverse step: Recover the original sample from the noise
• Training: Minimize variational lower bound of the model 𝑝& 𝐱!
→ It can be decomposed to the step-wise losses (for each step 𝑡)
Diffusion Probabilistic Model
14
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Training: Minimize variational lower bound of the model 𝑝& 𝐱!
→ It can be decomposed to the step-wise losses (for each step 𝑡)
• Here, the true reverse step 𝑞(𝐱#$%|𝐱#, 𝐱!) can be computed as a closed form of 𝛽#
• Note that we only define the true forward step 𝑞(𝐱#|𝐱#$%)
• Since all distributions above are Gaussian, the KL divergences are tractable
Diffusion Probabilistic Model
15
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Network: Use the image-to-image translation (e.g., U-Net) architectures
• Recall that input is 𝐱# and output is 𝐱#$%, both are images
• It is expensive since both input and output are high-dimensional
• Note that the denoiser 𝜇& 𝐱(, t shares weights, but conditioned by step 𝑡
Diffusion Probabilistic Model
16
* Image from the pix2pix-HD paper
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• Diffusion model aims to learn the reverse of noise generation procedure
• Sampling: Draw a random noise 𝒙" then apply the reverse step 𝑝&(𝐱#'%|𝐱#)
• It often requires the hundreds of reverse steps (very slow)
• Early and late steps change the high- and low-level attributes, respectively
Diffusion Probabilistic Model
17
* Image from the DDPM paper
Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
• DDPM reparametrizes the reverse distributions of diffusion models
• Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱#
• However, 𝐱#$% and 𝐱# share most information, and thus it is redundant
→ Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱#
Denoising Diffusion Probabilistic Model (DDPM)
18
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDPM reparametrizes the reverse distributions of diffusion models
• Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱#
• However, 𝐱#$% and 𝐱# share most information, and thus it is redundant
→ Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱#
• Formally, DDPM reparametrizes the learned reverse distribution as1
and the step-wise objective 𝐿#$% can be reformulated as2
Denoising Diffusion Probabilistic Model (DDPM)
19
1. 𝛼! are some constants determined by 𝛽!
2. Note that we need no “intermediate” samples, and only compare the forward noise 𝝐 and reverse noise 𝝐" conditioned on 𝐱#
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDPM initiated the diffusion model boom
• Achieved SOTA on CIFAR-10, with high-resolution scalability
• It produces more diverse samples than GAN (no mode collapse)
Denoising Diffusion Probabilistic Model (DDPM)
20
Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Motivation:
• Diffusion model is slow due to the iterative procedure
• GAN/VAE creates the sample by one-shot forward operation
• ⇒ Can we combine the advantages for fast sampling of diffusion models?
• Technical spoiler:
• Instead of naïvely applying diffusion model upon GAN/VAE,
DDIM proposes a principled approach of rough sketch + refinement
Denoising Diffusion Implicit Model (DDIM)
21
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea:
• Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑝&(𝐱#$%|𝐱#, 𝐱!)1
• Unlike original diffusion model, it is not a Markovian structure
Denoising Diffusion Implicit Model (DDIM)
22
1. Recall that the original diffusion model uses 𝑝"(𝐱$%&|𝐱$)
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Define the forward distribution 𝑞(𝐱#$%|𝐱#, 𝐱!) as
then, the forward process is derived from Bayes’ rule
Denoising Diffusion Implicit Model (DDIM)
23
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Forward process is
and reverse process is
Denoising Diffusion Implicit Model (DDIM)
24
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM roughly sketches the final sample, then refine it with the reverse process
• Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!)
• Formulation: Forward process is
and reverse process is
• Training: The variational lower bound of DDIM is identical to the one of DDPM1
• It is surprising since the forward/reverse formulation is totally different
Denoising Diffusion Implicit Model (DDIM)
25
1. Precisely, the bound is different, but the solution is identical under some assumption (though violated in practice)
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• DDIM significantly reduces the sampling steps of diffusion model
• Creates the outline of the sample after only 10 steps (DDPM needs hundreds)
Denoising Diffusion Implicit Model (DDIM)
26
Song et al. Denoising Diffusion Implicit Models. ICLR’21
• New golden era of generative models
• Competition of various approaches: GAN, VAE, flow, diffusion model1
• Also, lots of hybrid approaches (e.g., score SDE = diffusion + continuous flow)
• Which model to use?
• Diffusion model seems to be a
nice option for high-quality generation
• However, GAN is (currently) still a
more practical solution which needs
fast sampling (e.g., real-time apps.)
Take-home Message
27
1. VAE also shows promising generation performance (see NVAE, very deep VAE)
28
Thank you for listening! 😀

More Related Content

What's hot

Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion Models
KonfHubTechConferenc
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
Chia-Wen Cheng
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
Sanghoon Hong
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
Yanbin Kong
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
Sangwoo Mo
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
Nuwan Sriyantha Bandara
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
남주 김
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
Thomas da Silva Paula
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
Grigory Sapunov
 
Image-to-Image Translation
Image-to-Image TranslationImage-to-Image Translation
Image-to-Image Translation
Junho Kim
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
ﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
Hojin Yang
 
Generative models
Generative modelsGenerative models
Generative models
Birger Moell
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
Universitat Politècnica de Catalunya
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
NAVER Engineering
 
Gan intro
Gan introGan intro
Gan intro
Hyungjoo Cho
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
Sangwoo Mo
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
Jason Anderson
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
Shunta Saito
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Ding Li
 

What's hot (20)

Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion Models
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Explicit Density Models
Explicit Density ModelsExplicit Density Models
Explicit Density Models
 
Introduction to Transformer Model
Introduction to Transformer ModelIntroduction to Transformer Model
Introduction to Transformer Model
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Image-to-Image Translation
Image-to-Image TranslationImage-to-Image Translation
Image-to-Image Translation
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
Generative models
Generative modelsGenerative models
Generative models
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
Gan intro
Gan introGan intro
Gan intro
 
Score-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential EquationsScore-Based Generative Modeling through Stochastic Differential Equations
Score-Based Generative Modeling through Stochastic Differential Equations
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 

Similar to Introduction to Diffusion Models

Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...
Behzad Shomali
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Model
soxigoh238
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
MLReview
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
Sungchul Kim
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
The Statistical and Applied Mathematical Sciences Institute
 
Bootstrap.ppt
Bootstrap.pptBootstrap.ppt
Bootstrap.ppt
ABINASHPADHY6
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
Ulaş Bağcı
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Sungchul Kim
 
Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5
Stats Statswork
 
Generational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web ApplicationsGenerational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web Applications
kata shin
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Sangwoo Mo
 
TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)
Suhyeon Lee
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
Sangwoo Mo
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
Sangwoo Mo
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
Jinwon Lee
 
Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning
MohammadPooya Malek
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
Jason Riedy
 
Double patterning for 32nm and beyond
Double patterning for 32nm and beyondDouble patterning for 32nm and beyond
Double patterning for 32nm and beyond
Manikandan Sampathkumar
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
Sungchul Kim
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
WiMLDSMontreal
 

Similar to Introduction to Diffusion Models (20)

Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...Deep neural network with GANs pre- training for tuberculosis type classificat...
Deep neural network with GANs pre- training for tuberculosis type classificat...
 
Minor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic ModelMinor Project Report on Denoising Diffusion Probabilistic Model
Minor Project Report on Denoising Diffusion Probabilistic Model
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
Bootstrap.ppt
Bootstrap.pptBootstrap.ppt
Bootstrap.ppt
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5Design of Engineering Experiments Part 5
Design of Engineering Experiments Part 5
 
Generational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web ApplicationsGenerational Layered Canvas Mechanism for Collaborative Web Applications
Generational Layered Canvas Mechanism for Collaborative Web Applications
 
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)TPDM Presentation Slide (ICCV23)
TPDM Presentation Slide (ICCV23)
 
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement LearningHyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
 
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning Image Inpainting Using Deep Learning
Image Inpainting Using Deep Learning
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Double patterning for 32nm and beyond
Double patterning for 32nm and beyondDouble patterning for 32nm and beyond
Double patterning for 32nm and beyond
 
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation LearningPR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
 

More from Sangwoo Mo

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
Sangwoo Mo
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated Data
Sangwoo Mo
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
Sangwoo Mo
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
Sangwoo Mo
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
Sangwoo Mo
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
Sangwoo Mo
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
Sangwoo Mo
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit Gradients
Sangwoo Mo
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Sangwoo Mo
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
Sangwoo Mo
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
Sangwoo Mo
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Sangwoo Mo
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
Sangwoo Mo
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
Sangwoo Mo
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
Sangwoo Mo
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
Sangwoo Mo
 
Emergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep RepresentationsEmergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep Representations
Sangwoo Mo
 
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
Sangwoo Mo
 

More from Sangwoo Mo (20)

Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
Learning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated DataLearning Visual Representations from Uncurated Data
Learning Visual Representations from Uncurated Data
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)Deep Learning Theory Seminar (Chap 3, part 2)
Deep Learning Theory Seminar (Chap 3, part 2)
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Object-Region Video Transformers
Object-Region Video TransformersObject-Region Video Transformers
Object-Region Video Transformers
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)Sharpness-aware minimization (SAM)
Sharpness-aware minimization (SAM)
 
Self-Attention with Linear Complexity
Self-Attention with Linear ComplexitySelf-Attention with Linear Complexity
Self-Attention with Linear Complexity
 
Meta-Learning with Implicit Gradients
Meta-Learning with Implicit GradientsMeta-Learning with Implicit Gradients
Meta-Learning with Implicit Gradients
 
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
Challenging Common Assumptions in the Unsupervised Learning of Disentangled R...
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Domain Transfer and Adaptation Survey
Domain Transfer and Adaptation SurveyDomain Transfer and Adaptation Survey
Domain Transfer and Adaptation Survey
 
Neural Processes
Neural ProcessesNeural Processes
Neural Processes
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Emergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep RepresentationsEmergence of Invariance and Disentangling in Deep Representations
Emergence of Invariance and Disentangling in Deep Representations
 
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
REBAR: Low-variance, unbiased gradient estimates for discrete latent variable...
 

Recently uploaded

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Introduction to Diffusion Models

  • 1. Introduction to Diffusion Models 2022.01.03. KAIST ALIN-LAB Sangwoo Mo 1
  • 2. • Diffusion model is SOTA on image generation • Beat BigGAN and StyleGAN on high-resolution images Diffusion Model Boom! 2 Dhariwal & Nichol. Diffusion Models Beat GANs on Image Synthesis. NeurIPS’21
  • 3. • Diffusion model is SOTA on density estimation • Beat autoregressive models on likelihood score Diffusion Model Boom! 3 Song et al. Maximum Likelihood Training of Score-Based Diffusion Models. NeurIPS’21 Kingma et al. Variational Diffusion Models. NeurIPS’21
  • 4. • Diffusion model is useful for image editing • Editing = Rough scribble + diffusion (i.e., naturalization) • Scribbled images are unseen for GANs, but diffusion models still can denoise them Diffusion Model Boom! 4 Meng et al. SDEdit: Image Synthesis and Editing with Stochastic Differential Equations. arXiv’21
  • 5. • Diffusion model is useful for image editing • Also can be combined with vision-and-language model Diffusion Model Boom! 5 Nichol et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. arXiv’21
  • 6. • Diffusion model is also effective for non-visual domains • Continuous domains like speech, and even for discrete domains like text Diffusion Model Boom! 6 Kong et al. DiffWave: A Versatile Diffusion Model for Audio Synthesis. ICLR’21 Austin et al. Structured Denoising Diffusion Models in Discrete State-Spaces. NeurIPS’21
  • 7. • Trilemma of generative models: Quality vs. Diversity vs. Speed • Diffusion model produces diverse and high-quality samples, but generations is slow Diffusion Model is All We Need? 7 Xiao et al. Tackling the Generative Learning Trilemma with Denoising Diffusion GANs. arXiv’21
  • 8. • Today’s content • Diffusion Probabilistic Model – ICML’15 • Denoising Diffusion Probabilistic Model (DDPM) – NeurIPS’20 • Improve quality & diversity of diffusion model • Denoising Diffusion Implicit Model (DDIM) – ICLR’21 • Improve generation speed of diffusion model • Not covering • Relation of diffusion model and score matching • Extension to stochastic differential equation • There are lots of new interesting works (see NeurIPS’21, ICLR’22) Outline 8 Score SDE: Song et al. Score-Based Generative Modeling through Stochastic Differential Equations. ICLR’21 → See Score SDE (ICLR’21)
  • 9. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼)) Diffusion Probabilistic Model 9 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15 Forward (diffusion) process
  • 10. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → The sample 𝑥! converges to the complete noise 𝑥" (e.g., ∼ 𝒩(0, 𝐼)) • Reverse step: Recover the original sample from the noise → Note that it is the “generation” procedure Diffusion Probabilistic Model 10 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15 Reverse process Forward (diffusion) process
  • 11. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%) • Usually, the parameters 𝛽# are fixed (one can jointly learn, but not beneficial) • Noise annealing (i.e., reducing noise scale 𝛽# < 𝛽#$%) is crucial to the performance Diffusion Probabilistic Model 11 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 12. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample → Technically, it is a product of conditional noise distributions 𝑞(𝐱#|𝐱#$%) • Reverse step: Recover the original sample from the noise → It is also a product of conditional (de)noise distributions 𝑝&(𝐱#'%|𝐱#) • Use the learned parameters: denoiser 𝝁& (main part) and randomness 𝚺& Diffusion Probabilistic Model 12 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 13. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample Reverse step: Recover the original sample from the noise • Training: Minimize variational lower bound of the model 𝑝&(𝐱!) Diffusion Probabilistic Model 13 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 14. • Diffusion model aims to learn the reverse of noise generation procedure • Forward step: (Iteratively) Add noise to the original sample Reverse step: Recover the original sample from the noise • Training: Minimize variational lower bound of the model 𝑝& 𝐱! → It can be decomposed to the step-wise losses (for each step 𝑡) Diffusion Probabilistic Model 14 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 15. • Diffusion model aims to learn the reverse of noise generation procedure • Training: Minimize variational lower bound of the model 𝑝& 𝐱! → It can be decomposed to the step-wise losses (for each step 𝑡) • Here, the true reverse step 𝑞(𝐱#$%|𝐱#, 𝐱!) can be computed as a closed form of 𝛽# • Note that we only define the true forward step 𝑞(𝐱#|𝐱#$%) • Since all distributions above are Gaussian, the KL divergences are tractable Diffusion Probabilistic Model 15 Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 16. • Diffusion model aims to learn the reverse of noise generation procedure • Network: Use the image-to-image translation (e.g., U-Net) architectures • Recall that input is 𝐱# and output is 𝐱#$%, both are images • It is expensive since both input and output are high-dimensional • Note that the denoiser 𝜇& 𝐱(, t shares weights, but conditioned by step 𝑡 Diffusion Probabilistic Model 16 * Image from the pix2pix-HD paper Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 17. • Diffusion model aims to learn the reverse of noise generation procedure • Sampling: Draw a random noise 𝒙" then apply the reverse step 𝑝&(𝐱#'%|𝐱#) • It often requires the hundreds of reverse steps (very slow) • Early and late steps change the high- and low-level attributes, respectively Diffusion Probabilistic Model 17 * Image from the DDPM paper Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. ICML’15
  • 18. • DDPM reparametrizes the reverse distributions of diffusion models • Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱# • However, 𝐱#$% and 𝐱# share most information, and thus it is redundant → Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱# Denoising Diffusion Probabilistic Model (DDPM) 18 Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 19. • DDPM reparametrizes the reverse distributions of diffusion models • Key idea: The original reverse step fully creates the denoiser 𝜇& 𝐱(, t from 𝐱# • However, 𝐱#$% and 𝐱# share most information, and thus it is redundant → Instead, create the residual 𝜖& 𝐱(, t and add to the original 𝐱# • Formally, DDPM reparametrizes the learned reverse distribution as1 and the step-wise objective 𝐿#$% can be reformulated as2 Denoising Diffusion Probabilistic Model (DDPM) 19 1. 𝛼! are some constants determined by 𝛽! 2. Note that we need no “intermediate” samples, and only compare the forward noise 𝝐 and reverse noise 𝝐" conditioned on 𝐱# Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 20. • DDPM initiated the diffusion model boom • Achieved SOTA on CIFAR-10, with high-resolution scalability • It produces more diverse samples than GAN (no mode collapse) Denoising Diffusion Probabilistic Model (DDPM) 20 Ho et al. Denoising Diffusion Probabilistic Models. NeurIPS'20
  • 21. • DDIM roughly sketches the final sample, then refine it with the reverse process • Motivation: • Diffusion model is slow due to the iterative procedure • GAN/VAE creates the sample by one-shot forward operation • ⇒ Can we combine the advantages for fast sampling of diffusion models? • Technical spoiler: • Instead of naïvely applying diffusion model upon GAN/VAE, DDIM proposes a principled approach of rough sketch + refinement Denoising Diffusion Implicit Model (DDIM) 21 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 22. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: • Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑝&(𝐱#$%|𝐱#, 𝐱!)1 • Unlike original diffusion model, it is not a Markovian structure Denoising Diffusion Implicit Model (DDIM) 22 1. Recall that the original diffusion model uses 𝑝"(𝐱$%&|𝐱$) Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 23. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Define the forward distribution 𝑞(𝐱#$%|𝐱#, 𝐱!) as then, the forward process is derived from Bayes’ rule Denoising Diffusion Implicit Model (DDIM) 23 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 24. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Forward process is and reverse process is Denoising Diffusion Implicit Model (DDIM) 24 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 25. • DDIM roughly sketches the final sample, then refine it with the reverse process • Key idea: Given 𝐱#, generate the rough sketch 𝐱! and refine 𝑞(𝐱#$%|𝐱#, 𝐱!) • Formulation: Forward process is and reverse process is • Training: The variational lower bound of DDIM is identical to the one of DDPM1 • It is surprising since the forward/reverse formulation is totally different Denoising Diffusion Implicit Model (DDIM) 25 1. Precisely, the bound is different, but the solution is identical under some assumption (though violated in practice) Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 26. • DDIM significantly reduces the sampling steps of diffusion model • Creates the outline of the sample after only 10 steps (DDPM needs hundreds) Denoising Diffusion Implicit Model (DDIM) 26 Song et al. Denoising Diffusion Implicit Models. ICLR’21
  • 27. • New golden era of generative models • Competition of various approaches: GAN, VAE, flow, diffusion model1 • Also, lots of hybrid approaches (e.g., score SDE = diffusion + continuous flow) • Which model to use? • Diffusion model seems to be a nice option for high-quality generation • However, GAN is (currently) still a more practical solution which needs fast sampling (e.g., real-time apps.) Take-home Message 27 1. VAE also shows promising generation performance (see NVAE, very deep VAE)
  • 28. 28 Thank you for listening! 😀