Denoising Diffusion Probabilistic Models
Diffusion probabilistic models
Background
• Full Latent Variables Model
• We want to model a probability distribution pθ(x0) over data points x0.
• Where XT are latents of the same dimensionality as the data X0 q(X0)
• ) is reverse process defined by Markov Chain with Learned Gaussian
Transitions started at
• Make data pure Gaussian noise with mean 0 and covariance identity
matrix (equal noise in all directions)
Equation describes the training objective for diffusion models, which is based on the variational bound
(ELBO: Evidence Lower Bound) on the negative log-likelihood of the data.
Image Data Scaling
•Original Image Data: The image pixels are represented as integer values in the range {0, 1, ..., 255}.
•Scaled Data Representation: These pixel values are linearly scaled to [-1,1] using:
Why Scale to [-1,1]? Ensures that the input distribution to the neural network remains consistent.
•Helps in better training stability and convergence.
•Standardizes input for the reverse process.
Standard Normal Prior in Reverse Process
•The reverse process begins with a standard normal prior p(xT), meaning that at the final step of the
forward diffusion, the image data follows a normal distribution:
• The goal of the reverse process (decoder) is to gradually denoise this distribution and reconstruct the original
data.
Explanation of the Math Behind Stable Diffusion (DDPMs)
Explanation of the Math Behind Stable Diffusion (DDPMs)
Explanation of the Math Behind Stable Diffusion (DDPMs)
Explanation of the Math Behind Stable Diffusion (DDPMs)
Explanation of the Math Behind Stable Diffusion (DDPMs)
Explanation of the Math Behind Stable Diffusion (DDPMs)

Explanation of the Math Behind Stable Diffusion (DDPMs)

  • 1.
  • 2.
  • 4.
    Background • Full LatentVariables Model • We want to model a probability distribution pθ(x0) over data points x0. • Where XT are latents of the same dimensionality as the data X0 q(X0) • ) is reverse process defined by Markov Chain with Learned Gaussian Transitions started at • Make data pure Gaussian noise with mean 0 and covariance identity matrix (equal noise in all directions)
  • 12.
    Equation describes thetraining objective for diffusion models, which is based on the variational bound (ELBO: Evidence Lower Bound) on the negative log-likelihood of the data.
  • 27.
    Image Data Scaling •OriginalImage Data: The image pixels are represented as integer values in the range {0, 1, ..., 255}. •Scaled Data Representation: These pixel values are linearly scaled to [-1,1] using: Why Scale to [-1,1]? Ensures that the input distribution to the neural network remains consistent. •Helps in better training stability and convergence. •Standardizes input for the reverse process. Standard Normal Prior in Reverse Process •The reverse process begins with a standard normal prior p(xT), meaning that at the final step of the forward diffusion, the image data follows a normal distribution: • The goal of the reverse process (decoder) is to gradually denoise this distribution and reconstruct the original data.