SlideShare a Scribd company logo
1 of 20
Download to read offline
Diffusion Schrödinger bridges
for score-based generative modeling
Jeremy Heng
Joint work with Valentin De Bortoli, James Thornton and Arnaud Doucet
ESSEC Business School
ICSA China - 1 July 2022
1 / 20
Generative Modeling and Score-Based Generative Models
Diffusion Models Beat GANs on Image Synthesis - OpenAI, 2021
Over recent years, massive advances in generative modeling driven
by VAEs (Kingma & Welling, 2014), GANs (Goodfellow et al., 2014),
autoregressive models (van den Oord et al., 2016).
Score-based generative models aka denoising diffusion models
(Ho et al., 2020; Song et al., 2021) provide SOTA results in a large
number of domains.
2 / 20
One basic idea: ancestral sampling
From Song et al., ICLR 2021
Consider a Markov chain with X0 ∼ p0 and Xk+1 ∼ pk+1|k(·|Xk) then
p(x0:N ) = p0(x0)
QN−1
k=0 pk+1|k(xk+1|xk).
Denote by pk the marginal of Xk satisfying
pk(xk) =
R
pk|k−1(xk|xk−1)pk−1(xk−1)dxk−1.
Backward decomposition (pk|k+1 obtained with Bayes’ rule)
p(x0:N ) = pN (xN )
QN−1
k=0 pk|k+1(xk|xk+1).
In particular, one can sample from p(x0:N ) by ancestral sampling
Sample XN ∼ pN (·) then Xk ∼ pk|k+1(·|Xk+1) for k ∈ {N − 1, . . . , 1}.
3 / 20
Generative Modeling with ancestral sampling
From Song et al., ICLR 2021
Let p0 = pdata and set pk+1|k such that pN ≈ pref for N  1 where pref
is a “reference” easy-to-sample density.
Usual choice pk+1|k(x0
|x) = N(x0
; αx, (1 − α2
)Id) such that
pN (x) ≈ pref(x) where pref = N(x; 0d, Id) for N large enough.
Use ancestral sampling by replacing pN by pref
Sample XN ∼ pref(·) then Xk ∼ pk|k+1(·|Xk+1) for k ∈ {N − 1, . . . , 1}
Key Problem: Not only one needs forward transitions pk+1|k and N
large enough such that pN (x) ≈ pref(x) but also needs to
approximate the backward transitions pk|k+1.
4 / 20
Approximating Backward Transitions
We restrict ourselves to discretized Ornstein-Uhlenbeck processes
pk+1|k(xk+1|xk) = N(xk+1; αxk, (1 − α2
)Id),
(α  0 is close to 1)
Using a Taylor expansion we get
pk|k+1(xk|xk+1) = pk+1|k(xk+1|xk) exp[log pk(xk) − log pk+1(xk+1)]
≈ N(xk; (2 − α)xk+1 + (1 − α2
) ∇ log pk+1(xk+1)
| {z }
Score
, (1 − α2
)Id).
The score is not available but using that
pk+1(xk+1) =
R
p0(x0)pk+1|0(xk+1|x0)dx0, we get that
∇ log pk+1(xk+1) = EX0∼p0|k+1
[∇xk+1
log pk+1|0(xk+1|X0)].
5 / 20
Estimating the Scores using Score Matching and Sampling
Conditional expectation → Regression problem
sk+1 = arg mins Ep0,k+1
[||s(Xk+1) − ∇xk+1
log pk+1|0(Xk+1|X0)||2
].
In practice, we restrict ourselves to neural networks and estimate all
scores simultaneously i.e. sθ? (k, xk) ≈ ∇ log pk(xk) where
θ?
≈ arg minθ
PN
k=1 Ep0,k
[||sθ(k, Xk) − ∇xk
log pk|0(Xk|X0)||2
],
Generate samples from the backward process using XN ∼ pref and
the recursion
Xk = (2 − α)Xk+1 + (1 − α2
)sθ? (k + 1, Xk+1) +
√
1 − α2Zk+1.
Code available (JAX and Pytorch):
https://github.com/yang-song/score_sde
6 / 20
From Discrete to Continuous-Time
The Markov chain is an Euler discretization of the
Ornstein-Uhlenbeck
dXt = −βXtdt +
√
2dBt, X0 ∼ pdata.
(β  0 is a parameter, pref = N(0, 1/β Id)
The reverse-time process (Yt)t∈[0,T] = (XT−t)t∈[0,T] satisfies
dYt = {βYt + 2∇ log pT−t(Yt)}dt +
√
2dBt, Y0 ∼ pT
and the generative model is
dYt = {βYt + 2sθ? (T − t, Yt)}dt +
√
2dBt, Y0 ∼ pref .
7 / 20
From Discrete to Continuous-Time
Convergence of diffusion models (De Bortoli et al., 2021)
Assume there exists M ≥ 0 such that for any t ∈ [0, T] and x ∈ Rd
||sθ? (t, x) − ∇ log pt(x)|| ≤ M,
with sθ? ∈ C([0, T] × Rd
, Rd
) and regularity conditions on pdata and its
gradients.
Then there exist Bβ, Cβ, Dβ ≥ 0 s.t. for any N ∈ N and {γk}N
k=1 the
following hold:
||L(X0) − pdata||TV ≤ Bβ exp[−β1/2
T] + Cβ(M + γ̄1/2
) exp[DβT].
where T =
PN
k=1 γk and γ̄ = supk∈{1,...,N} γk ({γk}N
k=1 sequence of
stepsizes in the Euler-Maruyama discretization).
Take-home message: the “mixing time” of the reversal is entirely
given by the forward process. The bottleneck is not the mixing of the
chain but the approximation of the drift.
8 / 20
Practical Limitations
Not enough stepsizes lead to poor approximation (the
Ornstein-Uhlenbeck process does not mix fast enough).
Illustration of failure: N is too small so pN is very different from pref.
This harms the quality of the reconstruction for the time-reversal.
Our contribution: “iterating” diffusion models to force the correct
marginal distributions.
9 / 20
Revisiting Generative Modeling using Schrödinger Bridges
The Schrödinger Bridge problem: consider a base process p(x0:N ),
find π?
(x0:N ) such that
π?
= arg min{KL(π||p) : π0 = pdata, πN = pref}.
If π?
is available: XN ∼ pref, then Xk ∼ π?
k|k+1(·|Xk+1) for
k ∈ {N − 1, . . . , 0}.
We have π?
(x0:N ) = πs,?
(x0, xN )p(x1:N−1|x0, xN ) where
πs,?
= arg min{−Eπs [log pN|0(XN |X0)]−H(πs
) : πs
0 = pdata, πs
N = pref}
and, if pN|0(xN |x0) = N(xN ; x0, σ2
), then
πs,?
= arg min{Eπs [||X0 − XN ||2
] − 2σ2
H(πs
) : πs
0 = pdata, πs
N = pref}.
This is regularized Wasserstein 2 cost, i.e. σ → 0 implies that πs,?
converges to the optimal transport plan (Mikami, 2004).
10 / 20
Solving the Schrödinger Bridge Problem
The SB problem can be solved using Iterative Proportional Fitting
(IPF) (Fortet, 1940; Kullback, 1968), i.e. set π0
= p and for n ≥ 1
π2n+1
= arg min{KL(π||π2n
), πN = pref},
π2n+2
= arg min{KL(π||π2n+1
), π0 = pdata}.
limn→+∞ πn
= π?
under regularity conditions (Ruschendorf, 1995;
Léger, 2021; De Bortoli et al., 2021).
Explicit solution of the first IPF step
KL(π||π0
) = KL(πN ||pN ) + EπN
[KL(π|N ||p|N )]
Therefore,
π1
(x0:N ) = pref(xN )p(x0:N−1|xN )
= pref(xN )
Q0
k=N−1pk|k+1(xk|xk+1)
Take-home message: Approximation to first iteration of IPF
corresponds to current Score-Based Generative models.
11 / 20
Solving the Schrödinger Bridge Problem
The second iteration requires solving
π2
= arg min{KL(π||π1
), π0 = pdata}.
Therefore,
π2
(x0:N ) = pdata(x0)π1
(x1:N |x0)
= pdata(x0)
QN
k=1π1
k+1|k(xk+1|xk)
On an algorithmic level:
I IPF1: the time-reversal of the forward process π0
= p is
initialized by pref at time N to define the backward process π1
.
I IPF2: the time-reversal of the backward process π1
is initialized
by pdata at time 0 to define the forward process π2
.
I IPF3: the time-reversal of the forward process π2
is initialized
by pref at time N to define the backward process π3
.
I ...
12 / 20
Continuous-Time IPF
IPF can be formulated in continuous time
Π?
= arg min{KL(Π||P) : Π ∈ P(C), Π0 = pdata, ΠT = pref}.
Similarly, we define the IPF (Πn
) recursively Π0
= P using
Π2n+1
= arg min{KL(Π||Π2n
) : Π ∈ P(C), ΠT = pref},
Π2n+2
= arg min{KL(Π||Π2n+1
) : Π ∈ P(C), Π0 = pdata}.
Under regularity conditions, then
(Π2n+1
)R
→ dY2n+1
t = bn
T−t(Y2n+1
t )dt +
√
2dBt, Y2n+1
0 ∼ pref,
Π2n+2
→ dX2n+2
t = f n+1
t (X2n+2
t )dt +
√
2dBt, X2n+2
0 ∼ pdata,
where
bn
t (x) = −f n
t (x) + 2∇ log pn
t (x),
f n+1
t (x) = −bn
t (x) + 2∇ log qn
t (x),
with f 0
t (x) = f (x), and pn
t , qn
t the densities of Π2n
t and Π2n+1
t .
13 / 20
Diffusion Schrödinger Bridge
Sample generation: XN ∼ pref and Xk−1 = BβL (k, Xk) +
√
2γkZk.
14 / 20
Diffusion Schrödinger Bridge: 2D example
Diffusion Schrödinger Bridge (DSB) gives a solution to the “small
time problem”.
Approximation of Optimal Transport.
15 / 20
Applications: 2D distributions
Data distributions pdata vs distribution at t = 0 for T = 0.2 after 1 and
20 DSB steps
16 / 20
Applications: Downscaled CelebA
Generative model for CelebA after 10 DSB steps with N = 50,
T = 0.63 (d = 32 × 32 × 3 = 3072).
17 / 20
Applications: Datasets Interpolation
First row: Swiss-roll to S-curve (2D). Step 9 of DSB with T = 1
(N = 50). From left to right: t = 0, 0.4, 0.6, 1. Second row: EMNIST to
MNIST. Step 10 of DSB with T = 1.5 (N = 30). From left to right:
t = 0, 0.4, 1.25, 1.5.
18 / 20
Discussion
Quick summary
I Theoretical results for denoising diffusion models.
I Generative modeling can be reformulated as a Schrödinger
Bridge problem.
I Diffusion Schrödinger Bridge approximates its solution using
(discretized) forward-backward diffusions and score matching
ideas.
19 / 20
References
V. De Bortoli, J. Thornton, J. Heng  A. Doucet, Diffusion Schrödinger
bridge with applications to score-based generative modeling. NeurIPS
2021.
V. De Bortoli, G. Deligiannidis  A. Doucet, Quantitative uniform
stability of the iterative proportional fitting procedure.
arXiv:2108.08129.
J. Ho, A. Jain  P. Abbeel, Denoising diffusion probabilistic models.
NeurIPS 2020.
Y. Song, J. Sohl-Dickstein, D.P. Kingma, A.Kumar, S. Ermon  B.
Poole, Score-based generative modeling through stochastic differential
equations, ICLR 2021.
20 / 20

More Related Content

What's hot

帰納バイアスが成立する条件
帰納バイアスが成立する条件帰納バイアスが成立する条件
帰納バイアスが成立する条件Shinobu KINJO
 
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1hirokazutanaka
 
進化計算とは(what is evolutional algorythm)
進化計算とは(what is evolutional algorythm)進化計算とは(what is evolutional algorythm)
進化計算とは(what is evolutional algorythm)tetuwo181
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisXin-She Yang
 
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)Teppei Kurita
 
ウェーブレットと多重解像度処理
ウェーブレットと多重解像度処理ウェーブレットと多重解像度処理
ウェーブレットと多重解像度処理h_okkah
 
PILCO - 第一回高橋研究室モデルベース強化学習勉強会
PILCO - 第一回高橋研究室モデルベース強化学習勉強会PILCO - 第一回高橋研究室モデルベース強化学習勉強会
PILCO - 第一回高橋研究室モデルベース強化学習勉強会Shunichi Sekiguchi
 
グラフ理論入門 1
グラフ理論入門 1グラフ理論入門 1
グラフ理論入門 1butsurizuki
 
金融時系列のための深層t過程回帰モデル
金融時系列のための深層t過程回帰モデル金融時系列のための深層t過程回帰モデル
金融時系列のための深層t過程回帰モデルKei Nakagawa
 
遺伝的アルゴリズム・遺伝的プログラミング
遺伝的アルゴリズム・遺伝的プログラミング遺伝的アルゴリズム・遺伝的プログラミング
遺伝的アルゴリズム・遺伝的プログラミングMatsuiRyo
 
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定Katsuya Ito
 
勾配降下法の 最適化アルゴリズム
勾配降下法の最適化アルゴリズム勾配降下法の最適化アルゴリズム
勾配降下法の 最適化アルゴリズムnishio
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent methodSanghyuk Chun
 
動的輪郭モデル
動的輪郭モデル動的輪郭モデル
動的輪郭モデルArumaziro
 
CNNの構造最適化手法について
CNNの構造最適化手法についてCNNの構造最適化手法について
CNNの構造最適化手法についてMasanoriSuganuma
 
数式を(ちょっとしか)使わずに隠れマルコフモデル
数式を(ちょっとしか)使わずに隠れマルコフモデル数式を(ちょっとしか)使わずに隠れマルコフモデル
数式を(ちょっとしか)使わずに隠れマルコフモデルYuya Takashina
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Sangwoo Mo
 
Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Shintaro Yoshida
 

What's hot (20)

帰納バイアスが成立する条件
帰納バイアスが成立する条件帰納バイアスが成立する条件
帰納バイアスが成立する条件
 
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
 
Prml14 5
Prml14 5Prml14 5
Prml14 5
 
進化計算とは(what is evolutional algorythm)
進化計算とは(what is evolutional algorythm)進化計算とは(what is evolutional algorythm)
進化計算とは(what is evolutional algorythm)
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)
Sparse Codingをなるべく数式を使わず理解する(PCAやICAとの関係)
 
ウェーブレットと多重解像度処理
ウェーブレットと多重解像度処理ウェーブレットと多重解像度処理
ウェーブレットと多重解像度処理
 
PILCO - 第一回高橋研究室モデルベース強化学習勉強会
PILCO - 第一回高橋研究室モデルベース強化学習勉強会PILCO - 第一回高橋研究室モデルベース強化学習勉強会
PILCO - 第一回高橋研究室モデルベース強化学習勉強会
 
グラフ理論入門 1
グラフ理論入門 1グラフ理論入門 1
グラフ理論入門 1
 
金融時系列のための深層t過程回帰モデル
金融時系列のための深層t過程回帰モデル金融時系列のための深層t過程回帰モデル
金融時系列のための深層t過程回帰モデル
 
PRML chapter7
PRML chapter7PRML chapter7
PRML chapter7
 
遺伝的アルゴリズム・遺伝的プログラミング
遺伝的アルゴリズム・遺伝的プログラミング遺伝的アルゴリズム・遺伝的プログラミング
遺伝的アルゴリズム・遺伝的プログラミング
 
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
Dynamic Time Warping を用いた高頻度取引データのLead-Lag 効果の推定
 
勾配降下法の 最適化アルゴリズム
勾配降下法の最適化アルゴリズム勾配降下法の最適化アルゴリズム
勾配降下法の 最適化アルゴリズム
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
動的輪郭モデル
動的輪郭モデル動的輪郭モデル
動的輪郭モデル
 
CNNの構造最適化手法について
CNNの構造最適化手法についてCNNの構造最適化手法について
CNNの構造最適化手法について
 
数式を(ちょっとしか)使わずに隠れマルコフモデル
数式を(ちょっとしか)使わずに隠れマルコフモデル数式を(ちょっとしか)使わずに隠れマルコフモデル
数式を(ちょっとしか)使わずに隠れマルコフモデル
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Grad-CAMの始まりのお話
Grad-CAMの始まりのお話
 

Similar to Diffusion Schrödinger bridges for score-based generative modeling

Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Alexander Litvinenko
 
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...Leo Asselborn
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Alexander Litvinenko
 
Probabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic ReachabilityProbabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic ReachabilityLeo Asselborn
 
Quantum mechanics and the square root of the Brownian motion
Quantum mechanics and the square root of the Brownian motionQuantum mechanics and the square root of the Brownian motion
Quantum mechanics and the square root of the Brownian motionMarco Frasca
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flowsVjekoslavKovac1
 
Trilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsTrilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsVjekoslavKovac1
 
Bayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorBayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorJulyan Arbel
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
Pinning and facetting in multiphase LBMs
Pinning and facetting in multiphase LBMsPinning and facetting in multiphase LBMs
Pinning and facetting in multiphase LBMsTim Reis
 

Similar to Diffusion Schrödinger bridges for score-based generative modeling (20)

Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...
 
Muchtadi
MuchtadiMuchtadi
Muchtadi
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
Probabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic ReachabilityProbabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
Probabilistic Control of Uncertain Linear Systems Using Stochastic Reachability
 
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
QMC: Transition Workshop - Probabilistic Integrators for Deterministic Differ...
 
Quantum mechanics and the square root of the Brownian motion
Quantum mechanics and the square root of the Brownian motionQuantum mechanics and the square root of the Brownian motion
Quantum mechanics and the square root of the Brownian motion
 
Tales on two commuting transformations or flows
Tales on two commuting transformations or flowsTales on two commuting transformations or flows
Tales on two commuting transformations or flows
 
Trilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsTrilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operators
 
Bayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve priorBayesian adaptive optimal estimation using a sieve prior
Bayesian adaptive optimal estimation using a sieve prior
 
Quantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko RobnikQuantum chaos of generic systems - Marko Robnik
Quantum chaos of generic systems - Marko Robnik
 
PCA on graph/network
PCA on graph/networkPCA on graph/network
PCA on graph/network
 
Md2521102111
Md2521102111Md2521102111
Md2521102111
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
lecture6.ppt
lecture6.pptlecture6.ppt
lecture6.ppt
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Pinning and facetting in multiphase LBMs
Pinning and facetting in multiphase LBMsPinning and facetting in multiphase LBMs
Pinning and facetting in multiphase LBMs
 

More from JeremyHeng10

Sequential Monte Carlo Algorithms for Agent-based Models of Disease Transmission
Sequential Monte Carlo Algorithms for Agent-based Models of Disease TransmissionSequential Monte Carlo Algorithms for Agent-based Models of Disease Transmission
Sequential Monte Carlo Algorithms for Agent-based Models of Disease TransmissionJeremyHeng10
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingJeremyHeng10
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionJeremyHeng10
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionJeremyHeng10
 
Statistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR modelsStatistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR modelsJeremyHeng10
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloJeremyHeng10
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloJeremyHeng10
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloJeremyHeng10
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceJeremyHeng10
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo JeremyHeng10
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceJeremyHeng10
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo JeremyHeng10
 
Talk in BayesComp 2018
Talk in BayesComp 2018Talk in BayesComp 2018
Talk in BayesComp 2018JeremyHeng10
 

More from JeremyHeng10 (13)

Sequential Monte Carlo Algorithms for Agent-based Models of Disease Transmission
Sequential Monte Carlo Algorithms for Agent-based Models of Disease TransmissionSequential Monte Carlo Algorithms for Agent-based Models of Disease Transmission
Sequential Monte Carlo Algorithms for Agent-based Models of Disease Transmission
 
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modelingDiffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmission
 
Sequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmissionSequential Monte Carlo algorithms for agent-based models of disease transmission
Sequential Monte Carlo algorithms for agent-based models of disease transmission
 
Statistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR modelsStatistical inference for agent-based SIS and SIR models
Statistical inference for agent-based SIS and SIR models
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 
Unbiased Markov chain Monte Carlo
Unbiased Markov chain Monte CarloUnbiased Markov chain Monte Carlo
Unbiased Markov chain Monte Carlo
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte CarloUnbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inference
 
Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo Unbiased Hamiltonian Monte Carlo
Unbiased Hamiltonian Monte Carlo
 
Gibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inferenceGibbs flow transport for Bayesian inference
Gibbs flow transport for Bayesian inference
 
Controlled sequential Monte Carlo
Controlled sequential Monte Carlo Controlled sequential Monte Carlo
Controlled sequential Monte Carlo
 
Talk in BayesComp 2018
Talk in BayesComp 2018Talk in BayesComp 2018
Talk in BayesComp 2018
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Diffusion Schrödinger bridges for score-based generative modeling

  • 1. Diffusion Schrödinger bridges for score-based generative modeling Jeremy Heng Joint work with Valentin De Bortoli, James Thornton and Arnaud Doucet ESSEC Business School ICSA China - 1 July 2022 1 / 20
  • 2. Generative Modeling and Score-Based Generative Models Diffusion Models Beat GANs on Image Synthesis - OpenAI, 2021 Over recent years, massive advances in generative modeling driven by VAEs (Kingma & Welling, 2014), GANs (Goodfellow et al., 2014), autoregressive models (van den Oord et al., 2016). Score-based generative models aka denoising diffusion models (Ho et al., 2020; Song et al., 2021) provide SOTA results in a large number of domains. 2 / 20
  • 3. One basic idea: ancestral sampling From Song et al., ICLR 2021 Consider a Markov chain with X0 ∼ p0 and Xk+1 ∼ pk+1|k(·|Xk) then p(x0:N ) = p0(x0) QN−1 k=0 pk+1|k(xk+1|xk). Denote by pk the marginal of Xk satisfying pk(xk) = R pk|k−1(xk|xk−1)pk−1(xk−1)dxk−1. Backward decomposition (pk|k+1 obtained with Bayes’ rule) p(x0:N ) = pN (xN ) QN−1 k=0 pk|k+1(xk|xk+1). In particular, one can sample from p(x0:N ) by ancestral sampling Sample XN ∼ pN (·) then Xk ∼ pk|k+1(·|Xk+1) for k ∈ {N − 1, . . . , 1}. 3 / 20
  • 4. Generative Modeling with ancestral sampling From Song et al., ICLR 2021 Let p0 = pdata and set pk+1|k such that pN ≈ pref for N 1 where pref is a “reference” easy-to-sample density. Usual choice pk+1|k(x0 |x) = N(x0 ; αx, (1 − α2 )Id) such that pN (x) ≈ pref(x) where pref = N(x; 0d, Id) for N large enough. Use ancestral sampling by replacing pN by pref Sample XN ∼ pref(·) then Xk ∼ pk|k+1(·|Xk+1) for k ∈ {N − 1, . . . , 1} Key Problem: Not only one needs forward transitions pk+1|k and N large enough such that pN (x) ≈ pref(x) but also needs to approximate the backward transitions pk|k+1. 4 / 20
  • 5. Approximating Backward Transitions We restrict ourselves to discretized Ornstein-Uhlenbeck processes pk+1|k(xk+1|xk) = N(xk+1; αxk, (1 − α2 )Id), (α 0 is close to 1) Using a Taylor expansion we get pk|k+1(xk|xk+1) = pk+1|k(xk+1|xk) exp[log pk(xk) − log pk+1(xk+1)] ≈ N(xk; (2 − α)xk+1 + (1 − α2 ) ∇ log pk+1(xk+1) | {z } Score , (1 − α2 )Id). The score is not available but using that pk+1(xk+1) = R p0(x0)pk+1|0(xk+1|x0)dx0, we get that ∇ log pk+1(xk+1) = EX0∼p0|k+1 [∇xk+1 log pk+1|0(xk+1|X0)]. 5 / 20
  • 6. Estimating the Scores using Score Matching and Sampling Conditional expectation → Regression problem sk+1 = arg mins Ep0,k+1 [||s(Xk+1) − ∇xk+1 log pk+1|0(Xk+1|X0)||2 ]. In practice, we restrict ourselves to neural networks and estimate all scores simultaneously i.e. sθ? (k, xk) ≈ ∇ log pk(xk) where θ? ≈ arg minθ PN k=1 Ep0,k [||sθ(k, Xk) − ∇xk log pk|0(Xk|X0)||2 ], Generate samples from the backward process using XN ∼ pref and the recursion Xk = (2 − α)Xk+1 + (1 − α2 )sθ? (k + 1, Xk+1) + √ 1 − α2Zk+1. Code available (JAX and Pytorch): https://github.com/yang-song/score_sde 6 / 20
  • 7. From Discrete to Continuous-Time The Markov chain is an Euler discretization of the Ornstein-Uhlenbeck dXt = −βXtdt + √ 2dBt, X0 ∼ pdata. (β 0 is a parameter, pref = N(0, 1/β Id) The reverse-time process (Yt)t∈[0,T] = (XT−t)t∈[0,T] satisfies dYt = {βYt + 2∇ log pT−t(Yt)}dt + √ 2dBt, Y0 ∼ pT and the generative model is dYt = {βYt + 2sθ? (T − t, Yt)}dt + √ 2dBt, Y0 ∼ pref . 7 / 20
  • 8. From Discrete to Continuous-Time Convergence of diffusion models (De Bortoli et al., 2021) Assume there exists M ≥ 0 such that for any t ∈ [0, T] and x ∈ Rd ||sθ? (t, x) − ∇ log pt(x)|| ≤ M, with sθ? ∈ C([0, T] × Rd , Rd ) and regularity conditions on pdata and its gradients. Then there exist Bβ, Cβ, Dβ ≥ 0 s.t. for any N ∈ N and {γk}N k=1 the following hold: ||L(X0) − pdata||TV ≤ Bβ exp[−β1/2 T] + Cβ(M + γ̄1/2 ) exp[DβT]. where T = PN k=1 γk and γ̄ = supk∈{1,...,N} γk ({γk}N k=1 sequence of stepsizes in the Euler-Maruyama discretization). Take-home message: the “mixing time” of the reversal is entirely given by the forward process. The bottleneck is not the mixing of the chain but the approximation of the drift. 8 / 20
  • 9. Practical Limitations Not enough stepsizes lead to poor approximation (the Ornstein-Uhlenbeck process does not mix fast enough). Illustration of failure: N is too small so pN is very different from pref. This harms the quality of the reconstruction for the time-reversal. Our contribution: “iterating” diffusion models to force the correct marginal distributions. 9 / 20
  • 10. Revisiting Generative Modeling using Schrödinger Bridges The Schrödinger Bridge problem: consider a base process p(x0:N ), find π? (x0:N ) such that π? = arg min{KL(π||p) : π0 = pdata, πN = pref}. If π? is available: XN ∼ pref, then Xk ∼ π? k|k+1(·|Xk+1) for k ∈ {N − 1, . . . , 0}. We have π? (x0:N ) = πs,? (x0, xN )p(x1:N−1|x0, xN ) where πs,? = arg min{−Eπs [log pN|0(XN |X0)]−H(πs ) : πs 0 = pdata, πs N = pref} and, if pN|0(xN |x0) = N(xN ; x0, σ2 ), then πs,? = arg min{Eπs [||X0 − XN ||2 ] − 2σ2 H(πs ) : πs 0 = pdata, πs N = pref}. This is regularized Wasserstein 2 cost, i.e. σ → 0 implies that πs,? converges to the optimal transport plan (Mikami, 2004). 10 / 20
  • 11. Solving the Schrödinger Bridge Problem The SB problem can be solved using Iterative Proportional Fitting (IPF) (Fortet, 1940; Kullback, 1968), i.e. set π0 = p and for n ≥ 1 π2n+1 = arg min{KL(π||π2n ), πN = pref}, π2n+2 = arg min{KL(π||π2n+1 ), π0 = pdata}. limn→+∞ πn = π? under regularity conditions (Ruschendorf, 1995; Léger, 2021; De Bortoli et al., 2021). Explicit solution of the first IPF step KL(π||π0 ) = KL(πN ||pN ) + EπN [KL(π|N ||p|N )] Therefore, π1 (x0:N ) = pref(xN )p(x0:N−1|xN ) = pref(xN ) Q0 k=N−1pk|k+1(xk|xk+1) Take-home message: Approximation to first iteration of IPF corresponds to current Score-Based Generative models. 11 / 20
  • 12. Solving the Schrödinger Bridge Problem The second iteration requires solving π2 = arg min{KL(π||π1 ), π0 = pdata}. Therefore, π2 (x0:N ) = pdata(x0)π1 (x1:N |x0) = pdata(x0) QN k=1π1 k+1|k(xk+1|xk) On an algorithmic level: I IPF1: the time-reversal of the forward process π0 = p is initialized by pref at time N to define the backward process π1 . I IPF2: the time-reversal of the backward process π1 is initialized by pdata at time 0 to define the forward process π2 . I IPF3: the time-reversal of the forward process π2 is initialized by pref at time N to define the backward process π3 . I ... 12 / 20
  • 13. Continuous-Time IPF IPF can be formulated in continuous time Π? = arg min{KL(Π||P) : Π ∈ P(C), Π0 = pdata, ΠT = pref}. Similarly, we define the IPF (Πn ) recursively Π0 = P using Π2n+1 = arg min{KL(Π||Π2n ) : Π ∈ P(C), ΠT = pref}, Π2n+2 = arg min{KL(Π||Π2n+1 ) : Π ∈ P(C), Π0 = pdata}. Under regularity conditions, then (Π2n+1 )R → dY2n+1 t = bn T−t(Y2n+1 t )dt + √ 2dBt, Y2n+1 0 ∼ pref, Π2n+2 → dX2n+2 t = f n+1 t (X2n+2 t )dt + √ 2dBt, X2n+2 0 ∼ pdata, where bn t (x) = −f n t (x) + 2∇ log pn t (x), f n+1 t (x) = −bn t (x) + 2∇ log qn t (x), with f 0 t (x) = f (x), and pn t , qn t the densities of Π2n t and Π2n+1 t . 13 / 20
  • 14. Diffusion Schrödinger Bridge Sample generation: XN ∼ pref and Xk−1 = BβL (k, Xk) + √ 2γkZk. 14 / 20
  • 15. Diffusion Schrödinger Bridge: 2D example Diffusion Schrödinger Bridge (DSB) gives a solution to the “small time problem”. Approximation of Optimal Transport. 15 / 20
  • 16. Applications: 2D distributions Data distributions pdata vs distribution at t = 0 for T = 0.2 after 1 and 20 DSB steps 16 / 20
  • 17. Applications: Downscaled CelebA Generative model for CelebA after 10 DSB steps with N = 50, T = 0.63 (d = 32 × 32 × 3 = 3072). 17 / 20
  • 18. Applications: Datasets Interpolation First row: Swiss-roll to S-curve (2D). Step 9 of DSB with T = 1 (N = 50). From left to right: t = 0, 0.4, 0.6, 1. Second row: EMNIST to MNIST. Step 10 of DSB with T = 1.5 (N = 30). From left to right: t = 0, 0.4, 1.25, 1.5. 18 / 20
  • 19. Discussion Quick summary I Theoretical results for denoising diffusion models. I Generative modeling can be reformulated as a Schrödinger Bridge problem. I Diffusion Schrödinger Bridge approximates its solution using (discretized) forward-backward diffusions and score matching ideas. 19 / 20
  • 20. References V. De Bortoli, J. Thornton, J. Heng A. Doucet, Diffusion Schrödinger bridge with applications to score-based generative modeling. NeurIPS 2021. V. De Bortoli, G. Deligiannidis A. Doucet, Quantitative uniform stability of the iterative proportional fitting procedure. arXiv:2108.08129. J. Ho, A. Jain P. Abbeel, Denoising diffusion probabilistic models. NeurIPS 2020. Y. Song, J. Sohl-Dickstein, D.P. Kingma, A.Kumar, S. Ermon B. Poole, Score-based generative modeling through stochastic differential equations, ICLR 2021. 20 / 20