Tutorial Equivariance in Imaging ICMS 23.pptx

Julián Tachella
CNRS
Physics Laboratory
École Normale Supérieure de Lyon
Imaging with Equivariant Deep
Learning
Edinburgh ICMS Workshop 2023

2
based on 2023 IEEE Signal Processing Magazine paper,
with Dongdong Chen, Mike Davies, Matthias Ehrhardt, Carola Schonlieb and Ferdia Sherry

3
Inverse Problems
Goal: recover signal 𝑥 from 𝑦
𝑦 = 𝐴𝑥 + 𝜖

4
Examples
Magnetic resonance imaging
• 𝐴 = subset of Fourier modes
(𝑘 − space) of 2D/3D images
Image inpainting
• 𝐴 = diagonal matrix
with 1’s and 0s.
Computed tomography
• 𝐴 = 1D projections
(sinograms) of 2D image
𝑥
𝑦 𝑦 𝑥
𝑦 𝑥

5
Why It Is Hard to Invert?
Even in the absence of noise, infinitely many 𝑥 consistent with 𝑦:
𝑥 = 𝐴†𝑦 + 𝑣
where 𝐴†
is the pseudo-inverse of 𝐴 and 𝑣 is any vector in nullspace of 𝐴
reconstruct

6
Low-Dimensionality Prior
Idea: Most natural signals sets 𝒳 are low-dimensional
• Mathematically, we use the box-counting dimension
boxdim 𝒳 = 𝑘 ≪ 𝑛
Theorem: A signal 𝑥 belonging to a set 𝒳 ⊂ ℝ𝑛 with
boxdim 𝒳 = 𝑘 can be uniquely recovered from the
measurement 𝑦 = 𝐴𝑥 with almost every 𝐴 ∈ ℝ𝑚×𝑛 if
𝑚 > 2𝑘.

7
Symmetry Prior
Idea: Most natural signals sets 𝒳 are invariant to groups of transformations.
Example: natural images are translation invariant
• Mathematically, a set 𝒳 is invariant to a group
of transformations 𝑇𝑔 ∈ ℝ𝑛×𝑛
𝑔∈𝐺
if
∀𝑥 ∈ 𝒳, ∀𝑔 ∈ 𝐺, 𝑇𝑔𝑥 ∈ 𝒳
Other symmetries: rotations, permutation, amplitude

8
Group Actions
• A group action of 𝐺 acting on ℝ𝑛 is a mapping 𝑇: 𝐺 × ℝ𝑛 ↦ ℝ𝑛 which inherits the group
axioms.
(multiplication) 𝑇𝑔2
∘ 𝑇𝑔1
(𝑥) = 𝑇𝑔2∘𝑔1
𝑥
(inverse) 𝑇𝑔−1 𝑥 = 𝑇𝑔
−1(𝑥)
(identity) 𝑇𝑒 𝑥 = 𝑥
• Let 𝑇, 𝑇 be group actions on ℝ𝑛
and ℝ𝑚
. A mapping 𝑓: ℝ𝑛
↦ ℝ𝑚
is 𝑮-equivariant if
𝑓 ∘ 𝑇𝑔(𝑥) = 𝑇𝑔 ∘ 𝑓(𝑥) for all 𝑔 ∈ 𝐺, 𝑥 ∈ ℝ𝑛
• If 𝑓 and 𝑔 are equivariant, ℎ = 𝑔 ∘ 𝑓 is equivariant

9
Representations
• A linear representation of a compact 𝐺 on ℝ𝑛 is the group action 𝑇: 𝐺 ↦ ℝ𝑛×𝑛 represented
by invertible 𝑛 × 𝑛 matrices.
• A linear representation can be decomposed in irreps {𝜌𝑔
𝑘 ∈ ℝ𝑠𝑘×𝑠𝑘}𝑘=1
𝐾
of dimension 𝒔𝒋 with
multiplicity 𝒄𝒌 and arbitrary basis 𝐵 ∈ ℝ𝑛×𝑛:
𝑇𝑔 = 𝐵−1
𝜌𝑔
1 0
0 𝜌𝑔
1 ⋯ 0
⋮ ⋱ ⋮
0 ⋯
𝜌𝑔
𝐾
0
0 𝜌𝑔
𝐾
𝐵
Example: Shift matrices, 𝐵 is Fourier transform and 𝜌𝑔
𝑘 = 𝑒−
i2𝜋𝑘𝑔
𝑛 (𝑠𝑘 = 𝑐𝑘 = 1) for 𝑘 = 1, … , 𝑛

10
Regularised Reconstruction
Standard regularisation approach:
argmin 𝐴𝑥 − 𝑦
2
+ 𝐽(𝑥)
• 𝐽(𝑥) enforces low-dimensionality and is 𝐺-invariant: 𝐽 𝑇𝑔𝑥 = 𝐽(𝑥)
Examples: total-variation (shift invariant), sparsity (permutation invariant)
Disadvantages: hard to define a good 𝐽 𝑥 in real world problems, loose with
respect to the true 𝒳
𝑥

11
Learning approach
Idea: use training pairs of signals and measurements 𝑥𝑖, 𝑦𝑖 𝑖 to directly learn the
inversion function
argmin
𝑖
𝑥𝑖 − 𝑓𝜃(𝑦𝑖)
2
where 𝑓𝜃: ℝ𝑚 ↦ ℝ𝑛 is a deep neural network with parameters 𝜃.
𝜃
𝑓𝜃

12
Advantages:
• State-of-the-art reconstructions
• Once trained, 𝑓𝜃 is easy to evaluate
Learning Approach
x8 accelerated MRI [Zbontar et al., 2019]
Deep network
(34.5 dB)
Ground-truth
Total variation
(28.2 dB)

13
Tutorial Goals
1. Incorporate symmetry into design of 𝑓𝜃
2. Use symmetry to generalize to unseen poses and noise levels
3. How symmetry enables fully unsupervised learning

14
Equivariant Nets
Convolutional neural networks (CNNs) are translation equivariant
• Going beyond translations: make every block equivariant!
𝑊1 𝜙 𝑊𝐿
… 𝜙 𝑓𝜃(𝑦)
𝑦

15
Equivariant Nets
• Steerable linear layers [Cohen et al., 2016], [Serre, 1971]
𝑆 = {𝑊 ∈ ℝ𝑝×𝑛
𝑇𝑔𝑊 = 𝑊𝑇𝑔, ∀𝑔 ∈ 𝐺
𝑊𝜃 =
𝑖=1
𝑟
𝜃𝑖𝜓𝑖
• Parameter efficiency:
dim 𝑆
dim(ℝ𝑝×𝑛)
=
𝑟
𝑝𝑛
• Non-linearities: elementwise 𝜙 with 𝑇𝑔 represented by permutation matrices.
• In practice, choose 𝑇𝑔 (multiplicities=channels) at each layer and use existing libraries.
ℝ𝑝×𝑛
𝑆
= span(𝜓1, … , 𝜓𝑟)

16
𝐴
𝑓𝜃
Equivariance in Inverse Problems
Role of invariance of 𝒳 when solving inverse problems?
System Equivariance: Let 𝑓𝜃: 𝑦 ↦ 𝑥 be the reconstruction function
𝑓𝜃 ∘ 𝐴 ∘ 𝑇𝑔𝑥 = 𝑇𝑔 ∘ 𝑓𝜃 ∘ 𝐴𝑥 ∀𝑔 ∈ 𝐺, ∀𝑥 ∈ 𝒳

17
Equivariance by Design
𝐴
𝑓𝜃
Case 1:
𝐴 equivariant on ℝ𝑛 + equivariant architecture 𝑓𝜃 = equivariant system 𝑓 ∘ 𝐴

18
𝐴
𝑓𝜃
?
Equivariance by Design
Case 2:
𝐴 not equivariant on ℝ𝑛 + equivariant architecture 𝑓𝜃 ≠ equivariant system 𝑓 ∘ 𝐴

19
Examples
Translation Rotation Permutation Amplitude
Gaussian
Blur
Image
Inpainting
Sparse-view
CT
Accelerated
MRI
Downsampling
(no antialias)
?
?
?
?
?
? ?
?
?
?
?
?
?
?
? ?
? ?

20
Unrolled Networks
Standard regularisation approach:
argmin 𝐴𝑥 − 𝑦
2
+ 𝐽(𝑥)
Solved via proximal gradient descent,
𝑥𝑡+1 = prox𝜂𝐽 𝑥𝑡 − 𝜂𝐴⊤ 𝐴𝑥𝑡 − 𝑦
where prox𝐽 𝑥 = argmin 𝑥 − 𝑥
2
+ 𝐽 𝑥 .
𝑥
𝑥
Proposition: If the regularisation 𝐽(𝑥) is 𝐺-invariant, the proximal
operator is 𝐺-equivariant
prox𝐽 𝑇𝑔𝑥 = 𝑇𝑔prox𝐽 𝑥
E. Celledoni et al., “Equivariant neural networks for inverse problems,” 2021.

21
Unrolled Networks
Idea: Unroll optimisation and replace proximal operators by learnable networks
for 𝑡 = 1, … , 𝑅 iterations
𝑥𝑡+1 = 𝑓𝜃𝑡
𝑥𝑡 − 𝜂𝐴⊤ 𝐴𝑥𝑡 − 𝑦
where 𝑓𝜃𝑡
: ℝ𝑛 ↦ ℝ𝑛 are equivariant learnable networks
𝐴⊤ 𝑓𝜃1
𝜂𝐴⊤
𝑦
𝑓𝜃𝑅
(𝐼 − 𝜂𝐴⊤
𝐴)
…
𝑓𝜃(𝑦)

22
Pose Generalization
E. Celledoni et al., “Equivariant neural networks for inverse problems,” 2021.
Computed tomography
• LIDC-IDRI dataset
• 512 x 512 images
• 50 views
• Supervised learning

23
Noise Generalization
S. Mohan et al., “Robust and interpretable blind image denoising via bias-free convolutional neural networks,” 2019.
Image denoising: 𝑦 = 𝑥 + 𝜖 where 𝜖 ∼ 𝒩 0, 𝜎2
Amplitude invariance: for all 𝑥 ∈ 𝒳, 𝑔 ∈ ℝ+ we have 𝑔𝑥 ∈ 𝒳.
Equivariant relu network: 𝑓𝜃 𝑔𝑦 = 𝑔𝑓𝜃(𝑦) by removing all biases.
𝑓𝜃 𝑔𝑥 + 𝜖 = 𝑔𝑓𝜃 𝑥 +
𝜖
𝑔
= 𝑔𝑓𝜃 𝑥 + 𝜖′
where 𝜖′
∼ 𝒩 0,
𝜎
𝑔
2
is an unseen noise distribution!

25
Unsupervised Learning
with Equivariance

26
Limits to Supervised Learning
Main disadvantage: Obtaining training signals 𝑥𝑖 can be expensive or impossible.
• Medical and scientific imaging
• Only solves inverse problems which we already know what to expect
• Risk of training with signals from a different distribution
train test?

27
Learning from only measurements 𝒚?
argmin
𝑖
𝑦𝑖 − 𝐴𝑓𝜃(𝑦𝑖)
2
Learning Approach
𝒇𝜽
𝐴
𝜃
Proposition: 𝑓𝜃 𝑦 = 𝐴†
𝑦 + 𝑔𝜃(𝑦) where 𝑔𝜃: ℝ𝑚
↦ 𝒩
𝐴 is any function whose
image belongs to the nullspace of 𝐴 obtains zero training error.

28
Exploiting Invariance
How to learn from only 𝒚? We need some prior information
For all 𝑔 ∈ 𝐺 we have
𝑦 = 𝐴𝑥 = 𝐴𝑇𝑔𝑇𝑔
−1
𝑥
• Implicit access to multiple operators 𝐴𝑔
• Each operator with different nullspace
= 𝐴𝑔𝑥′

29
Model Identification
Can we uniquely identify the set of signals 𝒳 ⊂ ℝ𝑛 from
the observed measurement sets 𝒴𝑔 = 𝐴𝑇𝑔𝒳
𝑔∈𝐺
?

Theorem [T., Chen and Davies ’22]:
Identifying 𝒳 requires that 𝐴 is not equivariant: 𝐴𝑇𝑔 ≠ 𝑇𝑔𝐴
30
Necessary Conditions
Proposition [T., Chen and Davies ’22]: Identifying 𝒳 from {𝒴𝑔 = 𝐴𝑔𝒳}
possible only if
rank
𝐴𝑇1
⋮
𝐴𝑇|𝐺|
= 𝑛,
and thus, if 𝑚 ≥ max
𝑐𝑗
𝑠𝑗
≥
𝑛
𝐺
where 𝑠𝑗 and 𝑐𝑗 are dimension and
multiplicity of irreps.

31
Can We Learn Any Set?
• Necessary conditions are independent of 𝒳…
• For finite |𝐺|, some signal distributions cannot be identified if 𝐴 is rank-deficient:
𝔼𝑦{𝑒i𝑤⊤𝐴𝑔
†
𝑦
} = 𝔼𝑥 𝑒i𝑥⊤(𝐴𝑔
†
𝐴𝑔𝑤)
= 𝜑(𝐴𝑔
†
𝐴𝑔𝑤)
• We only observe projection of characteristic function 𝜑 to certain subspaces!

Theorem [T., Chen and Davies ’22]: Let 𝐺 be a compact cyclic group. Identifying
set 𝒳 with box-counting dimension 𝑘 from 𝒴𝑔 = 𝐴𝑇𝑔𝒳
𝑔
is possible by almost
every 𝐴 ∈ ℝ𝑚×𝑛 with
𝑚 > 2𝑘 + max
𝑗
𝑐𝑗 + 1
where 𝑐𝑗 are the multiplicities of the representation and max
𝑗
𝑐𝑗 ≥ 𝑛/ 𝐺 .
32
Sufficient Conditions
Additional assumption: The signal set is low-dimensional.
If group is big, then 𝑚 > 2𝑘 + 2 are sufficient = same condition as signal recovery!

33
Consequences
• 𝐴 = subset of Fourier modes
• Equivariant to translations
• Not equivariant to rotations,
which have max 𝑐𝑗 ≈ √𝑛
𝑚 > 2𝑘 + 𝑛 + 1
Image inpainting
• 𝐴 = diagonal matrix with 1’s
and 0s.
• Not equivariant to
translations, which have
max 𝑐𝑗 ≈ 1
𝑚 > 2𝑘 + 2
Computed tomography
• 𝐴 = 1D projections
(sinograms)
• Equivariant to translations
• Not equivariant to rotations,
which have max 𝑐𝑗 ≈ √𝑛
𝑚 > 2𝑘 + 𝑛 + 1
𝑥
𝑦 𝑦 𝑥
𝑦 𝑥

34
Equivariant Imaging
Enforcing equivariance in practice?
• Unrolled equivariant networks might not achieve equivariance of 𝑓𝜃 ∘ 𝐴
• Equivariant prox is not sufficient
Example: Learned prox𝐽 𝑥 = 𝑥 is equivariant, resulting 𝑓𝜃 𝑦 = 𝐴†𝑦 is
measurement consistent but not system equivariant
Idea: enforce equivariance during training!

35
Self-supervised training loss:
argmin ℒ𝑀𝐶 𝜃
• ℒ𝑀𝐶 𝜃 = 𝑖 𝑦𝑖 − 𝐴𝑓𝜃 𝑦𝑖
2
enforces measurement consistency
• ℒ𝐸𝐼 𝜃 = 𝑖,𝑔 𝑓𝜃 𝐴𝑇𝑔𝑓𝜃 𝑦𝑖 − 𝑇𝑔𝑓𝜃 𝑦𝑖
2
enforces equivariance of 𝐴 ∘ 𝑓𝜃
𝜃
• Network-agnostic: applicable to any 𝑓𝜃 (including unrolled)
+ℒ𝐸𝐼 𝜃
Equivariant Imaging

36
Robust EI: SURE+EI
Robust Equivariant Imaging
argmin ℒ𝑆𝑈𝑅𝐸 𝜃 + ℒ𝐸𝐼 𝜃
where ℒ𝑆𝑈𝑅𝐸 𝜃 = 𝑖 𝑦𝑖 − 𝐴𝑓𝜃 𝑦𝑖
2
− 𝜎2𝑚 + 2𝜎2div(𝐴 ∘ 𝑓𝜃)(𝑦𝑖)
𝜃
Theorem [Stein, 1981] Under mild differentiability conditions on the function 𝐴 ∘ 𝑓𝜃, the following
holds
𝔼𝑦 ℒ𝑀𝐶 𝜃 = 𝔼𝑦 ℒ𝑆𝑈𝑅𝐸 𝜃
• Similar expressions of ℒ𝑆𝑈𝑅𝐸 𝜃 for Poisson, Poisson-Gaussian, etc. exist

37
Experiments
Tasks:
• Magnetic resonance imaging
Network
• 𝑓𝜃 = 𝑔𝜃 ∘ 𝐴†
where 𝑔𝜃 is a U-Net
Comparison
• Pseudo-inverse 𝐴†
𝑦𝑖 (no training)
• Meas. consistency 𝐴𝑓𝜃 𝑦𝑖 = 𝑦𝑖
• Fully supervised loss: 𝑓𝜃 𝑦𝑖 = 𝑥𝑖
• Equivariant imaging (unsupervised)
𝐴𝑓𝜃 𝑦𝑖 = 𝑦𝑖 and equivariant 𝐴 ∘ 𝑓𝜃

Equivariant imaging Fully supervised
𝐴†𝑦 Meas. consistency
38
• Operator 𝐴 is a subset of Fourier measurements (x2 downsampling)
• Dataset is approximately rotation invariant
Signal 𝑥 Measurements 𝑦

39
Inpainting
Signal 𝑥
Measurements 𝑦
• Operator 𝐴 is an inpainting mask (30% pixels dropped)
• Poisson noise (rate=10)
• Dataset is approximately translation invariant
Supervised Meas. consistency Robust EI

40
Noisy
measurements 𝑦
Robust EI
Supervised
Clean signal 𝑥 Meas. consistency
Computed Tomography
• Operator 𝐴 is (non-linear variant) sparse radon transform (50 views)
• Mixed Poisson-Gaussian noise
• Dataset is approximately rotation invariant

41
Compressed Sensing
• 𝐴 is a random iid Gaussian matrix
• Shift invariant (max𝑗 𝑐𝑗 = 1) MNIST dataset 𝑘 ≈ 12

43
Conclusions
• Learn signal set from data
• Equivariance by design
• Generalize unseen poses and
noise level
• No ground-truth references needed
• Necessary and sufficient conditions

45
Conclusions
Future work/open problems
• Non-linear inverse problems
• Semi-group actions
• Approx. low-dimensional models
• Beyond Euclidean domains

46
Extras
• Theoretical bounds and algorithms can be extended to the case where we observe
data through multiple operators
𝐴1𝑥 𝐴2𝑥 𝐴3𝑥
“Unsupervised Learning From Incomplete Measurements for Inverse Problems”, Tachella, Chen and
Davies, NeurIPS 2022.

47
Papers
[1] “Equivariant Imaging: Learning Beyond the Range Space”, Chen, Tachella and
Davies, ICCV 2021 (Oral)
[2] “Robust Equivariant Imaging: a fully unsupervised framework for learning to
image from noisy and partial measurements”, Chen, Tachella and Davies, CVPR
[3] “Unsupervised Learning From Incomplete Measurements for Inverse Problems”,
Tachella, Chen and Davies, NeurIPS 2022.
[4] “Sensing Theorems for Unsupervised Learning in Inverse Problems”, Tachella,
Chen and Davies, JMLR 2023.
[5] “Imaging with Equivariant Deep Learning”, Chen, Davies, Eerhardt, Schonlieb,
Ferry and Tachella, IEEE SPM 2023.

Thanks for your attention!
Tachella.github.io
 Codes
 Presentations
 … and more
48

Tutorial Equivariance in Imaging ICMS 23.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Tutorial Equivariance in Imaging ICMS 23.pptx

Similar to Tutorial Equivariance in Imaging ICMS 23.pptx (20)

More from Julián Tachella

More from Julián Tachella (6)

Recently uploaded

Recently uploaded (20)

Tutorial Equivariance in Imaging ICMS 23.pptx

Editor's Notes