1. Not Enough Measurements,
Too Many Measurements
Michael McCann
Department of Computational
Mathematics, Science and Engineering
Michigan State University
UM CSP Seminar, Oct. 22, 2020
2. 2
My collaborators
●
Michael Unser and
BIG (EPFL)
– Kyong Jin
– Laurène Donati
– Harshit Gupta
●
Sai Ravishankar
and SLIM (MSU)
– Avrajit Ghosh
10. 10
Too Many Measurements!
●
e.g., The Cancer Imaging Archive
– 2223 subjects x hundreds of images
– includes Patient CT Projection Data Library
(AKA Mayo Clinic Data)
11. 11
Goal: reconstruct meaningful
images from measurements PLUS a
training set
what we have:
training
what we have:
measurements
what we want: image
The Physics
13. 13
Supervised image reconstruction
●
What do we pick for D?
●
Where does the training data come from?
●
How do we solve the fitting problem?
●
What is the structure of F?
15. 15
Themes in designing F
●
augment a direct method
– e.g., mix together different FBPs (Pelt et al. 2013)
denoise an FBP (Jin et al. 2017)
16. 16
Themes in designing F
●
augment a direct method
– e.g., mix together different FBPs (Pelt et al. 2013)
denoise an FBP (Jin et al. 2017)
●
take inspiration from variational methods
– unrolling, plug-and-play, ...
– e.g., learn the regularization (Aggarwal et al. 2019;
Gupta et al. 2018)
learn the gradient (Adler et al. 2017),
learn filters and nonlinearities (Hammernik et al.
2017)
17. 17
Themes in designing F
●
augment a direct method
– e.g., mix together different FBPs (Pelt et al. 2013)
denoise an FBP (Jin et al. 2017)
●
take inspiration from variational methods
– unrolling, plug-and-play, ...
– e.g., learn the regularization (Aggarwal et al. 2019; Gupta et
al. 2018)
learn the gradient (Adler et al. 2017),
learn filters and nonlinearities (Hammernik et al. 2017)
●
throw away variational methods
– e.g., learn the entire measurement to image mapping (Zhu et
al. 2018)
18. 18
Themes in designing F
●
augment a direct method
– e.g., mix together different FBPs (Pelt et al. 2013)
denoise an FBP (Jin et al. 2017)
●
take inspiration from variational methods
– unrolling, plug-and-play, ...
– e.g., learn the regularization (Aggarwal et al. 2019; Gupta et al. 2018)
learn the gradient (Adler et al. 2017),
learn filters and nonlinearities (Hammernik et al. 2017)
●
throw away variational methods
– e.g., learn the entire measurement to image mapping (Zhu et al. 2018)
●
work in the data domain
– e.g., learn to inpaint missing measurements (Ghani et al. 2019)
19. 19
Themes in designing F
●
augment a direct method
– e.g., mix together different FBPs (Pelt et al. 2013)
denoise an FBP (Jin et al. 2017)
●
take inspiration from variational methods
– unrolling, plug-and-play, ...
– e.g., learn the regularization (Aggarwal et al. 2019; Gupta et al. 2018)
learn the gradient (Adler et al. 2017),
learn filters and nonlinearities (Hammernik et al. 2017)
●
throw away variational methods
– e.g., learn the entire measurement to image mapping (Zhu et al. 2018)
●
work in the data domain
– e.g., learn to inpaint missing measurements (Ghani et al. 2019)
●
many more...
25. 25
Perspectives
●
It’s not about H, it’s about 𝓗
●
If we care about MSE, let’s do challenges
– fastMRI, Low Dose CT Grand Challenge, others?
26. 26
Perspectives
●
It’s not about H, it’s about 𝓗
●
If we care about MSE, let’s do challenges
– fastMRI, Low Dose CT Grand Challenge, others?
●
Let’s not forget data fidelity and robustness
ground truth FBPConvNet
37. 37
Cryo-EM reconstruction
●
projection matching (Penczek et al. 1994)
– # of unknowns grows with data :(
●
marginalized maximum likelihood (Sigworth 1998)
– marginalization is computationally heavy
– what most software currently does
●
method of moments (Kam 1980; Sharon et al. 2020)
– avoids above problems
– makes efficient use of data
39. 39
Generative adversarial networks
●
given samples of a distribution, generate new
samples from the same distribution
●
parameterization of the generator is flexible
image: Goodfellow et al. 2014
56. 56
Why learn a regularizer?
●
can’t be worse than sparsity-based
reconstruction
57. 57
Why learn a regularizer?
●
can’t be worse than sparsity-based
reconstruction
●
results in a a convex problem
– robust, includes data fidelity, decades of theory
58. 58
Why learn a regularizer?
●
can’t be worse than sparsity-based
reconstruction
●
results in a a convex problem
– robust, includes data fidelity, decades of theory
●
joins proven architecture with data adaptivity
59. 59
Why learn a regularizer?
●
can’t be worse than sparsity-based reconstruction
●
results in a a convex problem
– robust, includes data fidelity, decades of theory
●
joins proven architecture with data adaptivity
●
gives a hope of interpreting the learned part
60. 60
Related work
●
Main approach: relax the l1 term
– Peyré et al. 2011; Mairal et al. 2012; Sprechmann et al. 2013; Chen et al. 2014
61. 61
Related work
●
Main approach: relax the l1 term
– Peyré et al. 2011; Mairal et al. 2012; Sprechmann et al. 2013; Chen et al. 2014
62. 62
Related work
●
Main approach: relax the l1 term
– Peyré et al. 2011; Mairal et al. 2012; Sprechmann et al. 2013; Chen et al. 2014
63. 63
Related work
●
Main approach: relax the l1 term
– Peyré et al. 2011; Mairal et al. 2012; Sprechmann et al. 2013; Chen et al. 2014
66. 66
Our approach (outline)
1. solve the lower problem at the current W
2. find a (local) closed-form solution, x*(W)
67. 67
Our approach (outline)
1. solve the lower problem at the current W
2. find a (local) closed-form solution, x*(W)
3. substitute x*(W) into the upper level problem,
compute a gradient w.r.t. W, and descend
68. 68
1. Solve the lower level problem
●
standard, convex problem
– ADMM, Chambolle-Pock
●
need fast, accurate solutions
– hyperparameter selection is tough
70. 70
2. Find a closed-form solution
●
uniqueness? no, but
71. 71
2. Find a closed-form solution
●
uniqueness? no, but
●
intuition: in a region (of W-space) where the
sign pattern of Wx*(W) does not change,
||Wx||1 is linear
72. 72
2. Find a closed-form solution
●
uniqueness? no, but
●
intuition: in a region (of W-space) where the
sign pattern of Wx*(W) does not change,
||Wx||1 is linear
– see McCann and Ravishankar 2020 for A = I
73. 73
2. Find a closed-form solution
●
from McCann and Ravishankar 2020
74. 74
2. Find a closed-form solution
●
from McCann and Ravishankar 2020
●
even better, Ali and Tibshirani 2019
75. 75
2. Find a closed-form solution
●
from McCann and Ravishankar 2020
●
even better, Ali and Tibshirani 2019
– unique minimum norm solution when b=0
77. 77
3. Find the gradient
●
with 1d signals, pytorch can autograd these
expressions
– Agrawal et al. 2019 makes it even easier
78. 78
3. Find the gradient
●
with 1d signals, pytorch can autograd these
expressions
– Agrawal et al. 2019 makes it even easier
●
with images, things get tough
– W is potentially million X million
– no more SVDs or explicit inverses
– our solution: by hand + pytorch
79. 79
Early experiments
●
image denoising
●
W is a set of 8, 3x3 convolutions
●
compare to
– BM3D, TV, DCT
– unsupervised learned regularizer
●
training
– SGD with increasing batch size
– takes ~hours
83. 83
Absolute summed filter responses
and filters
●
Learned filters
– Are not orthonormal (neither ortho- nor -normal)
– Penalize edges less than DCT
85. 85
Taking a step back
●
CNN performance on image reconstruction is
due to
– the CNN (architecture)
– training
86. 86
Taking a step back
●
CNN performance on image reconstruction is
due to
– the CNN (architecture)
– training
●
remove the training: deep image prior
– Reinhard Heckel 1W-MINDS seminar:
https://www.youtube.com/watch?v=AvJgmbeupGY
87. 87
Taking a step back
●
CNN performance on image reconstruction is
due to
– the CNN (architecture)
– training
●
remove the training: deep image prior
– Reinhard Heckel 1W-MINDS seminar:
https://www.youtube.com/watch?v=AvJgmbeupGY
●
remove the CNN: learned regularizer
89. 89
References
●
M. Wu, G. C. Lander, and M. A. Herzik, “Sub-2 Angstrom resolution structure determination using single-
particle cryo-EM at 200 keV,” Journal of Structural Biology: X, vol. 4, p. 100020, 2020, doi:
10.1016/j.yjsbx.2020.100020.
●
Pawel A. Penczek, Robert A. Grassucci, and Joachim Frank. “The ribosome at improved resolution:
New techniques for merging and orientation refinement in 3D cryo-electron microscopy of biological
particles”. In:Ultramicroscopy 53.3 (1994), pp. 251–270.
●
Fred J Sigworth. “A maximum-likelihood approach to single-particle image refinement”. In:Journal of
structural biology 122.3 (1998), pp. 328–339.
●
Z. Kam, “The reconstruction of structure from electron micrographs of randomly oriented particles,” Journal
of Theoretical Biology, vol. 82, no. 1, Art. no. 1, Jan. 1980, doi: 10.1016/0022-5193(80)90088-0.
●
N. Sharon, J. Kileel, Y. Khoo, B. Landa, and A. Singer, “Method of moments for 3D single particle ab initio
modeling with non-uniform distribution of viewing angles,” Inverse Problems, vol. 36, no. 4, Art. no. 4, Feb.
2020, doi: 10.1088/1361-6420/ab6139.
●
I. Goodfellow et al., “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems
27, Ed.Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger Curran Associates,
Inc., 2014, pp. 2672–2680.
●
J. Adler and O. Öktem, “Solving ill-posed inverse problems using iterative deep neural networks,” Inverse
Problems, vol. 33, no. 12, p. 124007, Nov. 2017.
90. 90
References
●
H. Gupta, M. T. McCann, L. Donati, and M. Unser, “CryoGAN: A New Reconstruction Paradigm for
Single-Particle Cryo-EM via Deep Adversarial Learning,” bioRxiv, Mar. 2020, doi:
10.1101/2020.03.20.001016.
●
G. Peyré and J. M. Fadili, “Learning analysis sparsity priors,” in Sampling Theory and Applications,
Singapore, Singapore, May 2011, p. 4.
●
J. Mairal, F. Bach, and J. Ponce, “Task-driven dictionary learning,”IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 34, no. 4, pp. 791–804, Apr. 2012.
●
P. Sprechmann, R. Litman, T. Ben Yakar, A. M. Bronstein, and G. Sapiro, “Supervised sparse analysis
and synthesis operators,” in Advances in Neural Information Processing Systems 26, 2013, pp. 908–
916.
●
M. T. McCann and S. Ravishankar, “Supervised Learning of Sparsity-Promoting Regularizers for
Denoising,” arXiv:2006.05521 [eess.IV], 2020-06-09.
●
A. Ali and R. J. Tibshirani, “The Generalized Lasso Problem and Uniqueness,” Electronic Journal of
Statistics, vol. 13, no. 2, Art. no. 2, 2019, doi: 10.1214/19-ejs1569.
●
A. Agrawal, B. Amos, S. Barratt, S. Boyd, S. Diamond, and J. Z. Kolter, “Differentiable Convex
Optimization Layers,” in Advances in Neural Information Processing Systems 32, Ed.H. Wallach, H.
Larochelle, A. Beygelzimer, F. dAlché-Buc, E. Fox, and R. Garnett Curran Associates, Inc., 2019, pp.
9562–9574.