Equivariant deep learning enables unsupervised learning of inverse problems from measurements alone by exploiting signal symmetries. The measurement operator must not be equivariant to the symmetry group in order for the underlying signal set to be uniquely identified. If the signal set has low dimensionality and the symmetry group is large, the number of measurements needed is the same as for supervised signal recovery. This approach generalizes supervised training by allowing learning from unlabeled data.
Variational Autoencoders For Image GenerationJason Anderson
Meetup: https://www.meetup.com/Cognitive-Computing-Enthusiasts/events/260580395/
Video: https://www.youtube.com/watch?v=fnULFOyNZn8
Blog: http://www.compthree.com/blog/autoencoder/
Code: https://github.com/compthree/variational-autoencoder
An autoencoder is a machine learning algorithm that represents unlabeled high-dimensional data as points in a low-dimensional space. A variational autoencoder (VAE) is an autoencoder that represents unlabeled high-dimensional data as low-dimensional probability distributions. In addition to data compression, the randomness of the VAE algorithm gives it a second powerful feature: the ability to generate new data similar to its training data. For example, a VAE trained on images of faces can generate a compelling image of a new "fake" face. It can also map new features onto input data, such as glasses or a mustache onto the image of a face that initially lacks these features. In this talk, we will survey VAE model designs that use deep learning, and we will implement a basic VAE in TensorFlow. We will also demonstrate the encoding and generative capabilities of VAEs and discuss their industry applications.
Variational Autoencoders For Image GenerationJason Anderson
Meetup: https://www.meetup.com/Cognitive-Computing-Enthusiasts/events/260580395/
Video: https://www.youtube.com/watch?v=fnULFOyNZn8
Blog: http://www.compthree.com/blog/autoencoder/
Code: https://github.com/compthree/variational-autoencoder
An autoencoder is a machine learning algorithm that represents unlabeled high-dimensional data as points in a low-dimensional space. A variational autoencoder (VAE) is an autoencoder that represents unlabeled high-dimensional data as low-dimensional probability distributions. In addition to data compression, the randomness of the VAE algorithm gives it a second powerful feature: the ability to generate new data similar to its training data. For example, a VAE trained on images of faces can generate a compelling image of a new "fake" face. It can also map new features onto input data, such as glasses or a mustache onto the image of a face that initially lacks these features. In this talk, we will survey VAE model designs that use deep learning, and we will implement a basic VAE in TensorFlow. We will also demonstrate the encoding and generative capabilities of VAEs and discuss their industry applications.
An overview of gradient descent optimization algorithms Hakky St
勾配降下法についての論文をスライドにしたものです。
This is the slide for study meeting of gradient descent.
I use this paper and this is very good information about gradient descent.
https://arxiv.org/abs/1609.04747
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
This presentation about backpropagation and gradient descent will cover the basics of how backpropagation and gradient descent plays a role in training neural networks - using an example on how to recognize the handwritten digits using a neural network. After predicting the results, you will see how to train the network using backpropagation to obtain the results with high accuracy. Backpropagation is the process of updating the parameters of a network to reduce the error in prediction. You will also understand how to calculate the loss function to measure the error in the model. Finally, you will see with the help of a graph, how to find the minimum of a function using gradient descent. Now, let’s get started with learning backpropagation and gradient descent in neural networks.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations and the execution pipeline
2. Implement deep learning algorithms, understand neural networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional neural networks, recurrent neural networks, training deep networks and high-level interfaces
4. Build deep learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of artificial neural networks
6. Troubleshoot and improve deep learning models
7. Build your own deep learning project
8. Differentiate between machine learning, deep learning, and artificial intelligence
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
Review of Do and Batzoglou. "What is the expectation maximization algorith?" Nat. Biotechnol. 2008;26:897. Also covers the Data Augmentation and Stan implementation. Resources at https://github.com/kaz-yos/em_da_repo
References:
"Gaussian Process", Lectured by Professor Il-Chul Moon
"Gaussian Processes", Cornell CS4780 , Lectured by Professor
Kilian Weinberger
Bayesian Deep Learning by Sungjoon Choi
An overview of gradient descent optimization algorithms Hakky St
勾配降下法についての論文をスライドにしたものです。
This is the slide for study meeting of gradient descent.
I use this paper and this is very good information about gradient descent.
https://arxiv.org/abs/1609.04747
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
This presentation about backpropagation and gradient descent will cover the basics of how backpropagation and gradient descent plays a role in training neural networks - using an example on how to recognize the handwritten digits using a neural network. After predicting the results, you will see how to train the network using backpropagation to obtain the results with high accuracy. Backpropagation is the process of updating the parameters of a network to reduce the error in prediction. You will also understand how to calculate the loss function to measure the error in the model. Finally, you will see with the help of a graph, how to find the minimum of a function using gradient descent. Now, let’s get started with learning backpropagation and gradient descent in neural networks.
Why Deep Learning?
It is one of the most popular software platforms used for deep learning and contains powerful tools to help you build and implement artificial neural networks.
Advancements in deep learning are being seen in smartphone applications, creating efficiencies in the power grid, driving advancements in healthcare, improving agricultural yields, and helping us find solutions to climate change. With this Tensorflow course, you’ll build expertise in deep learning models, learn to operate TensorFlow to manage neural networks and interpret the results.
And according to payscale.com, the median salary for engineers with deep learning skills tops $120,000 per year.
You can gain in-depth knowledge of Deep Learning by taking our Deep Learning certification training course. With Simplilearn’s Deep Learning course, you will prepare for a career as a Deep Learning engineer as you master concepts and techniques including supervised and unsupervised learning, mathematical and heuristic aspects, and hands-on modeling to develop algorithms. Those who complete the course will be able to:
1. Understand the concepts of TensorFlow, its main functions, operations and the execution pipeline
2. Implement deep learning algorithms, understand neural networks and traverse the layers of data abstraction which will empower you to understand data like never before
3. Master and comprehend advanced topics such as convolutional neural networks, recurrent neural networks, training deep networks and high-level interfaces
4. Build deep learning models in TensorFlow and interpret the results
5. Understand the language and fundamental concepts of artificial neural networks
6. Troubleshoot and improve deep learning models
7. Build your own deep learning project
8. Differentiate between machine learning, deep learning, and artificial intelligence
Learn more at https://www.simplilearn.com/deep-learning-course-with-tensorflow-training
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
Review of Do and Batzoglou. "What is the expectation maximization algorith?" Nat. Biotechnol. 2008;26:897. Also covers the Data Augmentation and Stan implementation. Resources at https://github.com/kaz-yos/em_da_repo
References:
"Gaussian Process", Lectured by Professor Il-Chul Moon
"Gaussian Processes", Cornell CS4780 , Lectured by Professor
Kilian Weinberger
Bayesian Deep Learning by Sungjoon Choi
Deep learning paper review ppt sourece -Direct clr taeseon ryu
딥러닝 이미지 분류 테스크에서는 Self-Supervision 학습 방법이 있습니다. 레이블이 없는 상태에서 context prediction 이나 jigsaw puzzle과 같은 방법으로 학습시키는 방법이지만 이러한 self-supervision 테스크에는 모든 차원에 분포하지 않고 특정 부분 차원으로만 학습이 되는 Dimensional Collapse 라는 고질적인 문제를 일으킵니다. Self-supervision 중 positive pair는 가깝게, 그리고 negative pair는 서로 멀어지게 학습을 시키는 Contrastive Learning 이 있습니다. 이로인해 Dimensional Collapse에 강인할 것 이라고 직관적으로 생각이 들지만, 그렇지 않았습니다. 이러한 문제를 해결하기 위해 등장한 Direct CLR이라는 방법론을 소개드립니다.
논문의 배경부터 Direct CLR논문에 대한 디테일한 설명까지,
펀디멘탈팀의 이재윤님이 자세한 리뷰 도와주셨습니다.
오늘도 많은 관심 미리 감사드립니다 !
We propose an efficient algorithmic framework for time domain circuit simulation using exponential integrators. This work addresses several critical issues exposed by previous matrix exponential based circuit simulation research, and makes it capable of simulating stiff nonlinear circuit system at a large scale. In this framework, the system’s nonlinearity is treated with exponential Rosenbrock-Euler formulation. The matrix exponential and vector product is computed using invert Krylov subspace method. Our proposed method has several distinguished advantages over conventional formulations (e.g., the well-known backward Euler with Newton-Raphson method). The matrix factorization is performed only for the conductance/resistance matrix G, without being performed for the combinations of the capacitance/inductance matrix C and matrix G, which are used in traditional implicit formulations. Furthermore, due to the explicit nature of our formulation, we do not need to repeat LU decompositions when adjusting the length of time steps for error controls. Our algorithm is better suited to solving tightly coupled post-layout circuits in the pursuit for full-chip simulation. Our experimental results validate the advantages of our framework.
Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi
Salakhutdinov, Ruslan, and Geoffrey E. Hinton. "Learning a nonlinear embedding by preserving class neighbourhood structure." International Conference on Artificial Intelligence and Statistics. 2007.
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
Abstract (Eng/Kor):
Image restoration (IR) is one of the fundamental problems, which includes denoising, deblurring, super-resolution, etc. Among those, in today's talk, I will more focus on the super-resolution task. There are two main streams in the super-resolution studies; a traditional model-based optimization and a discriminative learning method. I will present the pros and cons of both methods and their recent developments in the research field. Finally, I will provide a mathematical view that explains both methods in a single holistic framework, while achieving the best of both worlds. The last slide summarizes the remaining problems that are yet to be solved in the field.
영상 복원(Image restoration, IR)은 low-level vision에서 매우 중요하게 다루는 근본적인 문제 중 하나로서 denoising, deblurring, super-resolution 등의 다양한 영상 처리 문제를 포괄합니다. 오늘 발표에서는 영상 복원 분야 중에서도 super-resolution 문제에 대해 집중적으로 다루겠습니다. 전통적인 model-based optimization 방식과 deep learning을 적용하여 문제를 푸는 방식에 대해, 각각의 장단점과 최신 연구 발전 흐름을 소개하겠습니다. 마지막으로는 이 둘을 하나로 잇는 통일된 관점을 제시하고 관련 연구들 살펴본 후, super-resolution 분야에서 아직 남아있는 문제점들을 정리하겠습니다.
PhD defence public presentation, Bayesian methods for inverse problems with point clouds: applications to single-photon lidar, ENSEEHIT, Toulouse, France
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
4. 4
Examples
Magnetic resonance imaging
• 𝐴 = subset of Fourier modes
(𝑘 − space) of 2D/3D images
Image inpainting
• 𝐴 = diagonal matrix
with 1’s and 0s.
Computed tomography
• 𝐴 = 1D projections
(sinograms) of 2D image
𝑥
𝑦 𝑦 𝑥
𝑦 𝑥
5. 5
Why It Is Hard to Invert?
Even in the absence of noise, infinitely many 𝑥 consistent with 𝑦:
𝑥 = 𝐴†𝑦 + 𝑣
where 𝐴†
is the pseudo-inverse of 𝐴 and 𝑣 is any vector in nullspace of 𝐴
reconstruct
6. 6
Low-Dimensionality Prior
Idea: Most natural signals sets 𝒳 are low-dimensional
• Mathematically, we use the box-counting dimension
boxdim 𝒳 = 𝑘 ≪ 𝑛
Theorem: A signal 𝑥 belonging to a set 𝒳 ⊂ ℝ𝑛 with
boxdim 𝒳 = 𝑘 can be uniquely recovered from the
measurement 𝑦 = 𝐴𝑥 with almost every 𝐴 ∈ ℝ𝑚×𝑛 if
𝑚 > 2𝑘.
7. 7
Symmetry Prior
Idea: Most natural signals sets 𝒳 are invariant to groups of transformations.
Example: natural images are translation invariant
• Mathematically, a set 𝒳 is invariant to a group
of transformations 𝑇𝑔 ∈ ℝ𝑛×𝑛
𝑔∈𝐺
if
∀𝑥 ∈ 𝒳, ∀𝑔 ∈ 𝐺, 𝑇𝑔𝑥 ∈ 𝒳
Other symmetries: rotations, permutation, amplitude
8. 8
Group Actions
• A group action of 𝐺 acting on ℝ𝑛 is a mapping 𝑇: 𝐺 × ℝ𝑛 ↦ ℝ𝑛 which inherits the group
axioms.
(multiplication) 𝑇𝑔2
∘ 𝑇𝑔1
(𝑥) = 𝑇𝑔2∘𝑔1
𝑥
(inverse) 𝑇𝑔−1 𝑥 = 𝑇𝑔
−1(𝑥)
(identity) 𝑇𝑒 𝑥 = 𝑥
• Let 𝑇, 𝑇 be group actions on ℝ𝑛
and ℝ𝑚
. A mapping 𝑓: ℝ𝑛
↦ ℝ𝑚
is 𝑮-equivariant if
𝑓 ∘ 𝑇𝑔(𝑥) = 𝑇𝑔 ∘ 𝑓(𝑥) for all 𝑔 ∈ 𝐺, 𝑥 ∈ ℝ𝑛
• If 𝑓 and 𝑔 are equivariant, ℎ = 𝑔 ∘ 𝑓 is equivariant
9. 9
Representations
• A linear representation of a compact 𝐺 on ℝ𝑛 is the group action 𝑇: 𝐺 ↦ ℝ𝑛×𝑛 represented
by invertible 𝑛 × 𝑛 matrices.
• A linear representation can be decomposed in irreps {𝜌𝑔
𝑘 ∈ ℝ𝑠𝑘×𝑠𝑘}𝑘=1
𝐾
of dimension 𝒔𝒋 with
multiplicity 𝒄𝒌 and arbitrary basis 𝐵 ∈ ℝ𝑛×𝑛:
𝑇𝑔 = 𝐵−1
𝜌𝑔
1 0
0 𝜌𝑔
1 ⋯ 0
⋮ ⋱ ⋮
0 ⋯
𝜌𝑔
𝐾
0
0 𝜌𝑔
𝐾
𝐵
Example: Shift matrices, 𝐵 is Fourier transform and 𝜌𝑔
𝑘 = 𝑒−
i2𝜋𝑘𝑔
𝑛 (𝑠𝑘 = 𝑐𝑘 = 1) for 𝑘 = 1, … , 𝑛
10. 10
Regularised Reconstruction
Standard regularisation approach:
argmin 𝐴𝑥 − 𝑦
2
+ 𝐽(𝑥)
• 𝐽(𝑥) enforces low-dimensionality and is 𝐺-invariant: 𝐽 𝑇𝑔𝑥 = 𝐽(𝑥)
Examples: total-variation (shift invariant), sparsity (permutation invariant)
Disadvantages: hard to define a good 𝐽 𝑥 in real world problems, loose with
respect to the true 𝒳
𝑥
11. 11
Learning approach
Idea: use training pairs of signals and measurements 𝑥𝑖, 𝑦𝑖 𝑖 to directly learn the
inversion function
argmin
𝑖
𝑥𝑖 − 𝑓𝜃(𝑦𝑖)
2
where 𝑓𝜃: ℝ𝑚 ↦ ℝ𝑛 is a deep neural network with parameters 𝜃.
𝜃
𝑓𝜃
12. 12
Advantages:
• State-of-the-art reconstructions
• Once trained, 𝑓𝜃 is easy to evaluate
Learning Approach
x8 accelerated MRI [Zbontar et al., 2019]
Deep network
(34.5 dB)
Ground-truth
Total variation
(28.2 dB)
13. 13
Tutorial Goals
1. Incorporate symmetry into design of 𝑓𝜃
2. Use symmetry to generalize to unseen poses and noise levels
3. How symmetry enables fully unsupervised learning
14. 14
Equivariant Nets
Convolutional neural networks (CNNs) are translation equivariant
• Going beyond translations: make every block equivariant!
𝑊1 𝜙 𝑊𝐿
… 𝜙 𝑓𝜃(𝑦)
𝑦
15. 15
Equivariant Nets
• Steerable linear layers [Cohen et al., 2016], [Serre, 1971]
𝑆 = {𝑊 ∈ ℝ𝑝×𝑛
𝑇𝑔𝑊 = 𝑊𝑇𝑔, ∀𝑔 ∈ 𝐺
𝑊𝜃 =
𝑖=1
𝑟
𝜃𝑖𝜓𝑖
• Parameter efficiency:
dim 𝑆
dim(ℝ𝑝×𝑛)
=
𝑟
𝑝𝑛
• Non-linearities: elementwise 𝜙 with 𝑇𝑔 represented by permutation matrices.
• In practice, choose 𝑇𝑔 (multiplicities=channels) at each layer and use existing libraries.
ℝ𝑝×𝑛
𝑆
= span(𝜓1, … , 𝜓𝑟)
16. 16
𝐴
𝑓𝜃
Equivariance in Inverse Problems
Role of invariance of 𝒳 when solving inverse problems?
System Equivariance: Let 𝑓𝜃: 𝑦 ↦ 𝑥 be the reconstruction function
𝑓𝜃 ∘ 𝐴 ∘ 𝑇𝑔𝑥 = 𝑇𝑔 ∘ 𝑓𝜃 ∘ 𝐴𝑥 ∀𝑔 ∈ 𝐺, ∀𝑥 ∈ 𝒳
26. 26
Limits to Supervised Learning
Main disadvantage: Obtaining training signals 𝑥𝑖 can be expensive or impossible.
• Medical and scientific imaging
• Only solves inverse problems which we already know what to expect
• Risk of training with signals from a different distribution
train test?
27. 27
Learning from only measurements 𝒚?
argmin
𝑖
𝑦𝑖 − 𝐴𝑓𝜃(𝑦𝑖)
2
Learning Approach
𝒇𝜽
𝐴
𝜃
Proposition: 𝑓𝜃 𝑦 = 𝐴†
𝑦 + 𝑔𝜃(𝑦) where 𝑔𝜃: ℝ𝑚
↦ 𝒩
𝐴 is any function whose
image belongs to the nullspace of 𝐴 obtains zero training error.
28. 28
Exploiting Invariance
How to learn from only 𝒚? We need some prior information
For all 𝑔 ∈ 𝐺 we have
𝑦 = 𝐴𝑥 = 𝐴𝑇𝑔𝑇𝑔
−1
𝑥
• Implicit access to multiple operators 𝐴𝑔
• Each operator with different nullspace
= 𝐴𝑔𝑥′
29. 29
Model Identification
Can we uniquely identify the set of signals 𝒳 ⊂ ℝ𝑛 from
the observed measurement sets 𝒴𝑔 = 𝐴𝑇𝑔𝒳
𝑔∈𝐺
?
30. Theorem [T., Chen and Davies ’22]:
Identifying 𝒳 requires that 𝐴 is not equivariant: 𝐴𝑇𝑔 ≠ 𝑇𝑔𝐴
30
Necessary Conditions
Proposition [T., Chen and Davies ’22]: Identifying 𝒳 from {𝒴𝑔 = 𝐴𝑔𝒳}
possible only if
rank
𝐴𝑇1
⋮
𝐴𝑇|𝐺|
= 𝑛,
and thus, if 𝑚 ≥ max
𝑐𝑗
𝑠𝑗
≥
𝑛
𝐺
where 𝑠𝑗 and 𝑐𝑗 are dimension and
multiplicity of irreps.
31. 31
Can We Learn Any Set?
• Necessary conditions are independent of 𝒳…
• For finite |𝐺|, some signal distributions cannot be identified if 𝐴 is rank-deficient:
𝔼𝑦{𝑒i𝑤⊤𝐴𝑔
†
𝑦
} = 𝔼𝑥 𝑒i𝑥⊤(𝐴𝑔
†
𝐴𝑔𝑤)
= 𝜑(𝐴𝑔
†
𝐴𝑔𝑤)
• We only observe projection of characteristic function 𝜑 to certain subspaces!
32. Theorem [T., Chen and Davies ’22]: Let 𝐺 be a compact cyclic group. Identifying
set 𝒳 with box-counting dimension 𝑘 from 𝒴𝑔 = 𝐴𝑇𝑔𝒳
𝑔
is possible by almost
every 𝐴 ∈ ℝ𝑚×𝑛 with
𝑚 > 2𝑘 + max
𝑗
𝑐𝑗 + 1
where 𝑐𝑗 are the multiplicities of the representation and max
𝑗
𝑐𝑗 ≥ 𝑛/ 𝐺 .
32
Sufficient Conditions
Additional assumption: The signal set is low-dimensional.
If group is big, then 𝑚 > 2𝑘 + 2 are sufficient = same condition as signal recovery!
33. 33
Consequences
Magnetic resonance imaging
• 𝐴 = subset of Fourier modes
• Equivariant to translations
• Not equivariant to rotations,
which have max 𝑐𝑗 ≈ √𝑛
𝑚 > 2𝑘 + 𝑛 + 1
Image inpainting
• 𝐴 = diagonal matrix with 1’s
and 0s.
• Not equivariant to
translations, which have
max 𝑐𝑗 ≈ 1
𝑚 > 2𝑘 + 2
Computed tomography
• 𝐴 = 1D projections
(sinograms)
• Equivariant to translations
• Not equivariant to rotations,
which have max 𝑐𝑗 ≈ √𝑛
𝑚 > 2𝑘 + 𝑛 + 1
𝑥
𝑦 𝑦 𝑥
𝑦 𝑥
34. 34
Equivariant Imaging
Enforcing equivariance in practice?
• Unrolled equivariant networks might not achieve equivariance of 𝑓𝜃 ∘ 𝐴
• Equivariant prox is not sufficient
Example: Learned prox𝐽 𝑥 = 𝑥 is equivariant, resulting 𝑓𝜃 𝑦 = 𝐴†𝑦 is
measurement consistent but not system equivariant
Idea: enforce equivariance during training!
36. 36
Robust EI: SURE+EI
Robust Equivariant Imaging
argmin ℒ𝑆𝑈𝑅𝐸 𝜃 + ℒ𝐸𝐼 𝜃
where ℒ𝑆𝑈𝑅𝐸 𝜃 = 𝑖 𝑦𝑖 − 𝐴𝑓𝜃 𝑦𝑖
2
− 𝜎2𝑚 + 2𝜎2div(𝐴 ∘ 𝑓𝜃)(𝑦𝑖)
𝜃
Theorem [Stein, 1981] Under mild differentiability conditions on the function 𝐴 ∘ 𝑓𝜃, the following
holds
𝔼𝑦 ℒ𝑀𝐶 𝜃 = 𝔼𝑦 ℒ𝑆𝑈𝑅𝐸 𝜃
• Similar expressions of ℒ𝑆𝑈𝑅𝐸 𝜃 for Poisson, Poisson-Gaussian, etc. exist
37. 37
Experiments
Tasks:
• Magnetic resonance imaging
Network
• 𝑓𝜃 = 𝑔𝜃 ∘ 𝐴†
where 𝑔𝜃 is a U-Net
Comparison
• Pseudo-inverse 𝐴†
𝑦𝑖 (no training)
• Meas. consistency 𝐴𝑓𝜃 𝑦𝑖 = 𝑦𝑖
• Fully supervised loss: 𝑓𝜃 𝑦𝑖 = 𝑥𝑖
• Equivariant imaging (unsupervised)
𝐴𝑓𝜃 𝑦𝑖 = 𝑦𝑖 and equivariant 𝐴 ∘ 𝑓𝜃
38. Equivariant imaging Fully supervised
𝐴†𝑦 Meas. consistency
38
Magnetic resonance imaging
• Operator 𝐴 is a subset of Fourier measurements (x2 downsampling)
• Dataset is approximately rotation invariant
Signal 𝑥 Measurements 𝑦
39. 39
Inpainting
Signal 𝑥
Measurements 𝑦
• Operator 𝐴 is an inpainting mask (30% pixels dropped)
• Poisson noise (rate=10)
• Dataset is approximately translation invariant
Supervised Meas. consistency Robust EI
40. 40
Noisy
measurements 𝑦
Robust EI
Supervised
Clean signal 𝑥 Meas. consistency
Computed Tomography
• Operator 𝐴 is (non-linear variant) sparse radon transform (50 views)
• Mixed Poisson-Gaussian noise
• Dataset is approximately rotation invariant
41. 41
Compressed Sensing
• 𝐴 is a random iid Gaussian matrix
• Shift invariant (max𝑗 𝑐𝑗 = 1) MNIST dataset 𝑘 ≈ 12
43. 43
Conclusions
• Learn signal set from data
• Equivariance by design
• Generalize unseen poses and
noise level
• No ground-truth references needed
• Necessary and sufficient conditions
45. 46
Extras
• Theoretical bounds and algorithms can be extended to the case where we observe
data through multiple operators
𝐴1𝑥 𝐴2𝑥 𝐴3𝑥
“Unsupervised Learning From Incomplete Measurements for Inverse Problems”, Tachella, Chen and
Davies, NeurIPS 2022.
46. 47
Papers
[1] “Equivariant Imaging: Learning Beyond the Range Space”, Chen, Tachella and
Davies, ICCV 2021 (Oral)
[2] “Robust Equivariant Imaging: a fully unsupervised framework for learning to
image from noisy and partial measurements”, Chen, Tachella and Davies, CVPR
[3] “Unsupervised Learning From Incomplete Measurements for Inverse Problems”,
Tachella, Chen and Davies, NeurIPS 2022.
[4] “Sensing Theorems for Unsupervised Learning in Inverse Problems”, Tachella,
Chen and Davies, JMLR 2023.
[5] “Imaging with Equivariant Deep Learning”, Chen, Davies, Eerhardt, Schonlieb,
Ferry and Tachella, IEEE SPM 2023.
47. Thanks for your attention!
Tachella.github.io
Codes
Presentations
… and more
48