Supervised Planetary Unmixing with Optimal Transport
1. Supervised Planetary Unmixing with
Optimal Transport
August 23, 2016
Sina Nakhostin, Nicolas Courty, Remi Flamary and Thomas Corpetti
Contact: sina.nakhostin@irisa.fr
IRISA
Université de Bretagne-SUD
France
2. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Agenda
Problem Definition
Optimal Transport (OT)
Unmixing with OT
Experiments and results
3. 18
Whispers 2016
2Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Supervised Unmixing
It is about a projection
Given:
A multi/hyper-spectral
dataset.
A dictionary of reference
signatures.
Goal:
Producing a set of
abundance maps
representing distribution of
different materials within
the scene.
4. 18
Whispers 2016
3Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Predicament
Endmember Variability
Signature profile of the
same material is usually
characterized by more
than one signature due to:
Sensing device accuracy
Reflectance angle
Shading effect
etc.
5. 18
Whispers 2016
3Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Predicament
Endmember Variability
Signature profile of the
same material is usually
characterized by more
than one signature due to:
Sensing device accuracy
Reflectance angle
Shading effect
etc.
Exploiting Overcomplete Dictionaries is a way to account
for endmember variability.
6. 18
Whispers 2016
4Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Predicament
Choice of Distance
What is the best distance measure for comparing
dictionary atoms ?
7. 18
Whispers 2016
4Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Predicament
Choice of Distance
What is the best distance measure for comparing
dictionary atoms ?
Conventional Distance Measures
Euclidean Distance
Spectral Angle Mapper
Spectral Information Divergence
8. 18
Whispers 2016
4Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Predicament
Choice of Distance
What is the best distance measure for comparing
dictionary atoms ?
Conventional Distance Measures
Euclidean Distance
Spectral Angle Mapper
Spectral Information Divergence
Proposed Measure
A distance measure based on Optimal Transport (OT).
Wasserstein Distance (a.k.a. Earth Mover Distance)
Defined between probability distributions.
Can be designed to be mostly sensitive to shifts in
frequency domain.
9. 18
Whispers 2016
5Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Why optimal Transport after all?
To see spectra as probability distributions.
Each spectrum should to be normalized along spectral
values.
Normalization makes the analysis less sensitive to the
total power of spectra in each pixel.
This improves robustness against shadows or other large
radiance changes and thus can prevent degenerate
solutions.
10. 18
Whispers 2016
6Problem Definition
Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Contributions
Figure : Courtesy of Cuturi. Transporting 2D probability distributions
In this work we:
Introduce an original Unmixing Algorithm based on
Optimal Transport Theory.
Use an efficient optimization scheme based on iterative
Bregman projections for solving the underlying problem.
Our formulation allows one to input an eventual prior about
the abundances.
We give preliminary results on the challenging asteroid
4-Vesta dataset.
11. 18
Whispers 2016
Problem Definition
7Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
What is Optimal Transport?
Lets µs and µt be two discrete
probability distributions in R+
.
Let a transport plan be an
association (a coupling) between
each bins of µs and µt .
The Kantorovitch formulation of
OT looks for an optimal coupling
between the two probability
distributions wrt. to a given
metric (see Figure)
12. 18
Whispers 2016
Problem Definition
8Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Discreet Optimal Transport
Knowing that distributions are available through a finite
number of bins (i.e. spectral bands) in R+
, we can write
them as:
µs =
ns
i=1
ps
i δs
xi
; µt =
nt
i=1
pt
i δt
xi
Where δxi
is the Dirac at location xi ∈ R+
. ps
i and pt
i are
probability masses associated to the i-th bins.
The set of probability couplings (joint probability
distributions) between µs and µt is defined as:
= {γ ∈ (R+
)ns×nt
|γ1nt = µs; γ1ns = µt }
Where ns and nt are the number of bins in µs and µt .
13. 18
Whispers 2016
Problem Definition
9Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Wasserstein Distance
OT seeks for γ minimizing the quantity:
WC(µs, µt ) = min
γ∈ (µs,µt )
< γ, C >F , (1)
Where < ., . >F is the Frobenius norm and C(d×d) ≥ 0 is the
cost matrix (pairwise distance wrt. a given metric).
Here, WC(µs, µt ) is called the Wasserstein distance.
What about Scalability?
The solution of (1) is a linear program with equality constraints.
Its resolution can be very time consuming.
14. 18
Whispers 2016
Problem Definition
10Optimal Transport
(OT)
Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Entropic Regularization
In order to control the smoothness of the coupling, [Cuturi,
2013] proposes an Entropy-based regularization term over
γ which reads:
WC, (µs, µt ) = min
γ∈ (µs,µt )
< γ, C >F − h(γ)
Entropy Regularizer
, (2)
This allows to draw a parallel between OT and a Bregman
projection:
γ = arg min
γ∈ (µs,µt )
KL(γ, ζ), (3)
Where ζ = exp(−C
).
This version of OT admits a simpler resolution method,
based on successive projections over the two marginal
constraints.
We use this closed form projection to solve for
Unmixing problem
15. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
11Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Unmixing of the spectrum µ
Lets assume a linear mixture : µ = Eα.
Where E(d×q) is the overcomplete dictionary and α > 0 is
a q-vector of abundance values and α 1 = 1.
We seek for p abundance values for each pixel and
(p ≤ q) → Endmember variability.
We also assume to have a prior knowledge α0(p×1) over
the abundances.
The unmixing of µ is then the solution of the following
optimization:
α = arg min
α
WC0, 0
(µ, Eα)
data fitting
+τ WC1, 1
(α, α0)
prior
. (4)
Data fitting part searches for the best decomposition from
the observations. Regularization part enforces the
compliance of the solution with the priors, balanced by
parameter τ ∈ R+
.
16. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
12Unmixing with OT
Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Unmixing of the spectrum µ
α = arg min
α
WC0, 0
(µ, Eα)
data fitting
+τ WC1, 1
(α, α0)
prior
. (5)
C0(d×d) and C1(q×p) are respectively the cost function
matrix in the spectral domain and the cost function which
contains information about the endmember groups.
The resolution of the optimization is also an algorithm
based on iterative Bregman projections. See details in the
paper.
17. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
13Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
4-Vesta dataset
We do unmixing on a portion of 4-Vesta northern
hemispher.
The VIR image has 383 bands covering the ranges:
0.55 − 1.05µm with spectral sampling of 1.8nm.
1.0 − 2.5µm with spectral sampling of 9.8nm.
We look for three main lithologies : Eucrite, Orthopyroxene
and Olivine.
A dictionary of 10 atoms formed by the signatures of
different lithologies which found in meteorites was used.
18. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
14Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Cost metric for the captor (C0)
In order to tailor our cost matrix C0 in alignment to the
characteristics of the dataset, we build C0(383×383) as the
square euclidean distance over the spectral values.
This clearly reflects the characteristic of the spectra and
the level of (dis)similarity among them.
19. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
15Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Cost metric for the materials (C1)
We manually construct C1(10×3)
to reflect the information
regarding the groups of
endmembers belonging to the
same material.
Two endmembers belonging to
the same material share a very
low cost with the corresponding
material in α0, C1(i,j) = 0.
Priors over material groups α
We can also encode our prior knowledge about the domination
of one or another material through the vector α(3×1). In case
there is no such prior knowledge, we can set all the priors
equal value eg here 1/3.
20. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
16Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Comparison with other method
Abundance maps by OT
Abundance maps by constrained LS
Unmixing based on OT reveals interesting patterns for
distribution of each material.
21. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
17Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Abundance maps with varying priors
More extensive tests should be conducted, for finding the
best parametrization.
22. 18
Whispers 2016
Problem Definition
Optimal Transport
(OT)
Unmixing with OT
18Experiments and
results
Dept. IRISA
Université de Bretagne-SUD
France
Conclusion/Perspectives
Conclusion
An unmixing algorithm based on Optimal Transport.
The metric devoted to distributions is mostly sensitive to
shifts in the frequency domain.
Endmember variability is addressed through the use of
overcomplete dictionary.
Through an iterative Bregman projection a cost function is
to be optimized.
Perspectives
Introducing new regularization term that will account for
sparsity in the groupings.
Possible candidate could be sparse Group Lasso (or Fuse
Lasso).