2. Augmented Lagrangian based full-waveform inversion with
Anderson acceleration
K. Aghazade*, A. Gholami*, H. S. Aghamiry**, and S. Operto**
*University of Tehran, Institute of Geophysics, Tehran, Iran
**University of Côte d’Azur, Geoazur, CNRS - IRD - OCA, Valbonne, France
3. Motivation
Is it possible to get more accurate model in a less number of iterations?
IR-WRI
Accelerated
IR-WRI
4. Extended FWI method
• FWI as a PDE constrained problem (Haber et al.,
2000)
min
𝐮,𝐦
1
2
𝐏𝐮 − 𝐝 2
2
𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝐀 𝐦 𝐮 = 𝐛
• m: Model parameter , u: Wavefield,
d: Observed data, P: Sampling operator , b: Source term,
A: Helmholtz operator
Augmented Lagrangian (IR-WRI) (Aghamiry et al.,
2019)
min
𝐮,𝐦
max
𝐯
1
2
𝐏𝐮 − 𝐝 2
2
+ 𝐯𝑇
[𝐀 𝐦 𝐮 − 𝐛] +
𝜆
2
𝐀 𝐦 𝐮 − 𝐛 2
2
Penalty method (WRI) (Van Leeuwen and Herrmann,
2013)
min
𝐮,𝐦
1
2
𝐏𝐮 − 𝐝 2
2
+
𝜆
2
𝐀 𝐦 𝐮 − 𝐛 2
2
𝜆: Penalty parameter
Mitigate the nonlinearity
Sensitive to the choice of Penalty parameter
6. Fixed-Point Iteration (FPI)
Given a function 𝑔 and a point 𝑥0 in the
domain of 𝑔, the fixed-point iteration [Picard iteration] is:
𝑥𝑛+1 = 𝑔 𝑥𝑛 𝑛 = 0,1,2, …
Which converges (at most cases) to 𝑥∗ such that:
𝑥∗ = 𝑔(𝑥∗)
FPI (simple definition)
Consider 𝑓 𝑥 = 𝑥2 − 𝑥 − 1 = 0. One can set:
𝑥 = 1 +
1
𝑥
or 𝑥 = x2
− 1, etc.
𝑥 = 𝑔(𝑥)
• 𝑔: Fixed-point mapping function
• 𝑥: Fixed-point of 𝑔
• 𝑥 = 𝑔 𝑥 : A fixed-point problem
7. Acceleration: a tool for convergence speed up
The dimensionality of the problem is large.
f(x) is continuously differentiable, but the analytic form of its
derivative is not readily available, or it is very expensive to
compute.
The cost of evaluating f(x) is computationally high.
Why acceleration?
Newton acceleration
Aitken acceleration
Epsilon algorithms
Anderson acceleration
etc.
Acceleration methods
8. Anderson acceleration for FPI (Waker and Ni, 2011)
The main idea AA: formulation for m = 𝑔(𝑚)
𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙: 𝑓 (𝑚) = 𝑚 − 𝑔 (𝑚)
• The linear combination of ℎ + 1 previous iterations,
i.e. 𝒎𝑘; 𝒎𝑘−1; … ; 𝒎𝑘−ℎ, under the fixed point
mapping 𝑔 reads:
𝑚𝑘+1 =
𝑗=0
ℎ
𝜃𝑗𝑔(𝑚𝑘−ℎ+𝑗)
• 𝜃1 ; … ; 𝜃ℎ 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔
• 𝑜𝑝𝑡𝑖𝑚𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑝𝑟𝑜𝑏𝑙𝑒𝑚:
min
𝜽 𝑗=0
ℎ
𝜃𝑗𝑓 𝑚𝑘−ℎ+𝑗 𝟐
𝟐
subject to 𝑗=0
ℎ
𝜃𝑗 = 1
AA keeps the memory of the h previous iterations and
describes the current iteration update as a weighted linear
combination of the memory.
11. Example1: Checkerboard model
Model dimension: 1400 × 1400
Ns: 4 (located at the corners of the model)
Nr: 276 (spaced along the four edges of the
model)
Source: The Ricker wavelet with 𝑓𝑑 = 10H𝑧
Inverted frequencies: 𝐹 = 2.5, 5 Hz
AA history: 10
(a) True velocity model, (b-e) inverted velocity
models by (b) WRI, (c) IR-WRI, (d) WRI with AA,
and (e) IR-WRI with AA
Model error curves versus iteration number
12. Example 2: 2004 BP salt model: Central Target Model dimension:
3𝑘𝑚 × 11𝑘𝑚
Source:
The Ricker wavelet with
𝑓𝑑 = 10H𝑧
Inverted frequencies:
Path1 = 3 − 8 Hz →
Path2: 4 − 11 Hz
AA history: 6
max iteration per frequency: 20
(a) true velocity model, (b) initial velocity model,
(c-d) IR-WRI (Without AA) result after first and second frequency paths.
(e-f) IR-WRI (With AA) result after first and second frequency paths
13. Example 2: 2004 BP salt model: Central Target
(left) Evolution of the source and data residuals
and model error for frequencies 3-3.5 Hz
(right) model error versus iteration
Detailed comparison between vertical velocity logs at
different locations
14. Model dimension: 3.5 𝑘𝑚 × 17 𝑘𝑚
Source: The Ricker wavelet with 𝑓𝑑 = 10H𝑧
Inverted frequencies:
Path1 = 3 − 3.5 Hz →
Path2: 3 − 6 Hz →
Path3: [3-13] Hz
AA history: 8
max iteration per frequency: 10
Example 3: The Marmousi II model
(a) True velocity model
(b) Initial model
15. Example 3: The Marmousi II model Noisy data
Noise free data
IR-WRI
AIR-WRI
IR-WRI + TV
AIR-WRI + TV
16. Evolution of the model error versus iteration for different models
Example 3: The Marmousi II model
Noise free data
Noisy data
17. Conclusions
We recast the IR-WRI iteration as a general fixed-point iteration to improve the speedup the convergence of
IR-WRI with Anderson acceleration (AA).
The proposed methodology can easily includes useful prior information and regularization.
Dual variables of BTV regularization, which are created for handling non-differentiable functions based on
splitting schemes, can be processed as extra fixed-point parameters.
The AA algorithm has two options: The first one is the regularization
of the quadratic problem in the AA algorithm. The second one is the safeguarding step. The experiments
for noise-free and noisy data show that applying this step improves the AA results and its robustness against
of noise.
With this new implementation, we have improved both accuracy and convergence rate of the original IR-
WRI.
18. We thank WIND consortium and their sponsors for
their continuous support.
This study was partially funded by the WIND consortium
(https://www.geoazur.fr/WIND), sponsored by Chevron, Shell, and
Total. This study was granted access to the HPC resources of SIGAMM
infrastructure (http://crimson.oca.eu), hosted by Observatoire de la Côte
d’Azur and which is supported by the Provence-Alpes Côte d’Azur
region, and the HPC resources of CINES/IDRIS/TGCC under the
allocation A0050410596 made by GENCI
19. References
• Aghamiry, H. S., A. Gholami, and S. Operto, 2019, Improving
full-waveform inversion by wavefield reconstruction with the
alternating direction method of multipliers: Geophysics, 84,
R139–R162.
• Haber, E., U. M. Ascher, and D. Oldenburg, 2000, On
optimization techniques for solving nonlinear inverse problems:
Inverse problems, 16, 1263.
• Van Leeuwen, T., and F. J. Herrmann, 2013, Mitigating local
minima in full-waveform inversion by expanding the search
space: Geophysical Journal International, 195, 661–667
• Walker, H. F., and P. Ni, 2011, Anderson acceleration for fixed-
point iterations: SIAM Journal on Numerical Analysis, 49, 1715-
1735