QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Algorithms - Jean-Christophe Pesquet, Mar 23, 2018

1/12
STOCHASTIC BLOCK-COORDINATE
FIXED POINT ALGORITHMS
Jean-Christophe Pesquet
Center for Visual Computing, CentraleSup´elec, University Paris-Saclay
Joint work with Patrick Louis Combettes
SAMSI Workshop - March 2018

2/12
Motivation
FIXED POINT ALGORITHM
for n = 0, 1, . . .
xn+1 = xn + λn Tnxn − xn ,
where
• x0 ∈ H separable real Hilbert space
• (∀n ∈ N) Tn : H → H
• (λn)n∈N relaxation parameters in ]0, +∞[.

2/12
Motivation
for n = 0, 1, . . .
• widely used in optimization, game theory, inverse problems, ma-
chine learning,...
• convergence of (xn)n∈N to x ∈ F =
n∈N
Fix Tn, under suitable
assumptions.
E. Picard (1856-1941)

2/12
Motivation
for n = 0, 1, . . .
• widely used in optimization, game theory, inverse problems, ma-
chine learning,...
• convergence of (xn)n∈N to x ∈ F =
n∈N
Fix Tn, under suitable
assumptions.
In the context of high-dimensional problems,
how to limit computational issues
raised by memory requirements ?

3/12
Block-coordinate approach
x ∈ H
x1 ∈ H1
x2 ∈ H2
·
·
·
·
xm ∈ Hm
H = H1 ⊕ · · · ⊕ Hm
H1, . . . , Hm: real separable Hilbert spaces
g

4/12
Block-coordinate algorithm
for n = 0, 1, . . .
for i = 1, . . . , m
xi,n+1 = xi,n + εi,nλn Ti,n(x1,n, . . . , xm,n) + ai,n − xi,n .
BLOCK-COORDINATE ALGORITHM
where
• (∀x ∈ H) Tnx = (Ti,n x)1 i m
where, for every i ∈ {1, . . . , m}, Ti,n : H → Hi is
measurable.

4/12
for n = 0, 1, . . .
for i = 1, . . . , m
where
• (∀x ∈ H) Tnx = (Ti,n x)1 i m
where, for every i ∈ {1, . . . , m}, Ti,n : H → Hi is measurable.
• (εn)n∈N = (εi,n)1 i m n∈N
identically distributed D-valued
random variables with D = {0, 1}m {0}.

4/12
for n = 0, 1, . . .
for i = 1, . . . , m
where
• (∀x ∈ H) Tnx = (Ti,n x)1 i m
• λn ∈ ]0, 1].

4/12
for n = 0, 1, . . .
for i = 1, . . . , m
where
• (∀x ∈ H) Tnx = (Ti,n x)1 i m
• λn ∈ ]0, 1].
• an = (ai,n)1 i n H-valued random variable: possible error
term.

4/12
for n = 0, 1, . . .
for i = 1, . . . , m
where
• (∀x ∈ H) Tnx = (Ti,n x)1 i m
• λn ∈ ]0, 1].
• an = (ai,n)1 i n H-valued random variable: possible error
term.
an ≡ 0 and εn ≡ (1, . . . , 1) P-a.s. ⇔ deterministic algorithm with
no error

5/12
Illustration of block activation strategy
Variable selection (∀n ∈ N)
x1,n activated when ε1,n = 1
How to choose the variable
εn = (ε1,n, . . . , ε6,n)?

5/12
εn = (ε1,n, . . . , ε6,n)?
P[εn = (1, 1, 0, 0, 0, 0)] = 0.1

5/12
εn = (ε1,n, . . . , ε6,n)?
P[εn = (1, 1, 0, 0, 0, 0)] = 0.1
P[εn = (1, 0, 1, 0, 0, 0)] = 0.2

5/12
εn = (ε1,n, . . . , ε6,n)?
P[εn = (1, 1, 0, 0, 0, 0)] = 0.1
P[εn = (1, 0, 1, 0, 0, 0)] = 0.2
P[εn = (1, 0, 0, 1, 1, 0)] = 0.2

5/12
εn = (ε1,n, . . . , ε6,n)?
P[εn = (1, 1, 0, 0, 0, 0)] = 0.1
P[εn = (1, 0, 1, 0, 0, 0)] = 0.2
P[εn = (1, 0, 0, 1, 1, 0)] = 0.2
P[εn = (0, 1, 1, 1, 1, 1)] = 0.5

6/12
Convergence analysis
NOTATION
(Fn)n∈N sequence of sigma-algebras such that
(∀n ∈ N) Fn ⊂ F and σ(x0, . . . , xn) ⊂ Fn ⊂ Fn+1
where σ(x0, . . . , xn) is the smallest σ-algebra generated by
(x0, . . . , xn).

6/12
Convergence analysis
NOTATION
(Fn)n∈N sequence of sigma-algebras such that
(∀n ∈ N) Fn ⊂ F and σ(x0, . . . , xn) ⊂ Fn ⊂ Fn+1
where σ(x0, . . . , xn) is the smallest σ-algebra generated by
(x0, . . . , xn).
ASSUMPTIONS
(i) F = ∅.
(ii) infn∈N λn > 0.
(iii) There exists a sequence (αn)n∈N in [0, +∞[ such that
n∈N
√
αn < +∞ and (∀n ∈ N) E( an
2 |Fn) αn.
(iv) For every n ∈ N, En = σ(εn) and Fn are independent.
(v) For every i ∈ {1, . . . , m}, pi = P[εi,0 = 1] > 0.

7/12
Convergence results
[Combettes, Pesquet, 2015]
Suppose that supn∈N λn < 1 and that, for every n ∈ N, Tn is
quasinonexpansive, i.e.
(∀z ∈ Fix Tn)(∀x ∈ H) Tnx − z x − z .
Then
(i) (Tnxn − xn)n∈N converges strongly P-a.s.to 0.
(ii) Suppose that, almost surely, every sequential cluster point
of (xn)n∈N belongs to F. Then (xn)n∈N converges weakly
P-a.s.to an F-valued random variable.
REMARK
Conditions met for many algorithms for solving monotone
inclusion problems, e.g., the forward-backward or the
Douglas-Rachford algorithm.

8/12
Convergence results
Assume that



F = {x} = {(xi)1 i m}
(∀n ∈ N)(∀x = (xi)1 i m ∈ H) Tnx − x 2
m
i=1
τi,n xi − xi
2
,
where {τi,n | 1 i m, n ∈ N} ⊂]0, +∞[. Then
(∀n ∈ N) E( xn+1−x 2
|F0)
max
1 i m
pi
min
1 i m
pi
n
k=0
χk x0−x 2
+ηn.
with, for every n ∈ N,



ξn =
αn
min
1 i m
pi
, µn = 1 − min
1 i m
pi 1 − τi,n
χn = 1 − λn(1 − µn) + ξnλn(1 + λn
√
µn)
ηn =
n
k=0
n
=k+1
χ λk 1 + λk
√
µk + λk ξk ξk.

8/12
Convergence results
Assume that



F = {x} = {(xi)1 i m}
(∀n ∈ N)(∀x = (xi)1 i m ∈ H) Tnx − x 2
m
i=1
τi,n xi − xi
2
,
where {τi,n | 1 i m, n ∈ N} ⊂]0, +∞[ and
(∀i ∈ {1, . . . , m}) sup
n∈N
τi,n < 1.
Suppose that x0 ∈ L2(Ω, F, P; H).
Then (xn)n∈N converges to x both in the mean square and
strongly P-a.s. senses.

9/12
Behavior in the absence of errors
• Under the same assumptions, linear convergence rate.
• Comparison with deterministic case
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
χ = 0.95
χ = 0.8
χ = 0.6
χ = 0.4
χ = 0.2
χ = 0.1
(p)/ (1) as a function of p for various values of χ
(p) = −
ln 1−(1−χ)p
p : convergence rate normalized by the
computational cost when (∀i ∈ {1, . . . , m}) pi = p
χ: convergence factor in the deterministic case.

9/12
Behavior in the absence of errors
• Under the same assumptions, linear convergence rate.
• Accuracy of upper bounds for a variational problem in
multicomponent image recovery
0 20 40 60 80 100 120 140 160 180 200
-120
-100
-80
-60
-40
-20
0
E xn − x 2
/E x0 − x 2
(in dB) versus iteration number n
when p = 1, p = 0.8, p = 0.46.
Theoretical upper bound in dashed lines.

10/12
Inﬂuence of stochastic errors
Assume that
αn = O(n−θ
)
with θ ∈ ]2, +∞[.
Then
E xn − x 2
= O(n−θ/2
).
loss of the linear convergence

11/12
Open issue: deterministic block activation
Let
(∀x ∈ H) |||x|||2
=
m
i=1
ωi xi
2
,
where max
1 i m
ωipi = 1.
Assume that λn ≡ 1 and an ≡ 0. Then
(∀n ∈ N) E(|||xn+1 − x|||2
|Fn)
=
m
i=1
ωipi Ti,n xn − xi
2
+
m
i=1
ωi(1 − pi) xi,n − xi
2
Tnxn − x 2
+ |||xn − x|||2
−
m
i=1
ωipi xi,n − xi
2
|||xn − x|||2
+
m
i=1
(τi,n − ωipi)
0
xi,n − xi
2
.
stochastic Fej´er monotonicity [Combettes, Pesquet, 2015]

12/12
Open issue: more directional convergence conditions
Example:
minimize
x∈H
f(x) = g
m
i=1
Lixi +
θ
2
x 2
where g: G → R convex 1-Lipschitz differentiable, G separable
real Hilbert space, (∀i ∈ {1, . . . , m}) Li bounded linear from Hi
to G, θ ∈]0, +∞[
• stochastic approach
Tn = Id − γn f
⇒ (∀i ∈ {1, . . . , m}) τn,i = 1 − γnθ
γn < 2
m
i=1 L∗
i Li +2θ
• deterministic approach (quasi cyclic activation)
γn < 2
Lin
2+2θ

QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Algorithms - Jean-Christophe Pesquet, Mar 23, 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Algorithms - Jean-Christophe Pesquet, Mar 23, 2018

Similar to QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Algorithms - Jean-Christophe Pesquet, Mar 23, 2018 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

QMC: Operator Splitting Workshop, Stochastic Block-Coordinate Fixed Point Algorithms - Jean-Christophe Pesquet, Mar 23, 2018