SlideShare a Scribd company logo
1 of 36
Download to read offline
Subdifferentials and Proximal Operators
Pontus Giselsson
1
Learning goals
• Be able to derive subdifferential and proximal operator formulas
• Understand that subdifferentials define affine minorizors
• Existence of subgradient for convex functions
• Understand maximal monotonicity and Minty’s theorem
• Know strong monotonicity and relation to strong convexity
• Know different characterizations of smoothness
• Understand and be able to use Fermat’s rule
• Know subdifferential calculus rules
• Understand that prox evaluates subdifferential
2
Subdifferentials
3
Gradients of convex functions
• Recall: A differentiable function f : Rn
→ R is convex iff
f(y) ≥ f(x) + ∇f(x)T
(y − x)
for all x, y ∈ Rn
f(y)
f(x) + ∇f(x)T (y − x)
(∇f(x), −1)
(x, f(x))
• Function f has for all x ∈ Rn
an affine minorizer that:
• has slope s defined by ∇f
• coincides with function f at x
• defines normal (∇f(x), −1) to epigraph of f
• What if function is nondifferentiable?
4
Subdifferentials and subgradients
• Subgradients s define affine minorizers to the function that:
(s, −1)
(s, −1)
(s, −1)
• coincide with f at x
• define normal vector (s, −1) to epigraph of f
• can be one of many affine minorizers at nondifferentiable points x
• Subdifferential of f : Rn
→ R at x is set of vectors s satisfying
f(y) ≥ f(x) + sT
(y − x) for all y ∈ Rn
, (1)
• Notation:
• subdifferential: ∂f : Rn
→ 2Rn
(power-set notation 2Rn
)
• subdifferential at x: ∂f(x) = {s : (1) holds}
• elements s ∈ ∂f(x) are called subgradients of f at x
5
Relation to gradient
• If f differentiable at x and ∂f(x) 6= ∅ then ∂f(x) = {∇f(x)}:
(s, −1)
x1
(∇f(x2), −1)
x2 (∇f(x3), −1)
x3
• i.e., subdifferential (if nonempty) at x consists of only gradient
6
Subgradient existence – Nonconvex example
• Function can be differentiable at x but ∂f(x) = ∅
x1
x2
x3
• x1: ∂f(x1) = {0}, ∇f(x1) = 0
• x2: ∂f(x2) = ∅, ∇f(x2) = 0
• x3: ∂f(x3) = ∅, ∇f(x3) = 0
• Gradient is a local concept, subdifferential is a global property
7
Subgradient existence – Convex example
• Consider the convex function:
x1
x2
x3
f(x) = |x|
• What are the subdifferentials at points x1, x2, x3?
• Subdifferential at x1 is -1 (affine minorizer with slope -1)
• Subdifferential at x2 is [-1,1] (affine minorizers with slope [-1,1])
• Subdifferential at x3 is 1 (affine minorizer with slope 1)
Fact:
• For finite-valued convex functions, a subgradient exists for every x
8
Existence for extended-valued convex functions
• Let f : Rn
→ R ∪ {∞} be convex, then:
1. Subgradients exist for all x in relative interior of domf
2. Subgradients sometimes exist for x on boundary of domf
3. No subgradient exists for x outside domf
• Examples for second case, boundary points of domf:
−
√
1 − x2 + ι[−1,1](x) x2 + ι[−2,2](x)
• No subgradient (affine minorizer) exists for left function at x = 1
9
Monotonicity
• Subdifferential operator is monotone:
(sx − sy)T
(x − y) ≥ 0
for all sx ∈ ∂f(x) and sy ∈ ∂f(y)
• Proof: Add two copies of subdifferential definition
f(y) ≥ f(x) + sT
x (y − x)
with x and y swapped
• ∂f : R → 2R
: Minimum slope 0 and maximum slope ∞
∂f
x
10
Monotonicity beyond subdifferentials
• Let A : Rn
→ 2Rn
be monotone, i.e.:
(u − v)T
(x − y) ≥ 0
for all u ∈ Ax and v ∈ Ay
• If n = 1, then A = ∂f for some function f : R → R ∪ {∞}
• If n ≥ 2 there exist monotone A that are not subdifferentials
11
Maximal monotonicity
• Let the set gph ∂f := {(x, u) : u ∈ ∂f(x)} be the graph of ∂f
• ∂f is maximally monotone if no other function g exists with
gph ∂f ⊂ gph ∂g,
with strict inclusion
• A result (due to Rockafellar):
f is closed convex if and only if ∂f is maximally monotone
12
Minty’s theorem
• Let ∂f : Rn
→ 2Rn
and α > 0
• ∂f is maximally monotone if and only if range(αI + ∂f) = Rn
∂f1
x
maximally monotone
∂f2
x
not maximally monotone
∂f1 + αI
x
full range
∂f2 + αI
x
not full range
• Interpretation: No “holes” in gph ∂f
13
Strong convexity
• Recall that f is σ-strongly convex if f − σ
2 k · k2
2 is convex
• If f is σ-strongly convex then
f(y) ≥ f(x) + sT
(y − x) + σ
2 kx − yk2
2
holds for all x ∈ dom∂f, s ∈ ∂f(x), and y ∈ Rn
• The function has convex quadratic minorizers instead of affine
f(y)
f(x1) + sT
1 (y − x1) + σ
2
kx1 − yk2
2
x1
(s1, −1)
f(x2) + sT
2,1(y − x2) + σ
2
kx2 − yk2
2
(s2,1, −1)
f(x2) + sT
2,2(y − x2) + σ
2
kx2 − yk2
2
x2
(s2,2, −1)
• Multiple lower bounds at x2 with subgradients s2,1 and s2,2
14
Strong monotonicity
• If f σ-strongly convex function, then ∂f is σ-strongly monotone:
(sx − sy)T
(x − y) ≥ σkx − yk2
2
for all sx ∈ ∂f(x) and sy ∈ ∂f(y)
• Proof: Add two copies of strong convexity inequality
f(y) ≥ f(x) + sT
x (y − x) + σ
2 kx − yk2
2
with x and y swapped
• ∂f is σ-strongly monotone if and only if ∂f − σI is monotone
• ∂f : R → 2R
: Minimum slope σ and maximum slope ∞
∂f
x
15
Strongly convex functions – An equivalence
The following are equivalent
(i) f is closed and σ-strongly convex
(ii) ∂f is maximally monotone and σ-strongly monotone
Proof:
(i)⇒(ii): we know this from before
(ii)⇒(i): (ii) ⇒ ∂f − σI = ∂(f − σ
2 k · k2
2) maximally monotone
⇒ f − σ
2 k · k2
2 closed convex
⇒ f closed and σ-strongly convex
16
Smoothness and convexity
• A differentiable function f : Rn
→ R is convex and smooth if
f(y) ≤ f(x) + ∇f(x)T
(y − x) + β
2 kx − yk2
2
f(y) ≥ f(x) + ∇f(x)T
(y − x)
holds for all x, y ∈ Rn
• f has convex quadratic majorizers and affine minorizers
f(x1) + ∇f(x1)T (y − x1) + β
2
kx1 − yk2
2
x1
(∇f(x2), −1)
f(x2) + ∇f(x2)T (y − x2) + β
2
kx2 − yk2
2
x2
(∇f(x2), −1)
f(y)
• Quadratic upper bound is called descent lemma 17
Gradient of smooth convex function
• Gradient of smooth convex function is monotone and Lipschitz
(∇f(x) − ∇f(y))T
(x − y) ≥ 0
k∇f(y) − ∇f(x)k2 ≤ βkx − yk2
• ∇f : R → R: Minimum slope 0 and maximum slope β
∇f
x
• Actually satisfies the stronger 1
β -cocoercivity property:
(∇f(x) − ∇f(y))T
(x − y) ≥ 1
β k∇f(y) − ∇f(x)k2
2
18
Smooth convex functions – Equivalences
Let f : Rn
→ R be differentiable. The following are equivalent:
(i) ∇f is 1
β -cocoercive
(ii) ∇f is maximally monotone and β-Lipschitz continuous
(iii) f is closed convex and satisfies descent lemma (is β-smooth)
• Implication (ii)⇒(i) is called the Baillon-Haddad theorem
• Will connect smoothness and strong convexity via conjugates in next lecture
19
Fermat’s rule
Let f : Rn
→ R ∪ {∞}, then x minimizes f if and only if
0 ∈ ∂f(x)
• Proof: x minimizes f if and only if
f(y) ≥ f(x) + 0T
(y − x) for all y ∈ Rn
which by definition of subdifferential is equivalent to 0 ∈ ∂f(x)
• Example: several subgradients at solution, including 0
(0, −1)
20
Fermat’s rule – Nonconvex example
• Fermat’s rule holds also for nonconvex functions
• Example:
x1
x2
(0, −1)
• ∂f(x1) = 0 and ∇f(x1) = 0 (global minimum)
• ∂f(x2) = ∅ and ∇f(x2) = 0 (local minimum)
• For nonconvex f, we can typically only hope to find local minima
21
Subdifferential calculus rules
• Subdifferential of sum ∂(f1 + f2)
• Subdifferential of composition with matrix ∂(g ◦ L)
22
Subdifferential of sum
If f1, f2 closed convex and relint domf1 ∩ relint domf2 6= ∅:
∂(f1 + f2) = ∂f1 + ∂f2
• One direction always holds: if x ∈ dom∂f1 ∩ dom∂f2:
∂(f1 + f2)(x) ⊇ ∂f1(x) + ∂f2(x)
Proof: let si ∈ ∂fi(x), add subdifferential definitions:
f1(y) + f2(y) ≥ f1(x) + f2(x) + (s1 + s2)T
(y − x)
i.e. s1 + s2 ∈ ∂(f1 + f2)(x)
• If f1 and f2 differentiable, we have (without convexity of f)
∇(f1 + f2) = ∇f1 + ∇f2
23
Subdifferential of composition
If f closed convex and relint dom(f ◦ L) 6= ∅:
∂(f ◦ L)(x) = LT
∂f(Lx)
• One direction always holds: If Lx ∈ domf, then
∂(f ◦ L)(x) ⊇ LT
∂f(Lx)
Proof: let s ∈ ∂f(Lx), then by definition of subgradient of f:
(f ◦ L)(y) ≥ (f ◦ L)(x) + sT
(Ly − Lx) = (f ◦ L)(x) + (LT
s)T
(y − x)
i.e., LT
s ∈ ∂(f ◦ L)(x)
• If f differentiable, we have chain rule (without convexity of f)
∇(f ◦ L)(x) = LT
∇f(Lx)
24
A sufficient optimality condition
Let f : Rm
→ R, g : Rn
→ R, and L ∈ Rm×n
then:
minimize f(Lx) + g(x) (1)
is solved by every x ∈ Rn
that satisfies
0 ∈ LT
∂f(Lx) + ∂g(x) (2)
• Subdifferential calculus inclusions say:
0 ∈ LT
∂f(Lx) + ∂g(x) ⊆ ∂((f ◦ L)(x) + g(x))
which by Fermat’s rule is equivalent to x solution to (1)
• Note: (1) can have solution but no x exists that satisfies (2)
25
A necessary and sufficient optimality condition
Let f : Rm
→ R, g : Rn
→ R, L ∈ Rm×n
with f, g closed convex
and assume relint dom(f ◦ L) ∩ relint domg 6= ∅ then:
minimize f(Lx) + g(x) (1)
is solved by x ∈ Rn
if and only if x satisfies
0 ∈ LT
∂f(Lx) + ∂g(x) (2)
• Subdifferential calculus equality rules say:
0 ∈ LT
∂f(Lx) + ∂g(x) = ∂((f ◦ L)(x) + g(x))
which by Fermat’s rule is equivalent to x solution to (1)
• Algorithms search for x that satisfy 0 ∈ LT
∂f(Lx) + ∂g(x)
26
A comment to constraint qualification
• The condition
relint dom(f ◦ L) ∩ relint domg 6= ∅
is called constraint qualification and referred to as CQ
• It is a mild condition that rarely is not satisfied
dom(f ◦ L)
domg
no solution
dom(f ◦ L)
domg
solution
no CQ
dom(f ◦ L)
domg
solution
CQ
27
Evaluating subgradients of convex functions
• Obviously need to evaluate subdifferentials to solve
0 ∈ LT
∂f(Lx) + ∂g(x)
• Explicit evaluation:
• If function is differentiable: ∇f (unique)
• If function is nondifferentiable: compute element in ∂f
• Implicit evaluation:
• Proximal operator (specific element of subdifferential)
28
Proximal operators
29
Proximal operator
• Proximal operator of (convex) g defined as:
proxγg(z) = argmin
x
(g(x) + 1
2γ kx − zk2
2)
where γ > 0 is a parameter
• Evaluating prox requires solving optimization problem
• Objective is strongly convex ⇒ solution exists and is unique
30
Prox evaluates the subdifferential
• Fermat’s rule on prox definition: x = proxγg(z) if and only if
0 ∈ ∂g(x) + γ−1
(x − z) ⇔ γ−1
(z − x) ∈ ∂g(x)
Hence, γ−1
(z − x) is element in ∂g(x)
• A subgradient ∂g(x) where x = proxγg(z) is computed
• Often used in algorithms when g nonsmooth (no gradient exists)
31
Prox is generalization of projection
• Recall the indicator function of a set C
ιC(x) :=
(
0 if x ∈ C
∞ otherwise
• Then
ΠC(z) = argmin
x
(kx − zk2 : x ∈ C)
= argmin
x
(1
2 kx − zk2
2 : x ∈ C)
= argmin
x
(1
2 kx − zk2
2 + ιC(x))
= proxιC
(z)
• Projection onto C equals prox of indicator function of C
32
Proximal operator – Example 1
Let g(x) = 1
2 xT
Hx + hT
x with H positive semidefinite
• Gradient satisfies ∇g(x) = Hx + h
• Fermat’s rule for x = proxγgz:
0 = ∇g(x) + γ−1
(x − z) ⇔ 0 = Hx + h + γ−1
(x − z)
⇔ (I + γH)x = z − γh
⇔ x = (I + γH)−1
(z − γh)
• So proxγgz = (I + γH)−1
(z − γh)
33
Proximal operator – Example 2
• Consider the function g with subdifferential ∂g:
g(x) =
(
−x if x ≤ 0
0 if x ≥ 0
∂g(x) =





−1 if x < 0
[−1, 0] if x = 0
0 if x > 0
• Graphical representations
(−1, −1)
(−1, −1)
(−0.5, −1) (0, −1) (0, −1)
g(x)
x
∂g(x)
x
• Fermat’s rule for x = proxγgz:
0 ∈ ∂g(x) + γ−1
(x − z)
34
Proximal operator – Example 2 cont’d
• Let x < 0, then Fermat’s rule reads
0 = −1 + γ−1
(x − z) ⇔ x = z + γ
which is valid (x < 0) if z < −γ
• Let x = 0, then Fermat’s rule reads
0 = [−1, 0] + γ−1
(0 − z)
which is valid (x = 0) if z ∈ [−γ, 0]
• Let x > 0, then Fermat’s rule reads
0 = 0 + γ−1
(x − z) ⇔ x = z
which is valid (x > 0) if z > 0
• The prox satisfies
proxγg(z) =





z + γ if z < −γ
0 if z ∈ [−γ, 0]
z if z > 0
35
Computational cost
• Evaluating prox requires solving optimization problem
proxγg(z) = argmin
x
(g(x) + 1
2γ kx − zk2
2)
• Prox typically more expensive to evaluate than gradient
• Example: Quadratic g(x) = 1
2 xT
Hx + hT
x:
proxγg(z) = (I + γH)−1
(z − γh), ∇g(z) = Hz − h
36

More Related Content

Similar to subdiff_prox.pdf

0.5.derivatives
0.5.derivatives0.5.derivatives
0.5.derivativesm2699
 
Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Matthew Leingang
 
Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Mel Anthony Pepito
 
limits and continuity
limits and continuitylimits and continuity
limits and continuityElias Dinsa
 
Convexity in the Theory of the Gamma Function.pdf
Convexity in the Theory of the Gamma Function.pdfConvexity in the Theory of the Gamma Function.pdf
Convexity in the Theory of the Gamma Function.pdfPeterEsnayderBarranz
 
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Katsuya Ito
 
Lesson05 Continuity Slides+Notes
Lesson05    Continuity Slides+NotesLesson05    Continuity Slides+Notes
Lesson05 Continuity Slides+NotesMatthew Leingang
 
Lesson05 Continuity Slides+Notes
Lesson05    Continuity Slides+NotesLesson05    Continuity Slides+Notes
Lesson05 Continuity Slides+NotesMatthew Leingang
 
Chapter 5(partial differentiation)
Chapter 5(partial differentiation)Chapter 5(partial differentiation)
Chapter 5(partial differentiation)Eko Wijayanto
 
Fourier series Introduction
Fourier series IntroductionFourier series Introduction
Fourier series IntroductionRizwan Kazi
 
functions limits and continuity
functions limits and continuityfunctions limits and continuity
functions limits and continuityPume Ananda
 

Similar to subdiff_prox.pdf (20)

0.5.derivatives
0.5.derivatives0.5.derivatives
0.5.derivatives
 
Limits, Continuity & Differentiation (Theory)
Limits, Continuity & Differentiation (Theory)Limits, Continuity & Differentiation (Theory)
Limits, Continuity & Differentiation (Theory)
 
Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)
 
Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)Lesson 8: Basic Differentation Rules (slides)
Lesson 8: Basic Differentation Rules (slides)
 
Matrix calculus
Matrix calculusMatrix calculus
Matrix calculus
 
limits and continuity
limits and continuitylimits and continuity
limits and continuity
 
Duality.pdf
Duality.pdfDuality.pdf
Duality.pdf
 
Convexity in the Theory of the Gamma Function.pdf
Convexity in the Theory of the Gamma Function.pdfConvexity in the Theory of the Gamma Function.pdf
Convexity in the Theory of the Gamma Function.pdf
 
Functions limits and continuity
Functions limits and continuityFunctions limits and continuity
Functions limits and continuity
 
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
Convex Analysis and Duality (based on "Functional Analysis and Optimization" ...
 
Paper 2
Paper 2Paper 2
Paper 2
 
Lesson05 Continuity Slides+Notes
Lesson05    Continuity Slides+NotesLesson05    Continuity Slides+Notes
Lesson05 Continuity Slides+Notes
 
Lesson 5: Continuity
Lesson 5: ContinuityLesson 5: Continuity
Lesson 5: Continuity
 
Lesson05 Continuity Slides+Notes
Lesson05    Continuity Slides+NotesLesson05    Continuity Slides+Notes
Lesson05 Continuity Slides+Notes
 
R lecture co4_math 21-1
R lecture co4_math 21-1R lecture co4_math 21-1
R lecture co4_math 21-1
 
Chapter 5(partial differentiation)
Chapter 5(partial differentiation)Chapter 5(partial differentiation)
Chapter 5(partial differentiation)
 
Jokyokai20111124
Jokyokai20111124Jokyokai20111124
Jokyokai20111124
 
Fourier series Introduction
Fourier series IntroductionFourier series Introduction
Fourier series Introduction
 
functions limits and continuity
functions limits and continuityfunctions limits and continuity
functions limits and continuity
 
Limits and derivatives
Limits and derivativesLimits and derivatives
Limits and derivatives
 

Recently uploaded

obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...
obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...
obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...yulianti213969
 
prodtion diary updated.pptxyyghktyuitykiyu
prodtion diary updated.pptxyyghktyuitykiyuprodtion diary updated.pptxyyghktyuitykiyu
prodtion diary updated.pptxyyghktyuitykiyuLeonBraley
 
Santa Monica College - M4MH - 5.13.24 - Presentation.pdf
Santa Monica College - M4MH - 5.13.24 - Presentation.pdfSanta Monica College - M4MH - 5.13.24 - Presentation.pdf
Santa Monica College - M4MH - 5.13.24 - Presentation.pdfRebeccaPontieri
 
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理awuboo
 
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Step
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door StepHosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Step
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Stepdarmandersingh4580
 
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样aqwaz
 
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...yulianti213969
 
prodtion diary final ultima.pptxoiu8edrfgrh
prodtion diary final ultima.pptxoiu8edrfgrhprodtion diary final ultima.pptxoiu8edrfgrh
prodtion diary final ultima.pptxoiu8edrfgrhLeonBraley
 
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...yulianti213969
 
Our great adventures in Warsaw - On arrival training
Our great adventures in Warsaw - On arrival trainingOur great adventures in Warsaw - On arrival training
Our great adventures in Warsaw - On arrival trainingAngelaHakobjanyan
 
Abstract Arch Design Wall Cladding - by Stone Art by SKL
Abstract Arch Design Wall Cladding -  by Stone Art by SKLAbstract Arch Design Wall Cladding -  by Stone Art by SKL
Abstract Arch Design Wall Cladding - by Stone Art by SKLstoneartbyskl
 
Museum Quality | PrintAction.pdf
Museum Quality | PrintAction.pdfMuseum Quality | PrintAction.pdf
Museum Quality | PrintAction.pdfVictoria Gaitskell
 
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitung
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitungobat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitung
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitungsiskavia916
 
The Adventurer's Guide Book by Amoré van der Linde
The Adventurer's Guide Book by Amoré van der LindeThe Adventurer's Guide Book by Amoré van der Linde
The Adventurer's Guide Book by Amoré van der Lindethatsamorevdl
 
DeFeliceKitley_Resume_BFAVCDGraduated2024
DeFeliceKitley_Resume_BFAVCDGraduated2024DeFeliceKitley_Resume_BFAVCDGraduated2024
DeFeliceKitley_Resume_BFAVCDGraduated2024KitleyDeFelice
 
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...Mumbai Escorts
 
一比一定制英国赫特福德大学毕业证学位证书
一比一定制英国赫特福德大学毕业证学位证书一比一定制英国赫特福德大学毕业证学位证书
一比一定制英国赫特福德大学毕业证学位证书AS
 
Gabriel Van Konynenburg _ Portfolio Visual
Gabriel Van Konynenburg _ Portfolio VisualGabriel Van Konynenburg _ Portfolio Visual
Gabriel Van Konynenburg _ Portfolio Visualgabbyvan28
 

Recently uploaded (20)

obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...
obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...
obat aborsi rembang wa 081336238223 jual obat aborsi cytotec asli di rembang9...
 
prodtion diary updated.pptxyyghktyuitykiyu
prodtion diary updated.pptxyyghktyuitykiyuprodtion diary updated.pptxyyghktyuitykiyu
prodtion diary updated.pptxyyghktyuitykiyu
 
Santa Monica College - M4MH - 5.13.24 - Presentation.pdf
Santa Monica College - M4MH - 5.13.24 - Presentation.pdfSanta Monica College - M4MH - 5.13.24 - Presentation.pdf
Santa Monica College - M4MH - 5.13.24 - Presentation.pdf
 
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理
一比一原版西悉尼大学毕业证(UWS毕业证)成绩单如可办理
 
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Step
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door StepHosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Step
Hosewife Bangalore Just VIP Btm Layout 100% Genuine at your Door Step
 
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样
一比一原版(MQU毕业证书)麦考瑞大学毕业证成绩单原件一模一样
 
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...
obat aborsi klaten wa 081336238223 jual obat aborsi cytotec asli di klaten54-...
 
prodtion diary final ultima.pptxoiu8edrfgrh
prodtion diary final ultima.pptxoiu8edrfgrhprodtion diary final ultima.pptxoiu8edrfgrh
prodtion diary final ultima.pptxoiu8edrfgrh
 
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...
obat aborsi wonogiri wa 081336238223 jual obat aborsi cytotec asli di wonogir...
 
Our great adventures in Warsaw - On arrival training
Our great adventures in Warsaw - On arrival trainingOur great adventures in Warsaw - On arrival training
Our great adventures in Warsaw - On arrival training
 
Abstract Arch Design Wall Cladding - by Stone Art by SKL
Abstract Arch Design Wall Cladding -  by Stone Art by SKLAbstract Arch Design Wall Cladding -  by Stone Art by SKL
Abstract Arch Design Wall Cladding - by Stone Art by SKL
 
Museum Quality | PrintAction.pdf
Museum Quality | PrintAction.pdfMuseum Quality | PrintAction.pdf
Museum Quality | PrintAction.pdf
 
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitung
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitungobat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitung
obat aborsi Cibitung wa 082223595321 jual obat aborsi cytotec asli di Cibitung
 
The Adventurer's Guide Book by Amoré van der Linde
The Adventurer's Guide Book by Amoré van der LindeThe Adventurer's Guide Book by Amoré van der Linde
The Adventurer's Guide Book by Amoré van der Linde
 
DeFeliceKitley_Resume_BFAVCDGraduated2024
DeFeliceKitley_Resume_BFAVCDGraduated2024DeFeliceKitley_Resume_BFAVCDGraduated2024
DeFeliceKitley_Resume_BFAVCDGraduated2024
 
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...
Russian ℂall Girls Vijay Nagar Hire Me Neha 96XXXXXXX Top Class ℂall Girl Ser...
 
一比一定制英国赫特福德大学毕业证学位证书
一比一定制英国赫特福德大学毕业证学位证书一比一定制英国赫特福德大学毕业证学位证书
一比一定制英国赫特福德大学毕业证学位证书
 
Gabriel Van Konynenburg _ Portfolio Visual
Gabriel Van Konynenburg _ Portfolio VisualGabriel Van Konynenburg _ Portfolio Visual
Gabriel Van Konynenburg _ Portfolio Visual
 
Apotik Jual Obat Aborsi asli Malang, Wa : 085180626899 - Penjual obat Cytotec...
Apotik Jual Obat Aborsi asli Malang, Wa : 085180626899 - Penjual obat Cytotec...Apotik Jual Obat Aborsi asli Malang, Wa : 085180626899 - Penjual obat Cytotec...
Apotik Jual Obat Aborsi asli Malang, Wa : 085180626899 - Penjual obat Cytotec...
 
batwhls
batwhlsbatwhls
batwhls
 

subdiff_prox.pdf

  • 1. Subdifferentials and Proximal Operators Pontus Giselsson 1
  • 2. Learning goals • Be able to derive subdifferential and proximal operator formulas • Understand that subdifferentials define affine minorizors • Existence of subgradient for convex functions • Understand maximal monotonicity and Minty’s theorem • Know strong monotonicity and relation to strong convexity • Know different characterizations of smoothness • Understand and be able to use Fermat’s rule • Know subdifferential calculus rules • Understand that prox evaluates subdifferential 2
  • 4. Gradients of convex functions • Recall: A differentiable function f : Rn → R is convex iff f(y) ≥ f(x) + ∇f(x)T (y − x) for all x, y ∈ Rn f(y) f(x) + ∇f(x)T (y − x) (∇f(x), −1) (x, f(x)) • Function f has for all x ∈ Rn an affine minorizer that: • has slope s defined by ∇f • coincides with function f at x • defines normal (∇f(x), −1) to epigraph of f • What if function is nondifferentiable? 4
  • 5. Subdifferentials and subgradients • Subgradients s define affine minorizers to the function that: (s, −1) (s, −1) (s, −1) • coincide with f at x • define normal vector (s, −1) to epigraph of f • can be one of many affine minorizers at nondifferentiable points x • Subdifferential of f : Rn → R at x is set of vectors s satisfying f(y) ≥ f(x) + sT (y − x) for all y ∈ Rn , (1) • Notation: • subdifferential: ∂f : Rn → 2Rn (power-set notation 2Rn ) • subdifferential at x: ∂f(x) = {s : (1) holds} • elements s ∈ ∂f(x) are called subgradients of f at x 5
  • 6. Relation to gradient • If f differentiable at x and ∂f(x) 6= ∅ then ∂f(x) = {∇f(x)}: (s, −1) x1 (∇f(x2), −1) x2 (∇f(x3), −1) x3 • i.e., subdifferential (if nonempty) at x consists of only gradient 6
  • 7. Subgradient existence – Nonconvex example • Function can be differentiable at x but ∂f(x) = ∅ x1 x2 x3 • x1: ∂f(x1) = {0}, ∇f(x1) = 0 • x2: ∂f(x2) = ∅, ∇f(x2) = 0 • x3: ∂f(x3) = ∅, ∇f(x3) = 0 • Gradient is a local concept, subdifferential is a global property 7
  • 8. Subgradient existence – Convex example • Consider the convex function: x1 x2 x3 f(x) = |x| • What are the subdifferentials at points x1, x2, x3? • Subdifferential at x1 is -1 (affine minorizer with slope -1) • Subdifferential at x2 is [-1,1] (affine minorizers with slope [-1,1]) • Subdifferential at x3 is 1 (affine minorizer with slope 1) Fact: • For finite-valued convex functions, a subgradient exists for every x 8
  • 9. Existence for extended-valued convex functions • Let f : Rn → R ∪ {∞} be convex, then: 1. Subgradients exist for all x in relative interior of domf 2. Subgradients sometimes exist for x on boundary of domf 3. No subgradient exists for x outside domf • Examples for second case, boundary points of domf: − √ 1 − x2 + ι[−1,1](x) x2 + ι[−2,2](x) • No subgradient (affine minorizer) exists for left function at x = 1 9
  • 10. Monotonicity • Subdifferential operator is monotone: (sx − sy)T (x − y) ≥ 0 for all sx ∈ ∂f(x) and sy ∈ ∂f(y) • Proof: Add two copies of subdifferential definition f(y) ≥ f(x) + sT x (y − x) with x and y swapped • ∂f : R → 2R : Minimum slope 0 and maximum slope ∞ ∂f x 10
  • 11. Monotonicity beyond subdifferentials • Let A : Rn → 2Rn be monotone, i.e.: (u − v)T (x − y) ≥ 0 for all u ∈ Ax and v ∈ Ay • If n = 1, then A = ∂f for some function f : R → R ∪ {∞} • If n ≥ 2 there exist monotone A that are not subdifferentials 11
  • 12. Maximal monotonicity • Let the set gph ∂f := {(x, u) : u ∈ ∂f(x)} be the graph of ∂f • ∂f is maximally monotone if no other function g exists with gph ∂f ⊂ gph ∂g, with strict inclusion • A result (due to Rockafellar): f is closed convex if and only if ∂f is maximally monotone 12
  • 13. Minty’s theorem • Let ∂f : Rn → 2Rn and α > 0 • ∂f is maximally monotone if and only if range(αI + ∂f) = Rn ∂f1 x maximally monotone ∂f2 x not maximally monotone ∂f1 + αI x full range ∂f2 + αI x not full range • Interpretation: No “holes” in gph ∂f 13
  • 14. Strong convexity • Recall that f is σ-strongly convex if f − σ 2 k · k2 2 is convex • If f is σ-strongly convex then f(y) ≥ f(x) + sT (y − x) + σ 2 kx − yk2 2 holds for all x ∈ dom∂f, s ∈ ∂f(x), and y ∈ Rn • The function has convex quadratic minorizers instead of affine f(y) f(x1) + sT 1 (y − x1) + σ 2 kx1 − yk2 2 x1 (s1, −1) f(x2) + sT 2,1(y − x2) + σ 2 kx2 − yk2 2 (s2,1, −1) f(x2) + sT 2,2(y − x2) + σ 2 kx2 − yk2 2 x2 (s2,2, −1) • Multiple lower bounds at x2 with subgradients s2,1 and s2,2 14
  • 15. Strong monotonicity • If f σ-strongly convex function, then ∂f is σ-strongly monotone: (sx − sy)T (x − y) ≥ σkx − yk2 2 for all sx ∈ ∂f(x) and sy ∈ ∂f(y) • Proof: Add two copies of strong convexity inequality f(y) ≥ f(x) + sT x (y − x) + σ 2 kx − yk2 2 with x and y swapped • ∂f is σ-strongly monotone if and only if ∂f − σI is monotone • ∂f : R → 2R : Minimum slope σ and maximum slope ∞ ∂f x 15
  • 16. Strongly convex functions – An equivalence The following are equivalent (i) f is closed and σ-strongly convex (ii) ∂f is maximally monotone and σ-strongly monotone Proof: (i)⇒(ii): we know this from before (ii)⇒(i): (ii) ⇒ ∂f − σI = ∂(f − σ 2 k · k2 2) maximally monotone ⇒ f − σ 2 k · k2 2 closed convex ⇒ f closed and σ-strongly convex 16
  • 17. Smoothness and convexity • A differentiable function f : Rn → R is convex and smooth if f(y) ≤ f(x) + ∇f(x)T (y − x) + β 2 kx − yk2 2 f(y) ≥ f(x) + ∇f(x)T (y − x) holds for all x, y ∈ Rn • f has convex quadratic majorizers and affine minorizers f(x1) + ∇f(x1)T (y − x1) + β 2 kx1 − yk2 2 x1 (∇f(x2), −1) f(x2) + ∇f(x2)T (y − x2) + β 2 kx2 − yk2 2 x2 (∇f(x2), −1) f(y) • Quadratic upper bound is called descent lemma 17
  • 18. Gradient of smooth convex function • Gradient of smooth convex function is monotone and Lipschitz (∇f(x) − ∇f(y))T (x − y) ≥ 0 k∇f(y) − ∇f(x)k2 ≤ βkx − yk2 • ∇f : R → R: Minimum slope 0 and maximum slope β ∇f x • Actually satisfies the stronger 1 β -cocoercivity property: (∇f(x) − ∇f(y))T (x − y) ≥ 1 β k∇f(y) − ∇f(x)k2 2 18
  • 19. Smooth convex functions – Equivalences Let f : Rn → R be differentiable. The following are equivalent: (i) ∇f is 1 β -cocoercive (ii) ∇f is maximally monotone and β-Lipschitz continuous (iii) f is closed convex and satisfies descent lemma (is β-smooth) • Implication (ii)⇒(i) is called the Baillon-Haddad theorem • Will connect smoothness and strong convexity via conjugates in next lecture 19
  • 20. Fermat’s rule Let f : Rn → R ∪ {∞}, then x minimizes f if and only if 0 ∈ ∂f(x) • Proof: x minimizes f if and only if f(y) ≥ f(x) + 0T (y − x) for all y ∈ Rn which by definition of subdifferential is equivalent to 0 ∈ ∂f(x) • Example: several subgradients at solution, including 0 (0, −1) 20
  • 21. Fermat’s rule – Nonconvex example • Fermat’s rule holds also for nonconvex functions • Example: x1 x2 (0, −1) • ∂f(x1) = 0 and ∇f(x1) = 0 (global minimum) • ∂f(x2) = ∅ and ∇f(x2) = 0 (local minimum) • For nonconvex f, we can typically only hope to find local minima 21
  • 22. Subdifferential calculus rules • Subdifferential of sum ∂(f1 + f2) • Subdifferential of composition with matrix ∂(g ◦ L) 22
  • 23. Subdifferential of sum If f1, f2 closed convex and relint domf1 ∩ relint domf2 6= ∅: ∂(f1 + f2) = ∂f1 + ∂f2 • One direction always holds: if x ∈ dom∂f1 ∩ dom∂f2: ∂(f1 + f2)(x) ⊇ ∂f1(x) + ∂f2(x) Proof: let si ∈ ∂fi(x), add subdifferential definitions: f1(y) + f2(y) ≥ f1(x) + f2(x) + (s1 + s2)T (y − x) i.e. s1 + s2 ∈ ∂(f1 + f2)(x) • If f1 and f2 differentiable, we have (without convexity of f) ∇(f1 + f2) = ∇f1 + ∇f2 23
  • 24. Subdifferential of composition If f closed convex and relint dom(f ◦ L) 6= ∅: ∂(f ◦ L)(x) = LT ∂f(Lx) • One direction always holds: If Lx ∈ domf, then ∂(f ◦ L)(x) ⊇ LT ∂f(Lx) Proof: let s ∈ ∂f(Lx), then by definition of subgradient of f: (f ◦ L)(y) ≥ (f ◦ L)(x) + sT (Ly − Lx) = (f ◦ L)(x) + (LT s)T (y − x) i.e., LT s ∈ ∂(f ◦ L)(x) • If f differentiable, we have chain rule (without convexity of f) ∇(f ◦ L)(x) = LT ∇f(Lx) 24
  • 25. A sufficient optimality condition Let f : Rm → R, g : Rn → R, and L ∈ Rm×n then: minimize f(Lx) + g(x) (1) is solved by every x ∈ Rn that satisfies 0 ∈ LT ∂f(Lx) + ∂g(x) (2) • Subdifferential calculus inclusions say: 0 ∈ LT ∂f(Lx) + ∂g(x) ⊆ ∂((f ◦ L)(x) + g(x)) which by Fermat’s rule is equivalent to x solution to (1) • Note: (1) can have solution but no x exists that satisfies (2) 25
  • 26. A necessary and sufficient optimality condition Let f : Rm → R, g : Rn → R, L ∈ Rm×n with f, g closed convex and assume relint dom(f ◦ L) ∩ relint domg 6= ∅ then: minimize f(Lx) + g(x) (1) is solved by x ∈ Rn if and only if x satisfies 0 ∈ LT ∂f(Lx) + ∂g(x) (2) • Subdifferential calculus equality rules say: 0 ∈ LT ∂f(Lx) + ∂g(x) = ∂((f ◦ L)(x) + g(x)) which by Fermat’s rule is equivalent to x solution to (1) • Algorithms search for x that satisfy 0 ∈ LT ∂f(Lx) + ∂g(x) 26
  • 27. A comment to constraint qualification • The condition relint dom(f ◦ L) ∩ relint domg 6= ∅ is called constraint qualification and referred to as CQ • It is a mild condition that rarely is not satisfied dom(f ◦ L) domg no solution dom(f ◦ L) domg solution no CQ dom(f ◦ L) domg solution CQ 27
  • 28. Evaluating subgradients of convex functions • Obviously need to evaluate subdifferentials to solve 0 ∈ LT ∂f(Lx) + ∂g(x) • Explicit evaluation: • If function is differentiable: ∇f (unique) • If function is nondifferentiable: compute element in ∂f • Implicit evaluation: • Proximal operator (specific element of subdifferential) 28
  • 30. Proximal operator • Proximal operator of (convex) g defined as: proxγg(z) = argmin x (g(x) + 1 2γ kx − zk2 2) where γ > 0 is a parameter • Evaluating prox requires solving optimization problem • Objective is strongly convex ⇒ solution exists and is unique 30
  • 31. Prox evaluates the subdifferential • Fermat’s rule on prox definition: x = proxγg(z) if and only if 0 ∈ ∂g(x) + γ−1 (x − z) ⇔ γ−1 (z − x) ∈ ∂g(x) Hence, γ−1 (z − x) is element in ∂g(x) • A subgradient ∂g(x) where x = proxγg(z) is computed • Often used in algorithms when g nonsmooth (no gradient exists) 31
  • 32. Prox is generalization of projection • Recall the indicator function of a set C ιC(x) := ( 0 if x ∈ C ∞ otherwise • Then ΠC(z) = argmin x (kx − zk2 : x ∈ C) = argmin x (1 2 kx − zk2 2 : x ∈ C) = argmin x (1 2 kx − zk2 2 + ιC(x)) = proxιC (z) • Projection onto C equals prox of indicator function of C 32
  • 33. Proximal operator – Example 1 Let g(x) = 1 2 xT Hx + hT x with H positive semidefinite • Gradient satisfies ∇g(x) = Hx + h • Fermat’s rule for x = proxγgz: 0 = ∇g(x) + γ−1 (x − z) ⇔ 0 = Hx + h + γ−1 (x − z) ⇔ (I + γH)x = z − γh ⇔ x = (I + γH)−1 (z − γh) • So proxγgz = (I + γH)−1 (z − γh) 33
  • 34. Proximal operator – Example 2 • Consider the function g with subdifferential ∂g: g(x) = ( −x if x ≤ 0 0 if x ≥ 0 ∂g(x) =      −1 if x < 0 [−1, 0] if x = 0 0 if x > 0 • Graphical representations (−1, −1) (−1, −1) (−0.5, −1) (0, −1) (0, −1) g(x) x ∂g(x) x • Fermat’s rule for x = proxγgz: 0 ∈ ∂g(x) + γ−1 (x − z) 34
  • 35. Proximal operator – Example 2 cont’d • Let x < 0, then Fermat’s rule reads 0 = −1 + γ−1 (x − z) ⇔ x = z + γ which is valid (x < 0) if z < −γ • Let x = 0, then Fermat’s rule reads 0 = [−1, 0] + γ−1 (0 − z) which is valid (x = 0) if z ∈ [−γ, 0] • Let x > 0, then Fermat’s rule reads 0 = 0 + γ−1 (x − z) ⇔ x = z which is valid (x > 0) if z > 0 • The prox satisfies proxγg(z) =      z + γ if z < −γ 0 if z ∈ [−γ, 0] z if z > 0 35
  • 36. Computational cost • Evaluating prox requires solving optimization problem proxγg(z) = argmin x (g(x) + 1 2γ kx − zk2 2) • Prox typically more expensive to evaluate than gradient • Example: Quadratic g(x) = 1 2 xT Hx + hT x: proxγg(z) = (I + γH)−1 (z − γh), ∇g(z) = Hz − h 36