This presentation is the final thesis presentation. The first part address the challenges in with uncertain demands in drinking water networks. In the second part we show, how demand forecasts are integrated with them. In the final part we formulate the stochastic model predictive problem and develop parallel proximal gradient methods to solve them in real-time.
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Parallel methods to solve stochastic optimal control: control of drinking water networks
1. Parallel methods for solving stochastic optimal
control: Control of drinking water networks
Ajay Kumar Sampathirao
Final thesis presentation
IMT SCHOOL FOR ADVANCED STUDIES LUCCA
December 20, 2016
2. Outline
1 Introduction
2 Stochastic optimal control on DWN
3 Proximal algorithms to solve SOC on DWN
4 SOC with LBFGS-FBE
5 Conclusions
Parallel solvers for stochastic optimal control Ajay Kumar 2/36
4. Structure
water sources
DWN
consumers
bore wells, rivers,
treatment plants
industrial regions,
households
pumps, values,
storage tanks
Drinking Water Network transports water from the sources to
consumers.
Parallel solvers for stochastic optimal control Ajay Kumar 3/36
6. Problem statement
water sources
DWN
consumers
Real-time controller
Objective: Uncertainty-aware real-time controller to decrease the
operating cost and provide high quality of service.
Parallel solvers for stochastic optimal control Ajay Kumar 3/36
7. Problem statement
water sources
DWN
consumers
Real-time controller
Solution: Exploit problem structure to harness the computational
power of graphics processing units (GPUs)
Parallel solvers for stochastic optimal control Ajay Kumar 3/36
8. Novelty
Stochastic control of DWN for the first time
Efficient parallelisation of the algorithm
Real time solution of scenario-based stochastic MPC
[Goryashko and Nemirovsky ’14, J.M. Grosso et al. ’14]
[J.M. Grosso, et al. Cape town ’14]
[Hans, et al. ’15]
Parallel solvers for stochastic optimal control Ajay Kumar 4/36
9. Novelty
Stochastic control of DWN for the first time
Efficient parallelisation of the algorithm
Real time solution of scenario-based stochastic MPC
[Goryashko and Nemirovsky ’14, J.M. Grosso et al. ’14]
[J.M. Grosso, et al. Cape town ’14]
[Hans, et al. ’15]
Parallel solvers for stochastic optimal control Ajay Kumar 4/36
10. Novelty
Stochastic control of DWN for the first time
Efficient parallelisation of the algorithm
Real time solution of scenario-based stochastic MPC
[Goryashko and Nemirovsky ’14, J.M. Grosso et al. ’14]
[J.M. Grosso, et al. Cape town ’14]
[Hans, et al. ’15]
Parallel solvers for stochastic optimal control Ajay Kumar 4/36
11. Impact
Low risk operation
Reduction of operating cost
Smooth control actions
High quality of service
Preventing shortages or insufficient pressure.
Parallel solvers for stochastic optimal control Ajay Kumar 5/36
12. Impact
Low risk operation
Reduction of operating cost
Smooth control actions
High quality of service
Preventing shortages or insufficient pressure.
Parallel solvers for stochastic optimal control Ajay Kumar 5/36
13. Impact
Low risk operation
Reduction of operating cost
Smooth control actions
High quality of service
Preventing shortages or insufficient pressure.
Parallel solvers for stochastic optimal control Ajay Kumar 5/36
14. Impact
Low risk operation
Reduction of operating cost
Smooth control actions
High quality of service
Preventing shortages or insufficient pressure.
Parallel solvers for stochastic optimal control Ajay Kumar 5/36
21. Our case study
DWN of Barcelona: 63 tanks, 114 pumping stations and valves, 88 demand nodes & 17 pipe intersection nodes.
Parallel solvers for stochastic optimal control Ajay Kumar 7/36
22. Control-oriented models
Simple mass balance equation (in discrete time)
xk+1 = Axk + Buk + Gd dk,
0 = Euk + Ed dk,
xk: tank volumes, uk: flows (controlled by pumping), dk: demands
— along with the constraints
xmin ≤ xk ≤ xmax,
umin ≤ uk ≤ umax.
Sampathirao, Sopasakis et al., 2016, 2014; Wang et al., 2014; Ocampo et al., 2010; Ocampo et al., 2009.
Parallel solvers for stochastic optimal control Ajay Kumar 8/36
23. Demand forecasting
Demand prediction concept:
dk+j ( j ) = ˆdk+j|k + j
where
dk+j : actual demand at time k + j
ˆdk+j|k: prediction of dk+j using info up to time k
Time series models like SARIMA, BATS, SVM are used to
predict the ˆdk+j|k
j : j-step-ahead prediction error
and ˆdk+j|k is a function of observable quantities up to time k.
Sampathirao, Sopasakis et al., 2014.
Parallel solvers for stochastic optimal control Ajay Kumar 9/36
24. Demand forecasting
Demand prediction concept:
dk+j ( j ) = ˆdk+j|k + j
Neglect the error:( 0, 1, . . . , N) (0, 0, . . . , 0) → Fails in practice
Parallel solvers for stochastic optimal control Ajay Kumar 9/36
25. Demand forecasting
Demand prediction concept:
dk+j ( j ) = ˆdk+j|k + j
Error bounds: ( 0, 1, . . . , N) ∈ E, e.g., j ∈ [ min
j , max
j ] → Too
conservative
Parallel solvers for stochastic optimal control Ajay Kumar 9/36
26. Demand forecasting
Demand prediction concept:
dk+j ( j ) = ˆdk+j|k + j
Independent normal distributions: j ∼ N(mj , σ2
j ) → Neglects
important inter-stage correlations (not realistic)
Parallel solvers for stochastic optimal control Ajay Kumar 9/36
27. Demand forecasting
Demand prediction concept:
dk+j ( j ) = ˆdk+j|k + j
( 0, 1, . . . , N) is random and admits finitely many values
Time [hr]
4370 4380 4390 4400 4410 4420 4430
Waterdemand[m3
/s]
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
Independent of the
distribution
Trade-off between
performance and
reliability
Parallel solvers for stochastic optimal control Ajay Kumar 9/36
28. Control objectives
Stage costs:
Economic cost: w (uk, k) = Wα(α1 + α2,k) uk
Smooth operation cost: ∆(∆uk) = ∆ukWu∆uk
Safe operation cost: S (xk) = Wx [xs − xk]+
Total cost: = w + ∆ + S .
We define ∆uk = uk − uk−1
Sampathirao et al., 2014; Cong Cong et al., 2014
Parallel solvers for stochastic optimal control Ajay Kumar 10/36
29. Control objectives
Stage costs:
Economic cost: w (uk, k) = Wα(α1 + α2,k) uk
Smooth operation cost: ∆(∆uk) = ∆ukWu∆uk
Safe operation cost: S (xk) = Wx [xs − xk]+
Total cost: = w + ∆ + S .
We define ∆uk = uk − uk−1
Sampathirao et al., 2014; Cong Cong et al., 2014
Parallel solvers for stochastic optimal control Ajay Kumar 10/36
30. Control objectives
Stage costs:
Economic cost: w (uk, k) = Wα(α1 + α2,k) uk
Smooth operation cost: ∆(∆uk) = ∆ukWu∆uk
Safe operation cost: S (xk) = Wx [xs − xk]+
Total cost: = w + ∆ + S .
We define ∆uk = uk − uk−1
Sampathirao et al., 2014; Cong Cong et al., 2014
Parallel solvers for stochastic optimal control Ajay Kumar 10/36
31. Control objectives
Stage costs:
Economic cost: w (uk, k) = Wα(α1 + α2,k) uk
Smooth operation cost: ∆(∆uk) = ∆ukWu∆uk
Safe operation cost: S (xk) = Wx [xs − xk]+
Total cost: = w + ∆ + S .
We define ∆uk = uk − uk−1
Sampathirao et al., 2014; Cong Cong et al., 2014
Parallel solvers for stochastic optimal control Ajay Kumar 10/36
32. Stochastic MPC
SMPC numerically solves a
finite horizon stochastic
optimal control problem
We are looking for control
policies
uk+i|k = φ(xi , ui−1, i )
Expectation operator use to
quantify the random stage
cost
past future
past future
cost
xi = {xk , xk+1|k . . . xk+i|k ; ui = {uk , uk+1|k . . . uk+i|k }; i = { 0, 1 . . . k }
Parallel solvers for stochastic optimal control Ajay Kumar 11/36
33. Stochastic MPC
Problem formulation
minimise
π=({uk+j|k }j ,{xk+j|k }j )
EV (π),
subject to
xk+j+1|k = Axk+j|k + Buk+j|k + Gd
ˆdk+j|k( j ) dynamics
Euk+j|k + Ed dk+j ( j ) = 0 alg. cond.
xmin ≤ xk+j|k ≤ xmax vol. constr.
umin ≤ uk+j|k ≤ umax flow constr.
xk|k = xk, uk−1|k = uk−1 initial cond.
we’re looking for control laws uk+j|k ( j ).
Parallel solvers for stochastic optimal control Ajay Kumar 12/36
34. Scenario trees
Structure to describe the evolution of
uncertainty in finite dimensions
Parallel solvers for stochastic optimal control Ajay Kumar 13/36
38. Scenario-based stochastic MPC
Problem formulation:
minimise
π=({uk+ji |k
}i,j ,{xk+ji |k
}i,j )
EV (π)
N−1
j=0
µ(j)
i=1
pi
j
i
j +
soft constraints
N
j=0
µ(j)
i=1
d
(xi
k+j|k),
subject to
xi
k+j+1|k=f (x
anc(j+1,i)
k+j|k , ui
k+j|k, di
k+j|k) dynamics
Eui
k+j|k + Ed
ˆdi
k+j|k = 0 algebraic
umin ≤ ui
k+j|k ≤ umax flow constr.
x1
k|k = xk, u1
k−1|k = uk−1 initial cond.
d
(x) = γd dist(x | X) where X = {x : xmin ≤ x ≤ xmax }
Parallel solvers for stochastic optimal control Ajay Kumar 14/36
39. Proximal algorithms to solve SOC on DWN
Parallel solvers for stochastic optimal control Ajay Kumar 14/36
40. Preliminaries
Proximal operator: Let f : IRn
→ IR ∪ {+∞} be a proper, closed
function and γ > 0. Define
proxγf (v) = arg min
z
f (z) + 1
2γ z − v 2
Convex conjugate: Let f : IR → IR ∪ {+∞} be convex, proper
and closed. We define its convex conjugate as
f ∗
(y) = sup
x
x y − f (x) .
When f ∗ is differentiable, then
f ∗
(y) = arg min
x
{x y − f (x)}.
Parikh and Boyd, 2014; Combettes and Pesquet, 2010.
conjugate-subgradient theorem (Rockafellar, 1972.)
Parallel solvers for stochastic optimal control Ajay Kumar 15/36
41. Preliminaries
Proximal operator: Let f : IRn
→ IR ∪ {+∞} be a proper, closed
function and γ > 0. Define
proxγf (v) = arg min
z
f (z) + 1
2γ z − v 2
Convex conjugate: Let f : IR → IR ∪ {+∞} be convex, proper
and closed. We define its convex conjugate as
f ∗
(y) = sup
x
x y − f (x) .
When f ∗ is differentiable, then
f ∗
(y) = arg min
x
{x y − f (x)}.
Parikh and Boyd, 2014; Combettes and Pesquet, 2010.
conjugate-subgradient theorem (Rockafellar, 1972.)
Parallel solvers for stochastic optimal control Ajay Kumar 15/36
42. Preliminaries
Proximal operator: Let f : IRn
→ IR ∪ {+∞} be a proper, closed
function and γ > 0. Define
proxγf (v) = arg min
z
f (z) + 1
2γ z − v 2
Convex conjugate: Let f : IR → IR ∪ {+∞} be convex, proper
and closed. We define its convex conjugate as
f ∗
(y) = sup
x
x y − f (x) .
When f ∗ is differentiable, then
f ∗
(y) = arg min
x
{x y − f (x)}.
Parikh and Boyd, 2014; Combettes and Pesquet, 2010.
conjugate-subgradient theorem (Rockafellar, 1972.)
Parallel solvers for stochastic optimal control Ajay Kumar 15/36
43. Preliminaries
Properties of proximal operator:
Moreau decomposition property
v =proxγf (v) + γproxγ−1f ∗ (γ−1
v)
Separable sum property: If
f (x) =
κ
i=1
fi (xi ),
then
(proxλf (v))i = proxλfi
(vi ).
Parikh and Boyd, 2014; Combettes and Pesquet, 2010.
Parallel solvers for stochastic optimal control Ajay Kumar 16/36
44. Proximal gradient method
Consider the optimisation problem :
P = min
x
f (x) + g(x),
f : IRn
→ ¯IR is proper and 1
L -smooth convex
g : IRm
→ ¯IR is closed, proper and convex
Then proximal gradient iterate:
xν+1
= proxγg (xν
− γ f (xν
)),
with γ ∈ (0, 1
L ).
Parallel solvers for stochastic optimal control Ajay Kumar 17/36
45. Proximal gradient method
Consider the optimisation problem :
P = min
x
f (x) + g(x),
f : IRn
→ ¯IR is proper and 1
L -smooth convex
g : IRm
→ ¯IR is closed, proper and convex
Then proximal gradient iterate:
xν+1
= proxγg (xν
− γ f (xν
)),
with γ ∈ (0, 1
L ).
Parallel solvers for stochastic optimal control Ajay Kumar 17/36
46. Dual proximal gradient method
Consider the optimisation problem :
P = min
z=Hx
f (x) + g(z),
f : IRn
→ ¯IR is proper and σ-strongly convex
g : IRm
→ ¯IR is closed, proper and convex
Parallel solvers for stochastic optimal control Ajay Kumar 18/36
47. Dual proximal gradient method
Consider the optimisation problem :
P = min
z=Hx
f (x) + g(z),
f : IRn
→ ¯IR is proper and σ-strongly convex
g : IRm
→ ¯IR is closed, proper and convex
If g(x) is prox-friendly but g(Hx) is not prox-friendly
Parallel solvers for stochastic optimal control Ajay Kumar 18/36
48. Dual proximal gradient method
Consider the optimisation problem :
P = min
z=Hx
f (x) + g(z),
f : IRn
→ ¯IR is proper and σ-strongly convex
g : IRm
→ ¯IR is closed, proper and convex
If g(x) is prox-friendly but g(Hx) is not prox-friendly
Then we formulate the dual problem – Fenchel dual:
D = min
y
f ∗
(−H y) + g∗
(y),
where f ∗ has Lipschitz-continuous gradient with constant 1/σ.
Fenchel duality generalises Lagrangian duality
Parallel solvers for stochastic optimal control Ajay Kumar 18/36
49. Dual APG algorithm
Nesterov’s accelerated proximal gradient algorithm (APG) is
defined by the recursion:
vν
= yν
+ θν(θ−1
ν−1 − 1)(yν
− yν−1
),
xν
= f ∗
(−H vν
),
zν
= proxλ−1g (λ−1
vν
+ Hxν
),
yν+1
= vν
+ λ(Hzv
− tv
),
θν+1 =
1
2
( θ4
ν + 4θ2
ν − θ2
ν).
with θ0 = θ−1 = 1 and y0 = y−1 = 0 (Nesterov, 1983).
Parallel solvers for stochastic optimal control Ajay Kumar 19/36
50. Characteristics of the algorithm
Dual iterates converge at a rate of O(1/ν2)
An ergodic (averaged) primal iterate converges at a rate of
O(1/ν2)
Preconditioning is of crucial importance
Terminate the algorithm when the iterate (xν, zν) satisfies
f (x) + g(z) − P ≤ V
x − Hz ∞ ≤ g .
P. Patrinos and A. Bemporad ’2014
Parallel solvers for stochastic optimal control Ajay Kumar 20/36
51. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
52. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
Our choice is:
f (z) = smooth cost + dynamics + alg equations
g(z) = safety cost + soft constraints + control constraints
where z = {xi
j , ui
j }
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
53. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
Our choice is:
f (z) =
N−1
j=0
µ(j)
i=1
pi
j ( w
(ui
j ) + ∆
(∆ui
j )) + δ(ui
j |Φ1(di
j ))
+ δ(xi
j+1, ui
j , x
anc(j+1,i)
j |Φ2(di
j )),
g(z) =
N−1
j=0
µ(j)
i=1
S
(xi
j+1)+ d
(xi
j+1) + δ(ui
j | U),
where
Φ1(d) = {u : Eu + Ed d = 0},
Φ2(d) = {(x+
, x, u) : x+
= Ax + Bu + Gd d}
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
54. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
min
z,t
f (z) + g(t),
s.t. Hz = t,
where H =
Inx
0
Inx 0
0 Inu
, t = ({xi
j }j,i , {χi
j }j,i , {ui
j }j,i ).
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
55. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
min
z,t
f (z) + g(t),
s.t. Hz = t,
where H =
Inx
0
Inx 0
0 Inu
, t = ({xi
j }j,i , {χi
j }j,i , {ui
j }j,i ).
Solve using Dual APG algorithm
This requires computation of:
Dual gradient - f ∗
(−H y)
proxg (y)
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
56. Dual APG for DWN control problem
Objective: Decompose the scenario SMPC suitable for parallelise
min
z,t
f (z) + g(t),
s.t. Hz = t,
where H =
Inx
0
Inx 0
0 Inu
, t = ({xi
j }j,i , {χi
j }j,i , {ui
j }j,i ).
Solve using Dual APG algorithm
This requires computation of:
Dual gradient - f ∗
(−H y)
proxg (y)
Parallel solvers for stochastic optimal control Ajay Kumar 21/36
57. Dual gradient computation
To compute the dual gradient we use
f ∗
(−H y) = arg min
z
{z Hy + f (z)}
= arg min
z:dynamics
Eui
j +Ed di
j =0
{z Hy +
j,i
quadratic(zi
j )}
This is an equality-constrained quadratic problem which can be
solved very efficiently using dynamic programming.
Parallel solvers for stochastic optimal control Ajay Kumar 22/36
58. Dual gradient computation
We can break it two parts:
Backward substitution
Forward substitution
Algorithm 1 Solve step
Require: wν
= (˜ςi
j , ˜ζi
j , ˜ψi
j ).
qi
N ← 0, and ri
N ← 0, ∀i ∈ N[1,ns ],
for j = N − 1, . . . , 0 do
for i = 1, . . . , µ(k) do {in parallel}
σl
j ← rl
j+1 + βl
j , ∀l ∈ child(j, i)
vl
j ← 1
2pl
j
(Φl
j ( ˜ξl
j + ql
j+1) + Ψl
j
˜ψl
j + Λl
j σl
j ),
∀l ∈ child(j, i)
ri
j ← l∈child(j,i) σl
j + ¯B ( ˜ξl
j +ql
j+1)+L ˜ψl
j
qi
j ← A l∈child(j,i)
˜ξl
j + ql
j+1
end for
end for
x1
0 ← p, u−1 ← q,
for j = 0, . . . , N − 1 do
for i = 1, . . . , µ(k) do {in parallel}
vi
j ← v
anc(j,i)
j−1
+ vi
j
ui
j ← Lvi
j + ˆui
j
xi
j+1 ← Ax
anc(j,i)
j
+ ¯Bvi
j + ei
j
end for
end for
return {xi
j }N
j=1, {ui
j }N−1
j=0
Parallel solvers for stochastic optimal control Ajay Kumar 23/36
59. Dual gradient computation
Backward substitution
sync
sync
stage k
stagek+1
Matrix-vector
product at the
nodes {parallel}
Algorithm 1 Solve step
Require: wν
= (˜ςi
j , ˜ζi
j , ˜ψi
j ).
qi
N ← 0, and ri
N ← 0, ∀i ∈ N[1,ns ],
for j = N − 1, . . . , 0 do
for i = 1, . . . , µ(k) do {in parallel}
σl
j ← rl
j+1 + βl
j , ∀l ∈ child(j, i)
vl
j ← 1
2pl
j
(Φl
j ( ˜ξl
j + ql
j+1) + Ψl
j
˜ψl
j + Λl
j σl
j ),
∀l ∈ child(j, i)
ri
j ← l∈child(j,i) σl
j + ¯B ( ˜ξl
j +ql
j+1)+L ˜ψl
j
qi
j ← A l∈child(j,i)
˜ξl
j + ql
j+1
end for
end for
x1
0 ← p, u−1 ← q,
for j = 0, . . . , N − 1 do
for i = 1, . . . , µ(k) do {in parallel}
vi
j ← v
anc(j,i)
j−1
+ vi
j
ui
j ← Lvi
j + ˆui
j
xi
j+1 ← Ax
anc(j,i)
j
+ ¯Bvi
j + ei
j
end for
end for
return {xi
j }N
j=1, {ui
j }N−1
j=0
Parallel solvers for stochastic optimal control Ajay Kumar 23/36
60. Dual gradient computation
Backward substitution
sync sync
sync
stage k
stagek+1
stagek-1
Matrix-vector
product at the
nodes {parallel}
Sum at the
nodes
{parallel}
Algorithm 1 Solve step
Require: wν
= (˜ςi
j , ˜ζi
j , ˜ψi
j ).
qi
N ← 0, and ri
N ← 0, ∀i ∈ N[1,ns ],
for j = N − 1, . . . , 0 do
for i = 1, . . . , µ(k) do {in parallel}
σl
j ← rl
j+1 + βl
j , ∀l ∈ child(j, i)
vl
j ← 1
2pl
j
(Φl
j ( ˜ξl
j + ql
j+1) + Ψl
j
˜ψl
j + Λl
j σl
j ),
∀l ∈ child(j, i)
ri
j ← l∈child(j,i) σl
j + ¯B ( ˜ξl
j +ql
j+1)+L ˜ψl
j
qi
j ← A l∈child(j,i)
˜ξl
j + ql
j+1
end for
end for
x1
0 ← p, u−1 ← q,
for j = 0, . . . , N − 1 do
for i = 1, . . . , µ(k) do {in parallel}
vi
j ← v
anc(j,i)
j−1
+ vi
j
ui
j ← Lvi
j + ˆui
j
xi
j+1 ← Ax
anc(j,i)
j
+ ¯Bvi
j + ei
j
end for
end for
return {xi
j }N
j=1, {ui
j }N−1
j=0
Parallel solvers for stochastic optimal control Ajay Kumar 23/36
61. Dual gradient computation
Forward substitution
sync sync
stage k
stagek-1
Matrix-vector
product control
{parallel}
Algorithm 1 Solve step
Require: wν
= (˜ςi
j , ˜ζi
j , ˜ψi
j ).
qi
N ← 0, and ri
N ← 0, ∀i ∈ N[1,ns ],
for j = N − 1, . . . , 0 do
for i = 1, . . . , µ(k) do {in parallel}
σl
j ← rl
j+1 + βl
j , ∀l ∈ child(j, i)
vl
j ← 1
2pl
j
(Φl
j ( ˜ξl
j + ql
j+1) + Ψl
j
˜ψl
j + Λl
j σl
j ),
∀l ∈ child(j, i)
ri
j ← l∈child(j,i) σl
j + ¯B ( ˜ξl
j +ql
j+1)+L ˜ψl
j
qi
j ← A l∈child(j,i)
˜ξl
j + ql
j+1
end for
end for
x1
0 ← p, u−1 ← q,
for j = 0, . . . , N − 1 do
for i = 1, . . . , µ(k) do {in parallel}
vi
j ← v
anc(j,i)
j−1
+ vi
j
ui
j ← Lvi
j + ˆui
j
xi
j+1 ← Ax
anc(j,i)
j
+ ¯Bvi
j + ei
j
end for
end for
return {xi
j }N
j=1, {ui
j }N−1
j=0
Parallel solvers for stochastic optimal control Ajay Kumar 23/36
62. Dual gradient computation
Forward substitution
sync sync
sync
stage k
stagek+1
stagek-1
Matrix-vector
product state
{parallel}
Matrix-vector
product control
{parallel}
Algorithm 1 Solve step
Require: wν
= (˜ςi
j , ˜ζi
j , ˜ψi
j ).
qi
N ← 0, and ri
N ← 0, ∀i ∈ N[1,ns ],
for j = N − 1, . . . , 0 do
for i = 1, . . . , µ(k) do {in parallel}
σl
j ← rl
j+1 + βl
j , ∀l ∈ child(j, i)
vl
j ← 1
2pl
j
(Φl
j ( ˜ξl
j + ql
j+1) + Ψl
j
˜ψl
j + Λl
j σl
j ),
∀l ∈ child(j, i)
ri
j ← l∈child(j,i) σl
j + ¯B ( ˜ξl
j +ql
j+1)+L ˜ψl
j
qi
j ← A l∈child(j,i)
˜ξl
j + ql
j+1
end for
end for
x1
0 ← p, u−1 ← q,
for j = 0, . . . , N − 1 do
for i = 1, . . . , µ(k) do {in parallel}
vi
j ← v
anc(j,i)
j−1
+ vi
j
ui
j ← Lvi
j + ˆui
j
xi
j+1 ← Ax
anc(j,i)
j
+ ¯Bvi
j + ei
j
end for
end for
return {xi
j }N
j=1, {ui
j }N−1
j=0
Parallel solvers for stochastic optimal control Ajay Kumar 23/36
63. Implementation
Implementation on NVIDIA Tesla
2075
Suitable for simple operations
which can be parallelisable
Programmed in CUDA
environment with cuBLAS library
CUDA 6.0 developed by Nvidia corp.
Parallel solvers for stochastic optimal control Ajay Kumar 24/36
65. Simulations & Assesment
Computational time
Economic utility
Quality of service
Network utility
Smoothness in operation
Measured in KPIs
Parallel solvers for stochastic optimal control Ajay Kumar 25/36
66. Computational time
0 100 200 300 400 500
# scenarios
0
200
400
600
800
runtime(s)
Gurobi APG
0 500
0
100
200
300
slope
GPU-APG is 14.6 times faster than Gurobi
Parallel solvers for stochastic optimal control Ajay Kumar 25/36
67. Key Performance Indicator
0 100 200 300 400 500
Number of scenarios
1.5
1.55
1.6
1.65
1.7
1.75
1.8
1.85
KPIE
/1000
0
1
2
3
4
5
6
7
KPI
S
/1000
KPIE
KPI
S
Trade-off between risk and performance in terms of number
of scenarios
Parallel solvers for stochastic optimal control Ajay Kumar 25/36
69. Motivation
Limitations of APG aka forward-backward splitting:
Does not include curvature information
Prone to ill-conditioning
Not suitable for high precision solution
Parallel solvers for stochastic optimal control Ajay Kumar 26/36
70. Problem statement
Limitations of APG aka forward-backward splitting:
Does not include curvature information
Prone to ill-conditioning
Not suitable for high precision solution
Objective: Develop algorithms which can be parallelisable as APG
and can provide high precision solution.
Parallel solvers for stochastic optimal control Ajay Kumar 26/36
71. Problem statement
Limitations of APG aka forward-backward splitting:
Does not include curvature information
Prone to ill-conditioning
Not suitable for high precision solution
Objective: Develop algorithms which can be parallelisable as APG
and can provide high precision solution.
Solution: Extend quasi-newton techniques to solve the stochastic
optimal problem.
Parallel solvers for stochastic optimal control Ajay Kumar 26/36
72. Forward Backward Envelope
Consider
ϕ(x) = f (x) + g(x)
where
f is smooth, closed, proper and convex
g is closed, proper and convex
Stella et al., 2016 arXiv:1604.08096; Patrinos and Bemporad, 2013.
Parallel solvers for stochastic optimal control Ajay Kumar 27/36
73. Forward Backward Envelope
Consider
ϕ(x) = f (x) + g(x)
where
f is smooth, closed, proper and convex
g is closed, proper and convex
Forward Backward Envelope is:
ϕγ(x) = min
z
f (x) + f (x), z − x + 1
2γ z − x 2
+ g(z)
Stella et al., 2016 arXiv:1604.08096; Patrinos and Bemporad, 2013.
Parallel solvers for stochastic optimal control Ajay Kumar 27/36
74. Forward Backward Envelope
Key properties
The FBE ϕγ is always
real-valued and
inf ϕ = inf ϕγ
arg min ϕ = arg min ϕγ
If f ∈ C2
then ϕγ ∈ C1
and
ϕγ(x) = (I−γ 2
f (x))Rγ(x),
where Rγ(x) = x − proxλg x
x
ϕ(x)
ϕγ(x)
ϕ
ϕγ
we can minimise ϕγ instead of minimising ϕ
Parallel solvers for stochastic optimal control Ajay Kumar 28/36
75. Forward Backward Envelope
Key properties
The FBE ϕγ is always
real-valued and
inf ϕ = inf ϕγ
arg min ϕ = arg min ϕγ
If f ∈ C2
then ϕγ ∈ C1
and
ϕγ(x) = (I−γ 2
f (x))Rγ(x),
where Rγ(x) = x − proxλg x
x
ϕ(x)
ϕγ(x)
ϕ
ϕγ
we can minimise ϕγ instead of minimising ϕ
Parallel solvers for stochastic optimal control Ajay Kumar 28/36
76. LBFGS on FBE
Algorithm 2 Forward-Backward L-BFGS
choose γ ∈ (0, 1/L), x0
, m (memory), (tolerance)
Initialize an LBFGS buffer with memory m
while Rγ (xk
) > do
dk
← −Bk
ϕγ (xν
) (Using the LBFGS buffer)
xk+1
← xk
+ τk dk
, τk : satisfies Wolfe conditions
sk
← xk+1
− xk
, qk
← ϕγ (xk+1
) − ϕγ (xk
), ρk ← sk
, qk
if ρk > 0 then
Push (sk
, qk
, ρk
) into the LBFGS buffer
end if
end while
Any direction dk can be used (LBFGS, nonlinear CG, etc)
In practice it is very fast
ϕ(xk) converges to ϕ as O(1/k) when ϕ has bounded level
set; linear convergence when ϕ is strongly convex
Parallel solvers for stochastic optimal control Ajay Kumar 29/36
77. Global FBE
Algorithm 3 Global Forward-Backward L-BFGS
choose γ ∈ (0, 1/L), x0
, m (memory), (tolerance)
Initialize an LBFGS buffer with memory m
while Rγ (xk
) > do
dk
← −Bk
ϕγ (xν
) (Using the LBFGS buffer)
wk
← xk
+ τk dk
, so that ϕγ (wk
) ≤ ϕγ (xk
)
xk+1
← proxγg (wk
− γ f (wk
))
sk
← xk+1
− xk
, qk
← ϕγ (xk+1
) − ϕγ (xk
), ρk ← sk
, qk
if ρk > 0 then
Push (sk
, qk
, ρk
) into the LBFGS buffer
end if
end while
Any direction dk can be used (LBFGS, nonlinear CG, etc)
In practice it is very fast
ϕ(xk) converges to ϕ as O(1/k) when ϕ has bounded level
set; linear convergence when ϕ is strongly convex
Parallel solvers for stochastic optimal control Ajay Kumar 29/36
78. LBFGS-FBE to solve SMPC
Goal: To develop parallel implementation of LBFGS-FBE algorithm
for Scenario-based SMPC
Parallel solvers for stochastic optimal control Ajay Kumar 30/36
79. LBFGS-FBE to solve SMPC
Goal: To develop parallel implementation of LBFGS-FBE algorithm
for Scenario-based SMPC
Consider the stochastic MPC formulation:
V (p) = min
π={uk }k=N−1
k=0
E Vf (xN, ξN) +
N−1
k=0
k(xk, uk, ξk) ,
s.t x0 = p,
xk+1 = Aξk
xk + Bξk
uk + wξk
,
Parallel solvers for stochastic optimal control Ajay Kumar 30/36
80. LBFGS-FBE to solve SMPC
Goal: To develop parallel implementation of LBFGS-FBE algorithm
for Scenario-based SMPC
Our choice of decomposition as
f (x) = smooth cost + dynamics
g(Hx) = non-smooth cost + constraints
Parallel solvers for stochastic optimal control Ajay Kumar 30/36
81. LBFGS-FBE to solve SMPC
Goal: To develop parallel implementation of LBFGS-FBE algorithm
for Scenario-based SMPC
Our choice of decomposition as
f (x)=
N−1
k=0
µk
i=1
p
i
k φ(x
i
k ,u
i
k ,i) +
µN
i=1
p
i
N φN (x
i
N , i)+δ(x|X(p)),
g(Hx)=
N−1
k=0
µk
i=1
p
i
k
¯φ(F
i
k x
i
k +G
i
k u
i
k ,i)+
µN
i=1
p
i
N
¯φN (F
i
N x
i
N , i),
where
X(p) = {x : x
j
k+1
= A
j
k
x
i
k + B
j
k
u
i
k + w
j
k
, j ∈ child(k, i)}
Parallel solvers for stochastic optimal control Ajay Kumar 30/36
82. LBFGS-FBE to solve SMPC
To apply LBFGS-FBE to scenario SMPC, we need
gradient of FBE - ϕγ
− dual gradient - f ∗
(−H y)
− Hessian-vector oracle - 2
f ∗
(−H y)d
proxg
LBFGS direction - dk
Parallel solvers for stochastic optimal control Ajay Kumar 31/36
83. LBFGS-FBE to solve SMPC
To apply LBFGS-FBE to scenario SMPC, we need
gradient of FBE - ϕγ
− dual gradient - f ∗
(−H y)
− Hessian-vector oracle - 2
f ∗
(−H y)d
proxg
LBFGS direction - dk
Computed with dynamic programming and is parellisable
Parallel solvers for stochastic optimal control Ajay Kumar 31/36
84. LBFGS-FBE to solve SMPC
To apply LBFGS-FBE to scenario SMPC, we need
gradient of FBE - ϕγ
− dual gradient - f ∗
(−H y)
− Hessian-vector oracle - 2
f ∗
(−H y)d
proxg
LBFGS direction - dk
Computed with an algorithm similar to f ∗(−H y) and have the
same computational complexity
Parallel solvers for stochastic optimal control Ajay Kumar 31/36
85. LBFGS-FBE to solve SMPC
To apply LBFGS-FBE to scenario SMPC, we need
gradient of FBE - ϕγ
− dual gradient - f ∗
(−H y)
− Hessian-vector oracle - 2
f ∗
(−H y)d
proxg
LBFGS direction - dk
Extra computations than APG: Hessian-vector oracle + LBFGS
algorithm
Parallel solvers for stochastic optimal control Ajay Kumar 31/36
86. Convergence
50 100 150 200
iterations
10 -3
10 -2
10 -1
10 0
Rλ
LBFGS (FBE)
PR+ (FBE)
APG
Parallel solvers for stochastic optimal control Ajay Kumar 32/36
87. Flop counts
64 128 256 512 1024
0
200
400
600
800
MFLOP
average
LBFGS (FBE)
PR+ (FBE)
APG
64 128 256 512 1024
# scenarios
0
2
4
6
GFLOP
maximum
Parallel solvers for stochastic optimal control Ajay Kumar 33/36
88. Runtimes in CUDA
6 8 10 12 14
log 2
(scenarios)
10 -2
10 -1
10 0
10 1
10 2
runtime(s)
LBFGS (Global)
APG
Gurobi
Parallel solvers for stochastic optimal control Ajay Kumar 34/36
90. Contribution
Proposes real-time uncertainty aware controller to manage the
drinking water networks
Provides parallelisable algorithms to the optimisation problem
in scenario-based SMPC
Scenario-based SMPC offers a trade-off between reliability and
performance
This discuss a new class of algorithms that provides fast and
high precision solution for the scenario-based SMPC
Parallel solvers for stochastic optimal control Ajay Kumar 35/36
91. Thank you very much
Parallel solvers for stochastic optimal control Ajay Kumar 36/36
92. References (i)
• A. Sampathirao, P. Sopasakis, A. Bemporad, and P. Patrinos, “Fast parallelizable scenario-based stochastic
optimization,” EUCCO 2016, Leuven, Belgium, 2016.
• A. Sampathirao, P. Sopasakis, A. Bemporad, and P. Patrinos, “GPU-accelerated stochastic predictive control of
drinking water networks,” IEEE CST (submitted, provisionally accepted), 2016 (on arXiv).
• A. Sampathirao, J. Grosso, P. Sopasakis, C. Ocampo-Martinez, A. Bemporad, and V. Puig, “Water demand
forecasting for the optimal operation of large-scale drinking water networks: The Barcelona case study,” 19th
IFAC World Congress, pp. 10457–10462, 2014.
• A. Sampathirao, P. Sopasakis, A. Bemporad, and P. Patrinos, “Distributed solution of stochastic optimal
control problems on GPUs,” 54th IEEE Conf. Decision and Control, (Osaka, Japan), Dec 2015.
• A. Sampathirao, P. Sopasakis, A. Bemporad, and P. Patrinos, “Proximal Quasi-Newton Methods for
Scenario-based Stochastic Optimal Control,” IFAC 2017 (submitted).
• A. Goryashko and A. Nemirovski, “Robust energy cost optimization of water distribution system with uncertain
demand,” Automation and Remote Control 75(10), pp. 1754–1769, 2014.
• S. Cong Cong, S. Puig, and G. Cembrano, “Combining CSP and MPC for the operational control of water
networks: Application to the Richmond case study,” 19th IFAC World Congress, (Cape Town), pp. 6246–6251,
2014.
• J. Grosso, C. Ocampo-Mart´inez, V. Puig, and B. Joseph, “Chance-constrained model predictive control for
drinking water networks,” Journal of Process Control 24(5), pp. 504–516, 2014
Parallel solvers for stochastic optimal control Ajay Kumar 36/36
93. References (ii)
• P.L. Combettes and J.-C. Pesquet. “Proximal splitting methods in signal processing,” Technical report, 2010.
URL http://arxiv.org/abs/0912.3522v4.
• N. Parikh and S. Boyd, “Proximal algorithms,” Found. Trends Optim 1, pp. 127–239, 2014.
• R. Rockafellar and J. Wets, “Variational analysis,” Berlin: Springer-Verlag, 3rd ed., 2009.
• R. Rockafellar, “Convex analysis,” Princeton university press, 1972.
• Yu. Nesterov, “A method of solving a convex programming problem with convergence rate O(1/k2
),” Soviet
Mathematics Doklady 72(2), pp. 372–376, 1983.
• L. Stella, A. Themelis and P. Patrinos “Forward-Backward quasi-Newton methods for nonsmooth optimisation
problems”, URL arxiv.org/pdf/1604.08096v2.pdf
• P. Patrinos and A. Bemporad “Proximal Newton methods for convex composite optimisation”, 52nd IEEE
Conference on Decision and Control, pp. 2358-2363, 2013
• P. Patrinos and A. Bemporad, ”An accelerated dual gradient-projection algorithm for embedded linear model
predictive control,” IEEE Trans. Aut. Contr, vol. 59(1), pp. 18-33, 2014,
Parallel solvers for stochastic optimal control Ajay Kumar 36/36