SlideShare a Scribd company logo
1 of 119
High-Scale Power Systems:
Simulation & Optimization
Olivier Teytaud + Inria-Tao + Artelys
TAO project-team
INRIA Saclay Île-de-France
O. Teytaud, Research Fellow,
olivier.teytaud@inria.fr
http://www.lri.fr/~teytaud/
This audience ?
Power
systems
(Reinforcement)
learning
Optimization
Me
Power
systems
(Reinforcement)
learning
Optimization
Power systems
● Many jobs in AI for power systems
● Important for the economy and for the world
● Join the force, do Machine Learning for Power
Systems!
All in one slide
● Noisy optimization
● Direct Policy Search
● Model Predictive Control + receding horizon
● Reinforcement learning / Markov Decision Processes
● Stochastic (dual) dynamic programming
● Bootstrap, bias correction, sample averrage
approximation
● Non stochastic uncertainties (Wald, Savage)
● + power systems (stability of networks, capacity
markets, domino effect, HVDC, unit commitment,
dispatch, UC by sort)
Most important slide of this talk
Something is unclear ?
Something is wrong ?
==> INTERRUPT ME !!!
If there is a problem for you, there is
probably a problem for 50 others :-)
Be a hero! Interrupt this
presentation at least once :-)
This talk
Power systems (enough for doing RL on it)
Machine learning for Power Systems
First, the power systems part
● Maybe you don't care about power systems
● But I promise it's cool and fun :-)
Energy matters!
The “30 glorieuses” (45-73) in
some western countries:
● No unemployment
● Growth
● Baby boom
==> stop at 1973 oil crisis
==> correlation economy/energy.
Pollution is complicated: numbers
from nextbigfuture (not super recent)
The death toll is not the only criterion, and you can disagree
with these numbers (so hard to evaluate).
Coal more
radioactive
than nuclear
power ?
Will
improve ?
Specifying costs: what scientists and
engineers do not decide
● Economical costs
● Ecological cost
● Air
● CO2 (+greenhouse gas)
● Water
● Waste storage
● ...
● Externalities
● Maintenance death
● Faults, quality of service
Energy pollutes
Climate change... yes it matters, but
business as usual (will change ?)
Air pollution: kills more than aids + malaria ?
Coal = cheap + huge reserves.
Nuclear power: Chernobyl + Fukushima
Electricity is tricky
Too much production ? Frequency increases.
Not enough production (fault) ? Freq. decreases.
==> both can be harmful
==> electricity needs instantaneous equilibrium
==> but some energies are intermittent, volatile
(wind) ou slow (coal, nuclear) and there are faults.
Comparing power plants only from the point of
view of euros per MWh = very approximate
==> study an energy mix, not a single energy source
Alternate current
Frequency must be stable.
Some power plants contribute
to real-time stabilization
(frequency++ ==> power--)
(frequency-- ==> power++)
Plants paid for … doing “nothing” ?
Intermittent wind or variable demand
+ “prod = demand” constraint
==> need fast/reactive power plants (reserves).
These PPs need more than the market price
==> capacity market (paid for being here “in case”)
==> complex economical model in deregulated
markets
Energy is (almost) a collaborative game
Sharing (peak hours, reserves...) is great.
Excellent for renewable energies
Collaboration not that bad in Europe.
==> Towards a European energy mix (solar in the
South, wind in the North).
A possible paradigm:
●Maximize social surplus assuming collaboration
(decision variables = investments).
●Assume that the legislator will take care of
incentives.
Optimization & energy
I. The most important question in the universe
II. Examples
III. A key problem: uncertainties
IV. Algorithms
Denmark
Wind power
33% local consumption
Implies need for
● connections (++)
● storage
● and/or gas plants
==> sometimes negative prices
==> 8.4 t CO2 per person (France 6.1, Usa 17.2)
Storage: electric vehicles ?
China
● Coal, massively
==> air pollution
● PV units production
cheap thanks to no environmental constraints + labor law
● Wind power + long distance DC connections
Imports from countries w/o ecological norms ?
Intoxicate babies in China rather than in
Europe ?
Chile
● Big hydropower planning
● Long term correlations
==> beautiful RL problem
France
● Nuclear (plenty)
● Exports
● Electric heating
(==> imports
during peaks)
France
Plenty of nuclear.
Little CO2 per inhabitant
(also thanks to compact towns).
Fukushima-style event in Paris ?
Terrorism risk ?
1980
2007
Germany
==> progressively
stopping nuclear PP
==> energy trading with France
==> 9.6 t CO2 per inhabitant x year ( > France)
Unit Commitment in Germany
Here !
Kerguelen: let's be crazy ?
Big surface
Wind 35 km/h frequent,
150 km/h usual,
peaks 200 km/h.
Perfect for wind power.
No consumption around.
==> H2 synthesis ? or move industries there ?
Important place for wild life.
Greenland: yet a bit more crazy
Wind power on all shores ?
Connect to America and Europe ?
(different peak hours)
Scandinavia
Still good locations
for hydropower.
Big connections to the
rest of Europe ?
Hydro storage convenient for smoothing
intermittent energy sources.
All Europe using storage in Sweden ?
Or H2 storage (not yet technically ok) ?
Beautiful problems!
● Definitely important
● All time scales
●Building PP (dozens of years)
●Building connections (dozens of years)
●Hydro planning (years)
●Nuclear planning (months, years)
●Thermal plants (hours, days)
●Faults, reserves (< second to months)
● Nonlinear effects
● Plenty of constraints (non separable!)
● High dimensional:
●action spaces (~10000)
●state spaces (~100)
Reserves (==> dispatch; frequency
control) (importance++ with renewable energies ?)
Unit commitment
Very cheap
power
Cheap
power
Expensive
power
Super
expensive
power
Marginal costs
here
Peak shaving by pumped storage
Expensive
energy
Expensive
energy
Cheap
energy
Peak shaving by pumped storage
Pumping
Hydro
power
Hydro
power
Also: take care of networks!
● Domino effect
● Overloaded line
● ==> failure
● ==> other lines overloaded
● ==> other lines fail
● ==> Baouuuum!!!!!
United States: the 2003 blackout
● Overloaded line + bug (race condition, paral. prog)
● Domino effect! 45 millions people with no
electricity (2 days), plus various damages
Example: interconnection studies
(demand levelling, stabilized supply)
The POST project – supergrids
simulation and optimization
Mature technology: HVDC links
(high-voltage direct current)
Related ideas in Asia
(more political issues)
HVDC might change the world
● Transmission networks are high voltage
alternate current
● But some connections are high voltage direct
current:
– Reduces losses (for long distance)
– Removes the need for frequency stabilization
Outline
1. Overview
2. Sequential decision making
3. Strategic decisions
4. Conclusions
Power systems: decision variables
Decisions =
● Strategic decisions (a few time steps):
●building a nuclear power plant
●build a Spain-Marocco connection
●build a wind farm
● tactical decisions (many time steps):
●switching on hydroPP #7 at 6:00
●switching on thermal PP #4 at 7:15
●....
Based on
simulations
of the
tactical level
Depends on
the
strategic
level
Sequential decision making
● Issues
– Demand varying in time, limited previsibility
– Transmission introduces constraints (no “copper plate”)
– Renewable ==> variability ++ (no “deterministic approach”)
● Methods
– Markovian assumptions ==> sometimes wrong!!!!
– Simplified models ==> Model error >> optimization error
● Approaches
● Machine Learning / Mathematical Programming
Stochastic Control
Reinforcement learning: black box stochastic control
Implicit assumption “state = observation” ?
Sometimes “state” is a huge unknown thing.
System
Controller
with
memory
commands
State
Cost
State
(known structure ?
or black-box ?)
Random values
Random
process
Observation
( = state ?)
Hybridization reinforcement learning /
mathematical programming
● Math programming (mathematicians doing discrete-time
control)
– Nearly exact solutions for a simplified problem
– High-dimensional constrained action space
– But small state space, linearization, Markov & not anytime
==> 99% of what I've seen in industry
● Reinforcement learning (geeks doing DTC)
– Unstable :-( (except DPS)
– Small model bias
– Small / simple action space <== often the main issue
– But high dimensional state space & anytime
3 examples of algorithms
==> Model Predictive Control,
Stochastic Dynamic
Programming,
Direct Policy Search
● Anticipative solutions:
● Replace all random parts by deterministic parts
● Optimize deterministically
● Pros/Cons
● Much simpler (deterministic optimization)
● But in real life you can not guess November
rains in January
● Rather optimistic decisions
MODEL PREDICTIVE CONTROL
● Looks like pure bullshit: 100% deterministic
● Still so convenient:
● So many constraints
● Huge state spaces
● Just having a bug-free simulator is so hard
● So many uncertainties
e.g. paper “Newave vs Odin”: other methods
have worst assumptions (convexity, Markovian
random processes, etc ==> later)
MODEL PREDICTIVE CONTROL
Shrinking horizon / receding horizon
1 Assume you know the next 48 hours
2 Optimize the reward over these 48 hours
3 In fact, just apply two hours of decisions
4 Go back to 1 and “t ← t + 2hours”.
==> operational horizon = 2 hours
==> tactical horizon = 48 hours
All effects lasting more than 48 hours are
neglected !!!!
Receding horizon + valorization
1 Assume you know the next 48 hours
2 Optimize the reward over these 48 hours +
bonus e.g. 5 euros per MWh in each stock
3 In fact, just apply two hours of decisions
4 Go back to 1 and “t ← t + 2hours”.
==> Much better but sometimes
still plain wrong
Receding horizon + constraint
1 Assume you know the next 48 hours
2 Optimize the reward over these 48 hours +
constraint > lower bound given by humans
(history)
3 In fact, just apply two hours of decisions
4 Go back to 1 and “t ← t + 2hours”.
==> not principled (imitation), but convenient
+ polynomial time with a linear model.
Receding horizon
+ learnt valorization
1 Assume you know the next 48 hours
2 Optimize the reward over these 48 hours +
learntFunction(currentState,stocks)
3 In fact, just apply two hours of decisions
4 Go back to 1 and “t ← t + 2hours”.
==> still polynomial if learntFunction “linear
programming” as a function of actions
==> I love that (but I might be biased :-) )
(Direct Value Seach)
How to learn that ? By Direct Policy Search!
3 examples of algorithms
Model Predictive Control,
==> Stochastic Dynamic
Programming,
Direct Policy Search
How to solve, simple case, three
states, 3 days, no random
process
1 1
2
32
2 2
2
3
2 3
3
3
3
3
4
1
How to solve, simple case, three
states, 3 days, no random
process
2
2
2
1 1
2
32
2 2
2
3
2 3
3
3
3
3
4
1
How to solve, simple case, three
states, 3 days, no random
process
3
4
6
2
2
2
1 1
2
32
2 2
2
3
2 3
3
3
3
3
4
1
How to solve, simple case, three
states, 3 days, no random
process
4
5
7
3
4
6
2
2
2
1 1
2
32
2 2
2
3
2 3
3
3
3
3
4
1
This was deterministic
● How to add a random process ?
● Don't believe that the world is limited to
compact MDPs :-)
● Remember Astrom'65 ? Sometimes you need
the history of observations (or latent variables)
in the state ==> can't make optimal decisions
just with current observations.
● Build a huge tree of possible futures and
multiply nodes ?
Adding a random transition,
without Markov assumption:
growing tree
4
5
7
3
4
6
2
2
2
1 1 2
322 2
2
3 2 3
3
3
33
41
4
5
7
3
4
6
2
2
2
1 1 2
322 2
2
3 2 3
3
3
33
41
4
5
7
3
4
6
2
2
2
1 1 2
322 2
2
3 2 3
3
3
33
41
Probability 1/3
Probability 2/3
The huge MDP necessary for solving a non-
Markovian problem
Representation as a Markov process (a tree):
This is the representation
of the random process.
In each node, there are the
state-nodes with decision-edges.
The huge MDP necessary for solving a non-
Markovian problem
Representation as a Markov process (a tree):
This is the representation
of the random process.
In each node, there are the
state-nodes with decision-edges.
Huge
representation.
Value-based
approaches
untractable.
Overfitting
● Representation as a Markov process (a tree):
How do you actually make decisions when the random values
are not exactly those observed ? (heuristics...)
● Check on random realizations which have not been used for
building the tree.
● Does it work correctly ? ===> cross validation
● Overfitting = when it works only on scenarios used in the
optimization process.
(see B. Defourny, D. Ernst and L. Wehenkel, INFORMS Journal on Computing, Vol. 25(3), 2013,
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear) (often)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
Maximum of linear functions = can be encoded in linear programming.
==> Each argmax is polynomial.
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
Maximum of linear functions = can be encoded in linear programming.
==> Each argmax is polynomial.
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
Maximum of linear functions = can be encoded in linear programming.
==> Each argmax is polynomial.
noise is multiplied
+ strict convexity
required
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
● → ok for 100 000 decision variables per time step
(tenths of time steps, hundreds of plants, several
decisions each)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
● → ok for 100 000 decision variables per time step
● but solving by expensive SDP/SDDP (curse of
dimensionality, exp. in state variables)
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
● → ok for 100 000 decision variables per time step
● but solving by expensive SDP/SDDP
● Constraints
● Needs LP approximation: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
● → ok for 100 000 decision variables per time step
● but solving by expensive SDP/SDDP
● Constraints
● Needs LP approximation: ok for you ?
● SDDP requires convex Bellman values: ok for you ?
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Representation of the controller with Linear Progamming
(value function as piecewise linear)
● → ok for 100 000 decision variables per time step !!!
● but slow in terms of state variables (exponential)
● Constraints
● Needs LP approximation: ok for you ?
● SDDP requires convex Bellman values: ok for you ?
● Needs Markov random processes: ok for you ?
(possibly after some random process extension...)
Summary
● Most classical solution = SDP and variants
● Or MPC (model-predictive control), replacing
the stochastic parts by deterministic pessimistic
forecasts
3 examples of algorithms
Model Predictive Control,
Stochastic Dynamic
Programming,
==> Direct Policy Search
Direct Policy Search
● Requires a parametric controller
● Principle: optimize the parameters on
simulations (= simulation-based optim)
● Unusual in large scale Power Systems
(we will see why)
● Usual in other areas (evolutionary robotics)
Stochastic Control by DPS
System
Controller
with
memory
commands
State
Cost
State
Random values
Random
process
Optimize the controller thanks to a simulator:
● Command = Controller(w,state,forecasts)
● Simulate( w ) = stochastic loss with parameter w
● w* = argmin [Simulate(w)] <== noisy optimization
Parameters
Stochastic Control by DPS
System
Controller
with
memory
commands
State
Cost
State
Random values
Random
process
So simple.
Does not work under this simple form, when you have
large scale action spaces and/or many constraints.
Still, nice representations can make it relevant.
Parameters
Direct Policy Search (DPS)
● Requires a parametric controller
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
● Noisy Black-Box Optimization
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
● Noisy Black-Box Optimization
● Advantages: non-linear ok, forecasts included
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
● Noisy Black-Box Optimization
● Advantages: non-linear ok, forecasts included
● Issue: too slow
hundreds of parameters for even 20 decision variables
(depends on structure)
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
● Noisy Black-Box Optimization
● Advantages: non-linear ok, forecasts included
● Issue: too slow
hundreds of parameters for even 20 decision variables
(depends on structure)
● Idea: a special structure for DPS (inspired from SDP)
Strategy optimized given the real
forecasting module you have, given
arbitrarily precise simulations
(forecasts are inputs)
Direct Policy Search (DPS)
● Requires a parametric controller
e.g. neural network
Controller(w,x) =
W3+W2.tanh(W1.x+W0)
● Noisy Black-Box Optimization
● Advantages: non-linear ok, forecasts included
● Issue: too slow
hundreds of parameters for even 20 decision variables
(depends on structure)
● Idea: a special structure for DPS (inspired from SDP)
Strategy optimized given the real
forecasting module you have, given
arbitrarily precise simulations
(forecasts are inputs)
Great for fine-tuning:
1. Optimize by other approach (MPC ?)
2. Fine tune by DPS
Noisy optimization
Two very different frameworks:
● We have a generative model (or a huge
sample)
● The problem is computational
● Gradient-based optimization, or black box
● We have a finite sample (e.g. 8 samples)
● The problem is statistical
● Let us compute the optimum on average on the
sample
● Is it really a good solution ?
Interesting papers
in recent Nips / Icml
Also results
from the 50s and 60s
Noise-free optimization
● Hessian + Gradient ==> apply Newton
H ( x(n+1) – x(n) ) = - g(n)
i.e. minimum of second order Taylor approximation
● Only gradient: quasi-Newton
guess the Hessian, thanks to e.g. BFGS
● No gradient
Evolutionary algorithms / pattern search methods
Finite differences
Noise-free optimization
● Hessian + Gradient ==> apply Newton
H ( x(n+1) – x(n) ) = - g(n)
i.e. minimum of second order Taylor approximation
● Only gradient: quasi-Newton
==> guess the Hessian, thanks to e.g. BFGS
● No gradient
Evolutionary algorithms / pattern search methods
Finite differences
Log || x(n) - x*|| ~ -Cn
Very assumption
dependent
distance(n+1)=O(distance(n)2
)
distance(n+1)/distance(n)=o(1)
Noisy black-box optimization
= request f(x) and get e.g. f(x,random)
Finite differences with noise (3rd deriv. ≠0):
● Dupac 57: log distance ~ -2/3 log(n)
Noisy black-box optimization
= request f(x) and get e.g. f(x,random)
Finite differences with noise (3rd deriv. ≠0):
● Dupac 57: log distance ~ -2/3 log(n)
● Fabian 67: log distance ~ - log(n) with sophisticated
finite differences and assuming “many” derivatives
Noisy black-box optimization
= request f(x) and get e.g. f(x,random)
Finite differences with noise (3rd deriv. ≠0):
● Dupac 57: log distance ~ -2/3 log(n)
● Fabian 67: log distance ~ - log(n) with sophisticated
finite differences and assuming “many” derivatives
exist
● Spall 00: log distance ~ -2/3 log(n) with better
dependency in the dimension and simpler algorithm
Noisy black-box optimization
= request f(x) and get e.g. f(x,random)
Finite differences with noise (3rd deriv. ≠0):
● Dupac 57: log distance ~ -2/3 log(n)
● Fabian 67: log distance ~ - log(n) with sophisticated
finite differences and assuming “many” derivatives
● Spall 00: log distance ~ -2/3 log(n) with better
dependency in the dimension and simpler algorithm
● Recent works: evolutionary algorithms with
resamplings ==> -1/2 log(n)
Noisy black-box optimization
= request f(x) and get e.g. f(x,random)
Finite differences with noise (3rd deriv. ≠0):
● Dupac 57: log distance ~ -2/3 log(n)
● Fabian 67: log distance ~ - log(n) with sophisticated
finite differences and assuming “many” derivatives
● Spall 00: log distance ~ -2/3 log(n) with better
dependency in the dimension and simpler algorithm
● Recent works: evolutionary algorithms with
resamplings ==> -1/2 log(n)
● Shamir 2012: non-asymptotically, log distance ~
-1/2 log(n) (or -log(n) with quadratic functions)
Sample average approximation
Two very different frameworks:
● We have a generative model (or a huge sample)
● The problem is computational
● Gradient-based optimization
● We have a finite sample (e.g. 8 samples...)
● The problem is statistical
● Let us compute the optimum on average on the
sample
● Is it really a good solution ?
SAA: sample average
approximation
● I want x* = argmin E f(x)+noise(x)
● But I compute x = argmin g(x)
g(x) = f(x)+noise1(x) + f(x) + noise2(x) + f(x)
+noise3(x)+...+ f(x) + noiseN(x) (SAA)
● E ( noise(x) ) = 0 and noisei i.i.d
●
Then E x ≠ x*, because N is finite: bias
● Bias corr.: evaluate b=Ex-x*, propose x-b ?
b=Ex-x* depends on the problem:
how to evaluate it ?
● We want to know the difference between
●
the optimum on average over ∞ i.i.d cases
● the optimum on average over N i.i.d cases
● Efron: let's compute the same difference for
another probability distribution, uniform over the
sample:
●
the opt. on average over ∞ (=N distinct) i.i.d cases
● the opt. on average over N i.i.d cases (among N!)
Looks strange, isn't it ?
● Efron and others designed such tools
● “Bootstrap”: find a solution with what you have
● Expensive:
● Compute the optimum x on your sample
● Compute the expected optimum x' on average on
multiple “resamplings”
● Compute b=x-Ex'
● Return x+b = 2x-Ex'
● Many other resampling methods (jackknife,
variants of bootstrap...)
Looks strange, isn't it ?
● Efron and others designed such tools
● “Bootstrap”: find a solution with what you have
● Expensive:
● Compute the optimum x on your sample
● Compute the expected optimum x' on average on
multiple “resamplings”
● Compute b=x-Ex'
● Return x+b
● Many other resampling methods (jackknife,
variants of bootstrap...)
A beautiful example of case in which
sophisticated mathematics help
State of the art in discrete-time control, a few tools:
● Model Predictive Control:
For making a decision in a given state:
(i) do forecasts
(ii) replace random procs -> pessimistic forecasts
(iii) optimize as if deterministic problem
● Stochastic Dynamic Programming:
● ~Markov model
● Compute “cost to go” backwards
● Direct Policy Search:
● Parametric controller
● Optimized on simulations
● Problems for high-dimensional constrained action spaces
State of the art in discrete-time control, a few tools:
MPC SDP DPS
Random Markovian Ok
process
Long Heuristic Ok Ok
term
effects
Constrained Ok Ok
action
spaces
●
Optimal if Deterministic Markovian Good structure
So
convenient
Good for
tuning ?
DPS as a fine tuning upper layer
● Bengio '97:
● Handcraft a policy
● Smooth it, replace constants by parameters
● Optimize params by DPS
● Decock et al '13:
● construct a policy as in SDP
● but learn a parametric V by DPS (instead of back.
induction)
Basically, it never fails...
Find suboptimality, counterbalance
it by DPS...
Power systems optimization (1 slide)
Consider an electric system.
Decisions =
● Strategic decisions (a few time steps):
●building a nuclear power plant
●build a Spain-Marocco connection
●build a wind farm
● tactical decisions (many time steps):
●switching on hydroPP #7 at 6:00
●switching on thermal PP #4 at 7:15
●....
Based on
simulations
of the
tactical level
Depends on
the
strategic
level
Outline
1. Overview
2. Sequential decision making
3. Strategic decisions
4. Conclusions
I have nightmares with
this problem.
No idea how to
tacklle that.
Strategic decisions
● Typical situation
● A system, to be optimized by RL / SDP / etc
● Strategic decisions on top of it
● Non stochastic uncertainties
● Tools
● Bandits (including adversarial)
● Wald / Savage / Nash criteria
● Bilevel optimization
● Still quite an open problem
Strategic decisions
● Decisions (very simplified):
● 100% renewable + demand-side management
● Plenty of nuclear power
● 100% coal+gas
● Microgrids
● Concentrated solar power (storage with molten salt...)...
● Scenarios
● A Fukushima in Europe ?
● Little improvements of renewable ?
● No more international cooperation ?
● ...new gravity storage ? Flying wheels ? compressed air ?
H2 ? fusion ? Breakthrough in PV ? Putin ?...
Strategic decisions
R(d,s) = reward for decision d in scenario s
● On average:
d*=argmaxd
Es
R(d,s) ==> what if we have no probabilities ?
● Worst case: d*=argmaxd
mins
R(d,s)
A bit conservative ? As if Nature decides “against” us and
“after” we have chosen d.
● Regret: R'(d,s) = R(d,s)-maxd'
R(d',s)
& d*=argmind
R'(d,s)
● Nash: d*=argmaxd
mins
R(d,s) (d random; no
“after”) ==> stochastic strategies ? (==> fired)
Strategic decisions
● Criteria
● Best (deter.) choice for worst scenario (Wald)
● Best (deter.) choice in terms of regret (Savage)
● Best simultaneous choice (Nash)
● Combination: best simultaneous regret (Nash+Savage)
● Formally:
● Wald: argmax_c min_s Reward(c,s)
● Sav.: argmax_c min_s max_g ( Reward(c,s)-Reward(g,s) )
● Nash-Savage & Nash-Wald: allow stochastic decisions
==> strange fact: optimal policies are then stochastic...
A stochastic nuclear decision ?
Strategic decisions
● Assume you have 100 000 000 000 euros for
investing in power plants / networks. What do you
do ?
● Different points of view:
● Too many uncertainties! Minimum investments.
● So many uncertaintiies! Investments everywhere (will
solve unemployment :-) ).
● No idea :-)
● What do you think ? What would you do if your life
was depending on it ?
Conclusions
● dynamic optimization problems in
power systems: beautiful + crucial
● 2 levels “strategic + tactical”
● Strategic: >100 decision variables in a
noisy non-linear optimization problem
with non-stochastic uncertainties
● Tactical (dynamic):
– 10 000 constrained action variables
– 100 state variables
– hundreds of time steps
– non-linear effects
– non-Markovian random processes
Conclusions
● RL did not invade power systems because of
high dim. constrained action spaces
Conclusions
● RL did not invade power systems because of
high dim. constrained action spaces
● Dynamic optimization does not boil down to
MDP solving ==> Markov assumption!
Conclusions
● RL did not invade power systems because of
high dim. constrained action spaces
● Dynamic optimization does not boil down to
MDP solving ==> Markov assumption!
● DPS on top of other algorithms ? Tuning in front
of “realistic” simulations.
Conclusions
● RL did not invade power systems because of
high dim. constrained action spaces
● Dynamic optimization does not boil down to
MDP solving ==> Markov assumption!
● DPS on top of other algorithms ? Tuning in front
of “realistic” simulations with realistic RP.
● Long term investments:
● Difficult to make decisions (nobody trusts criteria for
non-stochastic uncertainties)
● But negative conclusions matter: “removing X
without adding Y does not work”:
Conclusions
Main strengths of the RL (machine learning)
communities:
● Really cares about nonlinear effects (model error)
● Really cares about overfitting (clean cross-validation)
==> should invade power systems
==> at least if high-dim action spaces are handled
( SDP / SDDP great for that )
Our proposal: DPS as a fine-tuning upper layer,
over MPC (which is sooo convenient!).
Bibliography
● Dynamic Programming and Suboptimal Control: A Survey from
ADP to MPC. Bertsekas, 2005. (MPC = deterministic forecasts)
● “Newave vs Odin”: why MPC survives in spite of theoretical
shortcomings
● Dallagi et Simovic (EDF R&D) : "Optimisation des actifs
hydrauliques d'EDF : besoins métiers, méthodes actuelles et
perspectives", PGMO (importance of precise simulations)
● Ernst: The Global Grid, 2013 & all his slides/studies on WWW
● Renewable energy forecasts ought to be probabilistic! Pinson,
2013 (wipfor talk)
● Training a neural network with a financial criterion rather than a
prediction criterion. Bengio, 1997
● Direct Model Predictive Control, Decock et al, 2014 (combining
DPS and MPC)
Summary :-)
● Noisy optimization = black box stochastic optimization
● Dynamic optimization (DO) = multistage optimization
● Reinforcement learning = black box DO
● Direct Policy Search = RL by parametric optimization
● Model Predictive Control = DPS with simplified model
● Receding horizon = neglect long-term + frequent reoptimize
● (Stoc.) (dual) dynamic prog. = DO backwards in time
● Bootstrap, bias correction <== tricky statistics for sample
average approximation
● Non stochastic uncertainties (Wald, Savage)
● Unit commitment/dispatch = DP for power systems
● UC by sort = unit commitment with marginal costs only
All in one slide
● Noisy optimization
● Direct Policy Search
● Model Predictive Control + receding horizon
● Reinforcement learning / Markov Decision Processes
● Stochastic (dual) dynamic programming
● Bootstrap, bias correction, sample averrage
approximation
● Non stochastic uncertainties (Wald, Savage)
● + power systems (stability of networks, capacity markets,
domino effect, HVDC, unit commitment, dispatch, UC by
sort, think of time (storage) & space (transmission) )
Questions ?
Appendix
● Representation of the controller
● decision(current state)=
argmin Cost(decision) + Bellman(next state)
● Linear programming (LP) if:
– For a given current state, next state = LP(decision)
– Cost(decision) = LP(decision)
● →100 000 decision variables per time step
SDP / SDDP
Stochastic (Dual) Dynamic Programming
● Planning/control (tactical level)
● Pluriannual planning: evaluate marginal costs of hydroelectricity
● Taking into account stochasticity and uncertainties
● Moderate scale (Cities, Factories) (tactical level simpler)
● Master plan optimization
● Stochastic uncertainties
● High scale investment studies (e.g. Europe+North Africa)
● Long term (2030 - 2050)
● Huge (non-stochastic) uncertainties
● Investments: interconnections, storage, smart grids, power plants...
Our activities
Energy is expensive (or not ?)
Desertec = hundreds of billions of euros for
renewables in Africa.
Medgrid = transmission network for these
renewables.
==> worth taking time for making decisions
Stochastic Control
System
Controller
with
memory
commands
State
Cost
Observation
Random values
Random
process
For an optimal representation, you need access
● to the whole archive
● or to forecasts (generative model / probabilistic forecasts)
(Astrom 1965)

More Related Content

Viewers also liked

Examples of operational research
Examples of operational researchExamples of operational research
Examples of operational researchOlivier Teytaud
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchOlivier Teytaud
 
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)Olivier Teytaud
 
Simple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimizationSimple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimizationOlivier Teytaud
 
Disappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree SearchDisappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree SearchOlivier Teytaud
 
Bias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimizationBias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimizationOlivier Teytaud
 
Keywords and examples of machine learning
Keywords and examples of machine learningKeywords and examples of machine learning
Keywords and examples of machine learningOlivier Teytaud
 
Réseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielleRéseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielleOlivier Teytaud
 

Viewers also liked (11)

Examples of operational research
Examples of operational researchExamples of operational research
Examples of operational research
 
Power systemsilablri
Power systemsilablriPower systemsilablri
Power systemsilablri
 
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy SearchSimulation-based optimization: Upper Confidence Tree and Direct Policy Search
Simulation-based optimization: Upper Confidence Tree and Direct Policy Search
 
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
Monte Carlo Tree Search in 2014 (MCMC days in Marseille)
 
Simple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimizationSimple regret bandit algorithms for unstructured noisy optimization
Simple regret bandit algorithms for unstructured noisy optimization
 
Disappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree SearchDisappointing results & open problems in Monte-Carlo Tree Search
Disappointing results & open problems in Monte-Carlo Tree Search
 
Direct policy search
Direct policy searchDirect policy search
Direct policy search
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Bias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimizationBias and Variance in Continuous EDA: massively parallel continuous optimization
Bias and Variance in Continuous EDA: massively parallel continuous optimization
 
Keywords and examples of machine learning
Keywords and examples of machine learningKeywords and examples of machine learning
Keywords and examples of machine learning
 
Réseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielleRéseaux neuronaux profonds & intelligence artificielle
Réseaux neuronaux profonds & intelligence artificielle
 

Similar to High-Scale Power Systems Simulation & Optimization

Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new toolsOlivier Teytaud
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsOlivier Teytaud
 
High Performance Computing for Instabilities in Aerospace Propulsion Systems
High Performance Computing for Instabilities in Aerospace Propulsion SystemsHigh Performance Computing for Instabilities in Aerospace Propulsion Systems
High Performance Computing for Instabilities in Aerospace Propulsion Systemsinside-BigData.com
 
Submer ocp webinar the future of power-efficient datacenters
Submer ocp webinar   the future of power-efficient datacentersSubmer ocp webinar   the future of power-efficient datacenters
Submer ocp webinar the future of power-efficient datacentersJohn Laban
 
The future of power-efficient datacenters
The future of power-efficient datacentersThe future of power-efficient datacenters
The future of power-efficient datacentersJohn Laban
 
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy system
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy systemGRID FLEXIBILITY: an antidote to relieve pain in a changing energy system
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy systemIRIS Smart Cities
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentationkntotas
 
Future Electricity Markets: key pillars with high shares of wind and PV
Future Electricity Markets: key pillars with high shares of wind and PVFuture Electricity Markets: key pillars with high shares of wind and PV
Future Electricity Markets: key pillars with high shares of wind and PVLeonardo ENERGY
 
Future electricity markets: key pillars with high shares of wind and PV
Future electricity markets: key pillars with high shares of wind and PVFuture electricity markets: key pillars with high shares of wind and PV
Future electricity markets: key pillars with high shares of wind and PVLeonardo ENERGY
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Olivier Teytaud
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligenceOlivier Teytaud
 
COHEAT @ Rehau workshop March 2017
COHEAT @ Rehau workshop March 2017COHEAT @ Rehau workshop March 2017
COHEAT @ Rehau workshop March 2017Marko Cosic
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascaleMarc Snir
 
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Olivier Teytaud
 
Mathematics Colloquium, UCSC
Mathematics Colloquium, UCSCMathematics Colloquium, UCSC
Mathematics Colloquium, UCSCdongwook159
 
What is the likely future of real-time transient stability?
What is the likely future of real-time transient stability?What is the likely future of real-time transient stability?
What is the likely future of real-time transient stability?Université de Liège (ULg)
 

Similar to High-Scale Power Systems Simulation & Optimization (20)

Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new tools
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power Systems
 
Hydroelectricity
HydroelectricityHydroelectricity
Hydroelectricity
 
High Performance Computing for Instabilities in Aerospace Propulsion Systems
High Performance Computing for Instabilities in Aerospace Propulsion SystemsHigh Performance Computing for Instabilities in Aerospace Propulsion Systems
High Performance Computing for Instabilities in Aerospace Propulsion Systems
 
Submer ocp webinar the future of power-efficient datacenters
Submer ocp webinar   the future of power-efficient datacentersSubmer ocp webinar   the future of power-efficient datacenters
Submer ocp webinar the future of power-efficient datacenters
 
The future of power-efficient datacenters
The future of power-efficient datacentersThe future of power-efficient datacenters
The future of power-efficient datacenters
 
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy system
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy systemGRID FLEXIBILITY: an antidote to relieve pain in a changing energy system
GRID FLEXIBILITY: an antidote to relieve pain in a changing energy system
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Future Electricity Markets: key pillars with high shares of wind and PV
Future Electricity Markets: key pillars with high shares of wind and PVFuture Electricity Markets: key pillars with high shares of wind and PV
Future Electricity Markets: key pillars with high shares of wind and PV
 
Future electricity markets: key pillars with high shares of wind and PV
Future electricity markets: key pillars with high shares of wind and PVFuture electricity markets: key pillars with high shares of wind and PV
Future electricity markets: key pillars with high shares of wind and PV
 
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
Ilab Metis: we optimize power systems and we are not afraid of direct policy ...
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligence
 
COHEAT @ Rehau workshop March 2017
COHEAT @ Rehau workshop March 2017COHEAT @ Rehau workshop March 2017
COHEAT @ Rehau workshop March 2017
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
Resilience at exascale
Resilience at exascaleResilience at exascale
Resilience at exascale
 
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
Tools for artificial intelligence: EXP3, Zermelo algorithm, Alpha-Beta, and s...
 
Optimisation of post-combustion CCS for flexible operation - Dr Niall Mac Dow...
Optimisation of post-combustion CCS for flexible operation - Dr Niall Mac Dow...Optimisation of post-combustion CCS for flexible operation - Dr Niall Mac Dow...
Optimisation of post-combustion CCS for flexible operation - Dr Niall Mac Dow...
 
Mathematics Colloquium, UCSC
Mathematics Colloquium, UCSCMathematics Colloquium, UCSC
Mathematics Colloquium, UCSC
 
What is the likely future of real-time transient stability?
What is the likely future of real-time transient stability?What is the likely future of real-time transient stability?
What is the likely future of real-time transient stability?
 

Recently uploaded

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 

Recently uploaded (20)

UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 

High-Scale Power Systems Simulation & Optimization

  • 1. High-Scale Power Systems: Simulation & Optimization Olivier Teytaud + Inria-Tao + Artelys TAO project-team INRIA Saclay Île-de-France O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
  • 4. Power systems ● Many jobs in AI for power systems ● Important for the economy and for the world ● Join the force, do Machine Learning for Power Systems!
  • 5. All in one slide ● Noisy optimization ● Direct Policy Search ● Model Predictive Control + receding horizon ● Reinforcement learning / Markov Decision Processes ● Stochastic (dual) dynamic programming ● Bootstrap, bias correction, sample averrage approximation ● Non stochastic uncertainties (Wald, Savage) ● + power systems (stability of networks, capacity markets, domino effect, HVDC, unit commitment, dispatch, UC by sort)
  • 6. Most important slide of this talk Something is unclear ? Something is wrong ? ==> INTERRUPT ME !!! If there is a problem for you, there is probably a problem for 50 others :-) Be a hero! Interrupt this presentation at least once :-)
  • 7. This talk Power systems (enough for doing RL on it) Machine learning for Power Systems
  • 8. First, the power systems part ● Maybe you don't care about power systems ● But I promise it's cool and fun :-)
  • 9. Energy matters! The “30 glorieuses” (45-73) in some western countries: ● No unemployment ● Growth ● Baby boom ==> stop at 1973 oil crisis ==> correlation economy/energy.
  • 10. Pollution is complicated: numbers from nextbigfuture (not super recent) The death toll is not the only criterion, and you can disagree with these numbers (so hard to evaluate). Coal more radioactive than nuclear power ? Will improve ?
  • 11. Specifying costs: what scientists and engineers do not decide ● Economical costs ● Ecological cost ● Air ● CO2 (+greenhouse gas) ● Water ● Waste storage ● ... ● Externalities ● Maintenance death ● Faults, quality of service
  • 12. Energy pollutes Climate change... yes it matters, but business as usual (will change ?) Air pollution: kills more than aids + malaria ? Coal = cheap + huge reserves. Nuclear power: Chernobyl + Fukushima
  • 13. Electricity is tricky Too much production ? Frequency increases. Not enough production (fault) ? Freq. decreases. ==> both can be harmful ==> electricity needs instantaneous equilibrium ==> but some energies are intermittent, volatile (wind) ou slow (coal, nuclear) and there are faults. Comparing power plants only from the point of view of euros per MWh = very approximate ==> study an energy mix, not a single energy source
  • 14. Alternate current Frequency must be stable. Some power plants contribute to real-time stabilization (frequency++ ==> power--) (frequency-- ==> power++)
  • 15. Plants paid for … doing “nothing” ? Intermittent wind or variable demand + “prod = demand” constraint ==> need fast/reactive power plants (reserves). These PPs need more than the market price ==> capacity market (paid for being here “in case”) ==> complex economical model in deregulated markets
  • 16. Energy is (almost) a collaborative game Sharing (peak hours, reserves...) is great. Excellent for renewable energies Collaboration not that bad in Europe. ==> Towards a European energy mix (solar in the South, wind in the North). A possible paradigm: ●Maximize social surplus assuming collaboration (decision variables = investments). ●Assume that the legislator will take care of incentives.
  • 17. Optimization & energy I. The most important question in the universe II. Examples III. A key problem: uncertainties IV. Algorithms
  • 18. Denmark Wind power 33% local consumption Implies need for ● connections (++) ● storage ● and/or gas plants ==> sometimes negative prices ==> 8.4 t CO2 per person (France 6.1, Usa 17.2) Storage: electric vehicles ?
  • 19. China ● Coal, massively ==> air pollution ● PV units production cheap thanks to no environmental constraints + labor law ● Wind power + long distance DC connections Imports from countries w/o ecological norms ? Intoxicate babies in China rather than in Europe ?
  • 20. Chile ● Big hydropower planning ● Long term correlations ==> beautiful RL problem
  • 21. France ● Nuclear (plenty) ● Exports ● Electric heating (==> imports during peaks)
  • 22. France Plenty of nuclear. Little CO2 per inhabitant (also thanks to compact towns). Fukushima-style event in Paris ? Terrorism risk ? 1980 2007
  • 23. Germany ==> progressively stopping nuclear PP ==> energy trading with France ==> 9.6 t CO2 per inhabitant x year ( > France)
  • 26. Kerguelen: let's be crazy ? Big surface Wind 35 km/h frequent, 150 km/h usual, peaks 200 km/h. Perfect for wind power. No consumption around. ==> H2 synthesis ? or move industries there ? Important place for wild life.
  • 27. Greenland: yet a bit more crazy Wind power on all shores ? Connect to America and Europe ? (different peak hours)
  • 28. Scandinavia Still good locations for hydropower. Big connections to the rest of Europe ? Hydro storage convenient for smoothing intermittent energy sources. All Europe using storage in Sweden ? Or H2 storage (not yet technically ok) ?
  • 29. Beautiful problems! ● Definitely important ● All time scales ●Building PP (dozens of years) ●Building connections (dozens of years) ●Hydro planning (years) ●Nuclear planning (months, years) ●Thermal plants (hours, days) ●Faults, reserves (< second to months) ● Nonlinear effects ● Plenty of constraints (non separable!) ● High dimensional: ●action spaces (~10000) ●state spaces (~100)
  • 30. Reserves (==> dispatch; frequency control) (importance++ with renewable energies ?)
  • 32. Peak shaving by pumped storage Expensive energy Expensive energy Cheap energy
  • 33. Peak shaving by pumped storage Pumping Hydro power Hydro power
  • 34. Also: take care of networks! ● Domino effect ● Overloaded line ● ==> failure ● ==> other lines overloaded ● ==> other lines fail ● ==> Baouuuum!!!!!
  • 35. United States: the 2003 blackout ● Overloaded line + bug (race condition, paral. prog) ● Domino effect! 45 millions people with no electricity (2 days), plus various damages
  • 36. Example: interconnection studies (demand levelling, stabilized supply)
  • 37. The POST project – supergrids simulation and optimization Mature technology: HVDC links (high-voltage direct current) Related ideas in Asia (more political issues)
  • 38. HVDC might change the world ● Transmission networks are high voltage alternate current ● But some connections are high voltage direct current: – Reduces losses (for long distance) – Removes the need for frequency stabilization
  • 39. Outline 1. Overview 2. Sequential decision making 3. Strategic decisions 4. Conclusions
  • 40. Power systems: decision variables Decisions = ● Strategic decisions (a few time steps): ●building a nuclear power plant ●build a Spain-Marocco connection ●build a wind farm ● tactical decisions (many time steps): ●switching on hydroPP #7 at 6:00 ●switching on thermal PP #4 at 7:15 ●.... Based on simulations of the tactical level Depends on the strategic level
  • 41. Sequential decision making ● Issues – Demand varying in time, limited previsibility – Transmission introduces constraints (no “copper plate”) – Renewable ==> variability ++ (no “deterministic approach”) ● Methods – Markovian assumptions ==> sometimes wrong!!!! – Simplified models ==> Model error >> optimization error ● Approaches ● Machine Learning / Mathematical Programming
  • 42. Stochastic Control Reinforcement learning: black box stochastic control Implicit assumption “state = observation” ? Sometimes “state” is a huge unknown thing. System Controller with memory commands State Cost State (known structure ? or black-box ?) Random values Random process Observation ( = state ?)
  • 43. Hybridization reinforcement learning / mathematical programming ● Math programming (mathematicians doing discrete-time control) – Nearly exact solutions for a simplified problem – High-dimensional constrained action space – But small state space, linearization, Markov & not anytime ==> 99% of what I've seen in industry ● Reinforcement learning (geeks doing DTC) – Unstable :-( (except DPS) – Small model bias – Small / simple action space <== often the main issue – But high dimensional state space & anytime
  • 44. 3 examples of algorithms ==> Model Predictive Control, Stochastic Dynamic Programming, Direct Policy Search
  • 45. ● Anticipative solutions: ● Replace all random parts by deterministic parts ● Optimize deterministically ● Pros/Cons ● Much simpler (deterministic optimization) ● But in real life you can not guess November rains in January ● Rather optimistic decisions MODEL PREDICTIVE CONTROL
  • 46. ● Looks like pure bullshit: 100% deterministic ● Still so convenient: ● So many constraints ● Huge state spaces ● Just having a bug-free simulator is so hard ● So many uncertainties e.g. paper “Newave vs Odin”: other methods have worst assumptions (convexity, Markovian random processes, etc ==> later) MODEL PREDICTIVE CONTROL
  • 47. Shrinking horizon / receding horizon 1 Assume you know the next 48 hours 2 Optimize the reward over these 48 hours 3 In fact, just apply two hours of decisions 4 Go back to 1 and “t ← t + 2hours”. ==> operational horizon = 2 hours ==> tactical horizon = 48 hours All effects lasting more than 48 hours are neglected !!!!
  • 48. Receding horizon + valorization 1 Assume you know the next 48 hours 2 Optimize the reward over these 48 hours + bonus e.g. 5 euros per MWh in each stock 3 In fact, just apply two hours of decisions 4 Go back to 1 and “t ← t + 2hours”. ==> Much better but sometimes still plain wrong
  • 49. Receding horizon + constraint 1 Assume you know the next 48 hours 2 Optimize the reward over these 48 hours + constraint > lower bound given by humans (history) 3 In fact, just apply two hours of decisions 4 Go back to 1 and “t ← t + 2hours”. ==> not principled (imitation), but convenient + polynomial time with a linear model.
  • 50. Receding horizon + learnt valorization 1 Assume you know the next 48 hours 2 Optimize the reward over these 48 hours + learntFunction(currentState,stocks) 3 In fact, just apply two hours of decisions 4 Go back to 1 and “t ← t + 2hours”. ==> still polynomial if learntFunction “linear programming” as a function of actions ==> I love that (but I might be biased :-) ) (Direct Value Seach) How to learn that ? By Direct Policy Search!
  • 51. 3 examples of algorithms Model Predictive Control, ==> Stochastic Dynamic Programming, Direct Policy Search
  • 52. How to solve, simple case, three states, 3 days, no random process 1 1 2 32 2 2 2 3 2 3 3 3 3 3 4 1
  • 53. How to solve, simple case, three states, 3 days, no random process 2 2 2 1 1 2 32 2 2 2 3 2 3 3 3 3 3 4 1
  • 54. How to solve, simple case, three states, 3 days, no random process 3 4 6 2 2 2 1 1 2 32 2 2 2 3 2 3 3 3 3 3 4 1
  • 55. How to solve, simple case, three states, 3 days, no random process 4 5 7 3 4 6 2 2 2 1 1 2 32 2 2 2 3 2 3 3 3 3 3 4 1
  • 56. This was deterministic ● How to add a random process ? ● Don't believe that the world is limited to compact MDPs :-) ● Remember Astrom'65 ? Sometimes you need the history of observations (or latent variables) in the state ==> can't make optimal decisions just with current observations. ● Build a huge tree of possible futures and multiply nodes ?
  • 57. Adding a random transition, without Markov assumption: growing tree 4 5 7 3 4 6 2 2 2 1 1 2 322 2 2 3 2 3 3 3 33 41 4 5 7 3 4 6 2 2 2 1 1 2 322 2 2 3 2 3 3 3 33 41 4 5 7 3 4 6 2 2 2 1 1 2 322 2 2 3 2 3 3 3 33 41 Probability 1/3 Probability 2/3
  • 58. The huge MDP necessary for solving a non- Markovian problem Representation as a Markov process (a tree): This is the representation of the random process. In each node, there are the state-nodes with decision-edges.
  • 59. The huge MDP necessary for solving a non- Markovian problem Representation as a Markov process (a tree): This is the representation of the random process. In each node, there are the state-nodes with decision-edges. Huge representation. Value-based approaches untractable.
  • 60. Overfitting ● Representation as a Markov process (a tree): How do you actually make decisions when the random values are not exactly those observed ? (heuristics...) ● Check on random realizations which have not been used for building the tree. ● Does it work correctly ? ===> cross validation ● Overfitting = when it works only on scenarios used in the optimization process. (see B. Defourny, D. Ernst and L. Wehenkel, INFORMS Journal on Computing, Vol. 25(3), 2013,
  • 61. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) (often)
  • 62. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) Maximum of linear functions = can be encoded in linear programming. ==> Each argmax is polynomial.
  • 63. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) Maximum of linear functions = can be encoded in linear programming. ==> Each argmax is polynomial.
  • 64. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) Maximum of linear functions = can be encoded in linear programming. ==> Each argmax is polynomial. noise is multiplied + strict convexity required
  • 65. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step (tenths of time steps, hundreds of plants, several decisions each)
  • 66. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step ● but solving by expensive SDP/SDDP (curse of dimensionality, exp. in state variables)
  • 67. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step ● but solving by expensive SDP/SDDP ● Constraints ● Needs LP approximation: ok for you ?
  • 68. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step ● but solving by expensive SDP/SDDP ● Constraints ● Needs LP approximation: ok for you ? ● SDDP requires convex Bellman values: ok for you ?
  • 69. SDP / SDDP Stochastic (Dual) Dynamic Programming ● Representation of the controller with Linear Progamming (value function as piecewise linear) ● → ok for 100 000 decision variables per time step !!! ● but slow in terms of state variables (exponential) ● Constraints ● Needs LP approximation: ok for you ? ● SDDP requires convex Bellman values: ok for you ? ● Needs Markov random processes: ok for you ? (possibly after some random process extension...)
  • 70. Summary ● Most classical solution = SDP and variants ● Or MPC (model-predictive control), replacing the stochastic parts by deterministic pessimistic forecasts
  • 71. 3 examples of algorithms Model Predictive Control, Stochastic Dynamic Programming, ==> Direct Policy Search
  • 72. Direct Policy Search ● Requires a parametric controller ● Principle: optimize the parameters on simulations (= simulation-based optim) ● Unusual in large scale Power Systems (we will see why) ● Usual in other areas (evolutionary robotics)
  • 73. Stochastic Control by DPS System Controller with memory commands State Cost State Random values Random process Optimize the controller thanks to a simulator: ● Command = Controller(w,state,forecasts) ● Simulate( w ) = stochastic loss with parameter w ● w* = argmin [Simulate(w)] <== noisy optimization Parameters
  • 74. Stochastic Control by DPS System Controller with memory commands State Cost State Random values Random process So simple. Does not work under this simple form, when you have large scale action spaces and/or many constraints. Still, nice representations can make it relevant. Parameters
  • 75. Direct Policy Search (DPS) ● Requires a parametric controller
  • 76. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0)
  • 77. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization
  • 78. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● Advantages: non-linear ok, forecasts included
  • 79. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● Advantages: non-linear ok, forecasts included ● Issue: too slow hundreds of parameters for even 20 decision variables (depends on structure)
  • 80. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● Advantages: non-linear ok, forecasts included ● Issue: too slow hundreds of parameters for even 20 decision variables (depends on structure) ● Idea: a special structure for DPS (inspired from SDP) Strategy optimized given the real forecasting module you have, given arbitrarily precise simulations (forecasts are inputs)
  • 81. Direct Policy Search (DPS) ● Requires a parametric controller e.g. neural network Controller(w,x) = W3+W2.tanh(W1.x+W0) ● Noisy Black-Box Optimization ● Advantages: non-linear ok, forecasts included ● Issue: too slow hundreds of parameters for even 20 decision variables (depends on structure) ● Idea: a special structure for DPS (inspired from SDP) Strategy optimized given the real forecasting module you have, given arbitrarily precise simulations (forecasts are inputs) Great for fine-tuning: 1. Optimize by other approach (MPC ?) 2. Fine tune by DPS
  • 82. Noisy optimization Two very different frameworks: ● We have a generative model (or a huge sample) ● The problem is computational ● Gradient-based optimization, or black box ● We have a finite sample (e.g. 8 samples) ● The problem is statistical ● Let us compute the optimum on average on the sample ● Is it really a good solution ? Interesting papers in recent Nips / Icml Also results from the 50s and 60s
  • 83. Noise-free optimization ● Hessian + Gradient ==> apply Newton H ( x(n+1) – x(n) ) = - g(n) i.e. minimum of second order Taylor approximation ● Only gradient: quasi-Newton guess the Hessian, thanks to e.g. BFGS ● No gradient Evolutionary algorithms / pattern search methods Finite differences
  • 84. Noise-free optimization ● Hessian + Gradient ==> apply Newton H ( x(n+1) – x(n) ) = - g(n) i.e. minimum of second order Taylor approximation ● Only gradient: quasi-Newton ==> guess the Hessian, thanks to e.g. BFGS ● No gradient Evolutionary algorithms / pattern search methods Finite differences Log || x(n) - x*|| ~ -Cn Very assumption dependent distance(n+1)=O(distance(n)2 ) distance(n+1)/distance(n)=o(1)
  • 85. Noisy black-box optimization = request f(x) and get e.g. f(x,random) Finite differences with noise (3rd deriv. ≠0): ● Dupac 57: log distance ~ -2/3 log(n)
  • 86. Noisy black-box optimization = request f(x) and get e.g. f(x,random) Finite differences with noise (3rd deriv. ≠0): ● Dupac 57: log distance ~ -2/3 log(n) ● Fabian 67: log distance ~ - log(n) with sophisticated finite differences and assuming “many” derivatives
  • 87. Noisy black-box optimization = request f(x) and get e.g. f(x,random) Finite differences with noise (3rd deriv. ≠0): ● Dupac 57: log distance ~ -2/3 log(n) ● Fabian 67: log distance ~ - log(n) with sophisticated finite differences and assuming “many” derivatives exist ● Spall 00: log distance ~ -2/3 log(n) with better dependency in the dimension and simpler algorithm
  • 88. Noisy black-box optimization = request f(x) and get e.g. f(x,random) Finite differences with noise (3rd deriv. ≠0): ● Dupac 57: log distance ~ -2/3 log(n) ● Fabian 67: log distance ~ - log(n) with sophisticated finite differences and assuming “many” derivatives ● Spall 00: log distance ~ -2/3 log(n) with better dependency in the dimension and simpler algorithm ● Recent works: evolutionary algorithms with resamplings ==> -1/2 log(n)
  • 89. Noisy black-box optimization = request f(x) and get e.g. f(x,random) Finite differences with noise (3rd deriv. ≠0): ● Dupac 57: log distance ~ -2/3 log(n) ● Fabian 67: log distance ~ - log(n) with sophisticated finite differences and assuming “many” derivatives ● Spall 00: log distance ~ -2/3 log(n) with better dependency in the dimension and simpler algorithm ● Recent works: evolutionary algorithms with resamplings ==> -1/2 log(n) ● Shamir 2012: non-asymptotically, log distance ~ -1/2 log(n) (or -log(n) with quadratic functions)
  • 90. Sample average approximation Two very different frameworks: ● We have a generative model (or a huge sample) ● The problem is computational ● Gradient-based optimization ● We have a finite sample (e.g. 8 samples...) ● The problem is statistical ● Let us compute the optimum on average on the sample ● Is it really a good solution ?
  • 91. SAA: sample average approximation ● I want x* = argmin E f(x)+noise(x) ● But I compute x = argmin g(x) g(x) = f(x)+noise1(x) + f(x) + noise2(x) + f(x) +noise3(x)+...+ f(x) + noiseN(x) (SAA) ● E ( noise(x) ) = 0 and noisei i.i.d ● Then E x ≠ x*, because N is finite: bias ● Bias corr.: evaluate b=Ex-x*, propose x-b ?
  • 92. b=Ex-x* depends on the problem: how to evaluate it ? ● We want to know the difference between ● the optimum on average over ∞ i.i.d cases ● the optimum on average over N i.i.d cases ● Efron: let's compute the same difference for another probability distribution, uniform over the sample: ● the opt. on average over ∞ (=N distinct) i.i.d cases ● the opt. on average over N i.i.d cases (among N!)
  • 93. Looks strange, isn't it ? ● Efron and others designed such tools ● “Bootstrap”: find a solution with what you have ● Expensive: ● Compute the optimum x on your sample ● Compute the expected optimum x' on average on multiple “resamplings” ● Compute b=x-Ex' ● Return x+b = 2x-Ex' ● Many other resampling methods (jackknife, variants of bootstrap...)
  • 94. Looks strange, isn't it ? ● Efron and others designed such tools ● “Bootstrap”: find a solution with what you have ● Expensive: ● Compute the optimum x on your sample ● Compute the expected optimum x' on average on multiple “resamplings” ● Compute b=x-Ex' ● Return x+b ● Many other resampling methods (jackknife, variants of bootstrap...) A beautiful example of case in which sophisticated mathematics help
  • 95. State of the art in discrete-time control, a few tools: ● Model Predictive Control: For making a decision in a given state: (i) do forecasts (ii) replace random procs -> pessimistic forecasts (iii) optimize as if deterministic problem ● Stochastic Dynamic Programming: ● ~Markov model ● Compute “cost to go” backwards ● Direct Policy Search: ● Parametric controller ● Optimized on simulations ● Problems for high-dimensional constrained action spaces
  • 96. State of the art in discrete-time control, a few tools: MPC SDP DPS Random Markovian Ok process Long Heuristic Ok Ok term effects Constrained Ok Ok action spaces ● Optimal if Deterministic Markovian Good structure So convenient Good for tuning ?
  • 97. DPS as a fine tuning upper layer ● Bengio '97: ● Handcraft a policy ● Smooth it, replace constants by parameters ● Optimize params by DPS ● Decock et al '13: ● construct a policy as in SDP ● but learn a parametric V by DPS (instead of back. induction) Basically, it never fails... Find suboptimality, counterbalance it by DPS...
  • 98. Power systems optimization (1 slide) Consider an electric system. Decisions = ● Strategic decisions (a few time steps): ●building a nuclear power plant ●build a Spain-Marocco connection ●build a wind farm ● tactical decisions (many time steps): ●switching on hydroPP #7 at 6:00 ●switching on thermal PP #4 at 7:15 ●.... Based on simulations of the tactical level Depends on the strategic level
  • 99. Outline 1. Overview 2. Sequential decision making 3. Strategic decisions 4. Conclusions I have nightmares with this problem. No idea how to tacklle that.
  • 100. Strategic decisions ● Typical situation ● A system, to be optimized by RL / SDP / etc ● Strategic decisions on top of it ● Non stochastic uncertainties ● Tools ● Bandits (including adversarial) ● Wald / Savage / Nash criteria ● Bilevel optimization ● Still quite an open problem
  • 101. Strategic decisions ● Decisions (very simplified): ● 100% renewable + demand-side management ● Plenty of nuclear power ● 100% coal+gas ● Microgrids ● Concentrated solar power (storage with molten salt...)... ● Scenarios ● A Fukushima in Europe ? ● Little improvements of renewable ? ● No more international cooperation ? ● ...new gravity storage ? Flying wheels ? compressed air ? H2 ? fusion ? Breakthrough in PV ? Putin ?...
  • 102. Strategic decisions R(d,s) = reward for decision d in scenario s ● On average: d*=argmaxd Es R(d,s) ==> what if we have no probabilities ? ● Worst case: d*=argmaxd mins R(d,s) A bit conservative ? As if Nature decides “against” us and “after” we have chosen d. ● Regret: R'(d,s) = R(d,s)-maxd' R(d',s) & d*=argmind R'(d,s) ● Nash: d*=argmaxd mins R(d,s) (d random; no “after”) ==> stochastic strategies ? (==> fired)
  • 103. Strategic decisions ● Criteria ● Best (deter.) choice for worst scenario (Wald) ● Best (deter.) choice in terms of regret (Savage) ● Best simultaneous choice (Nash) ● Combination: best simultaneous regret (Nash+Savage) ● Formally: ● Wald: argmax_c min_s Reward(c,s) ● Sav.: argmax_c min_s max_g ( Reward(c,s)-Reward(g,s) ) ● Nash-Savage & Nash-Wald: allow stochastic decisions ==> strange fact: optimal policies are then stochastic... A stochastic nuclear decision ?
  • 104. Strategic decisions ● Assume you have 100 000 000 000 euros for investing in power plants / networks. What do you do ? ● Different points of view: ● Too many uncertainties! Minimum investments. ● So many uncertaintiies! Investments everywhere (will solve unemployment :-) ). ● No idea :-) ● What do you think ? What would you do if your life was depending on it ?
  • 105. Conclusions ● dynamic optimization problems in power systems: beautiful + crucial ● 2 levels “strategic + tactical” ● Strategic: >100 decision variables in a noisy non-linear optimization problem with non-stochastic uncertainties ● Tactical (dynamic): – 10 000 constrained action variables – 100 state variables – hundreds of time steps – non-linear effects – non-Markovian random processes
  • 106. Conclusions ● RL did not invade power systems because of high dim. constrained action spaces
  • 107. Conclusions ● RL did not invade power systems because of high dim. constrained action spaces ● Dynamic optimization does not boil down to MDP solving ==> Markov assumption!
  • 108. Conclusions ● RL did not invade power systems because of high dim. constrained action spaces ● Dynamic optimization does not boil down to MDP solving ==> Markov assumption! ● DPS on top of other algorithms ? Tuning in front of “realistic” simulations.
  • 109. Conclusions ● RL did not invade power systems because of high dim. constrained action spaces ● Dynamic optimization does not boil down to MDP solving ==> Markov assumption! ● DPS on top of other algorithms ? Tuning in front of “realistic” simulations with realistic RP. ● Long term investments: ● Difficult to make decisions (nobody trusts criteria for non-stochastic uncertainties) ● But negative conclusions matter: “removing X without adding Y does not work”:
  • 110. Conclusions Main strengths of the RL (machine learning) communities: ● Really cares about nonlinear effects (model error) ● Really cares about overfitting (clean cross-validation) ==> should invade power systems ==> at least if high-dim action spaces are handled ( SDP / SDDP great for that ) Our proposal: DPS as a fine-tuning upper layer, over MPC (which is sooo convenient!).
  • 111. Bibliography ● Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC. Bertsekas, 2005. (MPC = deterministic forecasts) ● “Newave vs Odin”: why MPC survives in spite of theoretical shortcomings ● Dallagi et Simovic (EDF R&D) : "Optimisation des actifs hydrauliques d'EDF : besoins métiers, méthodes actuelles et perspectives", PGMO (importance of precise simulations) ● Ernst: The Global Grid, 2013 & all his slides/studies on WWW ● Renewable energy forecasts ought to be probabilistic! Pinson, 2013 (wipfor talk) ● Training a neural network with a financial criterion rather than a prediction criterion. Bengio, 1997 ● Direct Model Predictive Control, Decock et al, 2014 (combining DPS and MPC)
  • 112. Summary :-) ● Noisy optimization = black box stochastic optimization ● Dynamic optimization (DO) = multistage optimization ● Reinforcement learning = black box DO ● Direct Policy Search = RL by parametric optimization ● Model Predictive Control = DPS with simplified model ● Receding horizon = neglect long-term + frequent reoptimize ● (Stoc.) (dual) dynamic prog. = DO backwards in time ● Bootstrap, bias correction <== tricky statistics for sample average approximation ● Non stochastic uncertainties (Wald, Savage) ● Unit commitment/dispatch = DP for power systems ● UC by sort = unit commitment with marginal costs only
  • 113. All in one slide ● Noisy optimization ● Direct Policy Search ● Model Predictive Control + receding horizon ● Reinforcement learning / Markov Decision Processes ● Stochastic (dual) dynamic programming ● Bootstrap, bias correction, sample averrage approximation ● Non stochastic uncertainties (Wald, Savage) ● + power systems (stability of networks, capacity markets, domino effect, HVDC, unit commitment, dispatch, UC by sort, think of time (storage) & space (transmission) )
  • 116. ● Representation of the controller ● decision(current state)= argmin Cost(decision) + Bellman(next state) ● Linear programming (LP) if: – For a given current state, next state = LP(decision) – Cost(decision) = LP(decision) ● →100 000 decision variables per time step SDP / SDDP Stochastic (Dual) Dynamic Programming
  • 117. ● Planning/control (tactical level) ● Pluriannual planning: evaluate marginal costs of hydroelectricity ● Taking into account stochasticity and uncertainties ● Moderate scale (Cities, Factories) (tactical level simpler) ● Master plan optimization ● Stochastic uncertainties ● High scale investment studies (e.g. Europe+North Africa) ● Long term (2030 - 2050) ● Huge (non-stochastic) uncertainties ● Investments: interconnections, storage, smart grids, power plants... Our activities
  • 118. Energy is expensive (or not ?) Desertec = hundreds of billions of euros for renewables in Africa. Medgrid = transmission network for these renewables. ==> worth taking time for making decisions
  • 119. Stochastic Control System Controller with memory commands State Cost Observation Random values Random process For an optimal representation, you need access ● to the whole archive ● or to forecasts (generative model / probabilistic forecasts) (Astrom 1965)