3slides

186 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
186
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

3slides

  1. 1. TAO - Inria Saclay-IDF● Machine Learning & Optimization● Tao-uctsig: ● Sequential decision making ● One permanent, full time + others part time ● Applications to energy management ● Strong collaboration with ● Taiwan ● Artelys (Ilab Metis; joint software) ● Others O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
  2. 2. Power systems, high scale Production NetworkFeedback mecanism (smart grids) em and D
  3. 3. For choosing investments we want to simulate systems● Difficulties: ● Demand varying in time, bounded prediction ● Transportation introduces constraints ● Renewable ==> variability ++● Problems: ● Limited previsibility has an impact ==> anticipative high-level techniques underestimate the need for storage / smoothing ● Markovian assumptions ==> wrong ● A system which neglects “base ≠ peak” can not be used. ==> Model error >> optimization error ==> Machine Learning on top of Math. Programming
  4. 4. Math programming and machine learning● Math programming: ● Nearly exact solutions ● High-dimensional constrained action space ● But small state space & not anytime● Reinforcement learning ● Unstable ● Small / simple action space ● But high dimensional state space & anytime
  5. 5. Stochastic dyn. Programming Huge computation time Assumes Markovian Models. Neglects non-linearities.● Step 1: compute Bellmans function: Can work with huge constrained● Step 2: make decisions: action space
  6. 6. Direct Policy Search● Define a parametric function ● Neural network ● Handcrafted function● Non-linear optimization ● The best θ is the one which performs best on simulations ==> obtained by non-linear stochastic optimization ● Non-linearities ok, arbitrary stochastic process, large state space ==> little model bias ● No solution for huge constrained action spaces
  7. 7. Math prog & reinforcement learning● Here, we consider “math prog = heuristic”, because its fast but with strong model bias● Proposals: DPS-style: ● MCTS (Monte-Carlo Tree Search) + heuristic Little model bias, arbitrary random ● DPS (Direct Policy Search) + heuristic process, large state space ● Example Bellman-style; ok for large constrained action spaces Non-linear ~ (θ, xt) Linear ~ x(t+1)
  8. 8. Works in Tao● Noisy non-linear optimization ● Fabians algorithm ● Anytime properties (for bilevel problems) ● Evolutionary algorithms● Reinforcement learning ● MCTS (Monte Carlo Tree Search) on top of heuristics ● DPS (combined with MCTS or heuristics)● Links with Artelys: ● Joint software ● Experiments – Non-anticipativity – Non-linearities

×