### SlideShare for iOS

by Linkedin Corporation

FREE - On the App Store

- Total Views
- 236
- Views on SlideShare
- 236
- Embed Views

- Likes
- 0
- Downloads
- 3
- Comments
- 0

No embeds

Uploaded via SlideShare as OpenOffice

© All Rights Reserved

- 1. TAO - Inria Saclay-IDF● Machine Learning & Optimization● Tao-uctsig: ● Sequential decision making ● One permanent, full time + others part time ● Applications to energy management ● Strong collaboration with ● Taiwan ● Artelys (Ilab Metis; joint software) ● Others O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
- 2. Power systems, high scale Production NetworkFeedback mecanism (smart grids) em and D
- 3. For choosing investments we want to simulate systems● Difficulties: ● Demand varying in time, bounded prediction ● Transportation introduces constraints ● Renewable ==> variability ++● Problems: ● Limited previsibility has an impact ==> anticipative high-level techniques underestimate the need for storage / smoothing ● Markovian assumptions ==> wrong ● A system which neglects “base ≠ peak” can not be used. ==> Model error >> optimization error ==> Machine Learning on top of Math. Programming
- 4. Math programming and machine learning● Math programming: ● Nearly exact solutions ● High-dimensional constrained action space ● But small state space & not anytime● Reinforcement learning ● Unstable ● Small / simple action space ● But high dimensional state space & anytime
- 5. Stochastic dyn. Programming Huge computation time Assumes Markovian Models. Neglects non-linearities.● Step 1: compute Bellmans function: Can work with huge constrained● Step 2: make decisions: action space
- 6. Direct Policy Search● Define a parametric function ● Neural network ● Handcrafted function● Non-linear optimization ● The best θ is the one which performs best on simulations ==> obtained by non-linear stochastic optimization ● Non-linearities ok, arbitrary stochastic process, large state space ==> little model bias ● No solution for huge constrained action spaces
- 7. Math prog & reinforcement learning● Here, we consider “math prog = heuristic”, because its fast but with strong model bias● Proposals: DPS-style: ● MCTS (Monte-Carlo Tree Search) + heuristic Little model bias, arbitrary random ● DPS (Direct Policy Search) + heuristic process, large state space ● Example Bellman-style; ok for large constrained action spaces Non-linear ~ (θ, xt) Linear ~ x(t+1)
- 8. Works in Tao● Noisy non-linear optimization ● Fabians algorithm ● Anytime properties (for bilevel problems) ● Evolutionary algorithms● Reinforcement learning ● MCTS (Monte Carlo Tree Search) on top of heuristics ● DPS (combined with MCTS or heuristics)● Links with Artelys: ● Joint software ● Experiments – Non-anticipativity – Non-linearities

Full NameComment goes here.