3slides
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

3slides

on

  • 236 views

 

Statistics

Views

Total Views
236
Views on SlideShare
236
Embed Views
0

Actions

Likes
0
Downloads
3
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

3slides Presentation Transcript

  • 1. TAO - Inria Saclay-IDF● Machine Learning & Optimization● Tao-uctsig: ● Sequential decision making ● One permanent, full time + others part time ● Applications to energy management ● Strong collaboration with ● Taiwan ● Artelys (Ilab Metis; joint software) ● Others O. Teytaud, Research Fellow, olivier.teytaud@inria.fr http://www.lri.fr/~teytaud/
  • 2. Power systems, high scale Production NetworkFeedback mecanism (smart grids) em and D
  • 3. For choosing investments we want to simulate systems● Difficulties: ● Demand varying in time, bounded prediction ● Transportation introduces constraints ● Renewable ==> variability ++● Problems: ● Limited previsibility has an impact ==> anticipative high-level techniques underestimate the need for storage / smoothing ● Markovian assumptions ==> wrong ● A system which neglects “base ≠ peak” can not be used. ==> Model error >> optimization error ==> Machine Learning on top of Math. Programming
  • 4. Math programming and machine learning● Math programming: ● Nearly exact solutions ● High-dimensional constrained action space ● But small state space & not anytime● Reinforcement learning ● Unstable ● Small / simple action space ● But high dimensional state space & anytime
  • 5. Stochastic dyn. Programming Huge computation time Assumes Markovian Models. Neglects non-linearities.● Step 1: compute Bellmans function: Can work with huge constrained● Step 2: make decisions: action space
  • 6. Direct Policy Search● Define a parametric function ● Neural network ● Handcrafted function● Non-linear optimization ● The best θ is the one which performs best on simulations ==> obtained by non-linear stochastic optimization ● Non-linearities ok, arbitrary stochastic process, large state space ==> little model bias ● No solution for huge constrained action spaces
  • 7. Math prog & reinforcement learning● Here, we consider “math prog = heuristic”, because its fast but with strong model bias● Proposals: DPS-style: ● MCTS (Monte-Carlo Tree Search) + heuristic Little model bias, arbitrary random ● DPS (Direct Policy Search) + heuristic process, large state space ● Example Bellman-style; ok for large constrained action spaces Non-linear ~ (θ, xt) Linear ~ x(t+1)
  • 8. Works in Tao● Noisy non-linear optimization ● Fabians algorithm ● Anytime properties (for bilevel problems) ● Evolutionary algorithms● Reinforcement learning ● MCTS (Monte Carlo Tree Search) on top of heuristics ● DPS (combined with MCTS or heuristics)● Links with Artelys: ● Joint software ● Experiments – Non-anticipativity – Non-linearities