Hierarchical Pomdp Planning And Execution

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    Talk to you about my recent work on ...

    Favorites, Groups & Events

    Hierarchical Pomdp Planning And Execution - Presentation Transcript

    1. Hierarchical POMDP Planning and Execution Joelle Pineau Machine Learning Lunch November 20, 2000
    2. Partially Observable MDP
      • POMDPs are characterized by:
        • States: s  S
        • Actions: a  A
        • Observations: o  O
        • Transition probabilities: T(s,a,s’)=Pr(s’|s,a)
        • Observation probabilities: T(o,a,s’)=Pr(o|s,a)
        • Rewards: R(s,a)
        • Beliefs: b(s t )=Pr(s t |o t ,a t ,…,o 0 ,a 0 )
      S 1 S 2 S 3
    3. The problem
      • How can we find good policies for complex POMDPs?
      • Is there a principled way to provide near-optimal policies?
    4. Proposed Approach
      • Exploit structure in the problem domain.
      • What type of structure?
        • Action set partitioning
      Act InvestigateHealth Move Navigate CheckPulse AskWhere Left Right Up Down CheckMeds
    5. Hierarchical POMDP Planning
      • What do we start with?
        • A full POMDP model: {S o ,A o ,O o ,M o }.
        • An action set partitioning graph.
      • Key idea:
        • Break the problem into many “related” POMDPs.
        • Each smaller POMDP has only a subset of A o .
          • imposing policy constraint
      • But why?
        • POMDP: exponential run-time per value iteration O(|A|  n-1 |O| )
    6. Example M B K E 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.8 POMDP: S o = { M eds, K itchen, B edroom} A o = {ClarifyTask, Check M eds, GoTo K itchen, GoTo B edroom} O o = {Noise, M eds, K itchen, B edroom} Value Function: MedsState KitchenState BedroomState 0.8 GoToKitchen ClarifyTask GoToBedroom CheckMeds
    7. Hierarchical POMDP Action Partitioning: Act Move CheckMeds ClarifyTask ClarifyTask GoToKitchen GoToBedroom
    8. Local Value Function and Policy - Move Controller ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState
    9. Modeling Abstract Actions ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState Problem : Need parameters for abstract action Move Solution : Use the local policy of corresponding low-level controller General form : Pr ( s j | s i , a k abstract ) = Pr ( s j | s i , Policy(a k abstract ,s i ) ) Example : Pr ( s j | MedsState , Move ) = Pr ( s j | MedsState , ClarifyTask ) Policy (Move,s i ):
    10. Local Value Function and Policy - Act Controller Move MedsState KitchenState BedroomState CheckMeds
    11. Comparing Policies Hierarchical Policy: Optimal Policy: = ClarifyTask = CheckMeds = GoToKitchen = GoToBedroom
    12. Bounding the value of the approximation
      • Value function of top-level controller is an upper-bound on the value of the approximation.
        • Why ? We were optimistic when modeling the abstract action.
      • Similarly, we can find a lower-bound .
        • How ? We can assume “worst-case” view when modeling the abstract action.
      • If we partition the action set differently, we will get different bounds.
    13. A real dialogue management example - AskGoWhere - GoToRoom - GoToKitchen - GoToFollow - VerifyRoom - VerifyKitchen - VerifyFollow - GreetGeneral - GreetMorning - GreetNight - RespondThanks - AskWeatherTime - SayCurrent - SayToday - SayTomorrow - StartMeds - NextMeds - ForceMeds - QuitMeds - AskCallWho - CallHelp - CallNurse - CallRelative - VerifyHelp - VerifyNurse - VerifyRelative - AskHealth - OfferHelp - SayTime Act CheckHealth Phone DoMeds CheckWeather Move Greet
    14. Results:
    15. Final words
      • We presented:
        • a general framework to exploit structure in POMDPs;
      • Future work:
        • automatic generation of good action partitioning;
        • conditions for additional observation abstraction;
        • bigger problems!

    + ahmad bassiounyahmad bassiouny, 2 years ago

    custom

    313 views, 0 favs, 0 embeds more stats

    Hierarchical Pom dp Planning And Execution

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 313
      • 313 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories