Your SlideShare is downloading. ×
0
Combing Reactive and
Deliberative Algorithms

    CSCI7000: Final Presentation
         Maciej Stachura
           Dec. 4,...
Outline

• Project Overview

• Positioning System

• Hardware Demo
Project Goals
• Combine deliberative and reactive
  algorithms

• Show stability and completeness

• Demonstrate multi-rob...
Coverage Problem
• Cover Entire Area
• Deliberative Algorithm Plans
  Next Point to visit.
• Reactive Algorithm pushes
  r...
Proof of Stability




Therefore stable system.
also,

                 error      decays
Demo for single vehicle

• Implimented on iCreate.
• 5 points to visit.
• Deliberative Algorithm
  Selects Point.
• Reacti...
Multi-robot Case
• 2 Robot Coverage

• Blue is free to move          VIDEO

• Green must stay in
  communication range.

•...
Outline

• Project Overview

• Positioning System

• Hardware Demo
Positioning System
• Problems with Stargazer.
   •   Periods of no measurement
   •   Occasional Bad Measurements




• St...
SPF Explanation
• Sigma Point Filter uses
  Stargazer and Odometry
  measures to predict robot
  position.
• Non-guassian ...
Outline

• Project Overview

• Positioning System

• Hardware Demo
Roomba Pac-Man
• Implimented 5 Robot Demo along
  with Jack Elston.

• Re-creation of Pac-Man Game.

• Demonstrate NetUAS ...
Video
Roomba Pac-Man
• Reactive Algorithms:
   •   Walls of maze
   •   Potential Field

• Deliberative Algorithms
   •   Ghost ...
Roomba Pac-Man
• Reactive Algorithms:
   •   Walls of maze
   •   Potential Field

• Deliberative Algorithms
   •   Ghost ...
Roomba Pac-Man
• Reactive Algorithms:
   •   Walls of maze
   •   Potential Field

• Deliberative Algorithms
   •   Ghost ...
Roomba Pac-Man
• Reactive Algorithms:
   •   Walls of maze
   •   Potential Field

• Deliberative Algorithms
   •   Ghost ...
Roomba Pac-Man
• Simulation
   •   Multi-threaded Sim. Of Robots
   •   Combine Software with Hardware

• Probabilistic Mo...
Roomba Pac-Man
• Simulation
   •   Multi-threaded Sim. Of Robots
   •   Combine Software with Hardware

• Probabilistic Mo...
Roomba Pac-Man
• Simulation
   •   Multi-threaded Sim. Of Robots
   •   Combine Software with Hardware

• Probabilistic Mo...
Roomba Pac-Man
• Simulation
   •   Multi-threaded Sim. Of Robots
   •   Combine Software with Hardware

• Probabilistic Mo...
Left to Do
• Impliment inter-robot potential field.

• Conduct Experiments

• Generalize Theory?
End

       Questions?




http://pacman.elstonj.com
A Gradient Based Approach

              Greg Brown
  Introduction
  Robot State Machine

  Gradients for “Grasping” the Object
  Gradient for Moving the Object

  Conve...
Place a single beacon on an object and
 another at the object’s destination. Multiple
 robots cooperate to move the object...
    Each Robot Knows:
     ◦  Distance/Direction to Object
     ◦  Distance/Direction to Destination
     ◦  Distance/Dir...
    Related “Grasping” Work:
     ◦  Grasping with hand – Maximize torque [Liu et al]
     ◦  Cage objects for pushing [F...
Pull towards object:

         γ = ri − robj



€       Avoid nearby robots:
                                             ...
Combined Cost Function:

                         γ2
       Cost =
                (γ κ c   + β )1/ κ c



€
Repel from all robots:
            N
                        2
      β = ∏ ri − rj − dr2
           j=1


                ...
    Related Work
     ◦  Formations [Tanner and Kumar]
     ◦  Flocking [Lindhé et al]
     ◦  Pushing objects [Fink et a...
Next Step Vector:
                         rObjCenter − rObjDest
     rγ i = rideali + dm
                         rObjCen...
Valley Perpendicular to Travel Vector:
               rObjCenterx − rObjDestx
    m=−
          rObjCentery − rObjDesty + ...
κ1 κ 2
Cost = γ γ
       1  2
Number of Occurences




                    0
                        10
                             20
                ...
  Resolve Convergence Problems
  Noise in Sensing

  Noise in Actuation
Number of Occurences




                    0
                        10
                             20
                ...
Modular Robots
                     Learning
                Contributions
                   Conclusion




A Young Modul...
Modular Robots
                             Learning
                        Contributions
                           Conc...
Modular Robots
                            Learning
                       Contributions
                          Conclus...
Modular Robots
                                                The Problem
                                   Learning
   ...
Modular Robots
                                           The Problem
                                Learning
           ...
Modular Robots
                                           The Problem
                                Learning
           ...
Modular Robots
                                           The Problem
                                Learning
           ...
Modular Robots
                                            The Problem
                                 Learning
         ...
Modular Robots
                                           The Problem
                                Learning
           ...
Modular Robots
                                        Going forward
                             Learning
               ...
Modular Robots
                                           Going forward
                                Learning
         ...
Modular Robots
                                                                               Going forward
              ...
Modular Robots
                                             Going forward
                                Learning
       ...
Modular Robots
                                            Going forward
                                 Learning
       ...
Modular Robots
                                              Going forward
                                  Learning
    ...
Modular Robots
                                          Going forward
                              Learning
            ...
Modular Robots
                                Learning
                           Contributions
                         ...
December 4, Project
December 4, Project
December 4, Project
December 4, Project
December 4, Project
Upcoming SlideShare
Loading in...5
×

December 4, Project

344

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
344
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "December 4, Project"

  1. 1. Combing Reactive and Deliberative Algorithms CSCI7000: Final Presentation Maciej Stachura Dec. 4, 2009
  2. 2. Outline • Project Overview • Positioning System • Hardware Demo
  3. 3. Project Goals • Combine deliberative and reactive algorithms • Show stability and completeness • Demonstrate multi-robot coverage on iCreate robots.
  4. 4. Coverage Problem • Cover Entire Area • Deliberative Algorithm Plans Next Point to visit. • Reactive Algorithm pushes robot to that point. • Reactive Algorithm Adds 2 constraints: • Maintain Communication Distance • Collision Avoidance
  5. 5. Proof of Stability Therefore stable system. also, error decays
  6. 6. Demo for single vehicle • Implimented on iCreate. • 5 points to visit. • Deliberative Algorithm Selects Point. • Reactive Algorithm uses potential field to reach point. • Point reached when within some minimum distance. VIDEO
  7. 7. Multi-robot Case • 2 Robot Coverage • Blue is free to move VIDEO • Green must stay in communication range. • Matlab Simulation.
  8. 8. Outline • Project Overview • Positioning System • Hardware Demo
  9. 9. Positioning System • Problems with Stargazer. • Periods of no measurement • Occasional Bad Measurements • State Estimation (SPF) • Combine Stargazer with Odometry • Reject Bad Measurements
  10. 10. SPF Explanation • Sigma Point Filter uses Stargazer and Odometry measures to predict robot position. • Non-guassian Noise • Implimented and Tested on robot platform. • Performs very well in the presence of no measurements or bad measurement.
  11. 11. Outline • Project Overview • Positioning System • Hardware Demo
  12. 12. Roomba Pac-Man • Implimented 5 Robot Demo along with Jack Elston. • Re-creation of Pac-Man Game. • Demonstrate NetUAS system. • Showcase most of concepts from class.
  13. 13. Video
  14. 14. Roomba Pac-Man • Reactive Algorithms: • Walls of maze • Potential Field • Deliberative Algorithms • Ghost Planning (Enumerate States) • Collision Avoidance • Game modes • Decentralized • Each ghost ran planning algorithm • Collaborated on positions • Communication • 802.11b Ad-hoc Network • AODV, no centralized node
  15. 15. Roomba Pac-Man • Reactive Algorithms: • Walls of maze • Potential Field • Deliberative Algorithms • Ghost Planning (Enumerate States) • Collision Avoidance • Game modes • Decentralized • Each ghost ran planning algorithm • Collaborated on positions • Communication • 802.11b Ad-hoc Network • AODV, no centralized node
  16. 16. Roomba Pac-Man • Reactive Algorithms: • Walls of maze • Potential Field • Deliberative Algorithms • Ghost Planning (Enumerate States) • Collision Avoidance • Game modes • Decentralized • Each ghost ran planning algorithm • Collaborated on positions • Communication • 802.11b Ad-hoc Network • AODV, no centralized node
  17. 17. Roomba Pac-Man • Reactive Algorithms: • Walls of maze • Potential Field • Deliberative Algorithms • Ghost Planning (Enumerate States) • Collision Avoidance • Game modes • Decentralized • Each ghost ran planning algorithm • Collaborated on positions • Communication • 802.11b Ad-hoc Network • AODV, no centralized node
  18. 18. Roomba Pac-Man • Simulation • Multi-threaded Sim. Of Robots • Combine Software with Hardware • Probabilistic Modelling • Sigma Point Filter • Human/Robot Interaction • Limited Human Control of Pac-Man • Autonomous Ghosts • Hardware Implimentation • SBC's running Gentoo • Experimental Verification
  19. 19. Roomba Pac-Man • Simulation • Multi-threaded Sim. Of Robots • Combine Software with Hardware • Probabilistic Modelling • Sigma Point Filter • Human/Robot Interaction • Limited Human Control of Pac-Man • Autonomous Ghosts • Hardware Implimentation • SBC's running Gentoo • Experimental Verification
  20. 20. Roomba Pac-Man • Simulation • Multi-threaded Sim. Of Robots • Combine Software with Hardware • Probabilistic Modelling • Sigma Point Filter • Human/Robot Interaction • Limited Human Control of Pac-Man • Autonomous Ghosts • Hardware Implimentation • SBC's running Gentoo • Experimental Verification
  21. 21. Roomba Pac-Man • Simulation • Multi-threaded Sim. Of Robots • Combine Software with Hardware • Probabilistic Modelling • Sigma Point Filter • Human/Robot Interaction • Limited Human Control of Pac-Man • Autonomous Ghosts • Hardware Implimentation • SBC's running Gentoo • Experimental Verification
  22. 22. Left to Do • Impliment inter-robot potential field. • Conduct Experiments • Generalize Theory?
  23. 23. End Questions? http://pacman.elstonj.com
  24. 24. A Gradient Based Approach Greg Brown
  25. 25.   Introduction   Robot State Machine   Gradients for “Grasping” the Object   Gradient for Moving the Object   Convergence Simulation Results   Continuing Work
  26. 26. Place a single beacon on an object and another at the object’s destination. Multiple robots cooperate to move the object. Goals:   Minimal/No Robot Communication   Object has an Unknown Geometry   Use Gradients for Reactive Navigation
  27. 27.   Each Robot Knows: ◦  Distance/Direction to Object ◦  Distance/Direction to Destination ◦  Distance/Direction to All Other Robots ◦  Bumper Sensor to Detect Collision   Robots Do Not Know ◦  Object Geometry ◦  Actions other Robots are taking
  28. 28.   Related “Grasping” Work: ◦  Grasping with hand – Maximize torque [Liu et al] ◦  Cage objects for pushing [Fink et al] ◦  Tug Boats Manipulating Barge [Esposito] ◦  ALL require known geometry   My Hybrid Approach ◦  Even distribution around object ◦  Alternate between Convergence and Repulsion Gradients ◦  Similar to Cow Herding example from class.
  29. 29. Pull towards object: γ = ri − robj € Avoid nearby robots: sign(d c − ri −r j )+1  ( ri − rj − dc2 ) 2  2 2 N 4 1+ d β = ∏1− 4 c 2   j=1  dc ( ri − rj − dc2 ) 2 + 1  €
  30. 30. Combined Cost Function: γ2 Cost = (γ κ c + β )1/ κ c €
  31. 31. Repel from all robots: N 2 β = ∏ ri − rj − dr2 j=1 1 Cost = (1+ β )1/ κ r € €
  32. 32.   Related Work ◦  Formations [Tanner and Kumar] ◦  Flocking [Lindhé et al] ◦  Pushing objects [Fink et al, Esposito] ◦  No catastrophic failure if out of position.   My Approach: ◦  Head towards destination in steps ◦  Keep close to object. ◦  Communicate “through” object ◦  Maintain orientation.   Assuming forklift on Robot can rotate 360º
  33. 33. Next Step Vector: rObjCenter − rObjDest rγ i = rideali + dm rObjCenter − rObjDest Pull to destination: € γ1 = ri − rγ i €
  34. 34. Valley Perpendicular to Travel Vector: rObjCenterx − rObjDestx m=− rObjCentery − rObjDesty + .0001 mrix − riy − mrγ x + rγ y γ2 = 2 € (m + 1)
  35. 35. κ1 κ 2 Cost = γ γ 1 2
  36. 36. Number of Occurences 0 10 20 30 40 50 521 60 670 820 969 1118 1268 1417 1566 1715 1865 2014 2163 2313 2462 2611 2761 2910 Time Steps 3059 3208 3358 3507 3656 3806 3955 4104 4254 4403 4552 4701 4851 5000 6 Bots 5 Bots 4 Bots 3 Bots
  37. 37.   Resolve Convergence Problems   Noise in Sensing   Noise in Actuation
  38. 38. Number of Occurences 0 10 20 30 40 50 245 60 404 562 721 879 1038 1196 1355 1513 1672 1830 1989 2147 2306 2464 2623 2781 Time Steps 2940 3098 3257 3415 3574 3732 3891 4049 4208 4366 4525 4683 4842 5000 6 Bots 5 Bots 4 Bots 3 Bots
  39. 39. Modular Robots Learning Contributions Conclusion A Young Modular Robot’s Guide to Locomotion Ben Pearre Computer Science University of Colorado at Boulder, USA December 6, 2009 Ben Pearre A Young Modular Robot’s Guide to Locomotion
  40. 40. Modular Robots Learning Contributions Conclusion Outline Modular Robots Learning The Problem The Policy Gradient Domain Knowledge Contributions Going forward Steering Curriculum Development Conclusion Ben Pearre A Young Modular Robot’s Guide to Locomotion
  41. 41. Modular Robots Learning Contributions Conclusion Modular Robots How to get these to move? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  42. 42. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion The Learning Problem Given unknown sensations and actions, learn a task: ◮ Sensations s ∈ Rn ◮ State x ∈ Rd ◮ Action u ∈ Rp ◮ Reward r ∈ R ◮ Policy π(x, θ) = Pr(u|x, θ) : R|θ| × R|u| Example policy: u(x, θ) = θ0 + θi (x − bi )T Di (x − bi ) + N (0, σ) i What does that mean for locomotion? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  43. 43. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion Policy Gradient Reinforcement Learning: Finite Difference Vary θ: ◮ Measure performance J0 of π(θ) ◮ Measure performance J1...n of π(θ + ∆1...n θ) ◮ Solve regression, move θ along gradient. −1 gradient = ∆ΘT ∆Θ ˆ ∆ΘT J     ∆θ1 J1 − J0 where ∆Θ =  .  and J =  ˆ  .  .  .  . .  ∆θn Jn − J0 Ben Pearre A Young Modular Robot’s Guide to Locomotion
  44. 44. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion Policy Gradient Reinforcement Learning: Likelihood Ratio Vary u: ◮ Measure performance J(π(θ)) of π(θ) with noise. . . ◮ Compute log-probability of generated trajectory Pr(τ |θ) H H Gradient = ∇θ log πθ (uk |xk ) rl k=0 l=0 Ben Pearre A Young Modular Robot’s Guide to Locomotion
  45. 45. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion Why is RL slow? “Curse of Dimensionality” ◮ Exploration ◮ Learning rate ◮ Domain representation ◮ Policy representation ◮ Over- and under-actuation ◮ Domain knowledge Ben Pearre A Young Modular Robot’s Guide to Locomotion
  46. 46. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion Domain Knowledge Infinite space of policies to explore. ◮ RL is model-free. So what? ◮ Representation is bias. ◮ Bias search towards “good” solutions ◮ Learn all of physics. . . and apply it? ◮ Previous experience in this domain? ◮ Policy implemented by <programmer, agent> “autonomous”? How would knowledge of this domain help? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  47. 47. Modular Robots The Problem Learning The Policy Gradient Contributions Domain Knowledge Conclusion Dimensionality Reduction Task learning as domain-knowledge acquisition: ◮ Experience with a domain ◮ Skill at completing some task ◮ Skill at completing some set of tasks? ◮ Taskspace Manifold Ben Pearre A Young Modular Robot’s Guide to Locomotion
  48. 48. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion Goals 1. Apply PGRL to a new domain. 2. Learn mapping from task manifold to policy manifold. 3. Robot school? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  49. 49. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion 1: Learning to locomote ◮ Sensors: Force feedback on servos? Or not. ◮ Policy: u ∈ R8 controls servos ui = N (θi , σ) ◮ Reward: forward speed ◮ Domain knowledge: none Demo? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  50. 50. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion 1: Learning to locomote Learning to move 10 steer bow 5 steer stern bow port fwd 0 θ stbd fwd port aft −5 stbd aft stern −10 0 500 1000 1500 2000 2500 s 0.4 effort 10−step forward speed 0.3 0.2 v 0.1 0 −0.1 0 500 1000 1500 2000 2500 s Ben Pearre A Young Modular Robot’s Guide to Locomotion
  51. 51. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion 2: Learning to get to a target ◮ Sensors: Bearing to goal. ◮ Policy: u ∈ R8 controls servos ◮ Policy parameters: θ ∈ R16 µi (x, θ) = θi · s (1) 1 = [ θi,0 θi,1 ] (2) φ = N (µi , σ) ui (3) 1 ∇θi log π(x, θ) = (ui − θi · s) · s (4) σ2 Ben Pearre A Young Modular Robot’s Guide to Locomotion
  52. 52. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion 2: Task space → policy space ◮ 16-DOF learning FAIL! Time to complete task ◮ Try simpler task: 300 ◮ Learn to locomote with 250 θ ∈ R16 200 seconds ◮ Try bootstrapping: 150 1. Learn to locomote with 8 DOF 100 2. Add new sensing and 50 0 20 40 60 80 100 120 control DOF task ◮ CHEATING! Why? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  53. 53. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion Curriculum development for manifold discovery? ◮ ´ Etude in Locomotion ◮ Task-space manifold for locomotion θ ∈ξ·[ 0 0 1 −1 1 −1 1 1 ]T ◮ Stop exploring in task nullspace ◮ FAST! ◮ ´ Etude in Steering ◮ Can task be completed on locomotion manifold? ◮ One possible approximate solution uses the bases T 0 0 1 −1 1 −1 1 1 1 −1 0 0 0 0 0 0 ◮ Can second basis be learned? Ben Pearre A Young Modular Robot’s Guide to Locomotion
  54. 54. Modular Robots Going forward Learning Steering Contributions Curriculum Development Conclusion 3: How to teach a robot? How to teach an animal? 1. Reward basic skills 2. Develop control along useful DOFs 3. Make skill more complex 4. A good solution NOW! Ben Pearre A Young Modular Robot’s Guide to Locomotion
  55. 55. Modular Robots Learning Contributions Conclusion Conclusion Exorcising the Curse of Dimensionality ◮ PGRL works for low-DOF problems. ◮ Task-space dimension < state-space dimension. ◮ Learn f: task-space manifold → policy-space manifold. Ben Pearre A Young Modular Robot’s Guide to Locomotion
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×