SlideShare a Scribd company logo
Apprenticeship Learning for the Dynamics Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],S T A N F O R D An Application of Reinforcement Helicopter Pieter Abbeel, Adam Coates, ,[object Object],[object Object],[object Object],[object Object],[object Object],Expert human pilot flight ( a 1 ,  s 1 ,  a 2 ,  s 2 ,  a 3 ,  s 3 , ….) Learn  P sa ( a 1 ,  s 1 ,  a 2 ,  s 2 ,  a 3 ,  s 3 , ….) Autonomous flight Learn  P sa Dynamics Model P sa Reward Function R Reinforcement Learning  Control policy   Take away message: In the apprenticeship learning setting, i.e., when we have an expert demonstration, we do not need explicit exploration to perform as well as the expert. Theorem. Assuming we have a polynomial number of teacher demonstrations, then the apprenticeship learning algorithm will return a policy that performs as well as the teacher within a polynomial number of iterations. [Abbeel & Ng, 2005 for more details.] Dynamics Model P sa Reward Function R Reinforcement Learning  Control policy   ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Reinforcement Learning and Apprenticeship Learning Data collection: aggressive exploration is dangerous Have good model of  dynamics? NO “ Explore” YES “ Exploit”
Stationary Rolls Stationary Flips S T A N F O R D Tail-In Funnels Nose-In Funnels Learning to Autonomous Flight Morgan Quigley and Andrew Y. Ng Experimental Results Video available. ,[object Object],[object Object],[object Object],[object Object],ADDITIONAL REFERENCES (SPECIFIC TO AUTONOMOUS HELICOPTER FLIGHT) [1]  J. Bagnell and J. Schneider.  Autonomous helicopter control using reinforcement learning policy search methods.  In  International Conference on Robotics and Automation.   IEEE, 2001. [2] V. Gavrilets, I. Martinos, B. Mettler, and E. Feron.  Control logic for automated aerobatic flight of miniature helicopter.  In  AIAA Guidance, Navigation and Control Conference , 2002. [3] M. La Civita.  Integrated Modeling and Robust Control for Full-Envelope Flight of Robotic Helicopters .  PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 2003. [4] M. La Civita, G. Papgeorgiou, W. C. Messner, and T. Kanade.  Design and flight testing of a high-bandwidth H-infinity loop shaping controller for a robotic helicopter.  Journal of Guidance, Control, and Dynamics , 29(2):485-494, March-April 2006. [5] B. Mettler, M. Tischler, and T. Kanade.  System identification of small-size unmanned helicopter dynamics.  In  American Helicopter Society, 55 th  Forum , 1999. [6] Jonathan M. Roberts, Peter I. Corke, and Gregg Buskey.  Low-cost flight control system for a small autonomous helicopter.  In  IEEE Int’l Conf. On Robotics and Automation , 2003. [7] S. Saripalli, J. Montgomery, and G. Sukhatme.  Visually-guided landing of an unmanned aerial vehicle, 2003. ,[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

Similar to Abbeel coatesquigleyng nips2006_poster

Lecture 16
Lecture 16Lecture 16
Lecture 16
butest
 

Similar to Abbeel coatesquigleyng nips2006_poster (20)

ML_ Unit_1_PART_A
ML_ Unit_1_PART_AML_ Unit_1_PART_A
ML_ Unit_1_PART_A
 
Lecture 16
Lecture 16Lecture 16
Lecture 16
 
IRJET- Constrained Role Mining using K-Map
IRJET- Constrained Role Mining using K-MapIRJET- Constrained Role Mining using K-Map
IRJET- Constrained Role Mining using K-Map
 
A statistical approach to predict flight delay
A statistical approach to predict flight delayA statistical approach to predict flight delay
A statistical approach to predict flight delay
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
Goman, Khramtsovsky, Shapiro (2001) – Aerodynamics Modeling and Dynamics Simu...
Goman, Khramtsovsky, Shapiro (2001) – Aerodynamics Modeling and Dynamics Simu...Goman, Khramtsovsky, Shapiro (2001) – Aerodynamics Modeling and Dynamics Simu...
Goman, Khramtsovsky, Shapiro (2001) – Aerodynamics Modeling and Dynamics Simu...
 
Development of deep reinforcement learning for inverted pendulum
Development of deep reinforcement learning for inverted  pendulumDevelopment of deep reinforcement learning for inverted  pendulum
Development of deep reinforcement learning for inverted pendulum
 
IRJET- Ship Detection for Pre-Annotated Ship Dataset in Machine Learning ...
IRJET-  	  Ship Detection for Pre-Annotated Ship Dataset in Machine Learning ...IRJET-  	  Ship Detection for Pre-Annotated Ship Dataset in Machine Learning ...
IRJET- Ship Detection for Pre-Annotated Ship Dataset in Machine Learning ...
 
Flight Stimulator PPt.pptx
Flight Stimulator PPt.pptxFlight Stimulator PPt.pptx
Flight Stimulator PPt.pptx
 
IRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction MethodIRJET- Novel based Stock Value Prediction Method
IRJET- Novel based Stock Value Prediction Method
 
Visualizing the Flight Test Data and its Simulation
Visualizing the Flight Test Data and its SimulationVisualizing the Flight Test Data and its Simulation
Visualizing the Flight Test Data and its Simulation
 
Case Study of Various Parameters by Applying Swing Up Control for Inverted Pe...
Case Study of Various Parameters by Applying Swing Up Control for Inverted Pe...Case Study of Various Parameters by Applying Swing Up Control for Inverted Pe...
Case Study of Various Parameters by Applying Swing Up Control for Inverted Pe...
 
A Comparison of Closed-Loop Performance of MULTIROTOR Configurations using No...
A Comparison of Closed-Loop Performance of MULTIROTOR Configurations using No...A Comparison of Closed-Loop Performance of MULTIROTOR Configurations using No...
A Comparison of Closed-Loop Performance of MULTIROTOR Configurations using No...
 
University course on aerospace projects management and se complete 2017
University course on aerospace projects management and se complete 2017University course on aerospace projects management and se complete 2017
University course on aerospace projects management and se complete 2017
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
E043036045
E043036045E043036045
E043036045
 
E043036045
E043036045E043036045
E043036045
 
E043036045
E043036045E043036045
E043036045
 

Abbeel coatesquigleyng nips2006_poster

  • 1.
  • 2.