Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Giuliano Sandini. Robotics and AI by Albert Yefimov 614 views
- victores2014thesis-presentation by Juan G. Victores 77 views
- Petar Kormushev - Learning the skil... by Petar Kormushev 854 views
- #IoEforumita - iCub and friends by Cisco Italia 622 views
- Aquila 2.0 by Martin Peniak 1087 views
- Reactive Reaching and Grasping on a... by Juxi Leitner 894 views

943 views

Published on

Published in:
Technology

No Downloads

Total views

943

On SlideShare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Learning the skill of archery by a humanoid robot iCub<br />Petar Kormushev, Sylvain Calinon, Ryo Saegusa, Giorgio Metta<br />Italian Institute of Technology (IIT)Advanced Robotics dept., RBCS dept. http://www.iit.it<br />Humanoids 2010 Nashville, TN, USADecember 6-8, 2010<br />
- 2. Motivation<br />How a robot can learn complex motor skills?<br />Why archery task?<br />bi-manual coordination<br />integration of image processing, motor control and learning parts in one coherent task<br />using tools (bow and arrow) to affect an external object (target)<br />appropriate task for testing different learning algorithms, because the reward is inherently defined by the goal of the task<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />2/20<br />
- 3. The archery task<br />Different societies<br />Different embodiments<br />Zashikikarakuri, 18-19th century(Mechanical automatons)<br />Kyudo(Japanese archery)<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />Differences in the<br />learned skill<br />3/20<br />
- 4. iCub archery skill<br />iCub is an open-source humanoid robot with dimensions comparable to 3.5 year-old child, 104 cm tall, with 53 DOF.<br />Static grasp of the bow<br />Aiming skill<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />4/20<br />
- 5. Problem definition<br />How to learn to shoot the arrowso that it hits the center of the target:<br /><ul><li>aim at the target
- 6. recognize arrow’s position wrt. the target</li></ul>Assumptions:<br /><ul><li>Prior knowledge about how to hold the bow and release the arrow
- 7. Prior knowledge about the colors of the target and the arrow</li></ul>Petar Kormushev, Italian Institute of Technology (IIT)<br />5/20<br />
- 8. Proposed approach<br />For learning bi-manual aiming:<br />PoWER: EM-based Reinforcement Learning<br />ARCHER: Chained vector regression algorithm<br />For hands position/orientation control:<br />IK motion controller for the two arms <br />For image recognition of the target and arrow:<br />color-based detection based on GMM<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />6/20<br />
- 9. Learning algorithm #1: PoWER<br />Policy learning by Weighting Exploration with the Returns (PoWER)<br />Reasons to select PoWER:<br />state-of-the-art EM-based RL algorithm<br />no need of learning rate (unlike policy-gradient methods)<br />efficient use of past experience via importance sampling<br />single rollout enough to update policy<br />Jens Kober and Jan Peters, NIPS 2009<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />7/20<br />
- 10. PoWER - implementation<br />Policy parameters :<br />relative position the two hands(3D vector from right to left hand)<br />Policy update rule:<br />Importance sampling<br />uses best σ rollouts so far<br />relative exploration<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />8/20<br />
- 11. PoWER - reward function<br />Return of an arrow shooting rollout :<br />Estimated target center position<br />Estimated arrow tip position<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />9/20<br />
- 12. Learning algorithm #2: ARCHER<br />Augmented Reward CHainEd Regression<br />Multi-dimensional reward vector<br />Iteratively converging process<br />Using regression to estimate new parameters<br />ARCHER can be viewed as a linear vector regression with a shrinking support region.<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />10/20<br />
- 13. Learning algorithm #2: ARCHER<br />rollouts<br />input parameters<br />observed result<br />target reward<br />matrix form<br />least-norm approximation<br />of the weights:<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />11/20<br />
- 14. Learning algorithm #2: ARCHER<br />ARCHER is suitable for problemsfor which:<br />a-priori knowledge about the desired goal reward is known<br />the reward can be decomposed into separate components<br />the task has a smooth solution space<br />Makes use of multi-dimensional reward, unlike standard RL, which only uses scalar reward<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />12/20<br />
- 15. Simulation experiment<br />Convergence criteria: distance to the center < 5 cm<br />PoWER<br />ARCHER<br />19 rollouts to converge<br />5 rollouts to converge<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />13/20<br />
- 16. Speed of convergence<br />Averaged over 40 runs with 60 rollouts in each run:<br />ARCHER converges faster than PoWER due to:<br /><ul><li> Using 2D reward to </li></ul> estimate parameters<br /><ul><li> Using prior knowledge </li></ul> about the goal’s reward<br />PoWER achieves reasonable performance despite using only 1D<br />feedback information.<br />First 3 rollouts with high random exploration<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />14/20<br />
- 17. Image recognition<br /><ul><li>Automatic detection of target and arrow
- 18. YUV color space (Y - luminance, UV – chrominance)
- 19. GMM for color-based detection</li></ul>Estimated reward vector:<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />15/20<br />
- 20. Robot motion controller<br />Pattacini et al, IROS 2010<br />Minimum-jerk IK cartesian controller<br />Hands orientation control<br />Posture and grasping configuration<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />16/20<br />
- 21. Real-world experiment<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />17/20<br />
- 22. Real-world performance<br />Distance between robot and target: 220 cm; Height of the robot: 104 cm<br />Diameter of target: 50 cm<br />Convergence is slightly slower than simulation because of:<br /><ul><li> Noise of image position</li></ul> estimation<br /><ul><li> Position/orientation</li></ul> error of controller<br /><ul><li> Nonlinearities of task</li></ul>ARCHER converges in less than 10 rollouts<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />18/20<br />
- 23. Conclusion<br /><ul><li>iCub learned to aim and hit the center of the target
- 24. Two learning algorithms were used to coordinate the posture of the hands:
- 25. PoWER: EM-based reinforcement learning
- 26. ARCHER: local vector regression with shrinking support region
- 27. Reward was extracted autonomously from visual feedback via colored-basedimage processing using GMM
- 28. ARCHER converges faster than PoWER due to:
- 29. multi-dimensional reward
- 30. known target reward
- 31. regression-based parameter estimation
- 32. Future work: use imitation learning to teach the robot the whole movement for grasping and pulling the arrow</li></ul>Petar Kormushev, Italian Institute of Technology (IIT)<br />19/20<br />
- 33. Thank you for your kind attention!<br />Petar Kormushev, Italian Institute of Technology (IIT)<br />More information: <br />http://kormushev.com/<br />20/20<br />

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment