Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Learning and Robotics

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

Machine Learning and Robotics

  1. 1. MACHINE LEARNING AND ROBOTICS Lisa Lyons 10/22/08
  2. 2. OUTLINE <ul><li>Machine Learning Basics and Terminology </li></ul><ul><li>An Example: DARPA Grand/Urban Challenge </li></ul><ul><li>Multi-Agent Systems </li></ul><ul><li>Netflix Challenge (if time permits) </li></ul>
  3. 3. INTRODUCTION <ul><li>Machine learning is commonly associated with robotics </li></ul><ul><li>When some think of robots, they think of machines like WALL-E (right) – human-looking, has feelings, capable of complex tasks </li></ul><ul><li>Goals for machine learning in robotics aren’t usually this advanced, but some think we’re getting there </li></ul><ul><li>Next three slides outline some goals that motivate researchers to continue work in this area </li></ul>
  4. 4. HOUSEHOLD ROBOT TO ASSIST HANDICAPPED <ul><li>Could come preprogrammed with general procedures and behaviors </li></ul><ul><li>Needs to be able to learn to recognize objects and obstacles and maybe even its owner (face recognition?) </li></ul><ul><li>Also needs to be able to manipulate objects without breaking them </li></ul><ul><li>May not always have all information about its environment (poor lighting, obscured objects) </li></ul>
  5. 5. FLEXIBLE MANUFACTURING ROBOT <ul><li>Configurable robot that could manufacture multiple items </li></ul><ul><li>Must learn to manipulate new types of parts without damaging them </li></ul>
  6. 6. LEARNING SPOKEN DIALOG SYSTEM FOR REPAIRS <ul><li>Given some initial information about a system, a robot could converse with a human and help to repair it </li></ul><ul><li>Speech understanding is a very hard problem in itself </li></ul>
  7. 7. MACHINE LEARNING BASICS AND TERMINOLOGY <ul><li>With applications and examples in robotics </li></ul>
  8. 8. LEARNING ASSOCIATIONS <ul><li>Association Rule – probability that an event will happen given another event already has (P(Y|X)) </li></ul>
  9. 9. CLASSIFICATION <ul><li>Classification – model where input is assigned to a class based on some data </li></ul><ul><li>Prediction – assuming a future scenario is similar to a past one, using past data to decide what this scenario would look like </li></ul><ul><li>Pattern Recognition – a method used to make predictions </li></ul><ul><ul><li>Face Recognition </li></ul></ul><ul><ul><li>Speech Recognition </li></ul></ul><ul><li>Knowledge Extraction – learning a rule from data </li></ul><ul><li>Outlier Detection – finding exceptions to the rules </li></ul>
  10. 10. REGRESSION <ul><li>Linear regression is an example </li></ul><ul><li>Both Classification and Regression are “Supervised Learning” strategies where the goal is to find a mapping from input to output </li></ul><ul><li>Example: Navigation of autonomous car </li></ul><ul><ul><li>Training Data: actions of human drivers in various situations </li></ul></ul><ul><ul><li>Input: data from sensors (like GPS or video) </li></ul></ul><ul><ul><li>Output: angle to turn steering wheel </li></ul></ul>
  11. 11. UNSUPERVISED LEARNING <ul><li>Only have input </li></ul><ul><li>Want to find regularities in the input </li></ul><ul><li>Density Estimation: finding patterns in the input space </li></ul><ul><ul><li>Clustering: find groupings in the input </li></ul></ul>
  12. 12. REINFORCEMENT LEARNING <ul><li>Policy: generating correct actions to reach the goal </li></ul><ul><li>Learn from past good policies </li></ul><ul><li>Example: robot navigating unknown environment in search of a goal </li></ul><ul><ul><li>Some data may be missing </li></ul></ul><ul><ul><li>May be multiple agents in the system </li></ul></ul>
  13. 13. POSSIBLE APPLICATIONS <ul><li>Exploring a world </li></ul><ul><li>Learning object properties </li></ul><ul><li>Learning to interact with the world and with objects </li></ul><ul><li>Optimizing actions </li></ul><ul><li>Recognizing states in world model </li></ul><ul><li>Monitoring actions to ensure correctness </li></ul><ul><li>Recognizing and repairing errors </li></ul><ul><li>Planning </li></ul><ul><li>Learning action rules </li></ul><ul><li>Deciding actions based on tasks </li></ul>
  14. 14. WHAT WE EXPECT ROBOTS TO DO <ul><li>Be able to react promptly and correctly to changes in environment or internal state </li></ul><ul><li>Work in situations where information about the environment is imperfect or incomplete </li></ul><ul><li>Learn through their experience and human guidance </li></ul><ul><li>Respond quickly to human interaction </li></ul><ul><li>Unfortunately, these are very high expectations which don’t always correlate very well with machine learning techniques </li></ul>
  15. 15. DIFFERENCES BETWEEN OTHER TYPES OF MACHINE LEARNING AND ROBOTICS <ul><li>Planning can frequently be done offline </li></ul><ul><li>Actions usually deterministic </li></ul><ul><li>No major time constraints </li></ul><ul><li>Often require simultaneous planning and execution (online) </li></ul><ul><li>Actions could be nondeterministic depending on data (or lack thereof) </li></ul><ul><li>Real-time often required </li></ul><ul><li>Other ML Applications </li></ul><ul><li>Robotics </li></ul>
  17. 17. THE CHALLENGE <ul><li>Defense Advanced Research Projects Agency (DARPA) </li></ul><ul><li>Goal: to build a vehicle capable of traversing unrehearsed off-road terrain </li></ul><ul><li>Started in 2003 </li></ul><ul><li>142 mile course through Mojave </li></ul><ul><li>No one made it through more than 5% of the course in 2004 race </li></ul><ul><li>In 2005, 195 teams registered, 23 teams raced, 5 teams finished </li></ul>
  18. 18. THE RULES <ul><li>Must traverse a desert course up to 175 miles long in under 10 h </li></ul><ul><li>Course kept secret until 2h before the race </li></ul><ul><li>Must follow speed limits for specific areas of the course to protect infrastructure and ecology </li></ul><ul><li>If a faster vehicle needs to overtake a slower one, the slower one is paused so that vehicles don’t have to handle dynamic passing </li></ul><ul><li>Teams given data on the course 2h before race so that no global path planning was required </li></ul>
  20. 20. A DARPA GRAND CHALLENGE VEHICLE THAT DID NOT CRASH <ul><li>… namely Stanley, the winner of the 2005 challenge </li></ul>
  21. 21. TERRAIN MAPPING AND OBSTACLE DETECTION <ul><li>Data from 5 laser scanners mounted on top of the car is used to generate a point cloud of what’s in front of the car </li></ul><ul><li>Classification problem </li></ul><ul><ul><li>Drivable </li></ul></ul><ul><ul><li>Occupied </li></ul></ul><ul><ul><li>Unknown </li></ul></ul><ul><li>Area in front of vehicle as grid </li></ul><ul><li>Stanley’s system finds the probability that ∆h > δ where ∆h is the observed height of the terrain in a certain cell </li></ul><ul><li>If this probability is higher than some threshold α , the system defines the cell as occupied </li></ul>
  22. 22. (CONT.) <ul><li>A discriminative learning algorithm is used to tune the parameters </li></ul><ul><li>Data is taken as a human driver drives through a mapped terrain avoiding obstacles (supervised learning) </li></ul><ul><li>Algorithm uses coordinate ascent to determine δ and α </li></ul>
  23. 23. COMPUTER VISION ASPECT <ul><li>Lasers only make it safe for car to drive < 25 mph </li></ul><ul><li>Needs to go faster to satisfy time constraint </li></ul><ul><li>Color camera is used for long-range obstacle detection </li></ul><ul><li>Still the same classification problem </li></ul><ul><li>Now there are more factors to consider – lighting, material, dust on lens </li></ul><ul><li>Stanley takes adaptive approach </li></ul>
  24. 24. VISION ALGORITHM <ul><li>Take out the sky </li></ul><ul><li>Map a quadrilateral on camera video corresponding with laser sensor boundaries </li></ul><ul><li>As long as this region is deemed drivable, use the pixels in the quad as a training set for the concept of drivable surface </li></ul><ul><li>Maintain Gaussians that model the color of drivable terrain </li></ul><ul><li>Adapt by adjusting previous Gaussians and/or throwing them out and adding new ones </li></ul><ul><ul><li>Adjustment allows for slow adjustment to lighting conditions </li></ul></ul><ul><ul><li>Replacement allows for rapid change in color of the road </li></ul></ul><ul><li>Label regions as drivable if their pixel values are near one or more of the Gaussians and they are connected to laser quadrilateral </li></ul>
  25. 26. ROAD BOUNDARIES <ul><li>Best way to avoid obstacles on a desert road is to find road boundaries and drive down the middle </li></ul><ul><li>Uses low-pass one-dimensional Kalman Filters to determine road boundary on both sides of vehicle </li></ul><ul><li>Small obstacles don’t really affect the boundary found </li></ul><ul><li>Large obstacles over time have a stronger effect </li></ul>
  26. 27. SLOPE AND RUGGEDNESS <ul><li>If terrain becomes too rugged or steep, vehicle must slow down to maintain control </li></ul><ul><li>Slope is found from vehicle’s pitch estimate </li></ul><ul><li>Ruggedness is determined by taking data from vehicle’s z accelerometer with gravity and vehicle vibration filtered out </li></ul>
  27. 28. PATH PLANNING <ul><li>No global planning necessary </li></ul><ul><li>Coordinate system used is base trajectory + lateral offset </li></ul><ul><li>Base trajectory is smoothed version of driving corridor on the map given to contestants before the race </li></ul>
  28. 29. PATH SMOOTHING <ul><li>Base trajectory computed in 4 steps: </li></ul><ul><ul><li>Points are added to the map in proportion to local curvature </li></ul></ul><ul><ul><li>Least-squares optimization is used to adjust trajectories for smoothing </li></ul></ul><ul><ul><li>Cubic spline interpolation is used to find a path that can be resampled efficiently </li></ul></ul><ul><ul><li>Calculate the speed limit </li></ul></ul>
  29. 30. ONLINE PATH PLANNING <ul><li>Determines the actual trajectory of vehicle during race </li></ul><ul><li>Search algorithm that minimizes a linear combination of continuous cost functions </li></ul><ul><li>Subject to dynamic and kinematic constraints </li></ul><ul><ul><li>Max lateral acceleration </li></ul></ul><ul><ul><li>Max steering angle </li></ul></ul><ul><ul><li>Max steering rate </li></ul></ul><ul><ul><li>Max acceleration </li></ul></ul><ul><li>Penalize hitting obstacles, leaving corridor, leaving center of road </li></ul>
  31. 33. RECURSIVE MODELING METHOD (RMM) <ul><li>Agents model the belief states of other agents </li></ul><ul><li>Beyesian methods implemented </li></ul><ul><li>Useful in homogeneous non-communicating Multi-Agent Systems (MAS) </li></ul><ul><li>Has to be cut off at some point (don’t want a situations where agent A thinks that agent B thinks that agent A thinks that…) </li></ul><ul><li>Agents can affect other agents by affecting the environment to produce a desired reaction </li></ul>
  32. 34. HETEROGENEOUS NON-COMMUNICATING MAS <ul><li>Competitive and cooperative learning possible </li></ul><ul><li>Competitive learning more difficult because agents may end up in “arms race” </li></ul><ul><li>Credit-assignment problem </li></ul><ul><ul><li>Can’t tell if agent benefitted because it’s actions were good or if opponent’s actions were bad </li></ul></ul><ul><li>Experts and observers have proven useful </li></ul><ul><li>Different agents may be given different roles to reach the goal </li></ul><ul><ul><li>Supervised learning to “teach” each agent how to do its part </li></ul></ul>
  33. 35. COMMUNICATION <ul><li>Allowing agents to communicate can lead to deeper levels of planning since agents know (or think they know) the beliefs of others </li></ul><ul><li>Could allow one agent to “train” another to follow it’s actions using reinforcement learning </li></ul><ul><li>Negotiations </li></ul><ul><li>Commitment </li></ul><ul><li>Autonomous robots could understand their position in an environment by querying other robots for their believed positions and making a guess based on that (Markov localization, SLAM) </li></ul>
  34. 36. NETFLIX CHALLENGE <ul><li>(if time permits) </li></ul>
  35. 37. REFERENCES <ul><li>Alpaydin, E. Introduction to Machine Learning . Cambridge, Mass. : MIT Press, 2004. </li></ul><ul><li>Kreuziger, J. “Application of Machine Learning to Robotics – An Analysis.” In Proceedings of the Second International Conference on Automation, Robotics, and Computer Vision (ICARCV '92). 1992. </li></ul><ul><li>Mitchell et. al. “Machine Learning .” Annu. Rev. Coput. Sci. 1990. 4:417-33. </li></ul><ul><li>Stone, P and Veloso, M. “Multiagent Systems: A Survey from a Machine Learning Perspective.” Autonomous Robots 8, 345-383, 2000. </li></ul><ul><li>Thrun et. al. “Stanley: The Robot that Won the DARPA Grand Challenge.” Journal of Field Robotics 23(9), 661-692, 2006. </li></ul>