Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Autonomous Vehicles: the Intersection of Robotics and Artificial Intelligence

Autonomous Vehicle Webinar. Crash course in AVs: high-level overview, technology deep-dives, and trends. Follow me on Twitter at

Link to YouTube Video:
Google Slides:

  • Be the first to comment

Autonomous Vehicles: the Intersection of Robotics and Artificial Intelligence

  1. 1. Autonomous Vehicles Webinar The intersection of robotics and artificial intelligence Streaming live via Hangouts 8pm CT - August 28th, 2016 Undergraduate student at University of Illinois at Urbana - Champaign, Class of 2017 B.S. Mechanical Engineering, Minor in Electrical Engineering Previous: PwC, Cummins, UIUC RA
  2. 2. Overview I. What is an AV? II. Technology A. AI + Robotics = AVs B. “Self-Driving Stack” 1. Sensing 2. Processing 3. Actuation III. Up Next
  3. 3. What is an autonomous vehicle (AV) ? Within the context of this discussion are focusing of roadway motor vehicles. AVs at their simplest would be a car with cruise- control capability. At its most complex is an entirely driverless vehicle. Much like everything else in tech, there is a lot of contention on how the classification should be structured. What is ‘full autonomy’, etc? Thankfully, the U.S. Dept. of Transportation developed an official tiering with very clear distinctions. Autonomous vehicles (AVs) are vehicles that are capable movement with limited or no outside instruction or intervention.
  4. 4. Autonomy, per the U.S. Dept. of Transportation: SOURCE: t+of+Transportation+Releases+Policy+on+Automated+Vehicle+Develo pment Tier 1 Automation at this level involves one or more specific control functions. Examples include electronic stability control or pre-charged brakes, where the vehicle automatically assists with braking to enable the driver to regain control of the vehicle or stop faster than possible by acting alone. Tier 2 This level involves automation of at least two primary control functions designed to work in unison to relieve the driver of control of those functions. An example of combined functions enabling a Level 2 system is adaptive cruise control in combination with lane centering. Tier 3 Vehicles at this level of automation enable the driver to cede full control of all safety-critical functions under certain traffic or environmental conditions and in those conditions to rely heavily on the vehicle to monitor for changes in those conditions requiring transition back to driver control. The driver is expected to be available for occasional control, but with sufficiently comfortable transition time. Tier 4 The vehicle is designed to perform all safety-critical driving functions and monitor roadway conditions entirely. The driver could provide destination input and is not expected to be available for control at any time during the trip. This includes unoccupied vehicles.
  5. 5. AI + robotics = AVs
  6. 6. The intersection of artificial intelligence and robotics An intelligent system that is capable of taking information/data and acting upon that data, capable of learning how to draw further insight Study of design and control of mechanical systems. On a closed- loop, these systems are capable of controlling themselves using sensory information ● Modern machine learning and AI techniques are capable of this for specific tasks (AlphaGo, Image Classification) ● These similar techniques, especially Deep Learning, could be applied to vehicles to teach it them drive given high volumes of data ● Robotics is a well understood field of study with decades of research and progress ● Has been applied to planes, cars, etc, but in an extremely limited fashion ● Autonomy cannot be “hard- coded”, must be “learned” AI Robotics
  7. 7. The intersection of artificial intelligence and robotics: where the magic happens Autonomous vehicles have always been a scientific dream. Planes have been capable of auto-pilot, “self-flying” features for decades. How is it taking so long to happen on cars? Well, existing infrastructures and roads cannot support rule-based robotic systems. There are too many possible scenarios that could occur when driving, rules for robotic vehicles cannot be “hard-coded”. True autonomy requires artificial intelligence. Intelligence that resembles the human capability to decipher 3D space changing in time. With decades of advances in machine learning and artificial intelligence we are nearing a time when machines are better at understanding roads than we are.
  8. 8. Technology Deep-Dive
  9. 9. There is a lot going on under the hood, let’s try to simplify it Pose Graph LIDAR Graph SLAM
  10. 10. 1 Sensing Processing2 Actuation3 The “Self-Driving Stack” The architecture of autonomy
  11. 11. Commands are sent to Control Unit which tells engine/motor to speed up or slow down. An analogous process occurs for vehicle steering. Sensor data is passed on ro algorithms and is processed locally (GPUs) or over a distributed network (the Cloud) Autonomous Vehicle Architecture 0100011010101010 0010110101000101 Video Camera (still images processing, pixels) LIDAR (light-radar, point clouds) Specific sensors (e.g. red light detection, pedestrian detection) 1 Sensing Processing 2 3 Actuation
  12. 12. Autonomous Vehicle Architecture Electromechanical Actuation Sensing Sensing Processing/Computation 1 1 2 3 1 Sensing
  13. 13. Technology Deep- Dive: Sensing 1 Sensing Processing2 Actuation3
  14. 14. LIDAR, video cameras, and radar/sonic sensors are most commonly used for gathering vehicle environment data Video Camera (still images processing, pixels) LIDAR (light-radar, point clouds) Specific sensors (e.g. red light detection, stop signs) Sensing ● “Light radar” - LIDAR ● Generates point clouds that are 3D representations of the driving environment ● Seen as the high-resolution input data that is integral to SLAM + RRT techniques ● Simple video cameras input feeds of still images that can be processed for lanes, obstacles, pedestrians, etc ● Cheap and effective, now being heavily implemented as the choice data for deep learning ● Case-specific sensors are heavily leveraged to provide insight in areas that LiDAR and cameras cannot handle in a general way ● Ex) a specific camera pointed at where stoplights are - feed directly into a specific algorithm for sensing red, yellow, and green colors
  15. 15. A deep-dive on LIDAR Sensing ● LIDAR has quickly become a go-to sensor for autonomous applications. Velodyne is an industry leader with relatively cheap, easy to calibrate units ● LIDAR units send out pulses of light and measure the time to return, which can be used to compute the distance of an object ● A rotating LIDAR sensor gathering distances of objects at different angles can gather enough points of data to construct a “point cloud” ● It is evident how useful point clouds are, similar effect as the human eye, 3D representation of space in real time
  16. 16. Researchers at MIT in collaboration with DARPA have been able to fabricate and implement a solid-state LIDAR chip: “Our lidar chips promise to be orders of magnitude smaller, lighter, and cheaper than lidar systems available on the market today. They also have the potential to be much more robust because of the lack of moving parts, with a non-mechanical beam steering 1,000 times faster than what is currently achieved in mechanical lidar systems.” “At the moment, our on-chip lidar system can detect objects at ranges of up to 2 meters, though we hope to achieve a 10-meter range within a year. The minimum range is around 5 centimeters. We have demonstrated centimeter longitudinal resolution and expect 3-cm lateral resolution at 2 meters. There is a clear development path towards lidar on a chip technology that can reach 100 meters, with the possibility of going even farther.” Massive size and price reduction of LIDAR sensors could fundamentally change approach to autonomous vehicles, drones, prosthetics, etc. “MIT and DARPA pack LIDAR sensor onto single chip” IEEE Spectrum, Aug 4 2016 A new, cheaper, solid state LIDAR emerging Sensing SOURCE: talk/semiconductors/optoelectronics/mit-lidar-on-a-chip
  17. 17. The sensing stage needs to gather lots of data from different sources in order to fully understand the environment Video Camera (still images processing, pixels) LIDAR (light-radar, point clouds) Specific sensors (e.g. red light detection, stop signs) Sensing
  18. 18. Technology Deep- Dive: Processing 1 Sensing Processing2 Actuation3
  19. 19. z The Processing Stack Processing ● CPUs, GPUs, SoCs on board ● Large amounts of flash memory ● “Cloud” compute ● Powerful endpoints, limited only by speed of data communication Computational Methods Local Distributed RRT*, SLAM, Kinematics End-to-End, DNN, CNN Motion Planning / Mapping Machine Learning / Deep Learning Intersections, Left-turn Rule-based systems LIDAR point cloud data Video Camera Feed Computational Muscle Input Data Output Commands
  20. 20. Computational Methods Motion Planning Artificial Intelligence (ML/Deep Learning)
  21. 21. Motion Planning - Algorithm 1: SLAM Processing What is the world around me (mapping) ● Sense from various positions ● Integrate measurements to produce map Where in I am in the world (localization) ● Sense ● Relate sensor reading to a world model (a priori maps) ● Compute (probabilistic) location relative to model **above points taken from CMU paper cited below Depicted to the right is a Kalman Filter being applied to position measurements and sensory information that in turn generates a Gaussian distribution of the possible positions Simultaneous localization and mapping (SLAM) SOURCE: Kalman-Mapping_howie.pdf
  22. 22. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  23. 23. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 Robot Landmark SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  24. 24. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 Robot Landmark SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  25. 25. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 Robot Landmark SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  26. 26. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 Robot Landmark SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  27. 27. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  28. 28. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  29. 29. Motion Planning - Algorithm 1: SLAM Processing SLAM Walkthrough 1 2 3 4 5 6 7 Location Likelihood Distribution SOURCE: astronautics/16-412j-cognitive-robotics-spring- 2005/projects/1aslam_blas_repo.pdf
  30. 30. Motion Planning - Algorithm 2: RRTs Processing ● Rapidly-exploring Random Trees (RRTs) are a set of exploratory algorithms that are useful for trajectory planning ● With a set of polygonal obstacles, an RRT can generate a possible path from the starting configuration to the ending (goal) configuration ● Sample paths are then input to a controller/model representation of the vehicle dynamics and the predicted trajectory of the vehicle is computed (x) ● The runtime of these algorithms can vary since accuracy is based on samples taken Once a probabilistic localization is realized, a probabilistic path can be generated using RRTs SOURCE:
  31. 31. Motion Planning - SLAM + RRTs = advanced guesswork Processing ● In order to obtain a higher-resolution probabilistic model of the ideal trajectory more samples need to be taken and more computations performed, hence the need for massive compute power! ● It is understandable that a car driving 60mph would have issues performing this depth of computation in a rapidly changing environment For more in-depth understanding of algorithmic robotics motion planning works check out SLAM for Dummies A probabilistic path generated from probabilistic input poses issues for vehicles moving at high speeds SOURCE: 4_submission_9.pdf **white spots represent sampled points used to generate RRT
  32. 32. Artificial Intelligence (ML/Deep Learning) Processing ● Newly emerging methodologies all revolve around deep learning via neural nets ○ RNNs, CNNs, GANs, Autoencoding, etc ● Two main forces driving adoption of these methods: ○ Cheaper and more powerful local and cloud computing (GPUs) ○ Open-source deep learning platforms (TensorFlow) These deep learning methodologies are injecting intelligence into vehicles, feeding them massive amounts of data, and letting them learn Please check out this Deep Learning Playground for a better visualization of the concept Artificial Intelligence Methods Feature extraction performed by a CNN on video from a forward facing camera. Model was able to determine what were road edges with relative accuracy (via NVIDIA) Lane centering generator that predicts path of vehicles based on video input from front facing camera (via
  33. 33. Artificial Intelligence (ML/Deep Learning) Processing
  34. 34. Important Academic Papers Regarding Deep Learning Processing ● NVIDIA - “End to End Learning for Self-Driving Cars” Video input from a forward facing camera is trained against steering wheel position and deep learning networks are capable of detecting important road features with limited additional nudging in the right direction ● - "Learning A Driving Simulator" Using video input with no additional training metadata (IMU, wheel angle) auto-encoded video was generated, predicting many frames into the future while maintaining road features ● Radford et al. (Facebook AI) - "Unsupervised Representational Learning w/ Deep GANs" Seminal work on deep learning auto-encoding that allowed breakthrough and similar work i.e. “Autoencoding Blade Runner” ● NYU & Facebook AI - “Deep Multi-Scale Video Prediction Beyond Mean Square Error” Implications of these papers indicate deep learning is a highly promising solution for AVs
  35. 35. Computational Muscle CPUs GPUs SoCs (Onboard) Distributed Computing (Cloud)
  36. 36. Computational muscle limited to local compute, for now Processing ● Current self-driving solutions are all implemented with local compute due to the need for simplicity, focusing on software first ● Utilizing GPUs and special SoCs to perform simple operations (i.e. with pixels and point clouds) at massive scale in parallel ● New TPUs (tensor processing units) are being designed specifically for the purpose of machine learning and AI, as well as new platforms emerging specifically for AVs ● A distributed network offering massive computational muscle would be ideal, but does not offer immediate simplicity due to latency, security, reliability, ... ● Movement toward an “AWS for AVs” is a huge opportunity, many companies are actively working on Two paradigms currently, local compute (CPUs, SoCs, GPUs) and distributed computation over a network (Cloud) Google’s new TPU that powered AlphaGo
  37. 37. Technology Deep- Dive: Actuation 1 Sensing Processing2 Actuation3
  38. 38. Actuation stage is primarily based on field of controls and electromechanical systems Actuation ● The control unit is circuit hardware that manages electromechanical systems within a car ● Large amount of low-level controls have been standardized into protocols like CAN ● Most well-studied and understood portion of the self-driving technology stack, high feasibility relative to other parts of the “stack” ● Companies like Delphi and Bosch are large players in this space and have invested decades of time and research into vehicle controls ● Innovation in this space is much more iterative, positioning incumbents to dominate the controls hardware/software for AVs The processing stage sends commands via bus like CAN or similar architectures to engine control unit/modules
  39. 39. Up Next
  40. 40. High level trends, “Self-Driving Stack” trends, general comments ● Costs of sensors is falling through the floor ● No “best sensor” yet, converging toward LIDAR and video camera, dependent on processing approaches ● Accuracy limits, distance limits, latency of data feed (LIDAR especially) are improving exponentially with cost Sensing ● Models vs. Neural vs. Mixed, no “best practice” ● Local compute only implementation yet, will transition toward “Cloud” same way as software ● Mapping is important but AI vector bank is the new data network effect ● V2V, V2I communication cannot be relied upon Processing ● Actuation / controls is out in front of the rest of the tech, not a limiting factor ● Mission critical safety and reliability needs to be investigated more heavily, beyond “Six Sigma” ● Incumbents well positioned ● Security has not been investigated thoroughly, will emerge as a large space later on Actuation / Controls
  41. 41. My Thoughts 1 Data Network Effects for AI systems are the sole most important factor to long- term success. Advantage Uber and Tesla. LIDAR and GPU companies will become important OEMs and provide it as a service to Big Auto, only non-commodity hardware that matters to enable “AV”. 2 The inherently difficult problems are software related, Big Auto not positioned to “win” at software. Defer to startups with ex-researchers. 3
  42. 42. - Otto (recently acquired by Uber for ~$600M) - Zoox - $200M fundraise with not evening a landing page, talk about stealthy! Team consists of “fathers of AVs” - - Attempting to offer autonomy enablement to vehicle mfgs. - - Software for AVs, not much info, rockstar team with very deep background - Peloton Tech - More immediate use case for semi-autonomy with platooning. Strategic investors, UPS venture arm is a positive signal. - NuTonomy - Released functioning product in Singapore, great team Companies to pay attention to
  43. 43. Thank You