Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Autonomous Vehicle Safety Technical and Social Issues WAISE 2018 Keynote

117 views

Published on

Keynote:
Autonomous Vehicle Safety Technical and Social Issues

More than three decades of ground vehicle autonomy development have led to the current high profile deployment of autonomous vehicle technology on public roads. Ensuring the safety of these vehicles requires solving a number of challenging technical, social, and political problems. Significant progress can be made on control and planning safety via the use of a doer/checker architecture. Perception validation is more challenging, and thus far has primarily relied upon road testing for most developers. Even if closed course testing and simulation are increased, the problem of edge cases not seen in on-road data collection will remain due to the likelihood of a heavy tail distribution of surprises. Part of this heavy tail is subtle environmental degradation, which our work has shown can cause failures that reveal potential weak spots in perception systems. The talk will summarize my experiences in these areas as well as lay out the basis for the broader hard questions of how safe is safe enough, whether deployment delay cost lives, and the topic of regulation.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Autonomous Vehicle Safety Technical and Social Issues WAISE 2018 Keynote

  1. 1. Autonomous Vehicle Safety Technical and Social Issues WAISE, Sept. 18, 2018 © 2018 Philip Koopman Prof. Philip Koopman
  2. 2. 2© 2018 Philip Koopman Control & Planning safety Perception safety Edge cases Some hard questions about AV safety Overview [General Motors]
  3. 3. 3© 2018 Philip Koopman 1985 1990 1995 2000 2005 2010 DARPA Grand Challenge DARPA LAGR ARPA Demo II DARPA SC-ALV NASA Lunar Rover NASA Dante II Auto Excavator Auto Harvesting Auto Forklift Mars Rovers Urban Challenge DARPA PerceptOR DARPA UPI Auto Haulage Auto Spraying Laser Paint RemovalArmy FCS Carnegie Mellon University Faculty, staff, students Off-campus Robotics Institute facility NREC: 30+ Years Of Cool Robots Software Safety
  4. 4. 4© 2018 Philip Koopman [ICSE 2018]
  5. 5. 5© 2018 Philip Koopman  The Big Red Button era Before Autonomy Software Safety
  6. 6. 6© 2018 Philip Koopman APD (Autonomous Platform Demonstrator) TARGET GVW: 8,500 kg   TARGET SPEED: 80 km/hr   Approved for Public Release. TACOM Case #20247 Date: 07 OCT 2009 Safety critical speed limit enforcement
  7. 7. 7© 2018 Philip Koopman Traditional Validation Meets Machine Learning  Use traditional software safety where you can ..BUT..  Machine Learning (inductive training)  No requirements –Training data is difficult to validate  No design insight –Generally inscrutable; prone to gaming and brittleness ?
  8. 8. 8© 2018 Philip Koopman  Specify unsafe regions  Specify safe regions  Under-approximate to simplify  Trigger system safety response upon transition to unsafe region Safety Envelope Approach to ML Deployment UNSAFE!
  9. 9. 9© 2018 Philip Koopman  “Doer” subsystem  Implements normal, untrusted functionality  “Checker” subsystem – Traditional SW  Implements failsafes (safety functions)  Checker entirely responsible for safety  Doer can be at low Safety Integrity Level  Checker must be at higher SIL (Also known as a “safety bag” approach) Architecting A Safety Envelope System Doer/Checker Pair Low SIL High SIL Simple Safety Envelope Checker ML
  10. 10. 10© 2018 Philip Koopman Validating an Autonomous Vehicle Pipeline Control Systems  Control Software Validation  Doer/Checker Architecture Autonomy Interface To Vehicle  Traditional Software Validation Perception presents a uniquely difficult assurance challenge Randomized & Heuristic Algorithms  Run-Time Safety Envelopes  Doer/Checker Architecture Machine Learning Based Approaches  ???
  11. 11. 11© 2018 Philip Koopman Brute Force Road Testing  If 100M miles/critical mishap…  Test 3x–10x longer than mishap rate  Need 1 Billion miles of testing  That’s ~25 round trips on every road in the world  With fewer than 10 critical mishaps …
  12. 12. 12© 2018 Philip Koopman  Good for identifying “easy” cases  Expensive and potentially dangerous Brute Force AV Validation: Public Road Testing http://bit.ly/2toadfa
  13. 13. 13© 2018 Philip Koopman  NOT: Blame the victim  Pedestrian in road is expected  NOT: Blame the technology  Immature technology under test – Failures are expected!  NOT: Blame the driver  A solo driver drop-out is expected  The real AV testing lesson:  Ensure safety driver is engaged   Safety argument: Driver alert; time to respond; disengagement works Did We Learn The Right Lesson from Tempe? https://goo.gl/aF1Hdi https://goo.gl/MbUvXZ https://goo.gl/MbUvXZ
  14. 14. 14© 2018 Philip Koopman  Safety Driver Tasks:  Mental model of “normal” AV  Detect abnormal AV behavior  React & recover if needed  Example: obstructed lane  Does driver know when to take over?  Can driver brake in time? – Or is sudden lane change necessary?  Example: two-way traffic  What if AV commands sudden left turn into traffic? Can Safety Driver React In Time? https://goo.gl/vQxLh7 Jan 20, 2016; Handan, China
  15. 15. 15© 2018 Philip Koopman  Safer, but expensive  Not scalable  Only tests things you have thought of! Closed Course Testing Volvo / Motor Trend
  16. 16. 16© 2018 Philip Koopman  Highly scalable; less expensive  Scalable; need to manage fidelity vs. cost  Only tests things you have thought of! Simulation http://bit.ly/2K5pQCN Udacity http://bit.ly/2toFdeT Apollo
  17. 17. 17© 2018 Philip Koopman You should expect the extreme, weird, unusual  Unusual road obstacles  Extreme weather  Strange behaviors Edge Case are surprises  You won’t see these in testing  Edge cases are the stuff you didn’t think of! What About Edge Cases? https://www.clarifai.com/demo http://bit.ly/2In4rzj
  18. 18. 18© 2018 Philip Koopman  Unusual road obstacles & obstacles  Extreme weather  Strange behaviors Just A Few Edge Cases http://bit.ly/2top1KD http://bit.ly/2tvCCPK https://dailym.ai/2K7kNS8 https://en.wikipedia.org/wiki/Magic_Roundabout_(Swindon) https://goo.gl/J3SSyu
  19. 19. 19© 2018 Philip Koopman  Where will you be after 1 Billion miles of validation testing?  Assume 1 Million miles between unsafe “surprises”  Example #1: 100 “surprises” @ 100M miles / surprise – All surprises seen about 10 times during testing – With luck, all bugs are fixed  Example #2: 100,000 “surprises” @ 100B miles / surprise – Only 1% of surprises seen during 1B mile testing – Bug fixes give no real improvement (1.01M miles / surprise) Why Edge Cases Matter https://goo.gl/3dzguf
  20. 20. 20© 2018 Philip Koopman The Real World: Heavy Tail Distribution(?) Common Things Seen In Testing Edge Cases Not Seen In Testing (Heavy Tail Distribution)
  21. 21. 21© 2018 Philip Koopman The Heavy Tail Testing Ceiling
  22. 22. 22© 2018 Philip Koopman Malicious Image Attacks Reveal Brittleness https://goo.gl/5sKnZV QuocNet: Car Not a Car Magnified Difference Bus Not a Bus Magnified Difference AlexNet: Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv preprint arXiv:1312.6199 (2013).
  23. 23. 23© 2018 Philip Koopman  Sensor data corruption experiments ML Is Brittle To Environment Changes Synthetic Equipment Faults Gaussian blur Exploring the response of a DNN to environmental perturbations from “Robustness Testing for Perception Systems,” RIOT Project, NREC, DIST-A. Defocus & haze are similarly a significant issue
  24. 24. 24© 2018 Philip Koopman  A scalable way to test & train on Edge Cases What We’re Learning With Hologram Your fleet and your data lake Hologram cluster tests your CNN Hologram cluster trains your CNN Your CNN becomes more robust
  25. 25. 25© 2018 Philip Koopman False positive on lane marking False negative real bicyclist False negative when in front of dark vehicle False negative when person next to light pole Context-Dependent Perception Failures  Perception failures are often context-dependent  False positives and false negatives are both a problem  This is an active research area … technology still in development Will this pass a “vision test” for bicyclists?
  26. 26. 26© 2018 Philip Koopman Where does that number come from?  “The critical reason was assigned to drivers in an estimated 2,046,000 crashes that comprise 94 percent of the NMVCCS crashes at the national level.  However, in none of these cases was the assignment intended to blame the driver for causing the crash.” [DOT HS 812 115] Low hanging fruit is 49% for AVs (https://goo.gl/2WReTj) Will Self-Driving Cars Be 94% Safer? https://www.nhtsa.gov/technology- innovation/automated-vehicles-safety
  27. 27. 27© 2018 Philip Koopman  ~100M miles/fatal mishap for human driven road vehicles  28% Alcohol impaired  27% Speeding-related  28% Not wearing seat belts  9% distracted driving (2016 data)  2% drowsy …  19% non-occupant fatalities (at-risk users) (total > 100% due to multiple factors in some mishaps)  Unimpaired drivers are better than 100M miles  Arguably, humans can reach 200M miles (i.e., half the accident rate)  What if AV fatality demographics change? Hard Question: How Safe Is Enough? https://goo.gl/tEuoaS
  28. 28. 28© 2018 Philip Koopman 1. Trust the car companies to self-regulate?  Toyota criminal sanctions. VW Dieselgate. GM Ignition Switch. Takata air bags. … see your news feed …  Numerous safety critical software recalls  $XX++ billion invested with incentive to deploy ASAP 2. Use existing automotive safety regulations?  FMVSS requires standardized vehicle-level road tests – Need to be careful with “waivers” 3. Wait for a new safety standard?  “Driver test” must include both skills & “maturity”  Software safety standards in progress, but will take a while Hard Question: How Do We Regulate? FMVSS 138 Telltale
  29. 29. 29© 2018 Philip Koopman  More safety transparency  Independent safety assessments  Industry collaboration on safety  Minimum performance standards  Share data on scenarios and obstacles  Safety for on-road testing (driver & vehicle)  Autonomy software safety standards  Traditional software safety … PLUS …  Dealing with uncertainty and brittleness  Data collection and feedback on field failures Ways To Improve AV Safety http://bit.ly/2MTbT8F (sign modified) Mars Thanks!

×