SlideShare a Scribd company logo
1 of 865
Self-Driving Cars
Lecture 1 – Introduction
Robotics, Computer Vision, System Software
BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad)
Kumar Bipin
Self-Driving - A Human Dream
2
Agenda
1.1 Organization
1.2 Introduction
1.3 History of Self-Driving
5
1.1
Organization
Contents
Goal: Develop an understanding of the capabilities and limitations of autonomous
driving solutions and gain a basic understanding of the entire system comprising
perception, planning and vehicle control. Training agents in simple environments.
I History of self-driving cars
I End-to-end learning for self-driving (imitation/reinforcement learning)
I Modular approaches to self-driving
I Perception (camera, lidar, radar)
I Localization (with visual and road maps)
I Navigation and path planning
I Vehicle models and control algorithms
8
Prerequisites
Linear Algebra:
I Vectors: x, y ∈ Rn
I Matrices: A, B ∈ Rm×n
I Operations: AT , A−1, Tr(A), det(A), A + B, AB, Ax, x>y
I Norms: kxk1, kxk2, kxk∞, kAkF
I SVD: A = UDV>
44
Prerequisites
Probability and Information Theory:
I Probability distributions: P(X = x)
I Marginal/conditional: p(x) =
R
p(x, y)dy , p(x, y) = p(x|y)p(y)
I Bayes rule: p(x|y) = p(y|x)p(x)/p(y)
I Conditional independence: x ⊥
⊥ y | z ⇔ p(x, y|z) = p(x|z)p(y|z)
I Expectation: Ex∼p [f(x)] =
R
x p(x)f(x)dx
I Variance: Var(f(x)) = E

(f(x) − E[f(x)])2

I Distributions: Bernoulli, Categorical, Gaussian, Laplace
I Entropy: H(x) , KL Divergence: DKL(pkq)
45
Thank You!
Looking forward to our discussions
1.2
Introduction
Why Self-Driving Cars?
Road Fatalities in 2017
I USA: 32,700 Germany: 3,300 World: 1,300,000
I Main factors: speeding, intoxication, distraction, etc. 49
Benefits of Autonomous Driving
I Lower risk of accidents
I Provide mobility for elderly and people with disabilities
I In the US 45% of people with disabilities still work
I Decrease pollution for a more healthy environment
I New ways of public transportation
I Car pooling
I Car sharing
I Reduce number of cars (95% of the time a car is parked)
50
Uber Commercial (2018)
51
Uber Fatal Accident (2018)
52
Self-driving is Hard
Human performance: 1 fatality per 100 mio miles
Error rate to improve on: 0.000001 %
Challenges:
I Snow, heavy rain, night
I Unstructured roads, parking lots
I Pedestrians, erratic behavior
I Reflections, dynamics
I Rare and unseen events
I Merging, negotiating, reasoning
I Ethics: what is good behavior?
I Legal questions
http://theoatmeal.com/blog/google_self_driving_car
53
Unstructured Traffic
54
The Trolley Problem (1905)
Thought experiment:
I You observe a train that will kill 5 people on the rail tracks if it continues
I You have the option to pull a lever to redirect the train to another track
I However, the train will kill one (other) person on that alternate track
I What is your decision? What is the correct/ethical decision?
55
The MIT Moral Machine
http://moralmachine.mit.edu/ 56
1.3
History of Self-Driving
The Automobile
1886: Benz Patent-Motorwagen Nummer 1
59
1886: Benz Patent-Motorwagen Nummer 1
I Benz 954 cc single-cylinder four-stroke engine (500 watts)
I Weight: 100 kg (engine), 265 kg (total)
I Maximal speed: 16 km/h
I Consumption: 10 liter / 100 km (!)
I Construction based on the tricycle, many bicycle components
I 29.1.1886: patent filed
I 3.7.1886: first public test drive in Mannheim
I 2.11.1886: patent granted, but investors stayed skeptical
I First long distance trip (106 km) by Bertha Benz in 1888 with Motorwagen
Nummer 3 (without knowledge of her husband) fostered commercial interest
I First gas station: pharmacy in Wiesloch near Heidelberg
59
1886: Benz Patent-Motorwagen Nummer 1
59
Self-Driving Cars
1925: Phantom Auto – “American Wonder” (Houdina Radio Control)
In the summer of 1925, Houdina’s driverless car, called the American Wonder, traveled along Broadway in New York
City—trailed by an operator in another vehicle—and down Fifth Avenue through heavy traffic. It turned corners, sped up,
slowed down and honked its horn. Unfortunately, the demonstration ended when the American Wonder crashed into
another vehicle filled with photographers documenting the event. (Discovery Magazine)
https://www.discovermagazine.com/technology/the-driverless-car-era-began-more-than-90-years-ago 61
1939: Futurama – New York World’s Fair
I Exhibit at the New York World’s Fair in 1939 sponsored by General Motors
I Designed by Norman Bel Geddes’ - his vision of the world 20 years later (1960)
I Radio-controlled electric cars, electromagnetic field via circuits in roadway
I #1 exhibition, very well received (great depression), prototypes by RCA  GM
https://www.youtube.com/watch?v=sClZqfnWqmc 62
1956: General Motors Firebird II
https://www.youtube.com/watch?v=cPOmuvFostY 63
1956: General Motors Firebird II
https://www.youtube.com/watch?v=cPOmuvFostY 63
1956: General Motors Firebird II
https://www.youtube.com/watch?v=cPOmuvFostY 63
1960: RCA Labs’ Wire Controlled Car  Aeromobile
https://spectrum.ieee.org/selfdriving-cars-were-just-around-the-cornerin-1960 64
1970: Citroen DS19
I Steered by sensing magnetic cables in the road, up to 130 km/h
https://www.youtube.com/watch?v=MwdjM2Yx3gU 65
1986: Navlab 1
I Vision-based navigation
Jochem, Pomerleau, Kumar and Armstrong: PANS: A Portable Navigation Platform. IV, 1995. 66
Navlab Overview
I Project at Carnegie Mellon University, USA
I 1986: Navlab 1: 5 computer racks (Warp supercomputer)
I 1988: First semi-autonomous drive at 20 mph
I 1990: Navlab 2: 6 mph offroad, 70 mph highway driving
I 1995: Navlab 5: “No Hands Across America” (2850 miles, 98 % autonomy)
I PANS: Portable Advanced Navigation Support
I Compute: 50 Mhz Sparc workstation (only 90 watts)
I Main focus: lane keeping (lateral but no longitudinal control, i.e., no steering)
I Position estimation: Differential GPS + Fibre Optic Gyroscope (IMU)
I Low-level control: HC11 microcontroller
Jochem, Pomerleau, Kumar and Armstrong: PANS: A Portable Navigation Platform. IV, 1995. 67
1988: ALVINN
ALVINN: An Autonomous Land Vehicle in a Neural Network
I Forward-looking, vision based driving
I Fully connected neural network maps
road images to vehicle turn radius
I Directions discretized (45 bins)
I Trained on simulated road images
I Tested on unlined paths, lined city
streets and interstate highways
I 90 consecutive miles at up to 70 mph
Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 68
1988: ALVINN
Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 69
1995: AURORA
AURORA: Automative Run-Off-Road Avoidance System
I Downward-looking (mounted at side)
I Adjustable template correlation
I Tracks solid or dashed lane marking
I shown to perform robustly even
when the markings are worn or their
appearance in the image is degraded
I Mainly tested as a lane departure
warning system (“time to crossing”)
Chen, Jochem and Pomerleau: AURORA: A Vision-Based Roadway Departure Warning System. IROS, 1995. 70
1986: VaMoRs – Bundeswehr Universität Munich
I Developed by Ernst Dickmanns in context of EUREKA-Prometheus (€800 mio.)
(PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety, 1987- 1995)
I Demonstration to Daimler-Benz Research 1986 in Stuttgart
I Longitudinal  lateral guidance with lateral acceleration feedback
I Speed: 0 to 36 km/h 71
1986: VaMoRs – Bundeswehr Universität Munich
I Developed by Ernst Dickmanns in context of EUREKA-Prometheus (€800 mio.)
(PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety, 1987- 1995)
I Demonstration to Daimler-Benz Research 1986 in Stuttgart
I Longitudinal  lateral guidance with lateral acceleration feedback
I Speed: 0 to 36 km/h 71
1994: VAMP – Bundeswehr Universität Munich
I 2nd Generation Transputer (60 processors), bifocal saccade vision, no GPS
I 1678 km autonomous ride Munich to Odense, 95% autonomy (up to 158 km)
I Autonomous driving speed record: 180 km/h (lane keeping)
I Convoi driving, automatic lane change (triggered by human)
72
1992: Summary Paper by Dickmanns
Dickmanns and Mysliwetz: Recursive 3-D Road and Relative Ego-State Recognition. PAMI, 1992. 73
1995: Invention of Adaptive Cruise Control (ACC)
I 1992: Lidar-based distance control by Mitsubishi (throttle control  downshift)
I 1997: Laser adaptive cruise control by Toyota (throttle control  downshift)
I 1999: Distronic radar-assisted ACC by Mercedes-Benz (S-Class), level 1 autonomy
74
2000: First Technological Revolution: GPS, IMUs  Maps
I NAVSTAR GPS available with 1 meter accuracy, IMUs improve up to 5 cm
I Navigation systems and road maps available
I Accurate self-localization and ego-motion estimation algorithms
75
2004: Darpa Grand Challenge 1 (Limited to US Participants)
I 1st competition in the Mojave Desert along a 240 km route, $1 mio prize money
I No traffic, dirt roads, driven by GPS (2935 points, up to 4 per curve).
I None of the robot vehicles finished the route. CMU traveled the farthest distance,
completing 11.78 km of the course before hitting a rock.
76
2005: Darpa Grand Challenge 2 (Limited to US Participants)
I 2nd competition in the Mojave Desert along a 212 km route, $2 mio prize money
I Five teams finished (Stanford team 1st in 6:54 h, CMU team 2nd in 7:05 h)
77
2006: Park Shuttle Rotterdam
I 1800 meters route from metro station Kralingse Zoom to business park Rivium
I One of the first truly driverless car, but dedicated lane, localization via magnets
78
2006: Second Technological Revolution: Lidars  High-res Sensors
I High-resolution Lidar
I Camera systems with increasing resolution
I Accurate 3D reconstruction, 3D detection  3D localization
79
2007: Darpa Urban Challenge (International Participants)
I 3nd competition at George Air Force Base, 96 km route, urban driving, $2 mio
I Rules: obey traffic law, negotiate, avoid obstacles, merge into traffic
I 11 US teams received $1 mio funding for their research
I Winners: CMU 1st (4:10), Stanford’s Stanley 2nd (4:29). No non-US participant.
80
2009: Google starts working on Self-Driving Car
I Led by Sebastian Thrun, former director of Stanford AI lab and Stanley team
I Others: Chris Urmson, Dmitri Dolgov, Mike Montemerlo, Anthony Levandowski
I Renamed “Waymo” in 2016 (Google spend $1 billion until 2015)
https://waymo.com/ 81
2010: VisLab Intercontinental Autonomous Challenge (VIAC)
I July 20 to October 28: 16,000 kilometres trip from Parma, Italy to Shanghai, China
I The second vehicle automatically followed the route defined by the leader vehicle
by following it either visually or thanks to GPS waypoints sent by the lead vehicle
Broggi, Medici, Zani, Coati and Panciroli: Autonomous vehicles control in the VisLab Intercontinental Autonomous Challenge. Annu. Rev. Control, 2012. 82
2010: VisLab Intercontinental Autonomous Challenge (VIAC)
I July 20 to October 28: 16,000 kilometres trip from Parma, Italy to Shanghai, China
I The second vehicle automatically followed the route defined by the leader vehicle
by following it either visually or thanks to GPS waypoints sent by the lead vehicle
Broggi, Medici, Zani, Coati and Panciroli: Autonomous vehicles control in the VisLab Intercontinental Autonomous Challenge. Annu. Rev. Control, 2012. 82
2010: Pikes Peak Self-Driving Audi TTS
I Pikes Peak International Hill Climb (since 1916): 20 km, 1440 Hm, Summit: 4300 m
I Audi TTS completes track in 27 min (record in 2010: 10 min, now: 8 min)
https://www.youtube.com/watch?v=Arx8qWx9CFk 83
2010: Stadtpilot (Technical University Braunschweig)
I Goal: geofenced innercity driving based on laser scanners, cameras and HD maps
I Challenges: traffic lights, roundabouts, etc. Similar efforts by FU Berlin and others
Saust, Wille, Lichte and Maurer: Autonomous Vehicle Guidance on Braunschweig’s inner ring road within the Stadtpilot Project. IV, 2011. 84
2012: Third Technological Revolution: Deep Learning
I Representation learning boosts in accuracy across tasks and benchmarks
Güler et al.: DensePose: Dense Human Pose Estimation In The Wild. CVPR, 2018. 85
2012: Third Technological Revolution: New Benchmarks
Geiger, Lenz and Urtasun: Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012. 86
2013: Mercedes Benz S500 Intelligent Drive
I Autonomous ride on historic Bertha Benz route by Daimler RD and KIT/FZI
I Novelty: close to production stereo cameras / radar (but requires HD maps)
Ziegler et al.: Making Bertha Drive - An Autonomous Journey on a Historic Route. IEEE Intell. Transp. Syst. Mag., 2014. 87
2014: Mercedes S Class
Advanced ADAS (Level 2 Autonomy):
I Autonomous steering, lane keeping, acceleration/braking, collision avoidance,
driver fatigue monitoring in city traffic and highway speeds up to 200 km/h 88
2014: Society of Automotive Engineers: SAE Levels of Autonomy
I Lateral control = steering, Longitudinal control = gas/brake
89
Disengagement per 1000 miles (California Dept. of Motor Veh., 2017)
90
2015: Uber starts Self-Driving Research
I Uber hires 50 robotic researchers and academics from CMU, shut down in 2020
91
2016: OTTO
I Self-driving truck company, bought by Uber for $625 mio., later shut down
92
2015: Tesla Model S Autopilot
Tesla Autopilot 2015 (Level 2 Autonomy):
I Lane keeping for limited-access highways (hands off time: 30-120 seconds)
I Doesn’t read traffic signals, traffic signs or detect pedestrians/cyclists
93
2016: Tesla Model S Autopilot: Fatal Accident 1
94
2018: Tesla Model X Autopilot: Fatal Accident 2
95
2018: Tesla Model X Autopilot: Fatal Accident 2
The National Transportation Safety Board (NTSB) said that four seconds before the 23
March crash on a highway in Silicon Valley, which killed Walter Huang, 38, the car
stopped following the path of a vehicle in front of it. Three seconds before the impact,
it sped up from 62mph to 70.8mph, and the car did not brake or steer away, the NTSB
said. After the fatal crash in the city of Mountain View, Tesla noted that the driver had
received multiple warnings to put his hands on the wheel and said he did not intervene
during the five seconds before the car hit the divider. But the NTSB report revealed that
these alerts were made more than 15 minutes before the crash. In the 60 seconds
prior to the collision, the driver also had his hands on the wheel on three separate
occasions, though not in the final six seconds, according to the agency. As the car
headed toward the barrier, there was no precrash braking or evasive steering
movement, the report added. The Guardian (June, 2018)
96
2018: Waymo (former Google) announced Public Service
I In 2018 driving without safety driver in a geofenced district of Phoenix
I By 2021 also in suburbs of Arizona, San Francisco and Mountain View
97
2018: Nuro Last-mile Delivery
I Founded by two of the Google self-driving car engineers
98
Self-Driving Industry
I NVIDIA: Supplier of self-driving hardware and software
I Waabi: Startup by Raquel Urtasun (formly Uber)
I Aurora: Startup by Chris Urmson (formerly CMU, Google, Waymo)
I Argo AI: Startup by Bryan Salesky (now Ford/Volkswagen)
I Zoox: Startup by Jesse Levinson (now Amazon)
I Cruise: Startup by Kyle Vogt (now General Motors)
I NuTonomy: Startup by Doug Parker (now Delphi/Aptiv)
I Efforts in China: Baidu Apollo, AutoX, Pony.AI
I Comma.ai: Custom open-source dashcam to retrofit any vehicle
I Wayve: Startup focusing on end-to-end self-driving
99
Self-Driving Industry
100
Business Models
Autonomous or nothing (Google, Apple, Uber)
I Very risky, only few companies can do this
I Long term goals
Introduce technology little by little (all car companies)
I Car industry is very conservative
I ADAS as intermediate goal
I Sharp transition: how to maintain the driver engaged?
101
Wild Predictions about the Future of Self-Driving
102
Summary
I Self-driving has a long history
I Highway lane-keeping of today was developed over 30 years ago
I Increased robustness ⇒ introduction of level 3 for highways in 2019
I Increased interest after DARPA challenge and new benchmarks (e.g., KITTI)
I Many claims about full self-driving (e.g., Elon Musk), but level 4/5 stays hard
I Waymo introduced first public service end of 2018 (with safety driver)
I Waymo/Tesla seem ahead of competition in full self-driving, but no winner yet
I But several setbacks (Uber, Tesla accidents)
I Most existing systems require laser scanners and HD maps (exception: Tesla)
I Driving as an engineering problem, quite different from human cognition
103
Self-Driving Cars
Lecture 2 – Imitation Learning
Robotics, Computer Vision, System Software
BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad)
Kumar Bipin
Agenda
2.1 Approaches to Self-Driving
2.2 Deep Learning Recap
2.3 Imitation Learning
2.4 Conditional Imitation Learning
2
2.1
Approaches to Self-Driving
Autonomous Driving
Steer
Gas Brake
Sensory Input
Mapping Function
Dominating Paradigms:
I Modular Pipelines
I End-to-End Learning (Imitation Learning, Reinforcement Learning)
I Direct Perception
4
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
Examples:
I [Montemerlo et al., JFR 2008]
I [Urmson et al., JFR 2008]
I Waymo, Uber, Tesla, Zoox, ...
5
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
6
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
6
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
6
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
6
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
6
Autonomous Driving: Modular Pipeline
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
Pros:
I Small components, easy
to develop in parallel
I Interpretability
Cons:
I Piece-wise training (not jointly)
I Localization and planning
heavily relies on HD maps
HD maps: Centimeter precision lanes, markings, traffic lights/signs, human annotated
7
Autonomous Driving: Modular Pipeline
I Piece-wise training difficult: not all objects are equally important!
Ohn-Bar and Trivedi: Are All Objects Equal? Deep Spatio-Temporal Importance Prediction in Driving Videos. PR, 2017. 8
Autonomous Driving: Modular Pipeline
I HD Maps are expensive to create (data collection  annotation effort)
https://www.geospatialworld.net/article/hd-maps-autonomous-vehicles/ 9
Autonomous Driving: End-to-End Learning
Steer
Gas Brake
Sensory Input
Imitation Learning / Reinforcement Learning
Neural
Network
Examples:
I [Pomerleau, NIPS 1989]
I [Bojarski, Arxiv 2016]
I [Codevilla et al., ICRA 2018]
10
Autonomous Driving: End-to-End Learning
Steer
Gas Brake
Sensory Input
Imitation Learning / Reinforcement Learning
Neural
Network
Pros:
I End-to-end training
I Cheap annotations
Cons:
I Training / Generalization
I Interpretability
10
Autonomous Driving: Direct Perception
Steer
Gas Brake
Sensory Input
Direct Perception
Intermediate
Representations
Vehicle
Control
Neural
Network
Examples:
I [Chen et al., ICCV 2015]
I [Sauer et al., CoRL 2018]
I [Behl et al., IROS 2020]
11
Autonomous Driving: Direct Perception
Steer
Gas Brake
Sensory Input
Direct Perception
Intermediate
Representations
Vehicle
Control
Neural
Network
Pros:
I Compact Representation
I Interpretability
Cons:
I Control typically not learned jointly
I How to choose representations?
11
2.2
Deep Learning Recap
Supervised Learning
Input Output
Model
I Learning: Estimate parameters w from training data {(xi, yi)}N
i=1
I Inference: Make novel predictions: y = fw(x)
13
Linear Classification
Logistic Regression
ŷ = σ(w
x + w0) σ(x) =
1
1 + e−x
I Let x ∈ R2
I Decision boundary: wx + w0 = 0
I Decide for class 1 ⇔ wx  −w0
I Decide for class 0 ⇔ wx  −w0
I Which problems can we solve? 10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0
x
0.0
0.2
0.4
0.6
0.8
1.0
(x)
Sigmoid
0.5
Decision
Boundary
Class 0 Class 1
14
Linear Classification
Linear Classifier:
Class 1 ⇔ w
x  −w0
x1 x2 OR(x1,x2)
0 0 0
0 1 1
1 0 1
1 1 1

1 1

| {z }
w
x1
x2
!
| {z }
x
 0.5
|{z}
−w0
1
1
0
Class 0
Class 1
15
Linear Classification
Linear Classifier:
Class 1 ⇔ w
x  −w0
x1 x2 AND(x1,x2)
0 0 0
0 1 0
1 0 0
1 1 1

1 1

| {z }
w
x1
x2
!
| {z }
x
 1.5
|{z}
−w0
1
1
0
Class 0
Class 1
16
Linear Classification
Linear Classifier:
Class 1 ⇔ w
x  −w0
x1 x2 NAND(x1,x2)
0 0 1
0 1 1
1 0 1
1 1 0

− 1 − 1

| {z }
w
x1
x2
!
| {z }
x
 −1.5
|{z}
−w0
1
1
0
Class 0
Class 1
17
Linear Classification
Linear Classifier:
Class 1 ⇔ w
x  −w0
x1 x2 XOR(x1,x2)
0 0 0
0 1 1
1 0 1
1 1 0

? ?

| {z }
w
x1
x2
!
| {z }
x
 ?
|{z}
−w0
1
1
0
Class 0
Class 1
?
18
Linear Classification
Linear classifier with
non-linear features ψ:
w



x1
x2
x1x2



| {z }
ψ(x)
 −w0
x1 x2 ψ1(x) ψ2(x) ψ3(x) XOR
0 0 0 0 0 0
0 1 0 1 0 1
1 0 1 0 0 1
1 1 1 1 1 0
1
0
Class 0
Class 1
1
I Non-linear features allow linear
classifier to solve non-linear
classification problems!
19
Representation Matters
CHAPTER 1. INTRODUCTION
x
y
r
θ
Cartesian Coordinates Polar Coordinates
I But how to choose the transformation? Can be very hard in practice.
I Yet, this was the dominant approach until the 2000s (vision, speech, ..)
I In deep/representation learning we want to learn these transformations
20
Non-Linear Classification
Linear Classifier:
Class 1 ⇔ w
x  −w0
x1 x2 XOR(x1,x2)
0 0 0
0 1 1
1 0 1
1 1 0
XOR(x1, x2) =
AND(OR(x1, x2),NAND(x1, x2))
1
1
0
Class 0
Class 1
N
A
N
D
O
R
X
O
R
21
Non-Linear Classification
XOR(x1, x2) = AND(OR(x1, x2),NAND(x1, x2))
The above expression can be rewritten
as a program of logistic regressors:
h1 = σ(w
OR x + wOR)
h2 = σ(w
NAND x + wNAND)
ŷ = σ(w
AND h + wAND)
Note that h(x) is a non-linear feature of x.
We call h(x) a hidden layer.
22
Multi-Layer Perceptrons
I MLPs are feedforward neural networks (no feedback connections)
I They compose several non-linear functions f(x) = ŷ(h3(h2(h1(x))))
where hi(·) are called hidden layers and ŷ(·) is the output layer
I The data specifies only the behavior of the output layer (thus the name “hidden”)
I Each layer i comprises multiple neurons j which are implemented as affine
transformations (ax + b) followed by non-linear activation functions (g):
hij = g(a
ijhi−1 + bij)
I Each neuron in each layer is fully connected to all neurons of the previous layer
I The overall length of the chain is the depth of the model ⇒ “Deep Learning”
23
MLP Network Architecture
Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output Layer
Network Depth = #Computation Layers = 4
Layer
Width
=
#Neurons
in
Layer
I Neurons are grouped into layers, each neuron fully connected to all prev. ones
I Hidden layer hi = g(Aihi−1 + bi) with activation function g(·) and weights Ai, bi
24
Deeper Models allow for more Complex Decisions
2 Hidden Neurons 5 Hidden Neurons 15 Hidden Neurons
https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html
25
Output and Loss Functions
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target
I The output layer is the last layer in a neural network which computes the output
I The loss function compares the result of the output layer to the target value(s)
I Choice of output layer and loss function depends on task (discrete, continuous, ..)
26
Output Layer
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target
I For classification problems, we use a sigmoid or softmax non-linearity
I For regression problems, we can directly return the value after the last layer
27
Loss Function
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target
I For classification problems, we use the (binary) cross-entropy loss
I For regression problems, we can use the `1 or `2 loss
28
Activation Functions
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target
I Hidden layer hi = g(Aihi−1 + bi) with activation function g(·) and weights Ai, bi
I The activation function is frequently applied element-wise to its input
I Activation functions must be non-linear to learn non-linear mappings
29
Activation Functions
30
Convolutional Neural Networks
Convolutional Neural Networks
I Multi-layer perceptrons don’t scale to high-dimensional inputs
I ConvNets represent data in 3 dimensions: width, height, depth (= feature maps)
I ConvNets interleave discrete convolutions, non-linearities and pooling
I Key ideas: sparse interactions, parameter sharing, equivariant representation
32
Fully Connected vs. Convolutional Layers
Filter Kernel
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Fully Connected vs. Convolutional Layers
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Fully Connected vs. Convolutional Layers
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Fully Connected vs. Convolutional Layers
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Fully Connected vs. Convolutional Layers
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Fully Connected vs. Convolutional Layers
I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1)
I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”)
I With Cin input and Cout output channels, layer size W × H and kernel size K × K
I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
Padding
Idea of Padding:
I Add boundary of appropriate size with zeros (blue) around input tensor
34
Downsampling
I Downsampling reduces the spatial resolution (e.g., for image level predictions)
I Downsampling increases the receptive field (which pixels influence a neuron)
35
Pooling
I Typically, stride s = 2 and kernel size 2 × 2 ⇒ reduces spatial dimensions by 2
I Pooling has no parameters (typical pooling operations: max, min, mean)
36
Pooling
I Typically, stride s = 2 and kernel size 2 × 2 ⇒ reduces spatial dimensions by 2
I Pooling has no parameters (typical pooling operations: max, min, mean)
36
Fully Connected Layers
I Often, convolutional networks comprise fully connected layers at the end
37
Optimization
Optimization
Optimization Problem: (dataset X)
w∗
= argmin
w
L(X, w)
Gradient Descent:
w0
= winit
wt+1
= wt
− η ∇wL(X, wt
)
I Neural network loss L(X, w) is not convex, we have to use gradient descent
I There exist multiple local minima, but we will find only one through optimization
I Good news: it is known that many local minima in deep networks are good ones
39
Backpropagation
I Values are efficiently computed
forward, gradients backward
I Modularity: Each node must only
“know” how to compute gradients
wrt. its own arguments
I One fw/bw pass per data point:
∇wL(X, w) =
N
X
i=1
∇wL(yi, xi, w)
| {z }
Backpropagation
Compute Loss
Compute Derivatives
40
Gradient Descent
Algorithm:
1. Initialize weights w0 and pick learning rate η
2. For all data points i ∈ {1, . . . , N} do:
2.1 Forward propagate xi through network to calculate prediction ŷi
2.2 Backpropagate to obtain gradient ∇wLi(wt
) ≡ ∇wL(ŷi, yi, wt
)
3. Update gradients: wt+1 = wt − η 1
N
P
i ∇wLi(wt)
4. If validation error decreases, go to step 2, otherwise stop
Challenges:
I Typically, millions of parameters ⇒ dim(w) = 1 million or more
I Typically, millions of training points ⇒ N = 1 million or more
I Becomes extremely expensive to compute and does not fit into memory
41
Stochastic Gradient Descent
Solution:
I The total loss over the entire training set can be expressed as the expectation:
1
N
X
i
Li(wt
) = Ei∼U{1,N}

Li(wt
)

I This expectation can be approximated by a smaller subset B  N of the data:
Ei∼U{1,N}

Li(wt
)

≈
1
B
X
b
Lb(wt
)
I Thus, the gradient can also be approximated by this subset (=minibatch):
1
N
X
i
∇wLi(wt
) ≈
1
B
X
b
∇wLb(wt
)
42
Stochastic Gradient Descent
Algorithm:
1. Initialize weights w0, pick learning rate η and minibatch size |Xbatch|
2. Draw random minibatch {(x1, y1), . . . , (xB, yB)} ⊆ X (with B  N)
3. For all minibatch elements b ∈ {1, . . . , B} do:
3.1 Forward propagate xb through network to calculate prediction ŷb
3.2 Backpropagate to obtain batch element gradient ∇wLb(wt
) ≡ ∇wL(ŷb, yb, wt
)
4. Update gradients: wt+1 = wt − η 1
B
P
b ∇wLb(wt)
5. If validation error decreases, go to step 2, otherwise stop
43
First-order Methods
There exist many variants:
I SGD
I SGD with Momentum
I SGD with Nesterov Momentum
I RMSprop
I Adagrad
I Adadelta
I Adam
I AdaMax
I NAdam
I AMSGrad
Adam is often the method of choice due to its robustness.
44
Learning Rate Schedules
0 10 20 30 40 50
20
30
40
50
60
iter. (1e4)
error
(%)
plain-18
plain-34
0 10 20 30 40 50
20
30
40
50
60
iter. (1e4)
error
(%)
ResNet-18
ResNet-34
I A fixed learning rate is too slow in the beginning and too fast in the end
I Exponential decay: ηt = ηαt
I Step decay: η ← 0.5η (every K iterations)
He, Zhang, Ren andSun: Deep Residual Learning for Image Recognition. CVPR, 2016. 45
Regularization
Capacity, Overfitting and Underfitting
0.0 0.2 0.4 0.6 0.8 1.0
x
1.5
1.0
0.5
0.0
0.5
1.0
1.5
y
M=1 Ground Truth
Noisy Observations
Polynomial Fit
Test Set
0.0 0.2 0.4 0.6 0.8 1.0
x
1.5
1.0
0.5
0.0
0.5
1.0
1.5
y
M=3 Ground Truth
Noisy Observations
Polynomial Fit
Test Set
0.0 0.2 0.4 0.6 0.8 1.0
x
1.5
1.0
0.5
0.0
0.5
1.0
1.5
y
M=9 Ground Truth
Noisy Observations
Polynomial Fit
Test Set
Capacity too low Capacity about right Capacity too high
I Underfitting: Model too simple, does not achieve low error on training set
I Overfitting: Training error small, but test error (= generalization error) large
I Regularization: Take model from third regime (right) to second regime (middle)
47
Early Stopping and Parameter Penalties
Unregularized Objective
L2 Regularizer
Early stopping:
I Dashed: Trajectory taken by SGD
I Trajectory stops at w̃ before
reaching minimum training error w∗
L2 Regularization:
I Regularize objective with L2 penalty
I Penalty forces minimum of
regularized loss w̃ closer to origin
48
Dropout
Idea:
I During training, set neurons to zero with probability µ (typically µ = 0.5)
I Each binary mask is one model, changes randomly with every training iteration
I Creates ensemble “on the fly” from a single network with shared parameters
Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov: Dropout: a simple way to prevent neural networks from overfitting. JMLR, 2014. 49
Data Augmentation
I Best way towards better generalization
is to train on more data
I However, data in practice often limited
I Goal of data augmentation: create
“fake” data from the existing data (on
the fly) and add it to the training set
I New data must preserve semantics
I Even simple operations like translation
or adding per-pixel noise often already
greatly improve generalization
I https://github.com/aleju/imgaug
50
2.3
Imitation Learning
Imitation Learning: Manipulation
Towards Imitation Learning of Dynamic Manipulation Tasks: A Framework to Learn from Failures 52
Imitation Learning: Car Racing
Trainer
(Human Driver)
Trainee
(Neural Network)
53
Imitation Learning in a Nutshell
Hard coding policies is often difficult ⇒ Rather use a data-driven approach!
I Given: demonstrations or demonstrator
I Goal: train a policy to mimic decision
I Variants: behavior cloning (this lecture), inverse optimal control, ...
54
Formal Definition of Imitation Learning
I State: s ∈ S may be partially observed (e.g., game screen)
I Action: a ∈ A may be discrete or continuous (e.g., turn angle, speed)
I Policy: πθ : S → A we want to learn the policy parameters θ
I Optimal action: a∗ ∈ A provided by expert demonstrator
I Optimal policy: π∗ : S → A provided by expert demonstrator
I State dynamics: P(si+1|si, ai) simulator, typically not known to policy
Often deterministic: si+1 = T(si, ai) deterministic mapping
I Rollout: Given s0, sequentially execute ai = πθ(si)  sample si+1 ∼ P(si+1|si, ai)
yields trajectory τ = (s0, a0, s1, a1, . . . )
I Loss function: L(a∗, a) loss of action a given optimal action a∗
55
Formal Definition of Imitation Learning
General Imitation Learning:
argmin
θ
Es∼P(s|πθ) [L (π∗
(s), πθ(s))]
I State distribution P(s|πθ) depends on rollout
determined by current policy πθ
Behavior Cloning:
argmin
θ
E(s∗,a∗)∼P∗ [L (a∗
, πθ(s∗
))]
| {z }
=
PN
i=1 L(a∗
i ,πθ(s∗
i ))
I State distribution P∗ provided by expert
I Reduces to supervised learning problem
56
Challenges of Behavior Cloning
I Behavior cloning makes IID assumption
I Next state is sampled from states observed during expert demonstration
I Thus, next state is sampled independently from action predicted by current policy
I What if πθ makes a mistake?
I Enters new states that haven’t been observed before
I New states not sampled from same (expert) distribution anymore
I Cannot recover, catastrophic failure in the worst case
I What can we do to overcome this train/test distribution mismatch?
57
DAgger
Data Aggregation (DAgger):
I Iteratively build a set of inputs that the final policy is likely to encounter based on
previous experience. Query expert for aggregate dataset
I But can easily overfit to main mode of demonstrations
I High training variance (random initialization, order of data)
Ross, Gordon and Bagnell: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. AISTATS, 2011. 58
DAgger with Critical States and Replay Buffer
Key Ideas:
1. Sample critical states from the collected on-policy data based on the
utility they provide to the learned policy in terms of driving behavior
2. Incorporate a replay buffer which progressively focuses on the high
uncertainty regions of the policy’s state distribution
Prakash, Behl, Ohn-bar, Chitta and Geiger: Exploring Data Aggregation in Policy Learning for Vision-based Urban Autonomous Driving. CVPR, 2020. 59
ALVINN: An Autonomous Land Vehicle
in a Neural Network
ALVINN: An Autonomous Land Vehicle in a Neural Network
I Fully connected 3 layer neural net
I 36k parameters
I Maps road images to turn radius
I Directions discretized (45 bins)
I Trained on simulated road images!
I Tested on unlined paths, lined city
streets and interstate highways
I 90 consecutive miles at up to 70 mph
Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 61
ALVINN: An Autonomous Land Vehicle in a Neural Network
Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 62
PilotNet: End-to-End Learning for Self-Driving Cars
PilotNet: System Overview
I Data augmentation by 3 cameras and virtually shifted / rotated images assuming
the world is flat (homography), adjusting the steering angle appropriately
Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 64
PilotNet: Architecture
I Convolutional network (250k param)
I Input: YUV image representation
I 1 Normalization layer
I Not learned
I 5 Convolutional Layers
I 3 strided 5x5
I 2 non-strided 3x3
I 3 Fully connected Layers
I Output: turning radius
I Trained on 72h of driving
Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 65
PilotNet: Video
Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 66
VisualBackProp
I Central idea: find salient image regions that lead to high activations
I Forward pass, then iteratively scale-up activations
Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 67
VisualBackProp
Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 68
VisualBackProp
I Test if shift in salient objects affects predicted turn radius more strongly
Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 69
2.4
Conditional Imitation Learning
Conditional Imitation Learning
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 71
Conditional Imitation Learning
Idea:
I Condition controller on navigation command c ∈ {left,right,straight}
I High-level navigation command can be provided by consumer GPS, i.e.,
telling the vehicle to turn left/right or go straight at the next intersection
I This removes the task ambiguity induced by the environment
I State st: current image Action at: steering angle  acceleration
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 72
Comparison to Behavior Cloning
Behavior Cloning:
I Training Set:
D = {(a∗
i , s∗
i )N
i=1}
I Objective:
argmin
θ
N
X
i=1
L (a∗
i , πθ(s∗
i ))
I Assumption:
∃f(·) : ai = f(si)
Often violated in practice!
Conditional Imitation Learning:
I Training Set:
D = {(a∗
i , s∗
i , c∗
i )N
i=1}
I Objective:
argmin
θ
N
X
i=1
L (a∗
i , πθ(s∗
i , c∗
i ))
I Assumption:
∃f(·, ·) : ai = f(si, ci)
Better assumption!
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 73
Conditional Imitation Learning: Network Architecture
I This paper proposes two network architectures:
I (a) Extract features C(c) and concatenate with image features I(i)
I (b) Command c acts as switch between specialized submodules
I Measurements m capture additional information (here: speed of vehicle)
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 74
Conditional Imitation Learning: Noise Injection
I Temporally correlated noise injected into trajectories ⇒ drift (only 12 minutes)
I Record driver’s (=expert’s) corrective response ⇒ recover from drift
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 75
CARLA Simulator
http://www.carla.org
Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 76
Conditional Imitation Learning
Codevilla, Santana, Lopez and Gaidon: Exploring the Limitations of Behavior Cloning for Autonomous Driving. ICCV, 2019. 77
Neural Attention Fields
I An MLP iteratively compresses the high-dimensional input into a compact
representation ci (c 6= nav. command) based on a BEV query location as input
I The model predicts waypoints and auxiliary semantics which aids learning
Chitta, Prakash and Geiger: Neural Attention Fields for End-to-End Autonomous Driving. ICCV, 2021. 78
Summary
Advantages of Imitation Learning:
I Easy to implement
I Cheap annotations (just driving while recording images and actions)
I Entire model trained end-to-end
I Conditioning removes ambiguity at intersections
Challenges for Imitation Learning?
I Behavior cloning uses IID assumption which is violated in practice
I Direct mapping from images to control ⇒ No long term planning
I No memory (can’t remember speed signs, etc.)
I Mapping is difficult to interpret (“black box”), despite visualization techniques
79
Self-Driving Cars
Lecture 3 – Direct Perception
Robotics, Computer Vision, System Software
BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad)
Kumar Bipin
Agenda
3.1 Direct Perception
3.2 Conditional Affordance Learning
3.3 Visual Abstractions
3.4 Driving Policy Transfer
3.5 Online vs. Offline Evaluation
2
3.1
Direct Perception
Approaches to Self-Driving
Steer
Gas Brake
Sensory Input
Modular Pipeline
Path
Planning
Vehicle
Control
Scene
Parsing
Low-level
Perception
+ Modular + Interpretable - Expert decisions - Piece-wise training
Steer
Gas Brake
Sensory Input
Imitation Learning / Reinforcement Learning
Neural
Network
+ End-to-end + Simple - Generalization - Interpretable - Data 4
Direct Perception
Steer
Gas Brake
Sensory Input
Direct Perception
Intermediate
Representations
Vehicle
Control
Neural
Network
Idea of Direct Perception:
I Hybrid model between imitation learning and modular pipelines
I Learn to predict interpretable low-dimensional intermediate representation
I Decouple perception from planning and control
I Allows to exploit classical controllers or learned controllers (or hybrids)
5
Direct Perception for Autonomous Driving
Affordances:
I Attributes of the environment which limit space of actions [Gibson, 1966]
I In this case: 13 affordances
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 6
Overview
I TORCS Simulator: Open source car racing game simulator
I Network: AlexNet (5 conv layers, 4 fully conn. layers), 13 output neurons
I Training: Affordance indicators trained with `2 loss
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 7
Affordance Indicators and State Machine
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 8
Controller
Steering controller:
s = θ1(α − dc/w)
I s: steering command θ1: parameter
I α: relative orientation dc: distance to centerline w: road width
Speed controller: (“optimal velocity car following model”)
v = vmax (1 − exp (−θ2 dp − θ3))
I v: target velocity vmax maximal velocity
I dp: distance to preceding car θ2,3: parameters
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 9
TORCS Simulator
I TORCS: Open source car racing game http://torcs.sourceforge.net/
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 10
Results
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 11
Network Visualization
I Left: Averaged top 100 images activating a neuron in first fully connected layer
I Right: Maximal response of 4th conv. layer (note: focus on cars and markings)
Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 12
3.2
Conditional Affordance Learning
How can we transfer this idea to cities?
Conditional Affordance Learning
Affordances
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑎𝑛𝑔𝑙𝑒 = 0.01 𝑟𝑎𝑑
𝐶𝑒𝑛𝑡𝑒𝑟𝑙𝑖𝑛𝑒 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒= 0.15 𝑚
𝑅𝑒𝑑 𝑙𝑖𝑔ℎ𝑡 = 𝐹𝑎𝑙𝑠𝑒
…
Neural
Network
Video Input
Directional Input
Control
Comands
Controller
Brake = 0.0
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 15
CARLA Simulator
I Goal: drive from A to B as fast, safely and comfortably as possible
I Infractions:
I Driving on wrong lane
I Driving on sidewalk
I Running a red light
I Violating speed limit
I Colliding with vehicles
I Hitting other objects
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 16
Affordances
Affordances:
I Distance to centerline
I Relative angle to road
I Distance to lead vehicle
I Speed signs
I Traffic lights
I Hazard stop
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 17
Affordances
Affordances:
I Distance to centerline
I Relative angle to road
I Distance to lead vehicle
I Speed signs
I Traffic lights
I Hazard stop
30
km/h
𝐴1
𝐴2
𝑨𝟑
𝝍
𝒅 𝒙𝒍𝒐𝒄𝒂𝒍
𝒚𝒍𝒐𝒄𝒂𝒍
centerline
vehicle
𝒍 = 𝟏𝟓 𝒎
agent
hazard stop
(for pedestrian)
speed
sign
traffic
light
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 18
Overview
...
Task Blocks
Feature
map
Feature Extractor
High-level Planner
Agent in Environment
Position
unconditional
conditional
CARLA
...
...
Longitudinal Control
Lateral Control
Controller
Control
Commands
Affordances
Directional
Command
Image
Perception CAL Agent
Memory
N
N-1
1
2
...
N-2
3
...
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 19
Controller
Longitudinal Control
cruising
following
over_limit
red_light
hazard_stop
States
Affordances Throttle
Brake
if hazard stop == True
elif red light == True
elif speed  limit - 15
elif veh_distance  35
else
I Finite-state machine
I PID controller for cruising
I Car following model
Lateral Control
I Stanley controller
I δ(t) = ψ(t) + arctan

kx(t)
u(t)

I Damping term
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 20
Parameter Learning
Perception Stack:
I Multi-task learning: single forward pass ⇒ fast learning and inference
I Dataset: random driving using controller operating on GT affordances
⇒ 240k images with GT affordances
I Loss functions:
I Discrete affordances: Class-weighted cross-entropy (CWCE)
I Continuous affordances: Mean average error (MAE)
I Optimized with ADAM (batch size 32)
Controller:
I Ziegler-Nichols
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 21
Data Collection
Data Collection:
I Navigation based on true
affordances  random inputs
Data Augmentation:
I No image flipping
I Color, contrast, brightness
I Gaussian blur  noise
I Provoke rear-end collisions
I Camera pose randomization
𝝓𝟏
𝝓𝟐
𝝓𝟑(= 𝟎)
𝒅 = 𝟓𝟎𝒄𝒎
𝒅 = 𝟓𝟎𝒄𝒎
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 22
Results
Training conditions New weather New town
New town and
new weather
Task MP CIL RL CAL MP CIL RL CAL MP CIL RL CAL MP CIL RL CAL
Straight 98 95 89 100 100 98 86 100 92 97 74 93 50 80 68 94
One turn 82 89 34 97 95 90 16 96 61 59 12 82 50 48 20 72
Navigation 80 86 14 92 94 84 2 90 24 40 3 70 47 44 6 68
Nav. dynamic 77 83 7 83 89 82 2 82 24 38 2 64 44 42 4 64
Baselines:
I MP = Modular Pipeline [Dosovitskiy et al., CoRL 2017]
I CIL = Conditional Imitation Learning [Codevilla et al., ICRA 2018]
I RL = Reinforcement Learning A3C [Mnih et al., ICML 2016]
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 23
Results
Conditional Navigation
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
Results
Speed Signs
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
Results
Car Following
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
Results
Hazard Stop
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
Attention
Attention to Hazard Stop
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 25
Attention
Attention to Red Light
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 25
Path Planning
Optimal Path (green) vs. Traveled Path (red)
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 26
Failure Cases
Hazard Stop: False Positive
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
Failure Cases
Hazard Stop: False Negative
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
Failure Cases
Red Light: False Positive
Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
3.3
Visual Abstractions
Does Computer Vision Matter for Action?
Does Computer Vision Matter for Action?
I Analyze various intermediate representations:
segmentation, depth, normals, flow, albedo
I Intermediate representations improve results
I Consistent gains across simulations / tasks
I Depth and semantic provide largest gains
I Better generalization performance
Zhou, Krähenbühl and Koltun: Does computer vision matter for action? Science Robotics, 2019. 29
Visual Abstractions
What is a good visual abstraction?
I Invariant (hide irrelevant variations from policy)
I Universal (applicable to wide range of scenarios)
I Data efficient (in terms of memory/computation)
I Label efficient (require little manual effort)
Train
Test
Pixel Space Representation Space
Figure Credit:
Alexander Sax
Semantic segmentation:
I Encodes task-relevant knowledge (e.g. road is drivable) and priors (e.g., grouping)
I Can be processed with standard 2D convolutional policy networks
Disadvantage:
I Labelling time: ∼90 min for 1 Cityscapes image
Zhou, Krähenbühl and Koltun: Does computer vision matter for action? Science Robotics, 2019. 30
Label Efficient Visual Abstractions
Questions:
I What is the trade-off between annotation time and driving performance?
I Can selecting specific semantic classes ease policy learning?
I Are visual abstractions trained with few images competitive?
I Is fine-grained annotation important?
I Are visual abstractions able to reduce training variance?
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 31
Label Efficient Visual Abstractions
Model:
I Visual abstraction network aψ : s 7→ r (state 7→ abstraction)
I Control policy πθ : r, c, v 7→ a (abstraction, command, velocity 7→ action)
I Composing both yields a = πθ(aψ(s)) (state 7→ action)
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
Label Efficient Visual Abstractions
Datasets:
I nr images annotated with semantic labels R = {si, ri}ns
i=1
I na images annotated with expert actions A = {si, ai}na
i=1
I We assume nr  na
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
Label Efficient Visual Abstractions
Training:
I Train visual abstraction network aφ(·) using semantic dataset R
I Apply this network to obtain control dataset Cφ = {aψ(si), ai}na
i=1
I Train control policy πθ(·) using control dataset Cφ
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
Control Policy
Model:
I CILRS [Codevilla et al., ICCV 2019]
Input:
I Visual abstraction r
I Navigational command c
I Vehicle velocity v
Output:
I Action/control â and velocity v̂
Loss:
I L = ||a − â||1 + λ ||v − v̂||1
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 33
Visual Abstractions
Privileged Segmentation (14 classes):
I Ground-truth semantic labels for 14 classes
I Upper bound for analysis
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
Visual Abstractions
Privileged Segmentation (6 classes):
I Ground-truth semantic labels for 2 stuff and 4 object classes
I Upper bound for analysis
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
Visual Abstractions
Inferred Segmentation (14 classes):
I Segmentation model trained on 14 classes
I ResNet and Feature Pyramid Network (FPN) with segmentation head
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
Visual Abstractions
Inferred Segmentation (6 classes):
I Segmentation model trained on 2 stuff and 4 object classes
I ResNet and Feature Pyramid Network (FPN) with segmentation head
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
Visual Abstractions
Hybrid Detection and Segmentation (6 classes):
I Segmentation model trained on 2 stuff classes: road, lane marking
I Object detection trained on 4 object classes: vehicle, pedestrian, traffic light (r/g)
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
Evaluation
Training Town Test Town
I CARLA 0.8.4 NoCrash benchmark
I Random start and end location
I Metric: Percentage of successfully completed episodes (success rate)
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 35
Traffic Density
Empty Regular Dense
I Difficulty varies with number of dynamic agents in the scene
I Empty: 0 Agents Regular: 65 Agents Dense: 220 Agents
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 36
Identifying Most Relevant Classes (Privileged)
I 14 classes: road, lane marking, vehicle, pedestrian, green light, red light, sidewalk,
building, fence, pole, vegetation, wall, traffic sign, other
I 7 classes: road, lane marking, vehicle, pedestrian, green light, red light, sidewalk,
building, fence, pole, vegetation, wall, traffic sign, other
I 6 classes: road, lane marking, vehicle, pedestrian, green light , red light, sidewalk
I 5 classes: road, lane marking, vehicle, pedestrian, green light, red light
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 37
Identifying Most Relevant Classes (Privileged)
Number of Classes
0%
25%
50%
75%
100%
5 6 7 14
Empty
Number of Classes
0%
25%
50%
75%
100%
5 6 7 14
Regular
Number of Classes
0%
25%
50%
75%
100%
5 6 7 14
Dense
Number of Classes
0%
25%
50%
75%
100%
5 6 7 14
Timeout
Collision
Success
Overall
I Moving from 14 to 6 classes does not hurt driving performance (on contrary)
I Drastic performance drop when lane markings are removed
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 38
Identifying Most Relevant Classes (Privileged)
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 39
Identifying Most Relevant Classes (Inferred)
83
74
100
86
Number of Classes
Success
Rate
0
25
50
75
100
6 14
Empty
64
56
76 72
Number of Classes
Success
Rate
0
25
50
75
100
6 14
Regular
30
19
26 24
Number of Classes
Success
Rate
0
25
50
75
100
6 14
Dense
59
50
67 61
Number of Classes
Success
Rate
0
25
50
75
100
6 14
Standard
Privileged
Overall
I Small performance drop when using inferred segmentations
I 6-class representation consistently improves upon 14-class representation
I We use the 6-class representation for all following experiments
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 40
Hybrid Representation
82
70
25
58
89
67
22
59
Success
Rate
0
25
50
75
100
Empty
Regular
Dense
Overall
Hybrid Standard
I Performance of hybrid representation matches standard segmentation
I Annotation time (segmentation): ∼ 300 seconds per image and per class
I Annotation time (hybrid): ∼ 20 seconds per image and per class
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 41
Summary
Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 42
3.4
Driving Policy Transfer
Driving Policy Transfer
Problem:
I Driving policies learned in simulation often do not transfer well to the real world
Idea:
I Encapsulate driving policy such that it is not directly exposed to raw perceptual
input or low-level control (input: semantic segmentation, output: waypoints)
I Allows for transferring driving policy without retraining or finetuning
Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 44
Waypoint Representation
Representation:
I Input: Semantic segmentation (per pixel “road” vs. “non-road”)
I Output: 2 waypoints (distance to vehicle, relative angle wrt. vehicle heading)
I One sufficient for steering, second one for braking before turns
Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 45
Results
Success Rate over 25 Navigation Trials
I Driving Policy: Conditional Imitation Learning (branched)
I Control: PID controller for lateral and longitudinal control
I Results: Full method generalizes best (“+” = with data augmentation)
Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 46
Results
Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 47
3.5
Online vs. Offline Evaluation
Online vs. Offline Evaluation
I Online evaluation (i.e., using a real vehicle) is expensive and can be dangerous
I Offline evaluation on a pre-recorded validation dataset is cheap and easy
I Question: How predictive is offline evaluation (a) for the online task (b)?
I Empirical study using CIL on CARLA trained with MSE loss on steering angle
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 49
Online Metrics
I Success Rate: Percentage of routes successfully completed
I Average Completion: Average fraction of distance to goal covered
I Km per Infraction: Average driven distance between 2 infractions
Remark: The current CARLA metrics Infraction Score and Driving Score are not
considered in this work from 2018, but would likely lead to similar conclusions.
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 50
Offline Metrics
I a/â: true/predicted steering angle |V |: #samples in validation set
I v : speed δ(·): Kronecker delta function θ: Heaviside step function
I Q ∈ {−1, 0, 1}: Quantization x  −σ −σ ≤ x  σ x ≥ σ
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 51
Results: Online vs. Online
I Generalization performance (town 2, new weather), radius = training iteration
I 45 different models varying dataset size, augmentation, architecture, etc.
I Success rate correlates well with average completion and km per infraction
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 52
Results: Online vs. Offline
I All metrics not well correlated, Mean Square Error (MSE) performs worst
I Absolute steering error improves, speed weighting is not important
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 53
Results: Online vs. Offline
I Cumulating the error over time does not improve the correlation
I Quantized classification and thresholded relative error perform best
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 54
Case Study
I Model 1: Trained with single camera  `2 loss (=bad model)
I Model 2: Trained with three cameras  `1 loss (=good model)
I Predictions of both models noisy, but Model 1 predicts occasionally very large
errors leading to crashes, however the average prediction error is similar
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 55
Case Study
I Model 1 crashes in every trial but model 2 can drive successfully
I Illustrates the difficulty of using offline metrics for predicting online behavior
Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 56
Summary
I Direct perception predicts intermediate representations
I Low-dimensional affordances or classic computer vision representations (e.g.,
semantic segmentation, depth) can be used as intermediate representations
I Decouples perception from planning and control
I Hybrid model between imitation learning and modular pipelines
I Direct methods are more interpretable as the representation can be inspected
I Effective visual abstractions can be learned using limited supervision
I Planning can also be decoupled from control for better transfer
I Offline metrics are not necessarily indicative of online driving performance
57
Self-Driving Cars
Lecture 4 – Reinforcement Learning
Robotics, Computer Vision, System Software
BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad)
Kumar Bipin
Agenda
4.1 Markov Decision Processes
4.2 Bellman Optimality and Q-Learning
4.3 Deep Q-Learning
2
4.1
Markov Decision Processes
Reinforcement Learning
So far:
I Supervised learning, lots of expert demonstrations required
I Use of auxiliary, short-term loss functions
I Imitation learning: per-frame loss on action
I Direct perception: per-frame loss on affordance indicators
Now:
I Learning of models based on the loss that we actually care about, e.g.:
I Minimize time to target location
I Minimize number of collisions
I Minimize risk
I Maximize comfort
I etc.
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 4
Types of Learning
Supervised Learning:
I Dataset: {(xi, yi)} (xi = data, yi = label) Goal: Learn mapping x 7→ y
I Examples: Classification, regression, imitation learning, affordance learning, etc.
Unsupervised Learning:
I Dataset: {(xi)} (xi = data) Goal: Discover structure underlying data
I Examples: Clustering, dimensionality reduction, feature learning, etc.
Reinforcement Learning:
I Agent interacting with environment which provides numeric reward signals
I Goal: Learn how to take actions in order to maximize reward
I Examples: Learning of manipulation or control tasks (everything that interacts)
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 5
Introduction to Reinforcement Learning
Agent
Environment
State st Action at
Reward rt
Next state st+1
I Agent oberserves environment state st at time t
I Agent sends action at at time t to the environment
I Environment returns the reward rt and its new state st+1 to the agent
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 6
Introduction to Reinforcement Learning
I Goal: Select actions to maximize total future reward
I Actions may have long term consequences
I Reward may be delayed, not instantaneous
I It may be better to sacrifice immediate reward to gain more long-term reward
I Examples:
I Financial investment (may take months to mature)
I Refuelling a helicopter (might prevent crash in several hours)
I Sacrificing a chess piece (might help winning chances in the future)
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 7
Example: Cart Pole Balancing
I Objective: Balance pole on moving cart
I State: Angle, angular vel., position, vel.
I Action: Horizontal force applied to cart
I Reward: 1 if pole is upright at time t
https://gym.openai.com/envs/#classic_control 8
Example: Robot Locomotion
http://blog.openai.com/roboschool/
I Objective: Make robot move forward
I State: Position and angle of joints
I Action: Torques applied on joints
I Reward: 1 if upright  forward moving
https://gym.openai.com/envs/#mujoco 9
Example: Atari Games
http://blog.openai.com/gym-retro/
I Objective: Maximize game score
I State: Raw pixels of screen (210x160)
I Action: Left, right, up, down
I Reward: Score increase/decrease at t
https://gym.openai.com/envs/#atari 10
Example: Go
www.deepmind.com/research/alphago/
I Objective: Winning the game
I State: Position of all pieces
I Action: Location of next piece
I Reward: 1 if game won, 0 otherwise
www.deepmind.com/research/alphago/ 11
Example: Self-Driving
I Objective: Lane Following
I State: Image (96x96)
I Action: Acceleration, Steering
I Reward: - per frame, + per tile
https://gym.openai.com/envs/CarRacing-v0/ 12
Reinforcement Learning: Overview
Agent
Environment
Action at
State st
Reward rt
Next state st+1
I How can we mathematically formalize the RL problem?
13
Markov Decision Process
Markov Decision Process (MDP) models the environment and is defined by the tuple
(S, A, R, P, γ)
with
I S : set of possible states
I A: set of possible actions
I R(rt|st, at): distribution of current reward given (state,action) pair
I P(st+1|st, at): distribution over next state given (state,action) pair
I γ: discount factor (determines value of future rewards)
Almost all reinforcement learning problems can be formalized as MDPs
14
Markov Decision Process
Markov property: Current state completely characterizes state of the world
I A state st is Markov if and only if
P(st+1|st) = P(st+1|s1, ..., st)
I ”The future is independent of the past given the present”
I The state captures all relevant information from the history
I Once the state is known, the history may be thrown away
I The state is a sufficient statistics of the future
15
Markov Decision Process
Reinforcement learning loop:
I At time t = 0:
I Environment samples initial state s0 ∼ P(s0)
I Then, for t = 0 until done:
I Agent selects action at
I Environment samples reward rt ∼ R(rt|st, at)
I Environment samples next state st+1 ∼ P(st+1|st, at)
I Agent receives reward rt and next state st+1
Agent
Environment
at
st
rt
st+1
How do we select an action?
16
Policy
A policy π is a function from S to A that specifies what action to take in each state:
I A policy fully defines the behavior of an agent
I Deterministic policy: a = π(s)
I Stochastic policy: π(a|s) = P(at = a|st = s)
Remark:
I MDP policies depend only on the current state and not the entire history
I However, the current state may include past observations
17
Policy
How do we learn a policy?
Imitation Learning: Learn a policy from expert demonstrations
I Expert demonstrations are provided
I Supervised learning problem
Reinforcement Learning: Learn a policy through trial-and-error
I No expert demonstrations given
I Agent discovers itself which actions maximize the expected future reward
I The agent interacts with the environment and obtains reward
I The agent discovers good actions and improves its policy π
18
Exploration vs. Exploitation
How do we discover good actions?
Answer: We need to explore the state/action space. Thus RL combines two tasks:
I Exploration: Try a novel action a in state s , observe reward rt
I Discovers more information about the environment, but sacrifices total reward
I Game-playing example: Play a novel experimental move
I Exploitation: Use a previously discovered good action a
I Exploits known information to maximize reward, but sacrifice unexplored areas
I Game-playing example: Play the move you believe is best
Trade-off: It is important to explore and exploit simultaneously
19
Exploration vs. Exploitation
How to balance exploration and exploitation?
-greedy exploration algorithm:
I Try all possible actions with non-zero probability
I With probability  choose an action at random (exploration)
I With probability 1 −  choose the best action (exploitation)
I Greedy action is defined as best action which was discovered so far
I  is large initially and gradually annealed (=reduced) over time
20
4.2
Bellman Optimality and Q-Learning
Value Functions
How good is a state?
The state-value function V π at state st is the expected cumulative discounted reward
(rt ∼ R(rt|st, at)) when following policy π from state st:
V π
(st) = E[rt + γrt+1 + γ2
rt+2 + . . . |st, π] = E


X
k≥0
γk
rt+k st, π


I The discount factor γ  1 is the value of future rewards at current time t
I Weights immediate reward higher than future reward
(e.g., γ = 1
2 ⇒ γk
= 1
1 , 1
2 , 1
4 , 1
8 , 1
16 , . . . )
I Determines agent’s far/short-sightedness
I Avoids infinite returns in cyclic Markov processes
22
Value Functions
How good is a state-action pair?
The action-value function Qπ at state st and action at is the expected cumulative
discounted reward when taking action at in state st and then following the policy π:
Qπ
(st, at) = E


X
k≥0
γk
rt+k st, at, π


I The discount factor γ ∈ [0, 1] is the value of future rewards at current time t
I Weights immediate reward higher than future reward
(e.g., γ = 1
2 ⇒ γk
= 1
1 , 1
2 , 1
4 , 1
8 , 1
16 , . . . )
I Determines agent’s far/short-sightedness
I Avoids infinite returns in cyclic Markov processes
23
Optimal Value Functions
The optimal state-value function V ∗(st) is the best V π(st) over all policies π:
V ∗
(st) = max
π
V π
(st) V π
(st) = E


X
k≥0
γk
rt+k st, π


The optimal action-value function Q∗(st, at) is the best Qπ(st, at) over all policies π:
Q∗
(st, at) = max
π
Qπ
(st, at) Qπ
(st, at) = E


X
k≥0
γk
rt+k st, at, π


I The optimal value functions specify the best possible performance in the MDP
I However, searching over all possible policies π is computationally intractable
24
Optimal Policy
If Q∗(st, at) would be known, what would be the optimal policy?
π∗
(st) = argmax
a0∈A
Q∗
(st, a0
)
I Unfortunately, searching over all possible policies π is intractable in most cases
I Thus, determining Q∗(st, at) is hard in general (for most interesting problems)
I Let’s have a look at a simple example where the optimal policy is easy to compute
25
A Simple Grid World Example
actions = {
1. right
2. left
3. up
4. down
}
states
?
?
reward: r = −1 for
each transition
Objective: Reach one of terminal states (marked with ’?’) in least number of actions
I Penalty (negative reward) given for every transition made
26
A Simple Grid World Example
?
?
Random Policy
?
?
Optimal Policy
I The arrows indicate equal probability of moving into each of the directions
27
Solving for the Optimal Policy
Bellman Optimality Equation
I The Bellman Optimality Equation is
named after Richard Ernest Bellman who
introduced dynamic programming in 1953
I Almost any problem which can be solved
using optimal control theory can be solved
via the appropriate Bellman equation
Richard Ernest Bellman
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 29
Bellman Optimality Equation
The Bellman Optimality Equation (BOE) decomposes Q∗ as follows:
Q∗
(st, at) = E

rt + γrt+1 + γ2
rt+2 + . . . |st, at

BOE
= E

rt + γ max
a0∈A
Q∗
(st+1, a0
) st, at

This recursive formulation comprises two parts:
I Current reward: rt
I Discounted optimal action-value of successor: γ max
a0∈A
Q∗(st+1, a0)
We want to determine Q∗(st, at). How can we solve the BOE?
I The BOE is non-linear (because of max-operator) ⇒ no closed form solution
I Several iterative methods have been proposed, most popular: Q-Learning
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 30
Proof of the Bellman Optimality Equation
Proof of the Bellman Optimality Equation for the optimal action-value function Q∗:
Q∗
(st, at) = E

rt + γrt+1 + γ2
rt+2 + . . . |st, at

= E


X
k≥0
γk
rt+k|st, at


= E

rt + γ
X
k≥0
γk
rt+k+1|st, at


= E [rt + γV ∗
(st+1)|st, at]
= E

rt + γ max
a0
Q∗
(st+1, a0
)|st, at

Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 31
Bellman Optimality Equation
Why is it useful to solve the BOE?
I A greedy policy which chooses the action that maximizes the optimal
action-value function Q∗ or the optimal state-value function V ∗ takes
into account the reward consequences of all possible future behavior
I Via Q∗ and V ∗ the optimal expected long-term return is turned into a quantity
that is locally and immediately available for each state / state-action pair
I For V ∗, a one-step-ahead search yields the optimal actions
I Q∗ effectively caches the results of all one-step-ahead searches
Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 32
Q-Learning
Q-Learning: Iteratively solve for Q∗
Q∗
(st, at) = E

rt + γ max
a0∈A
Q∗
(st+1, a0
) st, at

by constructing an update sequence Q1, Q2, . . . using learning rate α:
Qi+1(st, at) ← (1 − α)Qi(st, at) + α(rt + γ max
a0∈A
Qi(st+1, a0
))
= Qi(st, at) + α (rt + γ max
a0∈A
Qi(st+1, a0
)
| {z }
target
− Qi(st, at)
| {z }
prediction
)
| {z }
temporal difference (TD) error
I Qi will converge to Q∗ as i → ∞ Note: policy π learned implicitly via Q table!
Watkins and Dayan: Technical Note Q-Learning. Machine Learning, 1992. 33
Q-Learning
Implementation:
I Initialize Q table and initial state s0 randomly
I Repeat:
I Observe state st, choose action at according to -greedy strategy
(Q-Learning is “off-policy” as the updated policy is different from the behavior policy)
I Observe reward rt and next state st+1
I Compute TD error: rt + γ max
a0∈A
Qi(st+1, a0
) − Qi(st, at)
I Update Q table
What’s the problem with using Q tables?
I Scalability: Tables don’t scale to high dimensional state/action spaces (e.g., GO)
I Solution: Use a function approximator (neural network) to represent Q(s, a)
Watkins and Dayan: Technical Note Q-Learning. Machine Learning, 1992. 34
4.3
Deep Q-Learning
Deep Q-Learning
Use a deep neural network with weights θ as function approximator to estimate Q:
Q(s, a; θ) ≈ Q∗
(s, a)
Q(s, a; θ)
θ
s a
Q(s, a1; θ), ...Q(s, am; θ)
θ
s
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 36
Training the Q Network
Forward Pass:
Loss function is the mean-squared error in Q-values:
L(θ) = E







rt + γ max
a0
Q(st+1, a0
; θ)
| {z }
target
− Q(st, at; θ)
| {z }
prediction




2



Backward Pass:
Gradient update with respect to Q-function parameters θ:
∇θL(θ) = ∇θ E

rt + γ max
a0
Q(st+1, a0
; θ) − Q(st, at; θ)
2
#
Optimize objective end-to-end with stochastic gradient descent (SGD) using ∇θL(θ).
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 37
Experience Replay
To speed-up training we like to train on mini-batches:
I Problem: Learning from consecutive samples is inefficient
I Reason: Strong correlations between consecutive samples
Experience replay stores agent’s experiences at each time-step
I Continually update a replay memory D with new experiences et = (st, at, rt, st+1)
I Train on samples (st, at, rt, st+1) ∼ U(D) drawn uniformly at random from D
I Breaks correlations between samples
I Improves data efficiency as each sample can be used multiple times
In practice, a circular replay memory of finite memory size is used.
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 38
Fixed Q Targets
Problem: Non-stationary targets
I As the policy changes, so do our targets: rt + γ max
a0
Q(st+1, a0; θ)
I This may lead to oscillation or divergence
Solution: Use fixed Q targets to stabilize training
I A target network Q with weights θ− is used to generate the targets:
L(θ) = E(st,at,rt,st+1)∼U(D)

rt + γ max
a0
Q(st+1, a0
; θ−
) − Q(st, at; θ)
2
#
I Target network Q is only updated every C steps by cloning the Q-network
I Effect: Reduces oscillation of the policy by adding a delay
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 39
Putting it together
Deep Q-Learning using experience replay and fixed Q targets:
I Take action at according to -greedy policy
I Store transition (st, at, rt, st+1) in replay memory D
I Sample random mini-batch of transitions (st, at, rt, st+1) from D
I Compute Q targets using old parameters θ−
I Optimize MSE between Q targets and Q network predictions
L(θ) = Est,at,rt,st+1∼D

rt + γ max
a0
Q(st+1, a0
; θ−
) − Q(st, at; θ)
2
#
using stochastic gradient descent.
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 40
Case Study: Playing Atari Games
Agent
Environment
; ;
;
Objective: Complete the game with the highest score
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 41
Case Study: Playing Atari Games
Q(s, a; θ): Neural network with weights θ
FC-Out (Q values)
FC-256
32 4x4 conv, stride 2
16 8x8 conv, stride 2
; ; ; ;
Input: 84 × 84 × 4 stack of last 4 frames
(after grayscale conversion, downsampling, cropping)
Output: Q values for all (4 to 18) Atari actions
(efficient: single forward pass computes Q for all actions)
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 42
Case Study: Playing Atari Games
Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 43
Deep Q-Learning Shortcomings
Deep Q-Learning suffers from several shortcomings:
I Long training times
I Uniform sampling from replay buffer ⇒ all transitions equally important
I Simplistic exploration strategy
I Action space is limited to a discrete set of actions
(otherwise, expensive test-time optimization required)
Various improvements over the original algorithm have been explored.
44
Deep Deterministic Policy Gradients
DDPG addresses the problem of continuous action spaces.
Problem: Finding a continuous action requires optimization at every timestep.
Solution: Use two networks, an actor (deterministic policy) and a critic.
µ(s; θµ)
θµ
s
Actor
Q(s, a; θQ)
θQ
s a = µ(s; θµ)
Critic
Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 45
Deep Deterministic Policy Gradients
I Actor network with weights θµ estimates agent’s deterministic policy µ(s; θµ)
I Update deterministic policy µ(·) in direction that most improves Q
I Apply chain rule to the expected return (this is the policy gradient):
∇θµ Est,at,rt,st+1∼D

Q(st, µ(st; θµ
); θQ
)

= E

∇at
Q(st, at; θQ
) ∇θµ µ(st; θµ
)

I Critic estimates value of current policy Q(s, a; θQ)
I Learned using the Bellman Optimality Equation as in Q Learning:
∇θQ Est,at,rt,st+1∼D
h
rt + γQ(st+1, µ(st+1; θµ−
); θQ−
) − Q(st, at; θQ
)
2
i
I Remark: No maximization over actions required as this step is now learned via µ(·)
Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 46
Deep Deterministic Policy Gradients
Experience replay and target networks are again used to stabilize training:
I Replay memory D stores transition tuples (st, at, rt, st+1)
I Target networks are updated using “soft” target updates
I Weights are not directly copied but slowly adapted:
θQ−
← τθQ
+ (1 − τ)θQ−
θµ−
← τθµ
+ (1 − τ)θµ−
where 0  τ  1 controls the tradeoff between speed and stability of learning
Exploration is performed by adding noise ∇θµ to the policy µ(s):
µ(s; θµ
) + N
Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 47
Prioritized Experience Replay
Prioritize experience to replay important transitions more frequently
I Priority δ is measured by magnitude of temporal difference (TD) error:
δ = rt + γ max
a0
Q(st+1, a0
; θQ−
) − Q(st, at; θQ
)
I TD error measures how “surprising” or unexpected the transition is
I Stochastic prioritization avoids overfitting due to lack of diversity
I Enables learning speed-up by a factor of 2 on Atari benchmarks
Schaul et al.: Prioritized Experience Replay. ICLR, 2016. 48
Learning to Drive in a Day
Real-world RL demo by Wayve:
I Deep Deterministic Policy Gradients
with Prioritized Experience Replay
I Input: Single monocular image
I Action: Steering and speed
I Reward: Distance traveled without
the safety driver taking control
(requires no maps / localization)
I 4 Conv layers, 2 FC layers
I Only 35 training episodes
Kendall, Hawke, Janz, Mazur, Reda, Allen, Lam, Bewley and Shah: Learning to Drive in a Day. ICRA, 2019. 49
Learning to Drive in a Day
Kendall, Hawke, Janz, Mazur, Reda, Allen, Lam, Bewley and Shah: Learning to Drive in a Day. ICRA, 2019. 50
Other flavors of Deep RL
Asynchronous Deep Reinforcement Learning
Execute multiple agents in separate environment instances:
I Each agent interacts with its own environment copy and collects experience
I Agents may use different exploration policies to maximize experience diversity
I Experience is not stored but directly used to update a shared global model
I Stabilizes training in similar way to experience replay by decorrelating samples
I Leads to reduction in training time roughly linear in the number of parallel agents
Mnih et al.: Asynchronous Methods for Deep Reinforcement Learning. ICML, 2016. 52
Bootstrapped DQN
Bootstrapping for efficient exploration:
I Approximate a distribution over Q values via K bootstrapped ”heads”
I At the start of each epoch, a single head Qk is selected uniformly at random
I After training, all heads can be combined into a single ensemble policy
Q1 QK
θQ1
... θQK
θshared
s
Osband et al.: Deep Exploration via Bootstrapped DQN. NIPS, 2016. 53
Double Q-Learning
Double Q-Learning
I Decouple Q function for selection and evaluation of actions
to avoid Q overestimation and stabilize training. Target:
DQN : rt + γ max
a0
Q(st+1, a0
; θ−
)
DoubleDQN : rt + γQ(st+1, argmax
a0
Q(st+1, a0
; θ); θ−
)
I Online network with weights θ is used to determine greedy policy
I Target network with weights θ−
is used to determine corresponding action value
I Improves performance on Atari benchmarks
van Hasselt et al.: Deep Reinforcement Learning with Double Q-learning. AAAI, 2016. 54
Deep Recurrent Q-Learning
Add recurrency to a deep Q-network to handle partial observability of states:
FC-Out (Q-values)
LSTM
32 4x4 conv, stride 2
16 8x8 conv, stride 2
; ; ; ;
Replace fully-connected layer with recurrent LSTM layer
Hausknecht and Stone: Deep Recurrent Q-Learning for Partially Observable MDPs. AAAI, 2015 55
Faulty Reward Functions
https://blog.openai.com/faulty-reward-functions/ 56
Summary
I Reinforcement learning learns through interaction with the environment
I The environment is typically modeled as a Markov Decision Process
I The goal of RL is to maximize the expected future reward
I Reinforcement learning requires trading off exploration and exploitation
I Q-Learning iteratively solves for the optimal action-value function
I The policy is learned implicitly via the Q table
I Deep Q-Learning scales to continuous/high-dimensional state spaces
I Deep Deterministic Policy Gradients scales to continuous action spaces
I Experience replay and target networks are necessary to stabilize training
57
Self-Driving Cars
Lecture 5 – Vehicle Dynamics
Robotics, Computer Vision, System Software
BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad)
Kumar Bipin
Agenda
5.1 Introduction
5.2 Kinematic Bicycle Model
5.3 Tire Models
5.4 Dynamic Bicycle Model
2
5.1
Introduction
Electronic Stability Program
Knowledge of vehicle dynamics enables accurate vehicle control 5
Kinematics vs. Kinetics
Kinematics:
I Greek origin: “motion”, “moving”
I Describes motion of points and bodies
I Considers position, velocity, acceleration, ..
I Examples: Celestial bodies, particle
systems, robotic arm, human skeleton
Kinetics:
I Describes causes of motion
I Effects of forces/moments
I Newton’s laws, e.g., F = ma
6
Holonomic Constraints
Holonomic constraints are constraints on the configuration:
I Assume a particle in three dimensions (x, y, z) ∈ R3
I We can constrain the particle to the x/y plane via:
z = 0
⇔ f(x, y, z) = 0 with f(x, y, z) = z
x/y plane
I Constraints of the form f(x, y, z) = 0 are called holonomic constraints
I They constrain the configuration space
I But the system can move freely in that space
I Controllable degrees of freedom equal total degrees of freedom (2)
7
Non-Holonomic Constraints
Non-Holonomic constraints are constraints on the velocity:
I Assume a vehicle that is parameterized by (x, y, ψ) ∈ R2 × [0, 2π]
I The 2D vehicle velocity is given by:
ẋ = v cos(ψ)
ẏ = v sin(ψ)
⇒ ẋ sin(ψ) − ẏ cos(ψ) = 0
I This non-holonomic constraint cannot be expressed in the form f(x, y, ψ) = 0
I The car cannot freely move in any direction (e.g., sideways)
I It constrains the velocity space, but not the configuration space
I Controllable degrees of freedom less than total degrees of freedom (2 vs. 3)
8
Holonomic vs. Non-Holonomic Systems
Holonomic Systems
I Constrain configuration space
I Can freely move in any direction
I Controllable degrees of freedom
equal to total degrees of freedom
I Constraints can be described by
f(x1, . . . , xN ) = 0
Example:
3D Particle
z = 0
x/y plane
Nonholonomic Systems
I Constrain velocity space
I Cannot freely move in any direction
I Controllable degrees of freedom less
than total degrees of freedom
I Constraints cannot be described by
f(x1, . . . , xN ) = 0
Example:
Car
ẋ sin(ψ) − ẏ cos(ψ) = 0
9
Holonomic vs. Non-Holonomic Systems
I A robot can be subject to both holonomic and non-holonomic constraints
I A car (rigid body in 3D) is kept on the ground by 3 holonomic constraints
I One additional non-holonomic constraint prevents sideways sliding
10
Coordinate Systems
Inertial Frame
Horizontal Frame
Horizontal Plane
Vehicle
Reference
Point
Vehicle Frame
I Inertial Frame: Fixed to earth with vertical Z-axis and X/Y horizontal plane
I Vehicle Frame: Attached to vehicle at fixed reference point; xv points towards
the front, yv to the side and zv to the top of the vehicle (ISO 8855)
I Horizontal Frame: Origin at vehicle reference point (like vehicle frame) but x-
and y-axes are projections of xv- and yv-axes onto the X/Y horizontal plane
11
Kinematics of a Point
The position rP (t) ∈ R3 of point P at time t ∈ R is given by 3 coordinates.
Velocity and acceleration are the first and second derivatives of the position rP (t).
rP (t) =



x(t)
y(t)
z(t)


 vP (t) = ṙP (t) =



ẋ(t)
ẏ(t)
ż(t)


 aP (t) = r̈P (t) =



ẍ(t)
ÿ(t)
z̈(t)



Trajectory of point P
12
Kinematics of a Rigid Body
A rigid body refers to a collection of infinitely many infinitesimally small mass points
which are rigidly connected, i.e., their relative position remains unchanged over time.
It’s motion can be compactly described by the motion of an (arbitrary) reference point
C of the body plus the relative motion of all other points P with respect to C.
I C: Reference point fixed to rigid body
I P: Arbitrary point on rigid body
I ω: Angular velocity of rigid body
I Position: rP = rC + rCP
I Velocity: vP = vC + ω × rCP
I Due to rigidity, points P can only rotate wrt. C
I Thus a rigid body has 6 DoF (3 pos., 3 rot.)
13
Instantaneous Center of Rotation
At each time instance t ∈ R, there exists a particular reference point O (called the
instantaneous center of rotation) for which vO(t) = 0. Each point P of the rigid body
performs a pure rotation about O:
vP = vO + ω × rOP = ω × rOP
Example 1: Turning Wheel
I Wheel is completely lifted off the ground
I Wheel does not move in x or y direction
I Ang. vel. vector ω points into x/y plane
I Velocity of point P: vP = ωR with radius R
14
Instantaneous Center of Rotation
At each time instance t ∈ R, there exists a particular reference point O (called the
instantaneous center of rotation) for which vO(t) = 0. Each point P of the rigid body
performs a pure rotation about O:
vP = vO + ω × rOP = ω × rOP
Example 2: Rolling Wheel
I Wheel is rolling on the ground without slip
I Ground is fixed in x/y plane
I Ang. vel. vector ω points into x/y plane
I Velocity of point P: vP = 2ωR with radius R
14
5.2
Kinematic Bicycle Model
Rigid Body Motion
Rotation Center
I Different points on the rigid body move along different circular trajectories
16
Kinematic Bicycle Model
I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels
17
Kinematic Bicycle Model
I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels
17
Kinematic Bicycle Model
Rotation
Center
Wheelbase
Vehicle
Velocity
Front Wheel
Velocity
Back Wheel
Velocity
Slip
Angle
Front
Steering
Angle
Center of
Gravity
Heading
Angle
Back
Steering
Angle
Turning
Radius
Course Angle
Assumptions:
- Planar motion (no roll, no pitch)
- Low speed = No wheel slip
(wheel orientation = wheel velocity)
I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels
17
Kinematic Bicycle Model
Model
Rotation Center
Motion Equations
Ẋ = v cos(ψ + β)
Ẏ = v sin(ψ + β)
ψ̇ =
v cos(β)
`f + `r
(tan(δf ) − tan(δr))
β = tan−1

`f tan(δr) + `r tan(δf )
`f + `r

(proof as exercise)
18
Kinematic Bicycle Model
Model
Rotation Center
Motion Equations
Ẋ = v cos(ψ + β)
Ẏ = v sin(ψ + β)
ψ̇ =
v cos(β)
`f + `r
tan(δ)
β = tan−1

`r tan(δ)
`f + `r

(only front steering)
tan δ =
lf + lr
R0
⇒
1
R0
=
tan δ
lf + lr
⇒ tan β =
lr
R0
=
lr tan δ
lf + lr
cos β =
R0
R
⇒
1
R
=
cos β
R0
⇒ ψ̇ = ω =
v
R
=
v cos(β)
R0
=
v cos(β)
lf + lr
tan(δ)
18
Kinematic Bicycle Model
Model
Rotation Center
Motion Equations
Ẋ = v cos(ψ)
Ẏ = v sin(ψ)
ψ̇ =
vδ
`f + `r
(assuming β and δ are very small)
19
Kinematic Bicycle Model
Model
Rotation Center
Motion Equations
Xt+1 = Xt + v cos(ψ) ∆t
Yt+1 = Yt + v sin(ψ) ∆t
ψt+1 = ψt +
vδ
`f + `r
∆t
(time discretized model)
19
Ackermann Steering Geometry
Front Steering Angles
Turning Radius
Rotation Center
Wheelbase
Track
I In practice, the left and right wheel steering angles are not equal if no wheel slip
I Combination of admissible steering angles called Ackerman steering geometry
I If angles are small, the left/right steering wheel angles can be approximated:
δl ≈ tan

L
R + 0.5B

≈
L
R + 0.5B
δr ≈ tan

L
R − 0.5B

≈
L
R − 0.5B 20
Ackermann Steering Geometry
Trapezoidal Geometry
Left Turn
Right Turn
I In practice, this setup can be realized using a trapezoidal tie rod arrangement
21
5.3
Tire Models
Kinematics is not enough ..
Which assumption of our model is violated in this case?
23
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology
A Comprehensive Theoretical Overview of Self-Driving Car Technology

More Related Content

Similar to A Comprehensive Theoretical Overview of Self-Driving Car Technology

文献紹介2改
文献紹介2改文献紹介2改
文献紹介2改Souhei Hirai
 
文献紹介20141117
文献紹介20141117文献紹介20141117
文献紹介20141117Souhei Hirai
 
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018Amazon Web Services
 
Computer Vision for autonomous driving
Computer Vision for autonomous drivingComputer Vision for autonomous driving
Computer Vision for autonomous drivingBill Liu
 
Advanced maintained highway engineering
Advanced maintained highway engineeringAdvanced maintained highway engineering
Advanced maintained highway engineeringCHARUSSAT UNIVERSITY
 
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCE
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCEREVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCE
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCEAnikohAbraham
 
autonomousvehicles-161545445212101224.ppt
autonomousvehicles-161545445212101224.pptautonomousvehicles-161545445212101224.ppt
autonomousvehicles-161545445212101224.pptAshishJhuria
 
Autonomousvehicles 161212101224
Autonomousvehicles 161212101224Autonomousvehicles 161212101224
Autonomousvehicles 161212101224Cisco Systems
 
autonomousvehicles-161212101224-converted.pptx
autonomousvehicles-161212101224-converted.pptxautonomousvehicles-161212101224-converted.pptx
autonomousvehicles-161212101224-converted.pptxhuzefa ansari
 

Similar to A Comprehensive Theoretical Overview of Self-Driving Car Technology (20)

AUTONOMOUS VEHICLES 2.pdf
AUTONOMOUS VEHICLES 2.pdfAUTONOMOUS VEHICLES 2.pdf
AUTONOMOUS VEHICLES 2.pdf
 
文献紹介2改
文献紹介2改文献紹介2改
文献紹介2改
 
文献紹介20141117
文献紹介20141117文献紹介20141117
文献紹介20141117
 
Autonomous vehicles[1]
Autonomous vehicles[1]Autonomous vehicles[1]
Autonomous vehicles[1]
 
Autonomous vehicles[1]
Autonomous vehicles[1]Autonomous vehicles[1]
Autonomous vehicles[1]
 
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018
The Future of Mixed-Autonomy Traffic (AIS302) - AWS re:Invent 2018
 
Computer Vision for autonomous driving
Computer Vision for autonomous drivingComputer Vision for autonomous driving
Computer Vision for autonomous driving
 
Advanced maintained highway engineering
Advanced maintained highway engineeringAdvanced maintained highway engineering
Advanced maintained highway engineering
 
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCE
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCEREVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCE
REVIEW OF MICROSCOPIC TRAFFIC MODEL USING ARTIFICIAL INTELLIGENCE
 
Autonomous Vehicles
Autonomous VehiclesAutonomous Vehicles
Autonomous Vehicles
 
autonomousvehicles-161545445212101224.ppt
autonomousvehicles-161545445212101224.pptautonomousvehicles-161545445212101224.ppt
autonomousvehicles-161545445212101224.ppt
 
Autonomousvehicles 161212101224
Autonomousvehicles 161212101224Autonomousvehicles 161212101224
Autonomousvehicles 161212101224
 
L23 Robotics and Drones
L23 Robotics and DronesL23 Robotics and Drones
L23 Robotics and Drones
 
RAISE AND CRASH OF INNOVATOR- ANTHONY LEVANDOWSKI
RAISE AND CRASH OF INNOVATOR- ANTHONY LEVANDOWSKIRAISE AND CRASH OF INNOVATOR- ANTHONY LEVANDOWSKI
RAISE AND CRASH OF INNOVATOR- ANTHONY LEVANDOWSKI
 
Driving simulators to support the design of autonomous vehicles
Driving simulators to support the design of autonomous vehiclesDriving simulators to support the design of autonomous vehicles
Driving simulators to support the design of autonomous vehicles
 
autonomousvehicles-161212101224-converted.pptx
autonomousvehicles-161212101224-converted.pptxautonomousvehicles-161212101224-converted.pptx
autonomousvehicles-161212101224-converted.pptx
 
BDAMS4 - Dariu Gavrila
BDAMS4 - Dariu GavrilaBDAMS4 - Dariu Gavrila
BDAMS4 - Dariu Gavrila
 
Autonomous car
Autonomous carAutonomous car
Autonomous car
 
L23 Robotics and Drones
L23 Robotics and Drones L23 Robotics and Drones
L23 Robotics and Drones
 
Autonomous car
Autonomous car Autonomous car
Autonomous car
 

Recently uploaded

Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseWSO2
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 

Recently uploaded (20)

Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 

A Comprehensive Theoretical Overview of Self-Driving Car Technology

  • 1. Self-Driving Cars Lecture 1 – Introduction Robotics, Computer Vision, System Software BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad) Kumar Bipin
  • 2. Self-Driving - A Human Dream 2
  • 5. Contents Goal: Develop an understanding of the capabilities and limitations of autonomous driving solutions and gain a basic understanding of the entire system comprising perception, planning and vehicle control. Training agents in simple environments. I History of self-driving cars I End-to-end learning for self-driving (imitation/reinforcement learning) I Modular approaches to self-driving I Perception (camera, lidar, radar) I Localization (with visual and road maps) I Navigation and path planning I Vehicle models and control algorithms 8
  • 6. Prerequisites Linear Algebra: I Vectors: x, y ∈ Rn I Matrices: A, B ∈ Rm×n I Operations: AT , A−1, Tr(A), det(A), A + B, AB, Ax, x>y I Norms: kxk1, kxk2, kxk∞, kAkF I SVD: A = UDV> 44
  • 7. Prerequisites Probability and Information Theory: I Probability distributions: P(X = x) I Marginal/conditional: p(x) = R p(x, y)dy , p(x, y) = p(x|y)p(y) I Bayes rule: p(x|y) = p(y|x)p(x)/p(y) I Conditional independence: x ⊥ ⊥ y | z ⇔ p(x, y|z) = p(x|z)p(y|z) I Expectation: Ex∼p [f(x)] = R x p(x)f(x)dx I Variance: Var(f(x)) = E (f(x) − E[f(x)])2 I Distributions: Bernoulli, Categorical, Gaussian, Laplace I Entropy: H(x) , KL Divergence: DKL(pkq) 45
  • 8. Thank You! Looking forward to our discussions
  • 11. Road Fatalities in 2017 I USA: 32,700 Germany: 3,300 World: 1,300,000 I Main factors: speeding, intoxication, distraction, etc. 49
  • 12. Benefits of Autonomous Driving I Lower risk of accidents I Provide mobility for elderly and people with disabilities I In the US 45% of people with disabilities still work I Decrease pollution for a more healthy environment I New ways of public transportation I Car pooling I Car sharing I Reduce number of cars (95% of the time a car is parked) 50
  • 14. Uber Fatal Accident (2018) 52
  • 15. Self-driving is Hard Human performance: 1 fatality per 100 mio miles Error rate to improve on: 0.000001 % Challenges: I Snow, heavy rain, night I Unstructured roads, parking lots I Pedestrians, erratic behavior I Reflections, dynamics I Rare and unseen events I Merging, negotiating, reasoning I Ethics: what is good behavior? I Legal questions http://theoatmeal.com/blog/google_self_driving_car 53
  • 17. The Trolley Problem (1905) Thought experiment: I You observe a train that will kill 5 people on the rail tracks if it continues I You have the option to pull a lever to redirect the train to another track I However, the train will kill one (other) person on that alternate track I What is your decision? What is the correct/ethical decision? 55
  • 18. The MIT Moral Machine http://moralmachine.mit.edu/ 56
  • 22. 1886: Benz Patent-Motorwagen Nummer 1 I Benz 954 cc single-cylinder four-stroke engine (500 watts) I Weight: 100 kg (engine), 265 kg (total) I Maximal speed: 16 km/h I Consumption: 10 liter / 100 km (!) I Construction based on the tricycle, many bicycle components I 29.1.1886: patent filed I 3.7.1886: first public test drive in Mannheim I 2.11.1886: patent granted, but investors stayed skeptical I First long distance trip (106 km) by Bertha Benz in 1888 with Motorwagen Nummer 3 (without knowledge of her husband) fostered commercial interest I First gas station: pharmacy in Wiesloch near Heidelberg 59
  • 25. 1925: Phantom Auto – “American Wonder” (Houdina Radio Control) In the summer of 1925, Houdina’s driverless car, called the American Wonder, traveled along Broadway in New York City—trailed by an operator in another vehicle—and down Fifth Avenue through heavy traffic. It turned corners, sped up, slowed down and honked its horn. Unfortunately, the demonstration ended when the American Wonder crashed into another vehicle filled with photographers documenting the event. (Discovery Magazine) https://www.discovermagazine.com/technology/the-driverless-car-era-began-more-than-90-years-ago 61
  • 26. 1939: Futurama – New York World’s Fair I Exhibit at the New York World’s Fair in 1939 sponsored by General Motors I Designed by Norman Bel Geddes’ - his vision of the world 20 years later (1960) I Radio-controlled electric cars, electromagnetic field via circuits in roadway I #1 exhibition, very well received (great depression), prototypes by RCA GM https://www.youtube.com/watch?v=sClZqfnWqmc 62
  • 27. 1956: General Motors Firebird II https://www.youtube.com/watch?v=cPOmuvFostY 63
  • 28. 1956: General Motors Firebird II https://www.youtube.com/watch?v=cPOmuvFostY 63
  • 29. 1956: General Motors Firebird II https://www.youtube.com/watch?v=cPOmuvFostY 63
  • 30. 1960: RCA Labs’ Wire Controlled Car Aeromobile https://spectrum.ieee.org/selfdriving-cars-were-just-around-the-cornerin-1960 64
  • 31. 1970: Citroen DS19 I Steered by sensing magnetic cables in the road, up to 130 km/h https://www.youtube.com/watch?v=MwdjM2Yx3gU 65
  • 32. 1986: Navlab 1 I Vision-based navigation Jochem, Pomerleau, Kumar and Armstrong: PANS: A Portable Navigation Platform. IV, 1995. 66
  • 33. Navlab Overview I Project at Carnegie Mellon University, USA I 1986: Navlab 1: 5 computer racks (Warp supercomputer) I 1988: First semi-autonomous drive at 20 mph I 1990: Navlab 2: 6 mph offroad, 70 mph highway driving I 1995: Navlab 5: “No Hands Across America” (2850 miles, 98 % autonomy) I PANS: Portable Advanced Navigation Support I Compute: 50 Mhz Sparc workstation (only 90 watts) I Main focus: lane keeping (lateral but no longitudinal control, i.e., no steering) I Position estimation: Differential GPS + Fibre Optic Gyroscope (IMU) I Low-level control: HC11 microcontroller Jochem, Pomerleau, Kumar and Armstrong: PANS: A Portable Navigation Platform. IV, 1995. 67
  • 34. 1988: ALVINN ALVINN: An Autonomous Land Vehicle in a Neural Network I Forward-looking, vision based driving I Fully connected neural network maps road images to vehicle turn radius I Directions discretized (45 bins) I Trained on simulated road images I Tested on unlined paths, lined city streets and interstate highways I 90 consecutive miles at up to 70 mph Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 68
  • 35. 1988: ALVINN Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 69
  • 36. 1995: AURORA AURORA: Automative Run-Off-Road Avoidance System I Downward-looking (mounted at side) I Adjustable template correlation I Tracks solid or dashed lane marking I shown to perform robustly even when the markings are worn or their appearance in the image is degraded I Mainly tested as a lane departure warning system (“time to crossing”) Chen, Jochem and Pomerleau: AURORA: A Vision-Based Roadway Departure Warning System. IROS, 1995. 70
  • 37. 1986: VaMoRs – Bundeswehr Universität Munich I Developed by Ernst Dickmanns in context of EUREKA-Prometheus (€800 mio.) (PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety, 1987- 1995) I Demonstration to Daimler-Benz Research 1986 in Stuttgart I Longitudinal lateral guidance with lateral acceleration feedback I Speed: 0 to 36 km/h 71
  • 38. 1986: VaMoRs – Bundeswehr Universität Munich I Developed by Ernst Dickmanns in context of EUREKA-Prometheus (€800 mio.) (PROgraMme for a European Traffic of Highest Efficiency and Unprecedented Safety, 1987- 1995) I Demonstration to Daimler-Benz Research 1986 in Stuttgart I Longitudinal lateral guidance with lateral acceleration feedback I Speed: 0 to 36 km/h 71
  • 39. 1994: VAMP – Bundeswehr Universität Munich I 2nd Generation Transputer (60 processors), bifocal saccade vision, no GPS I 1678 km autonomous ride Munich to Odense, 95% autonomy (up to 158 km) I Autonomous driving speed record: 180 km/h (lane keeping) I Convoi driving, automatic lane change (triggered by human) 72
  • 40. 1992: Summary Paper by Dickmanns Dickmanns and Mysliwetz: Recursive 3-D Road and Relative Ego-State Recognition. PAMI, 1992. 73
  • 41. 1995: Invention of Adaptive Cruise Control (ACC) I 1992: Lidar-based distance control by Mitsubishi (throttle control downshift) I 1997: Laser adaptive cruise control by Toyota (throttle control downshift) I 1999: Distronic radar-assisted ACC by Mercedes-Benz (S-Class), level 1 autonomy 74
  • 42. 2000: First Technological Revolution: GPS, IMUs Maps I NAVSTAR GPS available with 1 meter accuracy, IMUs improve up to 5 cm I Navigation systems and road maps available I Accurate self-localization and ego-motion estimation algorithms 75
  • 43. 2004: Darpa Grand Challenge 1 (Limited to US Participants) I 1st competition in the Mojave Desert along a 240 km route, $1 mio prize money I No traffic, dirt roads, driven by GPS (2935 points, up to 4 per curve). I None of the robot vehicles finished the route. CMU traveled the farthest distance, completing 11.78 km of the course before hitting a rock. 76
  • 44. 2005: Darpa Grand Challenge 2 (Limited to US Participants) I 2nd competition in the Mojave Desert along a 212 km route, $2 mio prize money I Five teams finished (Stanford team 1st in 6:54 h, CMU team 2nd in 7:05 h) 77
  • 45. 2006: Park Shuttle Rotterdam I 1800 meters route from metro station Kralingse Zoom to business park Rivium I One of the first truly driverless car, but dedicated lane, localization via magnets 78
  • 46. 2006: Second Technological Revolution: Lidars High-res Sensors I High-resolution Lidar I Camera systems with increasing resolution I Accurate 3D reconstruction, 3D detection 3D localization 79
  • 47. 2007: Darpa Urban Challenge (International Participants) I 3nd competition at George Air Force Base, 96 km route, urban driving, $2 mio I Rules: obey traffic law, negotiate, avoid obstacles, merge into traffic I 11 US teams received $1 mio funding for their research I Winners: CMU 1st (4:10), Stanford’s Stanley 2nd (4:29). No non-US participant. 80
  • 48. 2009: Google starts working on Self-Driving Car I Led by Sebastian Thrun, former director of Stanford AI lab and Stanley team I Others: Chris Urmson, Dmitri Dolgov, Mike Montemerlo, Anthony Levandowski I Renamed “Waymo” in 2016 (Google spend $1 billion until 2015) https://waymo.com/ 81
  • 49. 2010: VisLab Intercontinental Autonomous Challenge (VIAC) I July 20 to October 28: 16,000 kilometres trip from Parma, Italy to Shanghai, China I The second vehicle automatically followed the route defined by the leader vehicle by following it either visually or thanks to GPS waypoints sent by the lead vehicle Broggi, Medici, Zani, Coati and Panciroli: Autonomous vehicles control in the VisLab Intercontinental Autonomous Challenge. Annu. Rev. Control, 2012. 82
  • 50. 2010: VisLab Intercontinental Autonomous Challenge (VIAC) I July 20 to October 28: 16,000 kilometres trip from Parma, Italy to Shanghai, China I The second vehicle automatically followed the route defined by the leader vehicle by following it either visually or thanks to GPS waypoints sent by the lead vehicle Broggi, Medici, Zani, Coati and Panciroli: Autonomous vehicles control in the VisLab Intercontinental Autonomous Challenge. Annu. Rev. Control, 2012. 82
  • 51. 2010: Pikes Peak Self-Driving Audi TTS I Pikes Peak International Hill Climb (since 1916): 20 km, 1440 Hm, Summit: 4300 m I Audi TTS completes track in 27 min (record in 2010: 10 min, now: 8 min) https://www.youtube.com/watch?v=Arx8qWx9CFk 83
  • 52. 2010: Stadtpilot (Technical University Braunschweig) I Goal: geofenced innercity driving based on laser scanners, cameras and HD maps I Challenges: traffic lights, roundabouts, etc. Similar efforts by FU Berlin and others Saust, Wille, Lichte and Maurer: Autonomous Vehicle Guidance on Braunschweig’s inner ring road within the Stadtpilot Project. IV, 2011. 84
  • 53. 2012: Third Technological Revolution: Deep Learning I Representation learning boosts in accuracy across tasks and benchmarks Güler et al.: DensePose: Dense Human Pose Estimation In The Wild. CVPR, 2018. 85
  • 54. 2012: Third Technological Revolution: New Benchmarks Geiger, Lenz and Urtasun: Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012. 86
  • 55. 2013: Mercedes Benz S500 Intelligent Drive I Autonomous ride on historic Bertha Benz route by Daimler RD and KIT/FZI I Novelty: close to production stereo cameras / radar (but requires HD maps) Ziegler et al.: Making Bertha Drive - An Autonomous Journey on a Historic Route. IEEE Intell. Transp. Syst. Mag., 2014. 87
  • 56. 2014: Mercedes S Class Advanced ADAS (Level 2 Autonomy): I Autonomous steering, lane keeping, acceleration/braking, collision avoidance, driver fatigue monitoring in city traffic and highway speeds up to 200 km/h 88
  • 57. 2014: Society of Automotive Engineers: SAE Levels of Autonomy I Lateral control = steering, Longitudinal control = gas/brake 89
  • 58. Disengagement per 1000 miles (California Dept. of Motor Veh., 2017) 90
  • 59. 2015: Uber starts Self-Driving Research I Uber hires 50 robotic researchers and academics from CMU, shut down in 2020 91
  • 60. 2016: OTTO I Self-driving truck company, bought by Uber for $625 mio., later shut down 92
  • 61. 2015: Tesla Model S Autopilot Tesla Autopilot 2015 (Level 2 Autonomy): I Lane keeping for limited-access highways (hands off time: 30-120 seconds) I Doesn’t read traffic signals, traffic signs or detect pedestrians/cyclists 93
  • 62. 2016: Tesla Model S Autopilot: Fatal Accident 1 94
  • 63. 2018: Tesla Model X Autopilot: Fatal Accident 2 95
  • 64. 2018: Tesla Model X Autopilot: Fatal Accident 2 The National Transportation Safety Board (NTSB) said that four seconds before the 23 March crash on a highway in Silicon Valley, which killed Walter Huang, 38, the car stopped following the path of a vehicle in front of it. Three seconds before the impact, it sped up from 62mph to 70.8mph, and the car did not brake or steer away, the NTSB said. After the fatal crash in the city of Mountain View, Tesla noted that the driver had received multiple warnings to put his hands on the wheel and said he did not intervene during the five seconds before the car hit the divider. But the NTSB report revealed that these alerts were made more than 15 minutes before the crash. In the 60 seconds prior to the collision, the driver also had his hands on the wheel on three separate occasions, though not in the final six seconds, according to the agency. As the car headed toward the barrier, there was no precrash braking or evasive steering movement, the report added. The Guardian (June, 2018) 96
  • 65. 2018: Waymo (former Google) announced Public Service I In 2018 driving without safety driver in a geofenced district of Phoenix I By 2021 also in suburbs of Arizona, San Francisco and Mountain View 97
  • 66. 2018: Nuro Last-mile Delivery I Founded by two of the Google self-driving car engineers 98
  • 67. Self-Driving Industry I NVIDIA: Supplier of self-driving hardware and software I Waabi: Startup by Raquel Urtasun (formly Uber) I Aurora: Startup by Chris Urmson (formerly CMU, Google, Waymo) I Argo AI: Startup by Bryan Salesky (now Ford/Volkswagen) I Zoox: Startup by Jesse Levinson (now Amazon) I Cruise: Startup by Kyle Vogt (now General Motors) I NuTonomy: Startup by Doug Parker (now Delphi/Aptiv) I Efforts in China: Baidu Apollo, AutoX, Pony.AI I Comma.ai: Custom open-source dashcam to retrofit any vehicle I Wayve: Startup focusing on end-to-end self-driving 99
  • 69. Business Models Autonomous or nothing (Google, Apple, Uber) I Very risky, only few companies can do this I Long term goals Introduce technology little by little (all car companies) I Car industry is very conservative I ADAS as intermediate goal I Sharp transition: how to maintain the driver engaged? 101
  • 70. Wild Predictions about the Future of Self-Driving 102
  • 71. Summary I Self-driving has a long history I Highway lane-keeping of today was developed over 30 years ago I Increased robustness ⇒ introduction of level 3 for highways in 2019 I Increased interest after DARPA challenge and new benchmarks (e.g., KITTI) I Many claims about full self-driving (e.g., Elon Musk), but level 4/5 stays hard I Waymo introduced first public service end of 2018 (with safety driver) I Waymo/Tesla seem ahead of competition in full self-driving, but no winner yet I But several setbacks (Uber, Tesla accidents) I Most existing systems require laser scanners and HD maps (exception: Tesla) I Driving as an engineering problem, quite different from human cognition 103
  • 72. Self-Driving Cars Lecture 2 – Imitation Learning Robotics, Computer Vision, System Software BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad) Kumar Bipin
  • 73. Agenda 2.1 Approaches to Self-Driving 2.2 Deep Learning Recap 2.3 Imitation Learning 2.4 Conditional Imitation Learning 2
  • 75. Autonomous Driving Steer Gas Brake Sensory Input Mapping Function Dominating Paradigms: I Modular Pipelines I End-to-End Learning (Imitation Learning, Reinforcement Learning) I Direct Perception 4
  • 76. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception Examples: I [Montemerlo et al., JFR 2008] I [Urmson et al., JFR 2008] I Waymo, Uber, Tesla, Zoox, ... 5
  • 77. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception 6
  • 78. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception 6
  • 79. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception 6
  • 80. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception 6
  • 81. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception 6
  • 82. Autonomous Driving: Modular Pipeline Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception Pros: I Small components, easy to develop in parallel I Interpretability Cons: I Piece-wise training (not jointly) I Localization and planning heavily relies on HD maps HD maps: Centimeter precision lanes, markings, traffic lights/signs, human annotated 7
  • 83. Autonomous Driving: Modular Pipeline I Piece-wise training difficult: not all objects are equally important! Ohn-Bar and Trivedi: Are All Objects Equal? Deep Spatio-Temporal Importance Prediction in Driving Videos. PR, 2017. 8
  • 84. Autonomous Driving: Modular Pipeline I HD Maps are expensive to create (data collection annotation effort) https://www.geospatialworld.net/article/hd-maps-autonomous-vehicles/ 9
  • 85. Autonomous Driving: End-to-End Learning Steer Gas Brake Sensory Input Imitation Learning / Reinforcement Learning Neural Network Examples: I [Pomerleau, NIPS 1989] I [Bojarski, Arxiv 2016] I [Codevilla et al., ICRA 2018] 10
  • 86. Autonomous Driving: End-to-End Learning Steer Gas Brake Sensory Input Imitation Learning / Reinforcement Learning Neural Network Pros: I End-to-end training I Cheap annotations Cons: I Training / Generalization I Interpretability 10
  • 87. Autonomous Driving: Direct Perception Steer Gas Brake Sensory Input Direct Perception Intermediate Representations Vehicle Control Neural Network Examples: I [Chen et al., ICCV 2015] I [Sauer et al., CoRL 2018] I [Behl et al., IROS 2020] 11
  • 88. Autonomous Driving: Direct Perception Steer Gas Brake Sensory Input Direct Perception Intermediate Representations Vehicle Control Neural Network Pros: I Compact Representation I Interpretability Cons: I Control typically not learned jointly I How to choose representations? 11
  • 90. Supervised Learning Input Output Model I Learning: Estimate parameters w from training data {(xi, yi)}N i=1 I Inference: Make novel predictions: y = fw(x) 13
  • 91. Linear Classification Logistic Regression ŷ = σ(w x + w0) σ(x) = 1 1 + e−x I Let x ∈ R2 I Decision boundary: wx + w0 = 0 I Decide for class 1 ⇔ wx −w0 I Decide for class 0 ⇔ wx −w0 I Which problems can we solve? 10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0 x 0.0 0.2 0.4 0.6 0.8 1.0 (x) Sigmoid 0.5 Decision Boundary Class 0 Class 1 14
  • 92. Linear Classification Linear Classifier: Class 1 ⇔ w x −w0 x1 x2 OR(x1,x2) 0 0 0 0 1 1 1 0 1 1 1 1 1 1 | {z } w x1 x2 ! | {z } x 0.5 |{z} −w0 1 1 0 Class 0 Class 1 15
  • 93. Linear Classification Linear Classifier: Class 1 ⇔ w x −w0 x1 x2 AND(x1,x2) 0 0 0 0 1 0 1 0 0 1 1 1 1 1 | {z } w x1 x2 ! | {z } x 1.5 |{z} −w0 1 1 0 Class 0 Class 1 16
  • 94. Linear Classification Linear Classifier: Class 1 ⇔ w x −w0 x1 x2 NAND(x1,x2) 0 0 1 0 1 1 1 0 1 1 1 0 − 1 − 1 | {z } w x1 x2 ! | {z } x −1.5 |{z} −w0 1 1 0 Class 0 Class 1 17
  • 95. Linear Classification Linear Classifier: Class 1 ⇔ w x −w0 x1 x2 XOR(x1,x2) 0 0 0 0 1 1 1 0 1 1 1 0 ? ? | {z } w x1 x2 ! | {z } x ? |{z} −w0 1 1 0 Class 0 Class 1 ? 18
  • 96. Linear Classification Linear classifier with non-linear features ψ: w    x1 x2 x1x2    | {z } ψ(x) −w0 x1 x2 ψ1(x) ψ2(x) ψ3(x) XOR 0 0 0 0 0 0 0 1 0 1 0 1 1 0 1 0 0 1 1 1 1 1 1 0 1 0 Class 0 Class 1 1 I Non-linear features allow linear classifier to solve non-linear classification problems! 19
  • 97. Representation Matters CHAPTER 1. INTRODUCTION x y r θ Cartesian Coordinates Polar Coordinates I But how to choose the transformation? Can be very hard in practice. I Yet, this was the dominant approach until the 2000s (vision, speech, ..) I In deep/representation learning we want to learn these transformations 20
  • 98. Non-Linear Classification Linear Classifier: Class 1 ⇔ w x −w0 x1 x2 XOR(x1,x2) 0 0 0 0 1 1 1 0 1 1 1 0 XOR(x1, x2) = AND(OR(x1, x2),NAND(x1, x2)) 1 1 0 Class 0 Class 1 N A N D O R X O R 21
  • 99. Non-Linear Classification XOR(x1, x2) = AND(OR(x1, x2),NAND(x1, x2)) The above expression can be rewritten as a program of logistic regressors: h1 = σ(w OR x + wOR) h2 = σ(w NAND x + wNAND) ŷ = σ(w AND h + wAND) Note that h(x) is a non-linear feature of x. We call h(x) a hidden layer. 22
  • 100. Multi-Layer Perceptrons I MLPs are feedforward neural networks (no feedback connections) I They compose several non-linear functions f(x) = ŷ(h3(h2(h1(x)))) where hi(·) are called hidden layers and ŷ(·) is the output layer I The data specifies only the behavior of the output layer (thus the name “hidden”) I Each layer i comprises multiple neurons j which are implemented as affine transformations (ax + b) followed by non-linear activation functions (g): hij = g(a ijhi−1 + bij) I Each neuron in each layer is fully connected to all neurons of the previous layer I The overall length of the chain is the depth of the model ⇒ “Deep Learning” 23
  • 101. MLP Network Architecture Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output Layer Network Depth = #Computation Layers = 4 Layer Width = #Neurons in Layer I Neurons are grouped into layers, each neuron fully connected to all prev. ones I Hidden layer hi = g(Aihi−1 + bi) with activation function g(·) and weights Ai, bi 24
  • 102. Deeper Models allow for more Complex Decisions 2 Hidden Neurons 5 Hidden Neurons 15 Hidden Neurons https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html 25
  • 103. Output and Loss Functions Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target I The output layer is the last layer in a neural network which computes the output I The loss function compares the result of the output layer to the target value(s) I Choice of output layer and loss function depends on task (discrete, continuous, ..) 26
  • 104. Output Layer Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target I For classification problems, we use a sigmoid or softmax non-linearity I For regression problems, we can directly return the value after the last layer 27
  • 105. Loss Function Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target I For classification problems, we use the (binary) cross-entropy loss I For regression problems, we can use the `1 or `2 loss 28
  • 106. Activation Functions Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer Loss Function Target I Hidden layer hi = g(Aihi−1 + bi) with activation function g(·) and weights Ai, bi I The activation function is frequently applied element-wise to its input I Activation functions must be non-linear to learn non-linear mappings 29
  • 109. Convolutional Neural Networks I Multi-layer perceptrons don’t scale to high-dimensional inputs I ConvNets represent data in 3 dimensions: width, height, depth (= feature maps) I ConvNets interleave discrete convolutions, non-linearities and pooling I Key ideas: sparse interactions, parameter sharing, equivariant representation 32
  • 110. Fully Connected vs. Convolutional Layers Filter Kernel I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 111. Fully Connected vs. Convolutional Layers I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 112. Fully Connected vs. Convolutional Layers I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 113. Fully Connected vs. Convolutional Layers I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 114. Fully Connected vs. Convolutional Layers I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 115. Fully Connected vs. Convolutional Layers I Fully connected layer: #Weights = W × H × Cout × (W × H × Cin + 1) I Convolutional layer: #Weights = Cout × (K × K × Cin + 1) (“weight sharing”) I With Cin input and Cout output channels, layer size W × H and kernel size K × K I Convolutions are followed by non-lineary activation functions (e.g., ReLU) 33
  • 116. Padding Idea of Padding: I Add boundary of appropriate size with zeros (blue) around input tensor 34
  • 117. Downsampling I Downsampling reduces the spatial resolution (e.g., for image level predictions) I Downsampling increases the receptive field (which pixels influence a neuron) 35
  • 118. Pooling I Typically, stride s = 2 and kernel size 2 × 2 ⇒ reduces spatial dimensions by 2 I Pooling has no parameters (typical pooling operations: max, min, mean) 36
  • 119. Pooling I Typically, stride s = 2 and kernel size 2 × 2 ⇒ reduces spatial dimensions by 2 I Pooling has no parameters (typical pooling operations: max, min, mean) 36
  • 120. Fully Connected Layers I Often, convolutional networks comprise fully connected layers at the end 37
  • 122. Optimization Optimization Problem: (dataset X) w∗ = argmin w L(X, w) Gradient Descent: w0 = winit wt+1 = wt − η ∇wL(X, wt ) I Neural network loss L(X, w) is not convex, we have to use gradient descent I There exist multiple local minima, but we will find only one through optimization I Good news: it is known that many local minima in deep networks are good ones 39
  • 123. Backpropagation I Values are efficiently computed forward, gradients backward I Modularity: Each node must only “know” how to compute gradients wrt. its own arguments I One fw/bw pass per data point: ∇wL(X, w) = N X i=1 ∇wL(yi, xi, w) | {z } Backpropagation Compute Loss Compute Derivatives 40
  • 124. Gradient Descent Algorithm: 1. Initialize weights w0 and pick learning rate η 2. For all data points i ∈ {1, . . . , N} do: 2.1 Forward propagate xi through network to calculate prediction ŷi 2.2 Backpropagate to obtain gradient ∇wLi(wt ) ≡ ∇wL(ŷi, yi, wt ) 3. Update gradients: wt+1 = wt − η 1 N P i ∇wLi(wt) 4. If validation error decreases, go to step 2, otherwise stop Challenges: I Typically, millions of parameters ⇒ dim(w) = 1 million or more I Typically, millions of training points ⇒ N = 1 million or more I Becomes extremely expensive to compute and does not fit into memory 41
  • 125. Stochastic Gradient Descent Solution: I The total loss over the entire training set can be expressed as the expectation: 1 N X i Li(wt ) = Ei∼U{1,N} Li(wt ) I This expectation can be approximated by a smaller subset B N of the data: Ei∼U{1,N} Li(wt ) ≈ 1 B X b Lb(wt ) I Thus, the gradient can also be approximated by this subset (=minibatch): 1 N X i ∇wLi(wt ) ≈ 1 B X b ∇wLb(wt ) 42
  • 126. Stochastic Gradient Descent Algorithm: 1. Initialize weights w0, pick learning rate η and minibatch size |Xbatch| 2. Draw random minibatch {(x1, y1), . . . , (xB, yB)} ⊆ X (with B N) 3. For all minibatch elements b ∈ {1, . . . , B} do: 3.1 Forward propagate xb through network to calculate prediction ŷb 3.2 Backpropagate to obtain batch element gradient ∇wLb(wt ) ≡ ∇wL(ŷb, yb, wt ) 4. Update gradients: wt+1 = wt − η 1 B P b ∇wLb(wt) 5. If validation error decreases, go to step 2, otherwise stop 43
  • 127. First-order Methods There exist many variants: I SGD I SGD with Momentum I SGD with Nesterov Momentum I RMSprop I Adagrad I Adadelta I Adam I AdaMax I NAdam I AMSGrad Adam is often the method of choice due to its robustness. 44
  • 128. Learning Rate Schedules 0 10 20 30 40 50 20 30 40 50 60 iter. (1e4) error (%) plain-18 plain-34 0 10 20 30 40 50 20 30 40 50 60 iter. (1e4) error (%) ResNet-18 ResNet-34 I A fixed learning rate is too slow in the beginning and too fast in the end I Exponential decay: ηt = ηαt I Step decay: η ← 0.5η (every K iterations) He, Zhang, Ren andSun: Deep Residual Learning for Image Recognition. CVPR, 2016. 45
  • 130. Capacity, Overfitting and Underfitting 0.0 0.2 0.4 0.6 0.8 1.0 x 1.5 1.0 0.5 0.0 0.5 1.0 1.5 y M=1 Ground Truth Noisy Observations Polynomial Fit Test Set 0.0 0.2 0.4 0.6 0.8 1.0 x 1.5 1.0 0.5 0.0 0.5 1.0 1.5 y M=3 Ground Truth Noisy Observations Polynomial Fit Test Set 0.0 0.2 0.4 0.6 0.8 1.0 x 1.5 1.0 0.5 0.0 0.5 1.0 1.5 y M=9 Ground Truth Noisy Observations Polynomial Fit Test Set Capacity too low Capacity about right Capacity too high I Underfitting: Model too simple, does not achieve low error on training set I Overfitting: Training error small, but test error (= generalization error) large I Regularization: Take model from third regime (right) to second regime (middle) 47
  • 131. Early Stopping and Parameter Penalties Unregularized Objective L2 Regularizer Early stopping: I Dashed: Trajectory taken by SGD I Trajectory stops at w̃ before reaching minimum training error w∗ L2 Regularization: I Regularize objective with L2 penalty I Penalty forces minimum of regularized loss w̃ closer to origin 48
  • 132. Dropout Idea: I During training, set neurons to zero with probability µ (typically µ = 0.5) I Each binary mask is one model, changes randomly with every training iteration I Creates ensemble “on the fly” from a single network with shared parameters Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov: Dropout: a simple way to prevent neural networks from overfitting. JMLR, 2014. 49
  • 133. Data Augmentation I Best way towards better generalization is to train on more data I However, data in practice often limited I Goal of data augmentation: create “fake” data from the existing data (on the fly) and add it to the training set I New data must preserve semantics I Even simple operations like translation or adding per-pixel noise often already greatly improve generalization I https://github.com/aleju/imgaug 50
  • 135. Imitation Learning: Manipulation Towards Imitation Learning of Dynamic Manipulation Tasks: A Framework to Learn from Failures 52
  • 136. Imitation Learning: Car Racing Trainer (Human Driver) Trainee (Neural Network) 53
  • 137. Imitation Learning in a Nutshell Hard coding policies is often difficult ⇒ Rather use a data-driven approach! I Given: demonstrations or demonstrator I Goal: train a policy to mimic decision I Variants: behavior cloning (this lecture), inverse optimal control, ... 54
  • 138. Formal Definition of Imitation Learning I State: s ∈ S may be partially observed (e.g., game screen) I Action: a ∈ A may be discrete or continuous (e.g., turn angle, speed) I Policy: πθ : S → A we want to learn the policy parameters θ I Optimal action: a∗ ∈ A provided by expert demonstrator I Optimal policy: π∗ : S → A provided by expert demonstrator I State dynamics: P(si+1|si, ai) simulator, typically not known to policy Often deterministic: si+1 = T(si, ai) deterministic mapping I Rollout: Given s0, sequentially execute ai = πθ(si) sample si+1 ∼ P(si+1|si, ai) yields trajectory τ = (s0, a0, s1, a1, . . . ) I Loss function: L(a∗, a) loss of action a given optimal action a∗ 55
  • 139. Formal Definition of Imitation Learning General Imitation Learning: argmin θ Es∼P(s|πθ) [L (π∗ (s), πθ(s))] I State distribution P(s|πθ) depends on rollout determined by current policy πθ Behavior Cloning: argmin θ E(s∗,a∗)∼P∗ [L (a∗ , πθ(s∗ ))] | {z } = PN i=1 L(a∗ i ,πθ(s∗ i )) I State distribution P∗ provided by expert I Reduces to supervised learning problem 56
  • 140. Challenges of Behavior Cloning I Behavior cloning makes IID assumption I Next state is sampled from states observed during expert demonstration I Thus, next state is sampled independently from action predicted by current policy I What if πθ makes a mistake? I Enters new states that haven’t been observed before I New states not sampled from same (expert) distribution anymore I Cannot recover, catastrophic failure in the worst case I What can we do to overcome this train/test distribution mismatch? 57
  • 141. DAgger Data Aggregation (DAgger): I Iteratively build a set of inputs that the final policy is likely to encounter based on previous experience. Query expert for aggregate dataset I But can easily overfit to main mode of demonstrations I High training variance (random initialization, order of data) Ross, Gordon and Bagnell: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. AISTATS, 2011. 58
  • 142. DAgger with Critical States and Replay Buffer Key Ideas: 1. Sample critical states from the collected on-policy data based on the utility they provide to the learned policy in terms of driving behavior 2. Incorporate a replay buffer which progressively focuses on the high uncertainty regions of the policy’s state distribution Prakash, Behl, Ohn-bar, Chitta and Geiger: Exploring Data Aggregation in Policy Learning for Vision-based Urban Autonomous Driving. CVPR, 2020. 59
  • 143. ALVINN: An Autonomous Land Vehicle in a Neural Network
  • 144. ALVINN: An Autonomous Land Vehicle in a Neural Network I Fully connected 3 layer neural net I 36k parameters I Maps road images to turn radius I Directions discretized (45 bins) I Trained on simulated road images! I Tested on unlined paths, lined city streets and interstate highways I 90 consecutive miles at up to 70 mph Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 61
  • 145. ALVINN: An Autonomous Land Vehicle in a Neural Network Pomerleau: ALVINN: An Autonomous Land Vehicle in a Neural Network. NIPS, 1988. 62
  • 146. PilotNet: End-to-End Learning for Self-Driving Cars
  • 147. PilotNet: System Overview I Data augmentation by 3 cameras and virtually shifted / rotated images assuming the world is flat (homography), adjusting the steering angle appropriately Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 64
  • 148. PilotNet: Architecture I Convolutional network (250k param) I Input: YUV image representation I 1 Normalization layer I Not learned I 5 Convolutional Layers I 3 strided 5x5 I 2 non-strided 3x3 I 3 Fully connected Layers I Output: turning radius I Trained on 72h of driving Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 65
  • 149. PilotNet: Video Bojarski et al.: End-to-End Learning for Self-Driving Cars. Arxiv, 2016. 66
  • 150. VisualBackProp I Central idea: find salient image regions that lead to high activations I Forward pass, then iteratively scale-up activations Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 67
  • 151. VisualBackProp Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 68
  • 152. VisualBackProp I Test if shift in salient objects affects predicted turn radius more strongly Bojarski et al.: VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving. ICRA, 2018. 69
  • 154. Conditional Imitation Learning Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 71
  • 155. Conditional Imitation Learning Idea: I Condition controller on navigation command c ∈ {left,right,straight} I High-level navigation command can be provided by consumer GPS, i.e., telling the vehicle to turn left/right or go straight at the next intersection I This removes the task ambiguity induced by the environment I State st: current image Action at: steering angle acceleration Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 72
  • 156. Comparison to Behavior Cloning Behavior Cloning: I Training Set: D = {(a∗ i , s∗ i )N i=1} I Objective: argmin θ N X i=1 L (a∗ i , πθ(s∗ i )) I Assumption: ∃f(·) : ai = f(si) Often violated in practice! Conditional Imitation Learning: I Training Set: D = {(a∗ i , s∗ i , c∗ i )N i=1} I Objective: argmin θ N X i=1 L (a∗ i , πθ(s∗ i , c∗ i )) I Assumption: ∃f(·, ·) : ai = f(si, ci) Better assumption! Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 73
  • 157. Conditional Imitation Learning: Network Architecture I This paper proposes two network architectures: I (a) Extract features C(c) and concatenate with image features I(i) I (b) Command c acts as switch between specialized submodules I Measurements m capture additional information (here: speed of vehicle) Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 74
  • 158. Conditional Imitation Learning: Noise Injection I Temporally correlated noise injected into trajectories ⇒ drift (only 12 minutes) I Record driver’s (=expert’s) corrective response ⇒ recover from drift Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 75
  • 159. CARLA Simulator http://www.carla.org Codevilla, Müller, López, Koltun and Dosovitskiy: End-to-End Driving Via Conditional Imitation Learning. ICRA, 2018. 76
  • 160. Conditional Imitation Learning Codevilla, Santana, Lopez and Gaidon: Exploring the Limitations of Behavior Cloning for Autonomous Driving. ICCV, 2019. 77
  • 161. Neural Attention Fields I An MLP iteratively compresses the high-dimensional input into a compact representation ci (c 6= nav. command) based on a BEV query location as input I The model predicts waypoints and auxiliary semantics which aids learning Chitta, Prakash and Geiger: Neural Attention Fields for End-to-End Autonomous Driving. ICCV, 2021. 78
  • 162. Summary Advantages of Imitation Learning: I Easy to implement I Cheap annotations (just driving while recording images and actions) I Entire model trained end-to-end I Conditioning removes ambiguity at intersections Challenges for Imitation Learning? I Behavior cloning uses IID assumption which is violated in practice I Direct mapping from images to control ⇒ No long term planning I No memory (can’t remember speed signs, etc.) I Mapping is difficult to interpret (“black box”), despite visualization techniques 79
  • 163. Self-Driving Cars Lecture 3 – Direct Perception Robotics, Computer Vision, System Software BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad) Kumar Bipin
  • 164. Agenda 3.1 Direct Perception 3.2 Conditional Affordance Learning 3.3 Visual Abstractions 3.4 Driving Policy Transfer 3.5 Online vs. Offline Evaluation 2
  • 166. Approaches to Self-Driving Steer Gas Brake Sensory Input Modular Pipeline Path Planning Vehicle Control Scene Parsing Low-level Perception + Modular + Interpretable - Expert decisions - Piece-wise training Steer Gas Brake Sensory Input Imitation Learning / Reinforcement Learning Neural Network + End-to-end + Simple - Generalization - Interpretable - Data 4
  • 167. Direct Perception Steer Gas Brake Sensory Input Direct Perception Intermediate Representations Vehicle Control Neural Network Idea of Direct Perception: I Hybrid model between imitation learning and modular pipelines I Learn to predict interpretable low-dimensional intermediate representation I Decouple perception from planning and control I Allows to exploit classical controllers or learned controllers (or hybrids) 5
  • 168. Direct Perception for Autonomous Driving Affordances: I Attributes of the environment which limit space of actions [Gibson, 1966] I In this case: 13 affordances Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 6
  • 169. Overview I TORCS Simulator: Open source car racing game simulator I Network: AlexNet (5 conv layers, 4 fully conn. layers), 13 output neurons I Training: Affordance indicators trained with `2 loss Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 7
  • 170. Affordance Indicators and State Machine Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 8
  • 171. Controller Steering controller: s = θ1(α − dc/w) I s: steering command θ1: parameter I α: relative orientation dc: distance to centerline w: road width Speed controller: (“optimal velocity car following model”) v = vmax (1 − exp (−θ2 dp − θ3)) I v: target velocity vmax maximal velocity I dp: distance to preceding car θ2,3: parameters Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 9
  • 172. TORCS Simulator I TORCS: Open source car racing game http://torcs.sourceforge.net/ Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 10
  • 173. Results Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 11
  • 174. Network Visualization I Left: Averaged top 100 images activating a neuron in first fully connected layer I Right: Maximal response of 4th conv. layer (note: focus on cars and markings) Chen, Seff, Kornhauser and Xiao: Learning Affordance for Direct Perception in Autonomous Driving. ICCV, 2015. 12
  • 176. How can we transfer this idea to cities?
  • 177. Conditional Affordance Learning Affordances 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑎𝑛𝑔𝑙𝑒 = 0.01 𝑟𝑎𝑑 𝐶𝑒𝑛𝑡𝑒𝑟𝑙𝑖𝑛𝑒 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒= 0.15 𝑚 𝑅𝑒𝑑 𝑙𝑖𝑔ℎ𝑡 = 𝐹𝑎𝑙𝑠𝑒 … Neural Network Video Input Directional Input Control Comands Controller Brake = 0.0 Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 15
  • 178. CARLA Simulator I Goal: drive from A to B as fast, safely and comfortably as possible I Infractions: I Driving on wrong lane I Driving on sidewalk I Running a red light I Violating speed limit I Colliding with vehicles I Hitting other objects Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 16
  • 179. Affordances Affordances: I Distance to centerline I Relative angle to road I Distance to lead vehicle I Speed signs I Traffic lights I Hazard stop Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 17
  • 180. Affordances Affordances: I Distance to centerline I Relative angle to road I Distance to lead vehicle I Speed signs I Traffic lights I Hazard stop 30 km/h 𝐴1 𝐴2 𝑨𝟑 𝝍 𝒅 𝒙𝒍𝒐𝒄𝒂𝒍 𝒚𝒍𝒐𝒄𝒂𝒍 centerline vehicle 𝒍 = 𝟏𝟓 𝒎 agent hazard stop (for pedestrian) speed sign traffic light Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 18
  • 181. Overview ... Task Blocks Feature map Feature Extractor High-level Planner Agent in Environment Position unconditional conditional CARLA ... ... Longitudinal Control Lateral Control Controller Control Commands Affordances Directional Command Image Perception CAL Agent Memory N N-1 1 2 ... N-2 3 ... Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 19
  • 182. Controller Longitudinal Control cruising following over_limit red_light hazard_stop States Affordances Throttle Brake if hazard stop == True elif red light == True elif speed limit - 15 elif veh_distance 35 else I Finite-state machine I PID controller for cruising I Car following model Lateral Control I Stanley controller I δ(t) = ψ(t) + arctan kx(t) u(t) I Damping term Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 20
  • 183. Parameter Learning Perception Stack: I Multi-task learning: single forward pass ⇒ fast learning and inference I Dataset: random driving using controller operating on GT affordances ⇒ 240k images with GT affordances I Loss functions: I Discrete affordances: Class-weighted cross-entropy (CWCE) I Continuous affordances: Mean average error (MAE) I Optimized with ADAM (batch size 32) Controller: I Ziegler-Nichols Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 21
  • 184. Data Collection Data Collection: I Navigation based on true affordances random inputs Data Augmentation: I No image flipping I Color, contrast, brightness I Gaussian blur noise I Provoke rear-end collisions I Camera pose randomization 𝝓𝟏 𝝓𝟐 𝝓𝟑(= 𝟎) 𝒅 = 𝟓𝟎𝒄𝒎 𝒅 = 𝟓𝟎𝒄𝒎 Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 22
  • 185. Results Training conditions New weather New town New town and new weather Task MP CIL RL CAL MP CIL RL CAL MP CIL RL CAL MP CIL RL CAL Straight 98 95 89 100 100 98 86 100 92 97 74 93 50 80 68 94 One turn 82 89 34 97 95 90 16 96 61 59 12 82 50 48 20 72 Navigation 80 86 14 92 94 84 2 90 24 40 3 70 47 44 6 68 Nav. dynamic 77 83 7 83 89 82 2 82 24 38 2 64 44 42 4 64 Baselines: I MP = Modular Pipeline [Dosovitskiy et al., CoRL 2017] I CIL = Conditional Imitation Learning [Codevilla et al., ICRA 2018] I RL = Reinforcement Learning A3C [Mnih et al., ICML 2016] Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 23
  • 186. Results Conditional Navigation Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
  • 187. Results Speed Signs Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
  • 188. Results Car Following Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
  • 189. Results Hazard Stop Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 24
  • 190. Attention Attention to Hazard Stop Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 25
  • 191. Attention Attention to Red Light Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 25
  • 192. Path Planning Optimal Path (green) vs. Traveled Path (red) Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 26
  • 193. Failure Cases Hazard Stop: False Positive Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
  • 194. Failure Cases Hazard Stop: False Negative Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
  • 195. Failure Cases Red Light: False Positive Sauer, Savinov and Geiger: Conditional Affordance Learning for Driving in Urban Environments. CoRL, 2018. 27
  • 197. Does Computer Vision Matter for Action? Does Computer Vision Matter for Action? I Analyze various intermediate representations: segmentation, depth, normals, flow, albedo I Intermediate representations improve results I Consistent gains across simulations / tasks I Depth and semantic provide largest gains I Better generalization performance Zhou, Krähenbühl and Koltun: Does computer vision matter for action? Science Robotics, 2019. 29
  • 198. Visual Abstractions What is a good visual abstraction? I Invariant (hide irrelevant variations from policy) I Universal (applicable to wide range of scenarios) I Data efficient (in terms of memory/computation) I Label efficient (require little manual effort) Train Test Pixel Space Representation Space Figure Credit: Alexander Sax Semantic segmentation: I Encodes task-relevant knowledge (e.g. road is drivable) and priors (e.g., grouping) I Can be processed with standard 2D convolutional policy networks Disadvantage: I Labelling time: ∼90 min for 1 Cityscapes image Zhou, Krähenbühl and Koltun: Does computer vision matter for action? Science Robotics, 2019. 30
  • 199. Label Efficient Visual Abstractions Questions: I What is the trade-off between annotation time and driving performance? I Can selecting specific semantic classes ease policy learning? I Are visual abstractions trained with few images competitive? I Is fine-grained annotation important? I Are visual abstractions able to reduce training variance? Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 31
  • 200. Label Efficient Visual Abstractions Model: I Visual abstraction network aψ : s 7→ r (state 7→ abstraction) I Control policy πθ : r, c, v 7→ a (abstraction, command, velocity 7→ action) I Composing both yields a = πθ(aψ(s)) (state 7→ action) Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
  • 201. Label Efficient Visual Abstractions Datasets: I nr images annotated with semantic labels R = {si, ri}ns i=1 I na images annotated with expert actions A = {si, ai}na i=1 I We assume nr na Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
  • 202. Label Efficient Visual Abstractions Training: I Train visual abstraction network aφ(·) using semantic dataset R I Apply this network to obtain control dataset Cφ = {aψ(si), ai}na i=1 I Train control policy πθ(·) using control dataset Cφ Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 32
  • 203. Control Policy Model: I CILRS [Codevilla et al., ICCV 2019] Input: I Visual abstraction r I Navigational command c I Vehicle velocity v Output: I Action/control â and velocity v̂ Loss: I L = ||a − â||1 + λ ||v − v̂||1 Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 33
  • 204. Visual Abstractions Privileged Segmentation (14 classes): I Ground-truth semantic labels for 14 classes I Upper bound for analysis Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
  • 205. Visual Abstractions Privileged Segmentation (6 classes): I Ground-truth semantic labels for 2 stuff and 4 object classes I Upper bound for analysis Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
  • 206. Visual Abstractions Inferred Segmentation (14 classes): I Segmentation model trained on 14 classes I ResNet and Feature Pyramid Network (FPN) with segmentation head Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
  • 207. Visual Abstractions Inferred Segmentation (6 classes): I Segmentation model trained on 2 stuff and 4 object classes I ResNet and Feature Pyramid Network (FPN) with segmentation head Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
  • 208. Visual Abstractions Hybrid Detection and Segmentation (6 classes): I Segmentation model trained on 2 stuff classes: road, lane marking I Object detection trained on 4 object classes: vehicle, pedestrian, traffic light (r/g) Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 34
  • 209. Evaluation Training Town Test Town I CARLA 0.8.4 NoCrash benchmark I Random start and end location I Metric: Percentage of successfully completed episodes (success rate) Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 35
  • 210. Traffic Density Empty Regular Dense I Difficulty varies with number of dynamic agents in the scene I Empty: 0 Agents Regular: 65 Agents Dense: 220 Agents Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 36
  • 211. Identifying Most Relevant Classes (Privileged) I 14 classes: road, lane marking, vehicle, pedestrian, green light, red light, sidewalk, building, fence, pole, vegetation, wall, traffic sign, other I 7 classes: road, lane marking, vehicle, pedestrian, green light, red light, sidewalk, building, fence, pole, vegetation, wall, traffic sign, other I 6 classes: road, lane marking, vehicle, pedestrian, green light , red light, sidewalk I 5 classes: road, lane marking, vehicle, pedestrian, green light, red light Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 37
  • 212. Identifying Most Relevant Classes (Privileged) Number of Classes 0% 25% 50% 75% 100% 5 6 7 14 Empty Number of Classes 0% 25% 50% 75% 100% 5 6 7 14 Regular Number of Classes 0% 25% 50% 75% 100% 5 6 7 14 Dense Number of Classes 0% 25% 50% 75% 100% 5 6 7 14 Timeout Collision Success Overall I Moving from 14 to 6 classes does not hurt driving performance (on contrary) I Drastic performance drop when lane markings are removed Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 38
  • 213. Identifying Most Relevant Classes (Privileged) Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 39
  • 214. Identifying Most Relevant Classes (Inferred) 83 74 100 86 Number of Classes Success Rate 0 25 50 75 100 6 14 Empty 64 56 76 72 Number of Classes Success Rate 0 25 50 75 100 6 14 Regular 30 19 26 24 Number of Classes Success Rate 0 25 50 75 100 6 14 Dense 59 50 67 61 Number of Classes Success Rate 0 25 50 75 100 6 14 Standard Privileged Overall I Small performance drop when using inferred segmentations I 6-class representation consistently improves upon 14-class representation I We use the 6-class representation for all following experiments Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 40
  • 215. Hybrid Representation 82 70 25 58 89 67 22 59 Success Rate 0 25 50 75 100 Empty Regular Dense Overall Hybrid Standard I Performance of hybrid representation matches standard segmentation I Annotation time (segmentation): ∼ 300 seconds per image and per class I Annotation time (hybrid): ∼ 20 seconds per image and per class Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 41
  • 216. Summary Behl, Chitta, Prakash, Ohn-Bar and Geiger: Label Efficient Visual Abstractions for Autonomous Driving. IROS, 2020. 42
  • 218. Driving Policy Transfer Problem: I Driving policies learned in simulation often do not transfer well to the real world Idea: I Encapsulate driving policy such that it is not directly exposed to raw perceptual input or low-level control (input: semantic segmentation, output: waypoints) I Allows for transferring driving policy without retraining or finetuning Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 44
  • 219. Waypoint Representation Representation: I Input: Semantic segmentation (per pixel “road” vs. “non-road”) I Output: 2 waypoints (distance to vehicle, relative angle wrt. vehicle heading) I One sufficient for steering, second one for braking before turns Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 45
  • 220. Results Success Rate over 25 Navigation Trials I Driving Policy: Conditional Imitation Learning (branched) I Control: PID controller for lateral and longitudinal control I Results: Full method generalizes best (“+” = with data augmentation) Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 46
  • 221. Results Müller, Dosovitskiy, Ghanem and Koltun: Driving Policy Transfer via Modularity and Abstraction. CoRL, 2018. 47
  • 222. 3.5 Online vs. Offline Evaluation
  • 223. Online vs. Offline Evaluation I Online evaluation (i.e., using a real vehicle) is expensive and can be dangerous I Offline evaluation on a pre-recorded validation dataset is cheap and easy I Question: How predictive is offline evaluation (a) for the online task (b)? I Empirical study using CIL on CARLA trained with MSE loss on steering angle Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 49
  • 224. Online Metrics I Success Rate: Percentage of routes successfully completed I Average Completion: Average fraction of distance to goal covered I Km per Infraction: Average driven distance between 2 infractions Remark: The current CARLA metrics Infraction Score and Driving Score are not considered in this work from 2018, but would likely lead to similar conclusions. Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 50
  • 225. Offline Metrics I a/â: true/predicted steering angle |V |: #samples in validation set I v : speed δ(·): Kronecker delta function θ: Heaviside step function I Q ∈ {−1, 0, 1}: Quantization x −σ −σ ≤ x σ x ≥ σ Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 51
  • 226. Results: Online vs. Online I Generalization performance (town 2, new weather), radius = training iteration I 45 different models varying dataset size, augmentation, architecture, etc. I Success rate correlates well with average completion and km per infraction Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 52
  • 227. Results: Online vs. Offline I All metrics not well correlated, Mean Square Error (MSE) performs worst I Absolute steering error improves, speed weighting is not important Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 53
  • 228. Results: Online vs. Offline I Cumulating the error over time does not improve the correlation I Quantized classification and thresholded relative error perform best Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 54
  • 229. Case Study I Model 1: Trained with single camera `2 loss (=bad model) I Model 2: Trained with three cameras `1 loss (=good model) I Predictions of both models noisy, but Model 1 predicts occasionally very large errors leading to crashes, however the average prediction error is similar Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 55
  • 230. Case Study I Model 1 crashes in every trial but model 2 can drive successfully I Illustrates the difficulty of using offline metrics for predicting online behavior Codevilla, Lopez, Koltun and Dosovitskiy: On Offline Evaluation of Vision-based Driving Models. ECCV, 2018. 56
  • 231. Summary I Direct perception predicts intermediate representations I Low-dimensional affordances or classic computer vision representations (e.g., semantic segmentation, depth) can be used as intermediate representations I Decouples perception from planning and control I Hybrid model between imitation learning and modular pipelines I Direct methods are more interpretable as the representation can be inspected I Effective visual abstractions can be learned using limited supervision I Planning can also be decoupled from control for better transfer I Offline metrics are not necessarily indicative of online driving performance 57
  • 232. Self-Driving Cars Lecture 4 – Reinforcement Learning Robotics, Computer Vision, System Software BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad) Kumar Bipin
  • 233. Agenda 4.1 Markov Decision Processes 4.2 Bellman Optimality and Q-Learning 4.3 Deep Q-Learning 2
  • 235. Reinforcement Learning So far: I Supervised learning, lots of expert demonstrations required I Use of auxiliary, short-term loss functions I Imitation learning: per-frame loss on action I Direct perception: per-frame loss on affordance indicators Now: I Learning of models based on the loss that we actually care about, e.g.: I Minimize time to target location I Minimize number of collisions I Minimize risk I Maximize comfort I etc. Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 4
  • 236. Types of Learning Supervised Learning: I Dataset: {(xi, yi)} (xi = data, yi = label) Goal: Learn mapping x 7→ y I Examples: Classification, regression, imitation learning, affordance learning, etc. Unsupervised Learning: I Dataset: {(xi)} (xi = data) Goal: Discover structure underlying data I Examples: Clustering, dimensionality reduction, feature learning, etc. Reinforcement Learning: I Agent interacting with environment which provides numeric reward signals I Goal: Learn how to take actions in order to maximize reward I Examples: Learning of manipulation or control tasks (everything that interacts) Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 5
  • 237. Introduction to Reinforcement Learning Agent Environment State st Action at Reward rt Next state st+1 I Agent oberserves environment state st at time t I Agent sends action at at time t to the environment I Environment returns the reward rt and its new state st+1 to the agent Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 6
  • 238. Introduction to Reinforcement Learning I Goal: Select actions to maximize total future reward I Actions may have long term consequences I Reward may be delayed, not instantaneous I It may be better to sacrifice immediate reward to gain more long-term reward I Examples: I Financial investment (may take months to mature) I Refuelling a helicopter (might prevent crash in several hours) I Sacrificing a chess piece (might help winning chances in the future) Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 7
  • 239. Example: Cart Pole Balancing I Objective: Balance pole on moving cart I State: Angle, angular vel., position, vel. I Action: Horizontal force applied to cart I Reward: 1 if pole is upright at time t https://gym.openai.com/envs/#classic_control 8
  • 240. Example: Robot Locomotion http://blog.openai.com/roboschool/ I Objective: Make robot move forward I State: Position and angle of joints I Action: Torques applied on joints I Reward: 1 if upright forward moving https://gym.openai.com/envs/#mujoco 9
  • 241. Example: Atari Games http://blog.openai.com/gym-retro/ I Objective: Maximize game score I State: Raw pixels of screen (210x160) I Action: Left, right, up, down I Reward: Score increase/decrease at t https://gym.openai.com/envs/#atari 10
  • 242. Example: Go www.deepmind.com/research/alphago/ I Objective: Winning the game I State: Position of all pieces I Action: Location of next piece I Reward: 1 if game won, 0 otherwise www.deepmind.com/research/alphago/ 11
  • 243. Example: Self-Driving I Objective: Lane Following I State: Image (96x96) I Action: Acceleration, Steering I Reward: - per frame, + per tile https://gym.openai.com/envs/CarRacing-v0/ 12
  • 244. Reinforcement Learning: Overview Agent Environment Action at State st Reward rt Next state st+1 I How can we mathematically formalize the RL problem? 13
  • 245. Markov Decision Process Markov Decision Process (MDP) models the environment and is defined by the tuple (S, A, R, P, γ) with I S : set of possible states I A: set of possible actions I R(rt|st, at): distribution of current reward given (state,action) pair I P(st+1|st, at): distribution over next state given (state,action) pair I γ: discount factor (determines value of future rewards) Almost all reinforcement learning problems can be formalized as MDPs 14
  • 246. Markov Decision Process Markov property: Current state completely characterizes state of the world I A state st is Markov if and only if P(st+1|st) = P(st+1|s1, ..., st) I ”The future is independent of the past given the present” I The state captures all relevant information from the history I Once the state is known, the history may be thrown away I The state is a sufficient statistics of the future 15
  • 247. Markov Decision Process Reinforcement learning loop: I At time t = 0: I Environment samples initial state s0 ∼ P(s0) I Then, for t = 0 until done: I Agent selects action at I Environment samples reward rt ∼ R(rt|st, at) I Environment samples next state st+1 ∼ P(st+1|st, at) I Agent receives reward rt and next state st+1 Agent Environment at st rt st+1 How do we select an action? 16
  • 248. Policy A policy π is a function from S to A that specifies what action to take in each state: I A policy fully defines the behavior of an agent I Deterministic policy: a = π(s) I Stochastic policy: π(a|s) = P(at = a|st = s) Remark: I MDP policies depend only on the current state and not the entire history I However, the current state may include past observations 17
  • 249. Policy How do we learn a policy? Imitation Learning: Learn a policy from expert demonstrations I Expert demonstrations are provided I Supervised learning problem Reinforcement Learning: Learn a policy through trial-and-error I No expert demonstrations given I Agent discovers itself which actions maximize the expected future reward I The agent interacts with the environment and obtains reward I The agent discovers good actions and improves its policy π 18
  • 250. Exploration vs. Exploitation How do we discover good actions? Answer: We need to explore the state/action space. Thus RL combines two tasks: I Exploration: Try a novel action a in state s , observe reward rt I Discovers more information about the environment, but sacrifices total reward I Game-playing example: Play a novel experimental move I Exploitation: Use a previously discovered good action a I Exploits known information to maximize reward, but sacrifice unexplored areas I Game-playing example: Play the move you believe is best Trade-off: It is important to explore and exploit simultaneously 19
  • 251. Exploration vs. Exploitation How to balance exploration and exploitation? -greedy exploration algorithm: I Try all possible actions with non-zero probability I With probability choose an action at random (exploration) I With probability 1 − choose the best action (exploitation) I Greedy action is defined as best action which was discovered so far I is large initially and gradually annealed (=reduced) over time 20
  • 253. Value Functions How good is a state? The state-value function V π at state st is the expected cumulative discounted reward (rt ∼ R(rt|st, at)) when following policy π from state st: V π (st) = E[rt + γrt+1 + γ2 rt+2 + . . . |st, π] = E   X k≥0 γk rt+k st, π   I The discount factor γ 1 is the value of future rewards at current time t I Weights immediate reward higher than future reward (e.g., γ = 1 2 ⇒ γk = 1 1 , 1 2 , 1 4 , 1 8 , 1 16 , . . . ) I Determines agent’s far/short-sightedness I Avoids infinite returns in cyclic Markov processes 22
  • 254. Value Functions How good is a state-action pair? The action-value function Qπ at state st and action at is the expected cumulative discounted reward when taking action at in state st and then following the policy π: Qπ (st, at) = E   X k≥0 γk rt+k st, at, π   I The discount factor γ ∈ [0, 1] is the value of future rewards at current time t I Weights immediate reward higher than future reward (e.g., γ = 1 2 ⇒ γk = 1 1 , 1 2 , 1 4 , 1 8 , 1 16 , . . . ) I Determines agent’s far/short-sightedness I Avoids infinite returns in cyclic Markov processes 23
  • 255. Optimal Value Functions The optimal state-value function V ∗(st) is the best V π(st) over all policies π: V ∗ (st) = max π V π (st) V π (st) = E   X k≥0 γk rt+k st, π   The optimal action-value function Q∗(st, at) is the best Qπ(st, at) over all policies π: Q∗ (st, at) = max π Qπ (st, at) Qπ (st, at) = E   X k≥0 γk rt+k st, at, π   I The optimal value functions specify the best possible performance in the MDP I However, searching over all possible policies π is computationally intractable 24
  • 256. Optimal Policy If Q∗(st, at) would be known, what would be the optimal policy? π∗ (st) = argmax a0∈A Q∗ (st, a0 ) I Unfortunately, searching over all possible policies π is intractable in most cases I Thus, determining Q∗(st, at) is hard in general (for most interesting problems) I Let’s have a look at a simple example where the optimal policy is easy to compute 25
  • 257. A Simple Grid World Example actions = { 1. right 2. left 3. up 4. down } states ? ? reward: r = −1 for each transition Objective: Reach one of terminal states (marked with ’?’) in least number of actions I Penalty (negative reward) given for every transition made 26
  • 258. A Simple Grid World Example ? ? Random Policy ? ? Optimal Policy I The arrows indicate equal probability of moving into each of the directions 27
  • 259. Solving for the Optimal Policy
  • 260. Bellman Optimality Equation I The Bellman Optimality Equation is named after Richard Ernest Bellman who introduced dynamic programming in 1953 I Almost any problem which can be solved using optimal control theory can be solved via the appropriate Bellman equation Richard Ernest Bellman Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 29
  • 261. Bellman Optimality Equation The Bellman Optimality Equation (BOE) decomposes Q∗ as follows: Q∗ (st, at) = E rt + γrt+1 + γ2 rt+2 + . . . |st, at BOE = E rt + γ max a0∈A Q∗ (st+1, a0 ) st, at This recursive formulation comprises two parts: I Current reward: rt I Discounted optimal action-value of successor: γ max a0∈A Q∗(st+1, a0) We want to determine Q∗(st, at). How can we solve the BOE? I The BOE is non-linear (because of max-operator) ⇒ no closed form solution I Several iterative methods have been proposed, most popular: Q-Learning Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 30
  • 262. Proof of the Bellman Optimality Equation Proof of the Bellman Optimality Equation for the optimal action-value function Q∗: Q∗ (st, at) = E rt + γrt+1 + γ2 rt+2 + . . . |st, at = E   X k≥0 γk rt+k|st, at   = E  rt + γ X k≥0 γk rt+k+1|st, at   = E [rt + γV ∗ (st+1)|st, at] = E rt + γ max a0 Q∗ (st+1, a0 )|st, at Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 31
  • 263. Bellman Optimality Equation Why is it useful to solve the BOE? I A greedy policy which chooses the action that maximizes the optimal action-value function Q∗ or the optimal state-value function V ∗ takes into account the reward consequences of all possible future behavior I Via Q∗ and V ∗ the optimal expected long-term return is turned into a quantity that is locally and immediately available for each state / state-action pair I For V ∗, a one-step-ahead search yields the optimal actions I Q∗ effectively caches the results of all one-step-ahead searches Sutton and Barto: Reinforcement Learning: An Introduction. MIT Press, 2017. 32
  • 264. Q-Learning Q-Learning: Iteratively solve for Q∗ Q∗ (st, at) = E rt + γ max a0∈A Q∗ (st+1, a0 ) st, at by constructing an update sequence Q1, Q2, . . . using learning rate α: Qi+1(st, at) ← (1 − α)Qi(st, at) + α(rt + γ max a0∈A Qi(st+1, a0 )) = Qi(st, at) + α (rt + γ max a0∈A Qi(st+1, a0 ) | {z } target − Qi(st, at) | {z } prediction ) | {z } temporal difference (TD) error I Qi will converge to Q∗ as i → ∞ Note: policy π learned implicitly via Q table! Watkins and Dayan: Technical Note Q-Learning. Machine Learning, 1992. 33
  • 265. Q-Learning Implementation: I Initialize Q table and initial state s0 randomly I Repeat: I Observe state st, choose action at according to -greedy strategy (Q-Learning is “off-policy” as the updated policy is different from the behavior policy) I Observe reward rt and next state st+1 I Compute TD error: rt + γ max a0∈A Qi(st+1, a0 ) − Qi(st, at) I Update Q table What’s the problem with using Q tables? I Scalability: Tables don’t scale to high dimensional state/action spaces (e.g., GO) I Solution: Use a function approximator (neural network) to represent Q(s, a) Watkins and Dayan: Technical Note Q-Learning. Machine Learning, 1992. 34
  • 267. Deep Q-Learning Use a deep neural network with weights θ as function approximator to estimate Q: Q(s, a; θ) ≈ Q∗ (s, a) Q(s, a; θ) θ s a Q(s, a1; θ), ...Q(s, am; θ) θ s Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 36
  • 268. Training the Q Network Forward Pass: Loss function is the mean-squared error in Q-values: L(θ) = E        rt + γ max a0 Q(st+1, a0 ; θ) | {z } target − Q(st, at; θ) | {z } prediction     2    Backward Pass: Gradient update with respect to Q-function parameters θ: ∇θL(θ) = ∇θ E rt + γ max a0 Q(st+1, a0 ; θ) − Q(st, at; θ) 2 # Optimize objective end-to-end with stochastic gradient descent (SGD) using ∇θL(θ). Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 37
  • 269. Experience Replay To speed-up training we like to train on mini-batches: I Problem: Learning from consecutive samples is inefficient I Reason: Strong correlations between consecutive samples Experience replay stores agent’s experiences at each time-step I Continually update a replay memory D with new experiences et = (st, at, rt, st+1) I Train on samples (st, at, rt, st+1) ∼ U(D) drawn uniformly at random from D I Breaks correlations between samples I Improves data efficiency as each sample can be used multiple times In practice, a circular replay memory of finite memory size is used. Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 38
  • 270. Fixed Q Targets Problem: Non-stationary targets I As the policy changes, so do our targets: rt + γ max a0 Q(st+1, a0; θ) I This may lead to oscillation or divergence Solution: Use fixed Q targets to stabilize training I A target network Q with weights θ− is used to generate the targets: L(θ) = E(st,at,rt,st+1)∼U(D) rt + γ max a0 Q(st+1, a0 ; θ− ) − Q(st, at; θ) 2 # I Target network Q is only updated every C steps by cloning the Q-network I Effect: Reduces oscillation of the policy by adding a delay Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 39
  • 271. Putting it together Deep Q-Learning using experience replay and fixed Q targets: I Take action at according to -greedy policy I Store transition (st, at, rt, st+1) in replay memory D I Sample random mini-batch of transitions (st, at, rt, st+1) from D I Compute Q targets using old parameters θ− I Optimize MSE between Q targets and Q network predictions L(θ) = Est,at,rt,st+1∼D rt + γ max a0 Q(st+1, a0 ; θ− ) − Q(st, at; θ) 2 # using stochastic gradient descent. Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 40
  • 272. Case Study: Playing Atari Games Agent Environment ; ; ; Objective: Complete the game with the highest score Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 41
  • 273. Case Study: Playing Atari Games Q(s, a; θ): Neural network with weights θ FC-Out (Q values) FC-256 32 4x4 conv, stride 2 16 8x8 conv, stride 2 ; ; ; ; Input: 84 × 84 × 4 stack of last 4 frames (after grayscale conversion, downsampling, cropping) Output: Q values for all (4 to 18) Atari actions (efficient: single forward pass computes Q for all actions) Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 42
  • 274. Case Study: Playing Atari Games Mnih et al.: Human-level control through deep reinforcement learning. Nature, 2015. 43
  • 275. Deep Q-Learning Shortcomings Deep Q-Learning suffers from several shortcomings: I Long training times I Uniform sampling from replay buffer ⇒ all transitions equally important I Simplistic exploration strategy I Action space is limited to a discrete set of actions (otherwise, expensive test-time optimization required) Various improvements over the original algorithm have been explored. 44
  • 276. Deep Deterministic Policy Gradients DDPG addresses the problem of continuous action spaces. Problem: Finding a continuous action requires optimization at every timestep. Solution: Use two networks, an actor (deterministic policy) and a critic. µ(s; θµ) θµ s Actor Q(s, a; θQ) θQ s a = µ(s; θµ) Critic Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 45
  • 277. Deep Deterministic Policy Gradients I Actor network with weights θµ estimates agent’s deterministic policy µ(s; θµ) I Update deterministic policy µ(·) in direction that most improves Q I Apply chain rule to the expected return (this is the policy gradient): ∇θµ Est,at,rt,st+1∼D Q(st, µ(st; θµ ); θQ ) = E ∇at Q(st, at; θQ ) ∇θµ µ(st; θµ ) I Critic estimates value of current policy Q(s, a; θQ) I Learned using the Bellman Optimality Equation as in Q Learning: ∇θQ Est,at,rt,st+1∼D h rt + γQ(st+1, µ(st+1; θµ− ); θQ− ) − Q(st, at; θQ ) 2 i I Remark: No maximization over actions required as this step is now learned via µ(·) Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 46
  • 278. Deep Deterministic Policy Gradients Experience replay and target networks are again used to stabilize training: I Replay memory D stores transition tuples (st, at, rt, st+1) I Target networks are updated using “soft” target updates I Weights are not directly copied but slowly adapted: θQ− ← τθQ + (1 − τ)θQ− θµ− ← τθµ + (1 − τ)θµ− where 0 τ 1 controls the tradeoff between speed and stability of learning Exploration is performed by adding noise ∇θµ to the policy µ(s): µ(s; θµ ) + N Lillicrap et al.: Continuous Control with Deep Reinforcement Learning. ICLR, 2016. 47
  • 279. Prioritized Experience Replay Prioritize experience to replay important transitions more frequently I Priority δ is measured by magnitude of temporal difference (TD) error: δ = rt + γ max a0 Q(st+1, a0 ; θQ− ) − Q(st, at; θQ ) I TD error measures how “surprising” or unexpected the transition is I Stochastic prioritization avoids overfitting due to lack of diversity I Enables learning speed-up by a factor of 2 on Atari benchmarks Schaul et al.: Prioritized Experience Replay. ICLR, 2016. 48
  • 280. Learning to Drive in a Day Real-world RL demo by Wayve: I Deep Deterministic Policy Gradients with Prioritized Experience Replay I Input: Single monocular image I Action: Steering and speed I Reward: Distance traveled without the safety driver taking control (requires no maps / localization) I 4 Conv layers, 2 FC layers I Only 35 training episodes Kendall, Hawke, Janz, Mazur, Reda, Allen, Lam, Bewley and Shah: Learning to Drive in a Day. ICRA, 2019. 49
  • 281. Learning to Drive in a Day Kendall, Hawke, Janz, Mazur, Reda, Allen, Lam, Bewley and Shah: Learning to Drive in a Day. ICRA, 2019. 50
  • 282. Other flavors of Deep RL
  • 283. Asynchronous Deep Reinforcement Learning Execute multiple agents in separate environment instances: I Each agent interacts with its own environment copy and collects experience I Agents may use different exploration policies to maximize experience diversity I Experience is not stored but directly used to update a shared global model I Stabilizes training in similar way to experience replay by decorrelating samples I Leads to reduction in training time roughly linear in the number of parallel agents Mnih et al.: Asynchronous Methods for Deep Reinforcement Learning. ICML, 2016. 52
  • 284. Bootstrapped DQN Bootstrapping for efficient exploration: I Approximate a distribution over Q values via K bootstrapped ”heads” I At the start of each epoch, a single head Qk is selected uniformly at random I After training, all heads can be combined into a single ensemble policy Q1 QK θQ1 ... θQK θshared s Osband et al.: Deep Exploration via Bootstrapped DQN. NIPS, 2016. 53
  • 285. Double Q-Learning Double Q-Learning I Decouple Q function for selection and evaluation of actions to avoid Q overestimation and stabilize training. Target: DQN : rt + γ max a0 Q(st+1, a0 ; θ− ) DoubleDQN : rt + γQ(st+1, argmax a0 Q(st+1, a0 ; θ); θ− ) I Online network with weights θ is used to determine greedy policy I Target network with weights θ− is used to determine corresponding action value I Improves performance on Atari benchmarks van Hasselt et al.: Deep Reinforcement Learning with Double Q-learning. AAAI, 2016. 54
  • 286. Deep Recurrent Q-Learning Add recurrency to a deep Q-network to handle partial observability of states: FC-Out (Q-values) LSTM 32 4x4 conv, stride 2 16 8x8 conv, stride 2 ; ; ; ; Replace fully-connected layer with recurrent LSTM layer Hausknecht and Stone: Deep Recurrent Q-Learning for Partially Observable MDPs. AAAI, 2015 55
  • 288. Summary I Reinforcement learning learns through interaction with the environment I The environment is typically modeled as a Markov Decision Process I The goal of RL is to maximize the expected future reward I Reinforcement learning requires trading off exploration and exploitation I Q-Learning iteratively solves for the optimal action-value function I The policy is learned implicitly via the Q table I Deep Q-Learning scales to continuous/high-dimensional state spaces I Deep Deterministic Policy Gradients scales to continuous action spaces I Experience replay and target networks are necessary to stabilize training 57
  • 289. Self-Driving Cars Lecture 5 – Vehicle Dynamics Robotics, Computer Vision, System Software BE, MS, PhD (MMMTU, IISc, IIIT-Hyderabad) Kumar Bipin
  • 290. Agenda 5.1 Introduction 5.2 Kinematic Bicycle Model 5.3 Tire Models 5.4 Dynamic Bicycle Model 2
  • 292. Electronic Stability Program Knowledge of vehicle dynamics enables accurate vehicle control 5
  • 293. Kinematics vs. Kinetics Kinematics: I Greek origin: “motion”, “moving” I Describes motion of points and bodies I Considers position, velocity, acceleration, .. I Examples: Celestial bodies, particle systems, robotic arm, human skeleton Kinetics: I Describes causes of motion I Effects of forces/moments I Newton’s laws, e.g., F = ma 6
  • 294. Holonomic Constraints Holonomic constraints are constraints on the configuration: I Assume a particle in three dimensions (x, y, z) ∈ R3 I We can constrain the particle to the x/y plane via: z = 0 ⇔ f(x, y, z) = 0 with f(x, y, z) = z x/y plane I Constraints of the form f(x, y, z) = 0 are called holonomic constraints I They constrain the configuration space I But the system can move freely in that space I Controllable degrees of freedom equal total degrees of freedom (2) 7
  • 295. Non-Holonomic Constraints Non-Holonomic constraints are constraints on the velocity: I Assume a vehicle that is parameterized by (x, y, ψ) ∈ R2 × [0, 2π] I The 2D vehicle velocity is given by: ẋ = v cos(ψ) ẏ = v sin(ψ) ⇒ ẋ sin(ψ) − ẏ cos(ψ) = 0 I This non-holonomic constraint cannot be expressed in the form f(x, y, ψ) = 0 I The car cannot freely move in any direction (e.g., sideways) I It constrains the velocity space, but not the configuration space I Controllable degrees of freedom less than total degrees of freedom (2 vs. 3) 8
  • 296. Holonomic vs. Non-Holonomic Systems Holonomic Systems I Constrain configuration space I Can freely move in any direction I Controllable degrees of freedom equal to total degrees of freedom I Constraints can be described by f(x1, . . . , xN ) = 0 Example: 3D Particle z = 0 x/y plane Nonholonomic Systems I Constrain velocity space I Cannot freely move in any direction I Controllable degrees of freedom less than total degrees of freedom I Constraints cannot be described by f(x1, . . . , xN ) = 0 Example: Car ẋ sin(ψ) − ẏ cos(ψ) = 0 9
  • 297. Holonomic vs. Non-Holonomic Systems I A robot can be subject to both holonomic and non-holonomic constraints I A car (rigid body in 3D) is kept on the ground by 3 holonomic constraints I One additional non-holonomic constraint prevents sideways sliding 10
  • 298. Coordinate Systems Inertial Frame Horizontal Frame Horizontal Plane Vehicle Reference Point Vehicle Frame I Inertial Frame: Fixed to earth with vertical Z-axis and X/Y horizontal plane I Vehicle Frame: Attached to vehicle at fixed reference point; xv points towards the front, yv to the side and zv to the top of the vehicle (ISO 8855) I Horizontal Frame: Origin at vehicle reference point (like vehicle frame) but x- and y-axes are projections of xv- and yv-axes onto the X/Y horizontal plane 11
  • 299. Kinematics of a Point The position rP (t) ∈ R3 of point P at time t ∈ R is given by 3 coordinates. Velocity and acceleration are the first and second derivatives of the position rP (t). rP (t) =    x(t) y(t) z(t)    vP (t) = ṙP (t) =    ẋ(t) ẏ(t) ż(t)    aP (t) = r̈P (t) =    ẍ(t) ÿ(t) z̈(t)    Trajectory of point P 12
  • 300. Kinematics of a Rigid Body A rigid body refers to a collection of infinitely many infinitesimally small mass points which are rigidly connected, i.e., their relative position remains unchanged over time. It’s motion can be compactly described by the motion of an (arbitrary) reference point C of the body plus the relative motion of all other points P with respect to C. I C: Reference point fixed to rigid body I P: Arbitrary point on rigid body I ω: Angular velocity of rigid body I Position: rP = rC + rCP I Velocity: vP = vC + ω × rCP I Due to rigidity, points P can only rotate wrt. C I Thus a rigid body has 6 DoF (3 pos., 3 rot.) 13
  • 301. Instantaneous Center of Rotation At each time instance t ∈ R, there exists a particular reference point O (called the instantaneous center of rotation) for which vO(t) = 0. Each point P of the rigid body performs a pure rotation about O: vP = vO + ω × rOP = ω × rOP Example 1: Turning Wheel I Wheel is completely lifted off the ground I Wheel does not move in x or y direction I Ang. vel. vector ω points into x/y plane I Velocity of point P: vP = ωR with radius R 14
  • 302. Instantaneous Center of Rotation At each time instance t ∈ R, there exists a particular reference point O (called the instantaneous center of rotation) for which vO(t) = 0. Each point P of the rigid body performs a pure rotation about O: vP = vO + ω × rOP = ω × rOP Example 2: Rolling Wheel I Wheel is rolling on the ground without slip I Ground is fixed in x/y plane I Ang. vel. vector ω points into x/y plane I Velocity of point P: vP = 2ωR with radius R 14
  • 304. Rigid Body Motion Rotation Center I Different points on the rigid body move along different circular trajectories 16
  • 305. Kinematic Bicycle Model I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels 17
  • 306. Kinematic Bicycle Model I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels 17
  • 307. Kinematic Bicycle Model Rotation Center Wheelbase Vehicle Velocity Front Wheel Velocity Back Wheel Velocity Slip Angle Front Steering Angle Center of Gravity Heading Angle Back Steering Angle Turning Radius Course Angle Assumptions: - Planar motion (no roll, no pitch) - Low speed = No wheel slip (wheel orientation = wheel velocity) I The kinematic bicycle model approximates the 4 wheels with 2 imaginary wheels 17
  • 308. Kinematic Bicycle Model Model Rotation Center Motion Equations Ẋ = v cos(ψ + β) Ẏ = v sin(ψ + β) ψ̇ = v cos(β) `f + `r (tan(δf ) − tan(δr)) β = tan−1 `f tan(δr) + `r tan(δf ) `f + `r (proof as exercise) 18
  • 309. Kinematic Bicycle Model Model Rotation Center Motion Equations Ẋ = v cos(ψ + β) Ẏ = v sin(ψ + β) ψ̇ = v cos(β) `f + `r tan(δ) β = tan−1 `r tan(δ) `f + `r (only front steering) tan δ = lf + lr R0 ⇒ 1 R0 = tan δ lf + lr ⇒ tan β = lr R0 = lr tan δ lf + lr cos β = R0 R ⇒ 1 R = cos β R0 ⇒ ψ̇ = ω = v R = v cos(β) R0 = v cos(β) lf + lr tan(δ) 18
  • 310. Kinematic Bicycle Model Model Rotation Center Motion Equations Ẋ = v cos(ψ) Ẏ = v sin(ψ) ψ̇ = vδ `f + `r (assuming β and δ are very small) 19
  • 311. Kinematic Bicycle Model Model Rotation Center Motion Equations Xt+1 = Xt + v cos(ψ) ∆t Yt+1 = Yt + v sin(ψ) ∆t ψt+1 = ψt + vδ `f + `r ∆t (time discretized model) 19
  • 312. Ackermann Steering Geometry Front Steering Angles Turning Radius Rotation Center Wheelbase Track I In practice, the left and right wheel steering angles are not equal if no wheel slip I Combination of admissible steering angles called Ackerman steering geometry I If angles are small, the left/right steering wheel angles can be approximated: δl ≈ tan L R + 0.5B ≈ L R + 0.5B δr ≈ tan L R − 0.5B ≈ L R − 0.5B 20
  • 313. Ackermann Steering Geometry Trapezoidal Geometry Left Turn Right Turn I In practice, this setup can be realized using a trapezoidal tie rod arrangement 21
  • 315. Kinematics is not enough .. Which assumption of our model is violated in this case? 23