Machine Learning for Self-Driving Cars
High-level Development Process for Autonomous Vehicles
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
Agenda
High-level Development Process for Autonomous Vehicles
3
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
1 Collect sensors data
Sensors Udacity Lincoln MKZ
Camera 3x Blackfly GigE Camera, 20 Hz
Lidar Velodyne HDL-32E, 9.5 Hz
IMU Xsens, 400 Hz
GPS 2x fixed, 1 Hz
CAN bus, 1,1 kHz
Robot Operating System
Data 3 GB per minute
https://github.com/udacity/self-driving-car
Robot Operating System
+ Popular open source robotics
framework
+ Reliable distributed architecture
+ Wide use in the robotics
research community
+ Huge selection of “off-the-shelf”
software packages for
hardware/algorithms/etc.
+ Used by Bosch, BMW, KUKA, Google, Siemens, etc.
https://roscon.ros.org/2015/presentations/ROSCon-Automated-Driving.pdf
Sensors Spec
Sensor blinding,
sunlight,
darkness
rain, fog,
snow
non-metal
objects
wind/ high
velocity
resolution range data
Ultrasonic yes yes yes no + + +
Lidar yes no yes yes +++ ++ +
Radar yes yes no yes ++ +++ +
Camera no no yes yes +++ +++ +++
Car data from sensors and bus traces
CAN, Flexray, Camera, Radar, Lidar, IMU, etc.
Pre-select signals, aggregate and prepare for sending
Parse traces and signals (dbc, fibex, autosar...)
Receive signals, analysis, and machine learning
Real-time or batch analysis based on sensors data
publish/subscriberealtime
Car Layer
Data Logger
Data Center
Realtime
Data Analytics
Real-time Analysis of car data
High-level Development Process for Autonomous Vehicles
8
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
2 Model Engineering
Machine Learning in Robotics
Observations
State
Estimation
Modeling &
Prediction
Planning
Controls
f(x)
Controls
Observations
Machine Learning for Autonomous Driving
+ Sensor Fusion clustering, segmentation, pattern recognition
+ Road ego-motion, image processing and pattern recognition
+ Localization simultaneous localization and mapping
+ Situation Understanding detection and classification
+ Trajectory Planning motion planning and control
+ Control Strategy reinforcement and supervised learning
+ Driver Model image processing and pattern recognition
Machine Learning Workflow
Ingest data
Data
Preprocessing
Search
Analysis
Model
Training
Re-
simulation
Reports
Results
Model
Deployment
Training
data
Model
Testing
Train Test Loop
Test
data
Model Feedback Loop
More Data + Bigger Models
Accuracy
Scale (data size, model size)
other approaches
neural networks
1990s
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
More Data + Bigger Models + More Computation
Accuracy
Scale (data size, model size)
other approaches
neural networks
Now
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
more compute
Train and evaluate machine learning models at scale
Single machine Data center
How to run more experiments faster and in parallel?
How to share and reproduce research?
How to go from research to real products?
When to use Distributed Machine Learning
Data Size
Model Size
Model parallelism
Single machine
Data center
Data
parallelism
training very large models exploring several model
architectures, hyper-
parameter optimization,
training several
independent models
speeds up the training
Compute Workload for Training and Evaluation
I/O intensive
Compute
intensive
Single machine
Data center
I/O Workload for Simulation and Testing
I/O intensive
Compute
intensive
Single machine
Data center
Open Machine Learning Platform
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Search
Analysis
Training
Evaluation
Re-Simulation
Testing
CaffeOnSpark
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
ROS bag data structure
https://github.com/valtech/ros_hadoop
Hadoop InputFormat for ROS bags https://github.com/valtech/ros_hadoop
Search & Analysis
+ Hadoop InputFormat and
Record Reader for Rosbag
+ Process Rosbag with Spark,
Yarn, MapReduce, Hadoop
Streaming API, …
+ Spark RDD are cached and
optimized for analysis
Ros
bag
Processing
Engine
Computer
Network
Storage
Advanced
Analytics
RDD
Record
Reader
RDD
DataFrame, DataSet
SQL, Spark APIs
NumPy
Ros
Msg
Training & Evaluation
+ Tensorflow Record Reader
+ Protocol Buffers to serialize
records
+ Save time because data
conversion not needed
+ Save storage because data
duplication not needed
Training
Engine
Machine
Learning
Ros
bag
Computer
Network
Storage
Record
Reader
Ros
msg
Re-Simulation & Testing
+ Use Spark for preprocessing,
transformation, cleansing,
aggregation, time window
selection before publish to ROS
topics
+ Use Re-Simulation framework
of choice to subscribe to the
ROS topics
Engine
Re-Simulation
with framework
of choice
Computer
Network
Storage
Ros
bag
Ros
topic
core
subscribe
publish
Time Travel
fold(left)
t
fold(right)
reduce/
shuffle
High-level Development Process for Autonomous Vehicles
25
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
3 Autonomous Driving
Architecture Building Blocks
http://www.bmw-carit.com/downloads/presentations/AutonomousDrivingNeedsROSScript.pdf
Hadoop InputFormat for ROS
Apache License 2.0
Download
https://github.com/valtech/ros_hadoop
Contact
jan.wiegelmann@valtech.de
thank you

Machine Learning for Self-Driving Cars

  • 1.
    Machine Learning forSelf-Driving Cars
  • 2.
    High-level Development Processfor Autonomous Vehicles 1 Collect sensors data 3 Autonomous Driving 2 Model Engineering Data Logger Control Unit Big Data Trained Model Data Center Agenda
  • 3.
    High-level Development Processfor Autonomous Vehicles 3 1 Collect sensors data 3 Autonomous Driving 2 Model Engineering Data Logger Control Unit Big Data Trained Model Data Center 1 Collect sensors data
  • 4.
    Sensors Udacity LincolnMKZ Camera 3x Blackfly GigE Camera, 20 Hz Lidar Velodyne HDL-32E, 9.5 Hz IMU Xsens, 400 Hz GPS 2x fixed, 1 Hz CAN bus, 1,1 kHz Robot Operating System Data 3 GB per minute https://github.com/udacity/self-driving-car
  • 5.
    Robot Operating System +Popular open source robotics framework + Reliable distributed architecture + Wide use in the robotics research community + Huge selection of “off-the-shelf” software packages for hardware/algorithms/etc. + Used by Bosch, BMW, KUKA, Google, Siemens, etc. https://roscon.ros.org/2015/presentations/ROSCon-Automated-Driving.pdf
  • 6.
    Sensors Spec Sensor blinding, sunlight, darkness rain,fog, snow non-metal objects wind/ high velocity resolution range data Ultrasonic yes yes yes no + + + Lidar yes no yes yes +++ ++ + Radar yes yes no yes ++ +++ + Camera no no yes yes +++ +++ +++
  • 7.
    Car data fromsensors and bus traces CAN, Flexray, Camera, Radar, Lidar, IMU, etc. Pre-select signals, aggregate and prepare for sending Parse traces and signals (dbc, fibex, autosar...) Receive signals, analysis, and machine learning Real-time or batch analysis based on sensors data publish/subscriberealtime Car Layer Data Logger Data Center Realtime Data Analytics Real-time Analysis of car data
  • 8.
    High-level Development Processfor Autonomous Vehicles 8 1 Collect sensors data 3 Autonomous Driving 2 Model Engineering Data Logger Control Unit Big Data Trained Model Data Center 2 Model Engineering
  • 9.
    Machine Learning inRobotics Observations State Estimation Modeling & Prediction Planning Controls f(x) Controls Observations
  • 10.
    Machine Learning forAutonomous Driving + Sensor Fusion clustering, segmentation, pattern recognition + Road ego-motion, image processing and pattern recognition + Localization simultaneous localization and mapping + Situation Understanding detection and classification + Trajectory Planning motion planning and control + Control Strategy reinforcement and supervised learning + Driver Model image processing and pattern recognition
  • 11.
    Machine Learning Workflow Ingestdata Data Preprocessing Search Analysis Model Training Re- simulation Reports Results Model Deployment Training data Model Testing Train Test Loop Test data Model Feedback Loop
  • 12.
    More Data +Bigger Models Accuracy Scale (data size, model size) other approaches neural networks 1990s https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
  • 13.
    More Data +Bigger Models + More Computation Accuracy Scale (data size, model size) other approaches neural networks Now https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI more compute
  • 14.
    Train and evaluatemachine learning models at scale Single machine Data center How to run more experiments faster and in parallel? How to share and reproduce research? How to go from research to real products?
  • 15.
    When to useDistributed Machine Learning Data Size Model Size Model parallelism Single machine Data center Data parallelism training very large models exploring several model architectures, hyper- parameter optimization, training several independent models speeds up the training
  • 16.
    Compute Workload forTraining and Evaluation I/O intensive Compute intensive Single machine Data center
  • 17.
    I/O Workload forSimulation and Testing I/O intensive Compute intensive Single machine Data center
  • 18.
    Open Machine LearningPlatform Training & Test data Compute + Network + Storage Deploy model ML Development & Catalog & REST API ML-Specialists Search Analysis Training Evaluation Re-Simulation Testing CaffeOnSpark Sample Model Prediction Batch Regression Cluster Dataset Correlation Centroid Anomaly Test Scores ü Mainly open source ü No vendor lock in ü Scale-out architecture ü Multi user support ü Resource management ü Job scheduling ü Speed-up training ü Speed-up simulation
  • 19.
    ROS bag datastructure https://github.com/valtech/ros_hadoop
  • 20.
    Hadoop InputFormat forROS bags https://github.com/valtech/ros_hadoop
  • 21.
    Search & Analysis +Hadoop InputFormat and Record Reader for Rosbag + Process Rosbag with Spark, Yarn, MapReduce, Hadoop Streaming API, … + Spark RDD are cached and optimized for analysis Ros bag Processing Engine Computer Network Storage Advanced Analytics RDD Record Reader RDD DataFrame, DataSet SQL, Spark APIs NumPy Ros Msg
  • 22.
    Training & Evaluation +Tensorflow Record Reader + Protocol Buffers to serialize records + Save time because data conversion not needed + Save storage because data duplication not needed Training Engine Machine Learning Ros bag Computer Network Storage Record Reader Ros msg
  • 23.
    Re-Simulation & Testing +Use Spark for preprocessing, transformation, cleansing, aggregation, time window selection before publish to ROS topics + Use Re-Simulation framework of choice to subscribe to the ROS topics Engine Re-Simulation with framework of choice Computer Network Storage Ros bag Ros topic core subscribe publish
  • 24.
  • 25.
    High-level Development Processfor Autonomous Vehicles 25 1 Collect sensors data 3 Autonomous Driving 2 Model Engineering Data Logger Control Unit Big Data Trained Model Data Center 3 Autonomous Driving
  • 26.
  • 27.
    Hadoop InputFormat forROS Apache License 2.0 Download https://github.com/valtech/ros_hadoop Contact jan.wiegelmann@valtech.de
  • 28.