2. 2
Jan Wiegelmann
@janwgl
Data Analytics at Valtech
Data Science, Engineering
Distributed Deep Learning
Hadoop Ecosystem
Meetups in Munich
Robot Operating System
Big Data in Automotive
3. High-level Development Process for Autonomous Vehicles
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
Agenda
4. High-level Development Process for Autonomous Vehicles
4
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
1 Collect sensors data
5. Sensors Udacity Lincoln MKZ
Camera 3x Blackfly GigE Camera, 20 Hz
Lidar Velodyne HDL-32E, 9.5 Hz
IMU Xsens, 400 Hz
GPS 2x fixed, 1 Hz
CAN bus, 1,1 kHz
Robot Operating System
Data 3 GB per minute
https://github.com/udacity/self-driving-car
6. Robot Operating System
+ Popular open source robotics
framework
+ Reliable distributed architecture
+ Wide use in the robotics
research community
+ Huge selection of “off-the-shelf”
software packages for
hardware/algorithms/etc.
+ Used by Bosch, BMW, KUKA, Google, Siemens, etc.
https://roscon.ros.org/2015/presentations/ROSCon-Automated-Driving.pdf
7. Sensors Spec
Sensor blinding,
sunlight,
darkness
rain, fog,
snow
non-metal
objects
wind/ high
velocity
resolution range data
Ultrasonic yes yes yes no + + +
Lidar yes no yes yes +++ ++ +
Radar yes yes no yes ++ +++ +
Camera no no yes yes +++ +++ +++
8. High-level Development Process for Autonomous Vehicles
8
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
2 Model Engineering
11. AI history à Perceptron
1958 F. Rosenblatt,
“Perceptron” model,
neuronal networks
1943 W. McCulloch,
W. Pitts, “Neuron” as
logical element
OR function XOR function
1969 M. Minsky,
S. Papert, triggers
first AI winter
feed forward
12. AI history à AI winter
1958 F. Rosenblatt,
Perzeptron model,
neuronal networks
1987-1993 the second
AI winter, desktop
computer, LISP
machines expensive
1943 W. McCulloch,
W. Pitts, neuron as
logical element
1980 Boom expert
systems, Q&A using
logical rules, Prolog
1969 M. Minsky,
S. Papert, trigger
first AI winter
1993-2001
Moore’s law, Deep
blue chess-
playing, Standford
DARPA challenge
13. AI history
Accuracy
Scale (data size, model size)
other approaches
neural networks
1990s
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
14. More Data + Bigger Models + More Computation
Accuracy
Scale (data size, model size)
other approaches
neural networks
Now
https://www.scribd.com/document/355752799/Jeff-Dean-s-Lecture-for-YC-AI
more compute
15. Machine Learning for Autonomous Driving
+ Sensor Fusion clustering, segmentation, pattern recognition
+ Road ego-motion, image processing and pattern recognition
+ Localization simultaneous localization and mapping
+ Situation Understanding detection and classification
+ Trajectory Planning motion planning and control
+ Control Strategy reinforcement and supervised learning
+ Driver Model image processing and pattern recognition
16. Car data from sensors and bus traces
CAN, Flexray, Camera, Radar, Lidar, IMU, etc.
Pre-select signals, aggregate and prepare for sending
Parse traces and signals (dbc, fibex, autosar...)
Receive signals, analysis, and machine learning
Real-time or batch analysis based on sensors data
publish/subscriberealtime
Car Layer
Data Logger
Data Center
Realtime
Data Analytics
Real-time Analysis of car data
17. Train and evaluate machine learning models at scale
Single machine Data center
How to run more experiments faster and in parallel?
How to share and reproduce research?
How to go from research to real products?
18. Distributed Machine Learning
Data Size
Model Size
Model parallelism
Single machine
Data center
Data
parallelism
training very large models exploring several model
architectures, hyper-
parameter optimization,
training several
independent models
speeds up the training
19. Compute Workload for Training and Evaluation
I/O intensive
Compute
intensive
Single machine
Data center
20. I/O Workload for Simulation and Testing
I/O intensive
Compute
intensive
Single machine
Data center
21. Machine Learning Cycle
Data collection
for training/test
Feature
engineering
I/O workload
Model development
and architecture
Compute workload I/O workload
Training and
evaluation
Re- Simulation
and Testing
Scaling and
monitoring
Model deployment
versioning
1 2 3
Model tuning
22. Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Feature
Engineering
Training
Evaluation
Re-Simulation
Testing
CaffeOnSpark
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
23. Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Feature
Engineering
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
24. Feature Engineering
+ Hadoop InputFormat and
Record Reader for Rosbag
+ Process Rosbag with Spark,
Yarn, MapReduce, Hadoop
Streaming API, …
+ Spark RDD are cached and
optimized for analysis
Ros
bag
Processing
Engine
Computer
Network
Storage
Advanced
Analytics
RDD
Record
Reader
RDD
DataFrame, DataSet
SQL, Spark APIs
NumPy
Ros
Msg
25. ROS bag data structure
https://github.com/valtech/ros_hadoop
27. Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Training
Evaluation
CaffeOnSpark
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
28. Training & Evaluation
+ Tensorflow ROSRecordDataset
+ Protocol Buffers to serialize
records
+ Save time because data
conversion not needed
+ Save storage because data
duplication not needed
Training
Engine
Machine
Learning
Ros
bag
Computer
Network
Storage
ROS
Dataset
Ros
msg
29. Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Re-Simulation
Testing
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
30. Re-Simulation & Testing
+ Use Spark for preprocessing,
transformation, cleansing,
aggregation, time window
selection before publish to ROS
topics
+ Use Re-Simulation framework
of choice to subscribe to the
ROS topics
Engine
Re-Simulation
with framework
of choice
Computer
Network
Storage
Ros
bag
Ros
topic
core
subscribe
publish
32. High-level Development Process for Autonomous Vehicles
32
1 Collect
sensors data
3 Autonomous
Driving
2 Model
Engineering
Data Logger Control Unit
Big Data Trained Model
Data Center
3 Autonomous Driving
33. Flux – Open Machine Learning Stack
Training & Test data
Compute + Network + Storage
Deploy model
ML Development & Catalog & REST API
ML-Specialists
Feature
Engineering
Training
Evaluation
Re-Simulation
Testing
CaffeOnSpark
Sample Model Prediction Batch Regression Cluster
Dataset Correlation Centroid Anomaly Test Scores
ü Mainly open source
ü No vendor lock in
ü Scale-out architecture
ü Multi user support
ü Resource management
ü Job scheduling
ü Speed-up training
ü Speed-up simulation
34. Flux – Open Machine Learning Stack
+ Native format support e.g. rosbags (Robot Operating System)
+ End-to-end machine learning pipeline
+ Layered API (provisioning, operating, processing, storage)
+ Optimized for scale-out based on cost, time, space
+ One-click on premise and cloud deployment
+ Apache License 2.0 – release Q4/2017
+ http://flux-project.org