Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving

WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics

Gheorghe Pucea, BMW Group
Jennifer Reinelt, BMW Group
Lessons Learned from Using Spark for
Evaluating Road Detection
@ BMW Autonomous Driving
#UnifiedDataAnalytics #SparkAISummit

Outline
4
• Evaluation of Lane Detection
• Evaluation Pipeline
• AI Based Ground Truth
• Lessons Learned

BMW AUTONOMOUS DRIVING
5
Car Setup for
Autonomous Driving

Outline
6
• Lessons Learned

Evaluation of Lane Detection
7
Real lane markings
Detected lane markings
At 1m?
At 50m?At 100m?
At 150m?
How well does the car detect the lane markings?

How well does the car detect the lane markings?
Key Performance Indicator (KPI) – Lateral Offset
8
commit
70d9c31
commit
c271a01
commit
4e0bcd3
commit
6e3bcd3
150m
Functional development time
Lateraloffset
improvement

Challenges:
• Where are the real lane markings? How do
we get the ground truth?
• How do we avoid making the same mistakes
as the car when looking for real lane
markings?
• How do we scale this ground truth
generation?
9
Real lane markings
Detected lane markings
At 1m?
At 50m?At 100m?
At 150m?

How do we get the ground truth?
• From manual labels
10
Very accurate Manual
Slow
Expensive to
scale up
Bad for
Occlusions

• From additional sensors
11
Automated
Fast
Accurate
Expensive to
scale up

• Using sophisticated algorithms in the backend
12
Scalable
Automated
Fast
Cheap
Lower
accuracy

Outline
13
• Lessons Learned

Ros
bag
orc
Datacenter:
> 230 PB capacity and > 1.500 TB raw data/day
> 100.000 Cores and >200 GPUs
Reprocessing KPI CalculationRos ConverterData Ingestion
Ground Truth
Generation
Other
Applications
Other
Applications
Other
Applications
Evaluation Pipeline
14
Data
Collection
InfluxDB

Outline
15
• Lessons Learned

Ros
bag
orc
Datacenter:
Reprocessing KPI CalculationRos ConverterData Ingestion
Ground Truth
Generation
Other
ApplicationsOther
ApplicationsOther
Applications
16
Data
Collection
InfluxDB
AI Based Ground Truth

AI Based Ground Truth
17
3D Lidar points clouds Semantic
Segmentation
Lidar intensity in
2D bird‘s eye view
Deep Neural
Network
Lane Marking
No Lane Marking

Outline
18
• Lessons Learned

Motivation of Lessons Learned
19
Source: https://twitter.com/bigdataborat?lang=en

Motivation of Lessons Learned
20
Source: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf

Ros
bag
orc
Datacenter:
Reprocessing KPI Calculation
Ros
Converter
Data
Ingestion
Ground Truth
Generation
Other
Applications
Other
Applications
Other
Applications
Lessons Learned – Spark Testing
21
Data
Collection
InfluxDB

Typical integration test
22

Drawback of static ORC‘s commited in the source code
23

Test data
generation
library
24
Type classes
cats

Using test data generation library for integration tests
25
Cats FlatMap Type Class
Scalacheck generators available with Type Classes

Sensor data streams as Scala ADT
26

Example Typeclass for generating Can Messages
27

Implemeting cats.FlatMap type class
28

Lessons Learned – Testing
Advantages of using code instead of static Orc files
• Compiler helps with breaking changes
• Improves test understandability
• Flexible manipulation of data using monadic operations
29

Lessons Learned – Catalyst Optimizations
Ros
bag
orc
Datacenter:
Ros
Converter
Data
Ingestion
Ground Truth
Generation
30
Data
Collection
InfluxDB
RDD

Interested in testing the impact of RDD – Dataset – Dataframe conversion:
• Test with 1 GB of Flexray data, ~ 20 runs/experiment
• Count the data
• Filter data by specific busId
31

Running count on ~1GB Flexray data
32
0 50 100 150 200 250 300 350
RDD
Dataset
Dataframe
Processing time(s)

How about filtering by busId before counting?
33

How about filtering by busId before counting?
34
0 50 100 150 200 250 300 350
RDD
Dataset Typed
Dataset Untyped
Dataframe
Processing time(s)

Running „explain“ on Dataset yields:
35
Dataset Untyped API Dataset Typed API

Which version is applying push down filters?
36
a) left
b) right
c) both
d) none

Which version is applying push down filters?
37
a) left
b) right
c) both
d) none
busIds: Array[Long] but
busId is of type Int

Lessons Learned – Optimizations
Catalyst optimizations
• Types matter for push down filters
• Conversion between Dataset Typed and Untyped API might
hurt performance
• Always check assumptions by looking at metrics/physical
execution plan
38

Ros
bag
orc
Datacenter:
Ros
Converter
Data Ingestion
Ground Truth
Generation
Other
Applications
Other
Applications
Other
Applications
Lessons Learned – Spark Configuration
39
Data
Collection
InfluxDB
> 1GB
be available fast
be sorted

Adding the feature to rosbag converter of writing bags > 1GB
Resulted in
• Increased processing time
• shuffle.FetchFailedException
Spark UI showed
• Lots of RACK_LOCAL tasks
• Task are taking long
40

Spark locality parameters
41

Tuning Spark locality yields improved processing time
42
0
5
10
15
20
25
30
35
40
#RACK_LOC AL tasks
Old config
Optimized Spark locality
0
50
100
150
200
250
300
350
400
450
Processing time (s)
Old config
Optimized Spark locality
100%
20%
~140GB image data
~20 runs

Tuning shuffling parameters, spark.reducer.maxReqInFlight
43
0
0.5
1
1.5
2
2.5
3
3.5
4
Failed Tasks
Old config
Optimized maxReqInFlight
40%

Lessons Learned – Configuration
Writing controlled size files from Spark:
• Pay attention to data locality
• Writing controlled sized files is hard
• Tuning Spark configuration properly yields surprising results
44

Summary
45
• KPIs on lane marking detection
• DNN for lidar based lane detection
• Tips for testing, configuring and
optimizing Spark

Video
46
https://youtu.be/wNAmxL25Bhk

Thank you for listening!
47#UnifiedDataAnalytics #SparkAISummit

DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving

Similar to Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving (20)

More from Databricks

More from Databricks (20)

Recently uploaded

Recently uploaded (20)

Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving