A First Look at the Integration of Machine Learning
Models in Complex Autonomous Driving Systems
A Case Study on Apollo
Tse-Hsun (Peter) Chen Lei MaJinqiu YangZi Peng
Health careSelf-driving car Social media Finance
AI-powered systems are everywhere
Software engineering plays a key role in AI-
powered systems
AI/ML is only a small part
of AI-powered systems
[Sculley, NeurIPS’15]
Software engineering is crucial to ensure the
functionality and quality of AI-powered systems
Software engineering
researcher
AI-powered systems
Studying ML model integration with source code in
self-driving car systems
Large and complex AI-powered systems
Failures would have life-critical
consequences
Lots of interests due to high market value
Our goal is to understand how ML models interact
with each other and their integration with code,
opening new avenues to future research and practice
on the quality assurance of self-driving car systems.
A case study on Apollo 5.0: open source self
driving car system
35K files 566 KLOC
279 files 36 KLOC
251 filesProto Buffer 19 KLOC
Apollo is now providing limited services to
customers
How are ML models used
and integrated with code?
Studying usage and integration of ML models in
the code
How well tested is ML-
related code?
How are ML models used
and integrated with code?
Studying usage and integration of ML models in
the code
How well tested is ML-
related code?
Manually studying ML model usages and system
architecture
Manual analysisSearch for ML model files
based on extensions
A simplified overview of Apollo ML components
Traffic light
recognition
Lane
detection
Obstacle
detection
Trajectory
prediction
A simplified overview of Apollo ML components
Traffic light
recognition
Lane
detection
Obstacle
detection
Trajectory
prediction
4 ML models
4 ML models
3 ML frameworks
9 ML models
3 ML frameworks
4 sensor sources
14 ML models
9 program points
A simplified overview of Apollo ML components
Traffic light
recognition
Lane
detection
Obstacle
detection
Trajectory
prediction
4 ML models
4 ML models
3 ML frameworks
9 ML models
3 ML frameworks
4 sensor sources
14 ML models
9 program points
Apollo relies on multiple information sources and ML
models to make decisions. The same task may even
contain multiple ML models, which complicates the
integration between models and source code.
ML models are interconnected in Apollo
The outputs from the ML models are
used as inputs for another ML model.
ML model outputs are used to post-
process or combine with outputs from
other ML models.
ML models are chosen based on the
current scenario.
Developers provides some safety
nets to ML model outputs
• Filtering out invalid outputs based on
heuristics
– E.g., The area of the detected traffic light
is negative.
• Using algorithms to correct ML model
– E.g., Consider time-sensitive constraints
to revise detected traffic light color.
Developers provides some safety
nets to ML model outputs
• Filtering out invalid outputs based on
heuristics
– E.g., The area of the detected traffic light
is negative
• Using algorithms to correct ML model
– E.g., Consider time-sensitive constraints
to revise detected traffic light color
Most existing research only focuses on unit testing
individual ML model. More research is needed for
testing and quality assurance of model integration.
How are ML models used
and integrated with code?
Studying usage and integration of ML models in
the code
How well tested is ML-
related code?
The ML models are
configurable, interconnected,
and safeguarded by code logic.
How are ML models used
and integrated with code?
Studying usage and integration of ML models in
the code
How well tested is ML-
related code?
The ML models are
configurable, interconnected,
and safeguarded by code logic.
Running Apollo in a simulator
• Profile Apollo to monitor the method
execution in ML-related components.
• Running simulation using data
recorded from real test drive,
including data from camera and
LiDAR, etc.
35.5% of the methods in ML-related components are
executed, while only 3.5% of the methods in other
components are executed.
CI test coverage of code components
ML-related
CI test coverage of code components
ML-related
More testing effort is needed to improve
the CI test coverage in Apollo.
How are ML models used
and integrated with code?
Studying usage and integration of ML models in
the code
How well tested is ML-
related code?
The ML models are
configurable, interconnected,
and safeguarded by code logic.
Despite the importance and
frequent execution, the test
coverage for ML-related code
has room for improvement.
Tse-Hsun (Peter) Chen
https://petertsehsun.github.io

A first look at the integration of machine learning models in complex autonomous driving systems

  • 1.
    A First Lookat the Integration of Machine Learning Models in Complex Autonomous Driving Systems A Case Study on Apollo Tse-Hsun (Peter) Chen Lei MaJinqiu YangZi Peng
  • 2.
    Health careSelf-driving carSocial media Finance AI-powered systems are everywhere
  • 3.
    Software engineering playsa key role in AI- powered systems AI/ML is only a small part of AI-powered systems [Sculley, NeurIPS’15]
  • 4.
    Software engineering iscrucial to ensure the functionality and quality of AI-powered systems Software engineering researcher AI-powered systems
  • 5.
    Studying ML modelintegration with source code in self-driving car systems Large and complex AI-powered systems Failures would have life-critical consequences Lots of interests due to high market value Our goal is to understand how ML models interact with each other and their integration with code, opening new avenues to future research and practice on the quality assurance of self-driving car systems.
  • 6.
    A case studyon Apollo 5.0: open source self driving car system 35K files 566 KLOC 279 files 36 KLOC 251 filesProto Buffer 19 KLOC
  • 7.
    Apollo is nowproviding limited services to customers
  • 8.
    How are MLmodels used and integrated with code? Studying usage and integration of ML models in the code How well tested is ML- related code?
  • 9.
    How are MLmodels used and integrated with code? Studying usage and integration of ML models in the code How well tested is ML- related code?
  • 10.
    Manually studying MLmodel usages and system architecture Manual analysisSearch for ML model files based on extensions
  • 11.
    A simplified overviewof Apollo ML components Traffic light recognition Lane detection Obstacle detection Trajectory prediction
  • 12.
    A simplified overviewof Apollo ML components Traffic light recognition Lane detection Obstacle detection Trajectory prediction 4 ML models 4 ML models 3 ML frameworks 9 ML models 3 ML frameworks 4 sensor sources 14 ML models 9 program points
  • 13.
    A simplified overviewof Apollo ML components Traffic light recognition Lane detection Obstacle detection Trajectory prediction 4 ML models 4 ML models 3 ML frameworks 9 ML models 3 ML frameworks 4 sensor sources 14 ML models 9 program points Apollo relies on multiple information sources and ML models to make decisions. The same task may even contain multiple ML models, which complicates the integration between models and source code.
  • 14.
    ML models areinterconnected in Apollo The outputs from the ML models are used as inputs for another ML model. ML model outputs are used to post- process or combine with outputs from other ML models. ML models are chosen based on the current scenario.
  • 15.
    Developers provides somesafety nets to ML model outputs • Filtering out invalid outputs based on heuristics – E.g., The area of the detected traffic light is negative. • Using algorithms to correct ML model – E.g., Consider time-sensitive constraints to revise detected traffic light color.
  • 16.
    Developers provides somesafety nets to ML model outputs • Filtering out invalid outputs based on heuristics – E.g., The area of the detected traffic light is negative • Using algorithms to correct ML model – E.g., Consider time-sensitive constraints to revise detected traffic light color Most existing research only focuses on unit testing individual ML model. More research is needed for testing and quality assurance of model integration.
  • 17.
    How are MLmodels used and integrated with code? Studying usage and integration of ML models in the code How well tested is ML- related code? The ML models are configurable, interconnected, and safeguarded by code logic.
  • 18.
    How are MLmodels used and integrated with code? Studying usage and integration of ML models in the code How well tested is ML- related code? The ML models are configurable, interconnected, and safeguarded by code logic.
  • 19.
    Running Apollo ina simulator • Profile Apollo to monitor the method execution in ML-related components. • Running simulation using data recorded from real test drive, including data from camera and LiDAR, etc. 35.5% of the methods in ML-related components are executed, while only 3.5% of the methods in other components are executed.
  • 20.
    CI test coverageof code components ML-related
  • 21.
    CI test coverageof code components ML-related More testing effort is needed to improve the CI test coverage in Apollo.
  • 22.
    How are MLmodels used and integrated with code? Studying usage and integration of ML models in the code How well tested is ML- related code? The ML models are configurable, interconnected, and safeguarded by code logic. Despite the importance and frequent execution, the test coverage for ML-related code has room for improvement.
  • 30.