AN AIRBUS INNOVATION CENTER
MLOps: Managing Data and
Workflows for Efficient
Model Development and
Deployment
Konstantinos Balafas – Head of AI Data
Carlo Dal Mutto – Director of Engineering
Acubed by Airbus
AN AIRBUS INNOVATION CENTER
ML Ops and Data Ops
AN AIRBUS INNOVATION CENTER
System Productization (MLOPS)
3
ML model
ML training
infra
Dev-ops
Data
labelling
MLOPS
Discipline that comprises a set of tools and principles to support progress through the lifecycle of a ML project
ML
deployment
Data
collection
Data
labelling Data
storage Data
verification
Pre/post
processing
Configuration
management
Process
management
Deployment
infrastructure
Monitoring
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
ML Project Lifecycle
4
Scoping Data Modeling Deployment
Define
project
Define data
and
establish
baseline
Label
and
organize
data
Select and
train model
Perform
data
analysis
Deploy in
production
Monitor
and
maintain
system
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Understand if there is business value
● Fundamental questions:
○ Is the project providing any business value?
○ What system performances would make the business viable?
■ System requirements
Understand if the problem is technically achievable
● Establish a baseline and confirm that system requirements can be achieved
○ Human level performance (HLP)
○ Literature search
○ Quick and dirty implementation
If HLP is not satisfactory, improve the considered sensors or the instruction to the humans
Project Scoping
5
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Data are first-class citizens in data-centric AI
New trend: big data → good data
Which data
● Raw vs processed data
● Metadata
○ Provenance: where the data came from
○ Lineage: which are the processing steps applied to the data
Data pipeline
Data
6
Labelling
Cleaning
Engineering
Raw data Partitioning
Dev. dataset
Test dataset
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Data pipeline
Data
7
Labelling
Cleaning
Engineering
Raw data Partitioning
Dev. dataset
Test dataset
Diversity is more
important than quantity
Need to be deliberate
about new data
Leverage already
collected data in an
iterative fashion for
data driven decisions
Core ETL Tools and
Processes
Identify data that
cannot or should not be
used downstream
Validate quality of Data
Acquisition (time
syncing, outliers, etc.)
Potentially engineer ML
features
In-house manual
labeling typically yields
highest quality but is
not scalable
Algorithmic solutions
exist
ML models
Data labeling vendors
QA is paramount
Partitioning strategy
should be
requirements-driven
Prevent target leakage
Ensure that models are
evaluated on a
representative dataset
Dataset versioning is
critical
Several code version
control-inspired
solutions
Raw dataset or
metadata-driven
versioning
Many high-quality
vendors exist
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Data pipeline - Key challenges
Data
8
Labelling
Cleaning
Engineering
Raw data Partitioning
Dev. dataset
Test dataset
Costly Significant upfront
investment and
ongoing maintenance
Difficult balance
between scalability
and quality
Dataset lineage
becomes very
complex
While automation steps are possible, it is very difficult to remove manual steps entirely from this loop
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Model-centric AI development → data-centric AI development
Model development preliminary steps
1. Define development set, then split it into training and validation set
2. Define business-centric and model-centric metrics
○ Account for data distribution artifacts (skewness, unbalancing, fairness…)
Model development main steps
Modeling
9
Get an initial
dataset
Feature
engineering
(normalization,
dimensionality
reduction, etc.)
Data
augmentation
Architecture
exploration
Finetune model
Efficient
train + deploy!
Start simple! Squeeze more out
your data!
Be creative! Back to
business!
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Types of deployment
• New product/problem capability
• Automate/assist human
• Replace a previous model/pipeline
Deployment
10
Human only
Monitoring
● Define which metrics should be tracked during deployment
● Implement monitoring dashboards (manage under- vs over-monitoring)
● Pipeline monitoring. For each component monitor: input metrics, output metrics, software metrics
Gradual deployment → progressively increase amount of automation
Shadow mode AI assistance
Partial
automation
Full
automation
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Post-Deployment
11
Metrics on input data
(concept shift)
Metrics on model
performance (require
ground truth)
Decision point
implies continuous
monitoring and
event-triggered
outer loop
Avoid catastrophic
forgetting
Training data should
represent operational
conditions
Dataset versioning
is important to
ensure reliable
model evaluation
and avoid target
leakage
Necessary in the
absence of ground
truth
Diversity is key
New data need to
“look and feel”
like old data
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Wayfinder
AN AIRBUS INNOVATION CENTER
Project Wayfinder
13
AN AIRBUS INNOVATION CENTER
• Develop a vision-based perception system for autonomous flight
• Development streams
• Sensing development (cameras, radars, INSs)
• SW development
• Perception development (including ML)
Problem Statement
14
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
• Goal: vision-based equivalent of the Instrument Landing System (ILS)
• INPUT: images, database of airport locations
• OUTPUT: position of the aircraft wrt the runway
Autonomous Landing
15
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
ML Ops and Data Ops at Wayfinder
AN AIRBUS INNOVATION CENTER
ML Ops and Data Ops
Data Ops
● Image quality analysis
● Image annotation
● Dataset/label versioning
● Data consumption and “insights”
ML Ops
● Train/validate/test split
● Model versioning
○ efficient experiment tracking
○ Verification & Validation
● Model monitoring
17
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Need to determine whether an image
is suitable for training (or testing!)
before an image is included in an
ML-bound dataset
● Leveraging “classical” CV techniques
(e.g. brightness, contrast)
● Exploit model predictions/confidence
as an additional dimension
Image Quality Analysis
18
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Requirements
○ Need to calculate ground truth for model training
○ Very high accuracy requirements
○ Need to scale to millions of images
● Solution
○ Leverage an ensemble of:
■ Classical CV techniques
■ Machine learning models
■ Human annotation
○ Establish both a per-image and per-algorithm “quality” metric
○ Develop a dense 3D database of airport features
Image Annotation
19
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Image Annotation Pipeline
20
Image
collection
Selection of
images to
manually
label
Outsource
manual
labeling
Label audit
Labeling
using CV
techniques
Labeling
using ML
techniques
Label
quality
metrics
Label fusion ML training
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Data versioning is important for Verification and Validation of our autonomous flight system
○ Images used for training different ML versions and modules
○ Labels denoting ground truth
● Leveraging logging of runs of the aforementioned pipeline to ensure traceability of our
labels
● Development of a similar “delivery” pipeline for versioning datasets
○ Consider datasets “immutable”
Dataset/Label Versioning
21
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Collected data has value beyond ML
● Enable data-driven decisions across Wayfinder
○ Data labeling
○ Future data collection
○ Systems-level requirements
● Developing purpose-built dashboards
Data Consumption and Insights
22
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Train/validate/test split
○ Account for sequential nature of data
○ Ensure similar distributions of key
features across sets
○ Ensure sufficient coverage across
operational envelope
● Model versioning
○ Efficient experiment tracking
○ V&V efforts
ML Ops: Pre-Deployment
23
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
● Current state
○ Model is evaluated on a per-project pre-defined test set that has been curated to meet system
requirements
● Monitoring Constraints
○ Ground truth is unavailable “online”
○ Expected large volumes of diverse data arriving in multiple batches
● Moving forward
○ Establish an annotation and evaluation strategy
○ Objective is to maximize diversity of a continuously growing evaluation set based on
■ Image content
■ Key metadata “dimensions” (e.g. airports, weather conditions, aircraft position, etc.)
■ Current model performance (and/or model confidence on different predictions)
ML Ops: Post-Deployment
24
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
There are several operational and conceptual challenges associated with ML and Data Ops
● Labeling
● Data curation
● Monitoring
We try to address them by:
● Breaking down the data life-cycle in semantic stages
● Organizing our code in modular pipelines
● Leveraging both classical and ML-based Computer Vision algorithms
● Automating the different pipeline steps to the extent possible
Conclusions
25
© 2022 Airbus
AN AIRBUS INNOVATION CENTER
Thank you!
AN AIRBUS INNOVATION CENTER
We are hiring!
acubed.airbus.com/careers

“MLOps: Managing Data and Workflows for Efficient Model Development and Deployment,” a Presentation from Airbus

  • 1.
    AN AIRBUS INNOVATIONCENTER MLOps: Managing Data and Workflows for Efficient Model Development and Deployment Konstantinos Balafas – Head of AI Data Carlo Dal Mutto – Director of Engineering Acubed by Airbus
  • 2.
    AN AIRBUS INNOVATIONCENTER ML Ops and Data Ops
  • 3.
    AN AIRBUS INNOVATIONCENTER System Productization (MLOPS) 3 ML model ML training infra Dev-ops Data labelling MLOPS Discipline that comprises a set of tools and principles to support progress through the lifecycle of a ML project ML deployment Data collection Data labelling Data storage Data verification Pre/post processing Configuration management Process management Deployment infrastructure Monitoring © 2022 Airbus
  • 4.
    AN AIRBUS INNOVATIONCENTER ML Project Lifecycle 4 Scoping Data Modeling Deployment Define project Define data and establish baseline Label and organize data Select and train model Perform data analysis Deploy in production Monitor and maintain system © 2022 Airbus
  • 5.
    AN AIRBUS INNOVATIONCENTER Understand if there is business value ● Fundamental questions: ○ Is the project providing any business value? ○ What system performances would make the business viable? ■ System requirements Understand if the problem is technically achievable ● Establish a baseline and confirm that system requirements can be achieved ○ Human level performance (HLP) ○ Literature search ○ Quick and dirty implementation If HLP is not satisfactory, improve the considered sensors or the instruction to the humans Project Scoping 5 © 2022 Airbus
  • 6.
    AN AIRBUS INNOVATIONCENTER Data are first-class citizens in data-centric AI New trend: big data → good data Which data ● Raw vs processed data ● Metadata ○ Provenance: where the data came from ○ Lineage: which are the processing steps applied to the data Data pipeline Data 6 Labelling Cleaning Engineering Raw data Partitioning Dev. dataset Test dataset © 2022 Airbus
  • 7.
    AN AIRBUS INNOVATIONCENTER Data pipeline Data 7 Labelling Cleaning Engineering Raw data Partitioning Dev. dataset Test dataset Diversity is more important than quantity Need to be deliberate about new data Leverage already collected data in an iterative fashion for data driven decisions Core ETL Tools and Processes Identify data that cannot or should not be used downstream Validate quality of Data Acquisition (time syncing, outliers, etc.) Potentially engineer ML features In-house manual labeling typically yields highest quality but is not scalable Algorithmic solutions exist ML models Data labeling vendors QA is paramount Partitioning strategy should be requirements-driven Prevent target leakage Ensure that models are evaluated on a representative dataset Dataset versioning is critical Several code version control-inspired solutions Raw dataset or metadata-driven versioning Many high-quality vendors exist © 2022 Airbus
  • 8.
    AN AIRBUS INNOVATIONCENTER Data pipeline - Key challenges Data 8 Labelling Cleaning Engineering Raw data Partitioning Dev. dataset Test dataset Costly Significant upfront investment and ongoing maintenance Difficult balance between scalability and quality Dataset lineage becomes very complex While automation steps are possible, it is very difficult to remove manual steps entirely from this loop © 2022 Airbus
  • 9.
    AN AIRBUS INNOVATIONCENTER Model-centric AI development → data-centric AI development Model development preliminary steps 1. Define development set, then split it into training and validation set 2. Define business-centric and model-centric metrics ○ Account for data distribution artifacts (skewness, unbalancing, fairness…) Model development main steps Modeling 9 Get an initial dataset Feature engineering (normalization, dimensionality reduction, etc.) Data augmentation Architecture exploration Finetune model Efficient train + deploy! Start simple! Squeeze more out your data! Be creative! Back to business! © 2022 Airbus
  • 10.
    AN AIRBUS INNOVATIONCENTER Types of deployment • New product/problem capability • Automate/assist human • Replace a previous model/pipeline Deployment 10 Human only Monitoring ● Define which metrics should be tracked during deployment ● Implement monitoring dashboards (manage under- vs over-monitoring) ● Pipeline monitoring. For each component monitor: input metrics, output metrics, software metrics Gradual deployment → progressively increase amount of automation Shadow mode AI assistance Partial automation Full automation © 2022 Airbus
  • 11.
    AN AIRBUS INNOVATIONCENTER Post-Deployment 11 Metrics on input data (concept shift) Metrics on model performance (require ground truth) Decision point implies continuous monitoring and event-triggered outer loop Avoid catastrophic forgetting Training data should represent operational conditions Dataset versioning is important to ensure reliable model evaluation and avoid target leakage Necessary in the absence of ground truth Diversity is key New data need to “look and feel” like old data © 2022 Airbus
  • 12.
    AN AIRBUS INNOVATIONCENTER Wayfinder
  • 13.
    AN AIRBUS INNOVATIONCENTER Project Wayfinder 13
  • 14.
    AN AIRBUS INNOVATIONCENTER • Develop a vision-based perception system for autonomous flight • Development streams • Sensing development (cameras, radars, INSs) • SW development • Perception development (including ML) Problem Statement 14 © 2022 Airbus
  • 15.
    AN AIRBUS INNOVATIONCENTER • Goal: vision-based equivalent of the Instrument Landing System (ILS) • INPUT: images, database of airport locations • OUTPUT: position of the aircraft wrt the runway Autonomous Landing 15 © 2022 Airbus
  • 16.
    AN AIRBUS INNOVATIONCENTER ML Ops and Data Ops at Wayfinder
  • 17.
    AN AIRBUS INNOVATIONCENTER ML Ops and Data Ops Data Ops ● Image quality analysis ● Image annotation ● Dataset/label versioning ● Data consumption and “insights” ML Ops ● Train/validate/test split ● Model versioning ○ efficient experiment tracking ○ Verification & Validation ● Model monitoring 17 © 2022 Airbus
  • 18.
    AN AIRBUS INNOVATIONCENTER ● Need to determine whether an image is suitable for training (or testing!) before an image is included in an ML-bound dataset ● Leveraging “classical” CV techniques (e.g. brightness, contrast) ● Exploit model predictions/confidence as an additional dimension Image Quality Analysis 18 © 2022 Airbus
  • 19.
    AN AIRBUS INNOVATIONCENTER ● Requirements ○ Need to calculate ground truth for model training ○ Very high accuracy requirements ○ Need to scale to millions of images ● Solution ○ Leverage an ensemble of: ■ Classical CV techniques ■ Machine learning models ■ Human annotation ○ Establish both a per-image and per-algorithm “quality” metric ○ Develop a dense 3D database of airport features Image Annotation 19 © 2022 Airbus
  • 20.
    AN AIRBUS INNOVATIONCENTER Image Annotation Pipeline 20 Image collection Selection of images to manually label Outsource manual labeling Label audit Labeling using CV techniques Labeling using ML techniques Label quality metrics Label fusion ML training © 2022 Airbus
  • 21.
    AN AIRBUS INNOVATIONCENTER ● Data versioning is important for Verification and Validation of our autonomous flight system ○ Images used for training different ML versions and modules ○ Labels denoting ground truth ● Leveraging logging of runs of the aforementioned pipeline to ensure traceability of our labels ● Development of a similar “delivery” pipeline for versioning datasets ○ Consider datasets “immutable” Dataset/Label Versioning 21 © 2022 Airbus
  • 22.
    AN AIRBUS INNOVATIONCENTER ● Collected data has value beyond ML ● Enable data-driven decisions across Wayfinder ○ Data labeling ○ Future data collection ○ Systems-level requirements ● Developing purpose-built dashboards Data Consumption and Insights 22 © 2022 Airbus
  • 23.
    AN AIRBUS INNOVATIONCENTER ● Train/validate/test split ○ Account for sequential nature of data ○ Ensure similar distributions of key features across sets ○ Ensure sufficient coverage across operational envelope ● Model versioning ○ Efficient experiment tracking ○ V&V efforts ML Ops: Pre-Deployment 23 © 2022 Airbus
  • 24.
    AN AIRBUS INNOVATIONCENTER ● Current state ○ Model is evaluated on a per-project pre-defined test set that has been curated to meet system requirements ● Monitoring Constraints ○ Ground truth is unavailable “online” ○ Expected large volumes of diverse data arriving in multiple batches ● Moving forward ○ Establish an annotation and evaluation strategy ○ Objective is to maximize diversity of a continuously growing evaluation set based on ■ Image content ■ Key metadata “dimensions” (e.g. airports, weather conditions, aircraft position, etc.) ■ Current model performance (and/or model confidence on different predictions) ML Ops: Post-Deployment 24 © 2022 Airbus
  • 25.
    AN AIRBUS INNOVATIONCENTER There are several operational and conceptual challenges associated with ML and Data Ops ● Labeling ● Data curation ● Monitoring We try to address them by: ● Breaking down the data life-cycle in semantic stages ● Organizing our code in modular pipelines ● Leveraging both classical and ML-based Computer Vision algorithms ● Automating the different pipeline steps to the extent possible Conclusions 25 © 2022 Airbus
  • 26.
    AN AIRBUS INNOVATIONCENTER Thank you!
  • 27.
    AN AIRBUS INNOVATIONCENTER We are hiring! acubed.airbus.com/careers