ML Application Life Cycle
111/20/19
Srujana Merugu
& Solving the right problems!
Formal approach to engineering ML systems ?
2
Programming)Languages:
Java,&&Python,&&C++,&Scala
Software)Design:)
OOP,&design&patterns,&SOA&...
Software)engineering:))
SDLC,&Agile,&SCRUM,&SRS,
Data)structures)&)Algos:)
sort,&&&trees,&lists&&
ML)Tools/)Packages:
Tensorflow,&spark.ml,&sklearn,&R
Offline)Modeling:
Preprocessing,&feature&engg.,&
learning,&evaluation
Engineering)ML)Systems:))
?)?
ML)Concepts)&)Algos
Linear&models,&Neural&models,&
BiasIvariance&&&
Software Engineer ML Engineer,)
Data)Scientist
3
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
4
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Application##### ! ML#&#
requirements Optimization#
Problems
5
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Precise#sources#&#definitions#of#
all#data#elements#
Checks:#
! Diff.#types#of#leakage
! Data#quality#issues
! Distributional#violations
6
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Offline#training##&#evaluation#of#
ML#models
! Multi:step#iterative#process
Offline Modeling
7
Data Collection &-Integration
Data-Exploration-
Feature-Engineering
Meet
Business-
Goals?
Model-Training,-Evaluation-------------
&--Fine?tuning
Data-Preprocessing
Data-Sampling/Splitting-
8
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Functional##Production#System
! Scalability
! Responsiveness
! Fault#tolerance
! Security
! …
9
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Equivalence#checks#for#offline#
modeling#vs.#production#settings
! Data#fetch#process
! Entire#model#pipeline#
! Data#distributions#
10
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
Automation#of
▪ Predictions#for#new#instances
▪ Data#quality#monitoring
▪ Data#logging#&#attribution
▪ Periodic#re=training
11
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
! A/B#testing#on key#prediction#
quality#& business#metrics#
! Assessment#of#system#
performance#aspects#
! Diagnosis#to#find#areas#of#
improvement
System Objectives
• Effectiveness w.r.t. business metrics
• Ethical compliance
• Fidelity wr.t. distributional assumptions
• Reproducibility
• Auditability
• Reusability
• Security
• Graceful failure
• ….
12
Can achieve these only with a formal approach
with checklists, templates & tests for each stage!
13
ML#Application
Life#Cycle Problem(
formulation
Data
definitions
Production(
system(
design
Production(
system
implementation
Pre6deployment(
testing
Deployment(&(
maintenance
Online(
evaluation(&
evolution
Offline
ML(modeling
Product(+(ML(+(Engg.
Engg.(+(ML
Engg.
ML
SKILLS
High – Application(Specific
Partial
Low – Application Agnostic
APPLICATION(DEPENDENT
H
P
L
H
H
P
L
L
P
L
P
ML & Data Science Learning Programs
14
Problem
Formulation
Data
Learning
Algorithms
ML
Pipelines
Modeling
Process
Deployment
Issues
Lot of emphasis on algorithms,
ML tools & modeling!
Factors for Success of ML Systems
15
Problem
Formulation
Data
Learning
Algorithms
ML
Pipelines
Modeling
Process
Deployment
Issues
Problem formulation & data
become more critical !
Problem Formulation
Business Problem: Optimize a decision process to improve business metrics
• Sub-optimal decisions due to missing information
• Solution strategy: predict missing information from available data using ML
Decision
Process
Decisions
External
Response
Business
Metrics
ML
Model
ML
Model
ML
Model
Ask “why?” to arrive at the right ML problem(s) !
Reseller Fraud Example
• Bulk buys during sale days on e-commerce websites
• Later resale at higher prices or returns
Reseller Fraud Example
Objective: Automation of reseller fraud detection
Option 1: Learn a binary classifier using historical orders & human auditor labels
Reseller Fraud Example
Objective: Automation of reseller fraud detection
Option 1: Learn a binary classifier using historical orders & human auditor labels
Limitations:
● Reverse-engineers human auditors’ decisions along with their biases and
shortcomings
● Can’t adapt to changes in fraudster tactics or data drifts
● No connection to “actual business metrics” that we want to optimize
Reseller Fraud Example
Objective: Reduce return shipping expenses; increase #users served (esp. sale time)
Decision process:
• Partner with reseller in case of potential to expand user base
• Block fraudulent orders or introduce friction (e.g., disable COD/free returns)
Missing information relevant to the decision:
• Likelihood of the buyer reselling the products
• Likely return shipping costs
• Unserved demand for the product (during sale and overall)
• Likelihood of reseller serving an untapped customer base
Key elements of an ML Prediction Problem
• Instance definition
• Target variable to be predicted
• Input features
• Modeling metrics
• Ethical & fairness constraints
• Deployment constraints
• Sources of data
REPRESENTATION OBJECTIVES
OBSERVATIONS
Instance Definition
• Is it the right granularity for the decision making process?
• Is it feasible from the data collection perspective ?
Multiple options (reseller fraud example)
• a customer
• a purchase order spanning multiple products
• a single product order (i.e., customer-product pair)
TargetVariable to be Predicted
• Can we express the business metrics (approximately) in terms of the
prediction quality of the target variables(s)?
• Will accurate predictions improve the business metrics substantially?
– estimate biz. metrics for different cases (ideal, current-baseline, likely)
• What is the data collection effort ?
– manual labeling costs, joins with external data
• Is it possible to get high quality observations?
– uncertainty in the definition, noise or bias in labeling process
Input features
• Is the feature predictive of the target ?
• Are the features going to be available in production setting ?
– define exact time windows for features based on aggregates
– watch out for time lags in data availability
– be wary of target leakages (esp. conditional expectations of target )
• How costly is to compute or acquire the feature ?
– monetary and computational costs
– might be different in training and deployment settings
Sources of Data
• Is the distribution of training data similar to production data?
– at least conditional distribution of target given input signals?
– are there fairness issues that require sampling adjustments?
– can we re-train with “new data” in case production data evolves over time?
• Are there systemic biases in training data due to collection process?
– fixed training filters?
• adjust the prediction scope to match with the filter
– collection limited by existing model?
• explore-exploit strategies & statistical bias correction approaches
Modeling Metrics - Classification
• Online metrics are meant to be computed on a live system
– can be defined directly in terms of the key business metrics (e.g., net revenue)
– typically measured via A/B tests & influenced by a lot of factors
• Offline metrics are meant to be computed on retrospective “labeled” data
– more closely tied to prediction quality (e.g., area under ROC curve)
– typically measured during offline experimentation
11/22/19 26
Modeling Metrics - Classification
• Online metrics are meant to be computed on a live system
– can be defined directly in terms of the key business metrics (e.g., net revenue)
– typically measured via A/B tests & influenced by a lot of factors
• Offline metrics are meant to be computed on retrospective “labeled” data
– more closely tied to prediction quality (e.g., area under ROC curve)
– typically measured during offline experimentation
11/22/19 27
• Primary metrics are ones that we are actively trying to optimize
– e.g., losses due to fraud
• Secondary metrics are ones that can serve as constraints or guardrails
– e.g., customer base size
Modeling Metrics
• What are the key online metrics (primary/secondary)?
– a deep question related to system goals ! !
• Are the offline modeling metrics aligned with online metrics ?
– relative goodness of models should reflect online metric performance
11/21/19 28
Ethical and Fairness Constraints
• What are the long term secondary effects of the ML
system ?
• Is the system fair to different user segments ?
Need$to$be$incorporated$in$the$modeling$metrics$!
Deployment Constraints
• What are the application constraints?
– user interface based restrictions (interaction mode, form factor)
– connectivity issues
• What are the hardware constraints ?
– client side or server side computation
– memory, compute power
• What are scalability requirements ?
– size of data, frequency of processing( training/batch prediction)
– rate of arrival of prediction instances & latency bounds (online predictions)
Key Tenets for ML-Applications
11/21/19 31
Data Definitions
• Precisely record all sources & definitions for all data elements
– (ids, features, targets, metric-factors) for (training, evaluation, production)
• Establish parity across training/evaluation/production
– definitions, level sets, units, time windows, missing value handling, correct snapshots
• Review for common data leakages
– peeking into future, target
• Pro-actively collect information on data quality issues & resolve
– missing/invalid value causes, data corruptions
Offline Modeling
• Ensure data is of high quality
– Fix missing values, outliers, systemic bias
• Narrow down modeling options based on data characteristics
– Learn about the relative effectiveness of various preprocessing, feature engineering,
and learning algorithms for different types of data.
• Be smart on the trade-off between feature engg. effort & model complexity
– Sweet spot depends on the problem complexity, available data, domain knowledge,
and computational requirements
• Ensure offline evaluation is a good “proxy” for the “real unseen” data
evaluation
– Generate data splits similar to how it would be during deployment
Engineering
• Work backwards from the application use case
– Data/compute /ML framework choices based on deployment constraints
• Clear decoupling of modeling and production system responsibilities
– self contained models (config, parameters, libs) from data scientists
– application-agnostic pipelines for scoring, evaluation, re-training, data-collection
• Maintain versioned repositories for data, models, experiments
– logs, feature factories
• Plan for ecosystems of connected ML models
– easy composition of ML workflows
11/22/19 34
Deployment
• Establish offline modeling vs. production parity
– Checks on every possible component that could change
• Establish improvement in business metrics before scaling up
– A/B testing over random buckets of instances
• Trust the models, but always audit
– Insert safe-guards (automated monitoring) and manual audits
• View model building as a continuous process not a one-time effort
– Retrain periodically to handle data drifts & design for this need
Main Takeaways
• Map out your org-specific ML application life cycle
• Introduce checklists, templates, and tests for each stage
• Invest effort in getting the problem formulation right (ask “why?”)
• Be proactive about data issues
11/21/19 36
Thank You !
Happy Modeling !
Contact: srujana@gmail.com

ML Application Life Cycle