Ramjee Ganti, dblue Inc.
Machine Learning Lifecycle
Ramjee Ganti
● Founder dblue.ai
● Engineer
● Scaled Tech at Startups
● No Sugar Advocate
● @gantir
Machine Learning Lifecycle
“Machine learning lifecycle refers to the iterative
process that spans cross functional teams which
define, build, deploy, monitor, operate and improve
ML systems”.
Perception
Machine Learning Algorithms
Reality
ML
Algorithms
Data Collection
Configuration
Serving
Infrastructure Feature Extraction
Data
Validation
Monitoring
Process
Management Tools
Resource
Management
Analysis Tools
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
15%-20%
People
Why is it hard?
● Make it Discoverable and Accessible
● Version your data
● Feature Repository
Data Processing
Model Exploration
● Training Multiple Models
● Dependent Models
● Serving the Model
○ Embedded
○ Service
○ Data
Testing and Validation
● Validating Data
● Validating Component Integration
● Validating model quality
● Validating model bias and fairness
Deployment
● Multiple models
● Shadowing
● Split serving
● Online learning models
Model Monitoring & Observability
● Model inputs
● Model outputs and decisions
● User actions
● Model fairness
● Model interpretability outputs
Take Away
1. It’s hard, accept it.
2. Take software engineering approach
3. Implement reproducible and auditable process
4. Version Data, Models, Code
5. Think Platforms
References/Credits
● Continuous Delivery for Machine Learning
● Hidden Technical Debt in Machine Learning Systems
● Doing Machine Learning at Uber
Thank You

Machine learning life cycle