Databricks Overview for MLOps

Databricks Overview
for MLOps
Clemens Mewald
Director of Product Management

MLOps / Governance
The Databricks ML Platform
Data Science Workspace
Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving

DATA ENGINEERS DATA SCIENTISTS ML ENGINEERS DATA ANALYSTS
Collaborative Data Science Workspace
MLOps / Governance
Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving

DATA ENGINEERS DATA SCIENTISTS
Cloud-native Collaboration Features
Commenting Co-Presence
Co-Editing
Multi-Language
Scala, SQL, Python, R: All in one
notebook.
Collaborative
Realtime co-presence, co-editing,
and commenting.
Databricks Notebooks
ML ENGINEERS DATA ANALYSTS

(Git-based) Projects
Version Review Test
Development /
Experimentatio
n
Production Jobs
Git / CI/CD
Systems
CI/CD Integration
▲
▼
Supported Git Providers

MLOps / Governance
High Quality Data at Scale
Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving

High Quality Data at Scale
Structured, Semi-Structured and
Unstructured Data
Business
Intelligence
Data
Science
Machine
Learning
Delta Lake
Data Science
Workspace
MLﬂow
Workspace
SQL
Analytics
Ingest any format at any scale from any source
ACID transactions guarantee data validity
Versioning and time-travel built-in
Automated logging of data + version information

Turnkey ML Training at Scale
MLOps / Governance
Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving

ML Runtime: DevOps-free Environment
optimized for Machine Learning
Packages up the most popular ML Toolkits
Simpliﬁes Distributed ML/DL
Distribute and scale any single-machine ML code
to 1,000’s of machines.
Built-in AutoML and Auto-Logging
Hyperparameter tuning, AutoML, automated
tracking, and visualizations with MLﬂow
Turnkey ML Training at Scale

Distributed Training
▪ Built-in support in the ML Runtime
TensorFlow native Distribution Strategy (Spark TensorFlow Distributor)
HorovodRunner (Keras, TensorFlow, and PyTorch) Worker Nodes
Driver
Training Tasks

Distributed Tuning
▪ Built-in support in the ML Runtime
Worker Nodes
Driver
Trials
Integration

Support for all Deployment Modes
MLOps / Governance
Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving

Models Tracking
Flavor 2
Flavor 1
Custom
Models
In-Line Code
Containers
Batch & Stream
Scoring
Cloud Inference
Services
OSS Serving
Solutions
Parameters Metrics Artifacts
Models
Metadata
Deployment Options
Staging Production Archived
Data Scientists Deployment Engineers
v2
v3
v1
Model Registry

Deploying an MLLib
model as a Spark UDF

Deploying an MLLib
Deploying a Scikit Learn

Deploying an MLLib
Deploying a TensorFlow

Deploying an MLLib
Deploying a TensorFlow
Yes, they’re all the same!
As are the commands to
deploy these models as
Docker containers, etc.

Data
Ingestion
Data
Versioning
Model
Training
Model
Tuning
Runtime and
Environments
Monitoring
Batch
Scoring
Online Serving
Data Governance
Powered by
Experiment Tracking Reproducibility Model Governance
End-to-end MLOps / Governance

Powered by
Data Governance Experiment Tracking Reproducibility Model Governance
Data Source / Lineage
Data Versioning
Automated Data Source capture and Versioning

Powered by
Feature-Level Data
Lineage / Usage
Automated capture of Feature Usage

Powered by
Parameters
Metrics
Models
Artifacts
Automated capture of ML metrics, parameters,
artifacts, etc.

Powered by
Trials
Automated capture of Hyperparameter Search

Powered by
Model Interpretability
Automated Model Interpretability

Powered by
Code Versioning
Cluster
Configuration
Environment
Configuration
Automated capture of Code, Environment and
Cluster Specification

Powered by
Model Discoverability Model Stage-Based ACLs
Model Sharing, Reuse, and ACLs

Powered by
Approval Process for
Stage Transitions
Audit Log of
Model Changes
Automated Model Lineage and Governance

Powered by
Turnkey Serving
integrated with Model
Versions and Stages
Turnkey Model Serving

Quality / Performance
Metric Monitoring
Powered by
Model Quality monitoring

Code versioning
Data versioning
Cluster conﬁguration
Environment speciﬁcation
Auto-Logging Reproducibility Checklist Reproduce Run Feature
Powered by
✓
✓
✓
✓
The Result: Full End-to-End Governance and
Reproducibility

Databricks Overview for MLOps

More Related Content

What's hot

Similar to Databricks Overview for MLOps

More from Databricks

Recently uploaded

Databricks Overview for MLOps