Domain specific pretrained models
To simplify solution development
Popular frameworks
To build advanced deep learning solutions
Productive services
To empower data science and development teams
Powerful infrastructure
To accelerate deep learning
Familiar Data Science tools
To simplify model development
From the Intelligent Cloud to the Intelligent Edge
Azure Databricks Machine Learning VMs
TensorFlowPyTorch ONNX
LanguageSpeech
…
SearchVision
Scikit-Learn
Azure Notebooks JupyterVisual Studio Code Command line
Azure Machine Learning
CPU GPU FPGA
Hardest Part of ML isn’t ML, it’s Data
ML
Code
Configuration
Data Collection
Data
Verification
Feature Extraction
Machine Resource
Management
Analysis Tools
Process
Management Tools
Serving
Infrastructure
Monitoring
“Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015
Azure Databricks
Fast, easy, and collaborative Apache Spark™-based analytics platform
Built with your needs in mind
Role-based access controls
Effortless autoscaling
Live collaboration
Enterprise-grade SLAs
Best-in-class notebooks
Simple job scheduling
Seamlessly integrated with the Azure Portfolio
Increase productivity
Build on a secure, trusted cloud
Scale without limits
Azure Machine Learning service
Bring AI to everyone with an end-to-end, scalable, trusted platform
Built with your needs in mind
Support for open source frameworks
Managed compute
DevOps for machine learning
Simple deployment
Tool agnostic Python SDK
Automated machine learning
Seamlessly integrated with the Azure Portfolio
Boost your data science productivity
Increase your rate of experimentation
Deploy and manage your models everywhere
ML Lifecycle: Azure Databricks + Azure ML
SQL DB
Cosmos DB
Datawarehouse
Data lake
Blob storage
…
Prepare Data Build & Train Deploy
Prepare Data
Prepare Data at Scale
ADLS Gen 2
Clean, Curate, Process
Data
Azure
Databricks
Data
sources
Azure Data Factory
Build and Train
Train and Evaluate ML Models
Azure Databricks Runtime
Azure Databricks
Notebooks
Deploy
Deploy and Manage ML Models
How much is this car worth?
Machine Learning Problem Example
Mileage
Condition
Car brand
Year of make
Regulations
…
Parameter 1
Parameter 2
Parameter 3
Parameter 4
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Mileage Gradient Boosted Criterion
Loss
Min Samples Split
Min Samples Leaf
Others Model
Which algorithm? Which parameters?Which features?
Car brand
Year of make
Criterion
Loss
Min Samples Split
Min Samples Leaf
Others
N Neighbors
Weights
Metric
P
Others
Which algorithm? Which parameters?Which features?
Mileage
Condition
Car brand
Year of make
Regulations
…
Gradient Boosted
Nearest Neighbors
SVM
Bayesian Regression
LGBM
…
Nearest Neighbors
Model
Iterate
Gradient BoostedMileage
Car brand
Year of make
Car brand
Year of make
Condition
Which algorithm? Which parameters?Which features?
Iterate
Enter data
Define goals
Apply constraints
OutputInput Intelligently test multiple models in parallel
Optimized model
Prepare
Data
Register and
Manage Model
Train & Test
Model (Manual
or Automated)
Build
Image
…
Build model
(your favorite
IDE)
Deploy A
Scalable
Service &
Monitor
Model
Prepare Experiment Deploy
ML Lifecycle with Azure Databricks and Azure ML (demo summary)
One Azure ML workspace for all of ML in Azure
Key Updates:
• Automated ML with Azure Databricks
• MLflow with Azure ML
• Azure Databricks workspace integrated with Azure ML workspace
https://aka.ms/build-bk3010 for new features
Start Free
Build, train, and deploy models
with an Azure free account
https://azure.microsoft.com/free
Documentation
Dig into our technical
documentation
https://aka.ms/AzureMLDocs
https://docs.azuredatabricks.net/
Give feedback
Tell us what you think, ask for a
feature
https://aka.ms/AzureML_feedback
Learn More
“It is not the answer that enlightens,
but the question”
Eugene Ionesco
Managing your ML lifecycle with Azure Databricks and Azure ML

Managing your ML lifecycle with Azure Databricks and Azure ML

  • 3.
    Domain specific pretrainedmodels To simplify solution development Popular frameworks To build advanced deep learning solutions Productive services To empower data science and development teams Powerful infrastructure To accelerate deep learning Familiar Data Science tools To simplify model development From the Intelligent Cloud to the Intelligent Edge Azure Databricks Machine Learning VMs TensorFlowPyTorch ONNX LanguageSpeech … SearchVision Scikit-Learn Azure Notebooks JupyterVisual Studio Code Command line Azure Machine Learning CPU GPU FPGA
  • 4.
    Hardest Part ofML isn’t ML, it’s Data ML Code Configuration Data Collection Data Verification Feature Extraction Machine Resource Management Analysis Tools Process Management Tools Serving Infrastructure Monitoring “Hidden Technical Debt in Machine Learning Systems,” Google NIPS 2015
  • 5.
    Azure Databricks Fast, easy,and collaborative Apache Spark™-based analytics platform Built with your needs in mind Role-based access controls Effortless autoscaling Live collaboration Enterprise-grade SLAs Best-in-class notebooks Simple job scheduling Seamlessly integrated with the Azure Portfolio Increase productivity Build on a secure, trusted cloud Scale without limits
  • 7.
    Azure Machine Learningservice Bring AI to everyone with an end-to-end, scalable, trusted platform Built with your needs in mind Support for open source frameworks Managed compute DevOps for machine learning Simple deployment Tool agnostic Python SDK Automated machine learning Seamlessly integrated with the Azure Portfolio Boost your data science productivity Increase your rate of experimentation Deploy and manage your models everywhere
  • 8.
    ML Lifecycle: AzureDatabricks + Azure ML SQL DB Cosmos DB Datawarehouse Data lake Blob storage … Prepare Data Build & Train Deploy
  • 9.
    Prepare Data Prepare Dataat Scale ADLS Gen 2 Clean, Curate, Process Data Azure Databricks Data sources Azure Data Factory
  • 10.
    Build and Train Trainand Evaluate ML Models Azure Databricks Runtime Azure Databricks Notebooks
  • 11.
  • 13.
    How much isthis car worth? Machine Learning Problem Example
  • 14.
    Mileage Condition Car brand Year ofmake Regulations … Parameter 1 Parameter 2 Parameter 3 Parameter 4 … Gradient Boosted Nearest Neighbors SVM Bayesian Regression LGBM … Mileage Gradient Boosted Criterion Loss Min Samples Split Min Samples Leaf Others Model Which algorithm? Which parameters?Which features? Car brand Year of make
  • 15.
    Criterion Loss Min Samples Split MinSamples Leaf Others N Neighbors Weights Metric P Others Which algorithm? Which parameters?Which features? Mileage Condition Car brand Year of make Regulations … Gradient Boosted Nearest Neighbors SVM Bayesian Regression LGBM … Nearest Neighbors Model Iterate Gradient BoostedMileage Car brand Year of make Car brand Year of make Condition
  • 16.
    Which algorithm? Whichparameters?Which features? Iterate
  • 17.
    Enter data Define goals Applyconstraints OutputInput Intelligently test multiple models in parallel Optimized model
  • 20.
    Prepare Data Register and Manage Model Train& Test Model (Manual or Automated) Build Image … Build model (your favorite IDE) Deploy A Scalable Service & Monitor Model Prepare Experiment Deploy ML Lifecycle with Azure Databricks and Azure ML (demo summary)
  • 21.
    One Azure MLworkspace for all of ML in Azure Key Updates: • Automated ML with Azure Databricks • MLflow with Azure ML • Azure Databricks workspace integrated with Azure ML workspace https://aka.ms/build-bk3010 for new features
  • 22.
    Start Free Build, train,and deploy models with an Azure free account https://azure.microsoft.com/free Documentation Dig into our technical documentation https://aka.ms/AzureMLDocs https://docs.azuredatabricks.net/ Give feedback Tell us what you think, ask for a feature https://aka.ms/AzureML_feedback Learn More
  • 23.
    “It is notthe answer that enlightens, but the question” Eugene Ionesco

Editor's Notes

  • #4 When we think about our machine learning platform on Azure, we think about it in 5 layers. The top layer is our sophisticated set of pretrained models using our rich set of cognitive services that we make available to you that you can use out of the box to build your E2E applications. The next 4 layers are in the realm of custom AI where you are building custom machine learning models. You can use data science tools that you are familiar using a set of popular frameworks like TensorFlow, PyTorch The next layer is the set of powerful services like Azure Databricks and AML to create E2E AI and ML applications. All these run on top of powerful hardware like GPUs, CPUs and FPGAs to accelerate the training and inferencing of these models.
  • #6 Azure Databricks and super fast, managed and scalable version of Spark – it is designed for the cloud first to lower the total cost of ownership by doing a few things Making clusters Empheral Eliminating the need for making forecasting decisions by supporting autoscale Removing Devops Building in the most important packages into our Runtime
  • #7 DB differs from HDI in that HDI is PAAS that allows working with more OSS tools. DB advantage is SAAS that is easier to use, has native AD integration, has auto-scaling and auto-termination (like a pause/resume), a workflow scheduler, real-time workspace collaboration, and performance improvements over Apache Spark. Demo: Open Portal - Show creating DB - Show Cluster - Show Notebook - Show steps to create a table, explain Delta Table - Show Query of a table - Show Plot https://docs.azuredatabricks.net/getting-started/quick-start.html#step-2-create-a-cluster
  • #9 ADB and AML – easy to use AML as single place to capture everything Build and train models using Azure Machine Learning and Azure Databricks Deploy manage and monitor models in the cloud, edge or even FPGA devices using AML
  • #10 Connect to data from any source with different data paradigms and formats Seamlessly integrate with all of your data sources using ADF and create hybrid pipelines Spark itself supports more than 50 data sources and file types Batch and streaming workloads Store and process data without any limits Process data using ephemeral clusters with auto-scale Scale compute and storage needs separately Productionize your pipelines - Quickly productionalize working notebooks with automated jobs
  • #11 Move the framework at the beginning – remove directional arrow
  • #20 No lock down – solution Open source
  • #21 Automated ML with Azure Databricks MLflow with Azure ML on Azure Databricks MLflow with Azure ML on Notebooks VMs
  • #22 Why – what we saw what we saw tp tie it up