SlideShare a Scribd company logo
1 of 49
MLOps and Reproducible ML on AWS
with Kubeflow and Amazon SageMaker
Presented by:
Stepan Pushkarev, CTO @ Provectus
Qingwei Li, ML Specialist Solutions Architect @ AWS
1. Learn how to a build scalable and secure ML Infrastructure on AWS with
Provectus
2. Explore best practices of using Amazon SageMaker with open source tools
for better experience and productivity
Webinar Objectives
1. Familiarity with AWS & Amazon SageMaker services
2. Familiarity with ML Workflow
3. Familiarity with Kubeflow & Kubeflow Pipelines
Webinar Prerequisites
1. Introductions
2. Case Study: GoCheck Kids
3. Overview of AWS Infrastructure for Machine Learning
4. Provectus ML Infrastructure on AWS
a. Experimentation
b. MLOps
c. Feature Store
Agenda
AI-First Consultancy & Solutions Provider
Сlients ranging from
fast-growing startups to
large enterprises
450 employees and
growing
Established in 2010
HQ in Palo Alto
Offices across the US,
Canada, and Europe
We are obsessed about leveraging cloud, data, and AI to reimagine the way
businesses operate, compete, and deliver customer value
Innovative Tech Vendors
Seeking for niche expertise to
differentiate and win the market
Midsize to Large Enterprises
Seeking to accelerate innovation,
achieve operational excellence
Our Clients
Introductions
Stepan Pushkarev
Chief Technology
Officer, Provectus
Iskandar Sitdikov
ML Solutions Architect,
Provectus
Rinat Gareev
ML Solutions Architect,
Provectus
Ilnur Garifullin
ML Solutions Architect,
Provectus
Qingwei Li
ML Specialist Solutions
Architect, AWS
The past few years have been like a dream come true for those who work in
analytics and big data.There is a new career path for platform engineers to learn
Hadoop, Scala and Spark. Java and Python programmers have a chance to move
to the Big Data world. There they find higher salaries, new challenges and get
to scale up to distributed systems. But recently I am starting to hear some
complaints and dashed hopes from engineers who have spent time working there.
1. Tools evolution — The Apache Spark/Hadoop ecosystem is great, but it is not stable and user-friendly enough
to just run and forget. Engineers and data scientists should contribute to existing open source projects and create
new tools to fill the gaps in day-to-day operations.
2. Education and cross skills — When data scientists write code, they need to think not just about abstractions,
but consider the practical issues of what is possible and what is reasonable. For example, they need to think how
long their query will run and whether the data they extract will fit into the storage mechanism they are using.
3. Improve the process — DevOps might be a solution. Here DevOps does not just mean writing Ansible scripts
and installing Jenkins. We need DevOps working in optimal fashion to reduce handoffs and invent new tools to
give everyone self-service to make them as productive as possible.
Why ML Infrastructure
GoCheck Kids Story: Secure, agile, and compliant ML
infrastructure for Deep Vision Screening
GoCheck Kids
Reduce manual overhead for child vision
screening.
Detect strabismus, crescent, dark iris/pupil
population, as well as to reject images where
child is not looking straight into the camera.
Security and compliance requirements - Track
everything, do not touch anything.
Deep Vision Solution for GoCheck Kids
Business Problem Solution
End-to-end deep learning image classification
models to detect child gaze, strabismus,
crescent, and dark iris/pupil population.
Provectus has developed quite a few ML models:
● Different input (pre-processing, region cropping, single vs two eyes, etc.), 6
● Different feature generation backbones (deep convolutional networks: ResNet,
MobileNet, EfficientNet, custom, etc.), 7
● Transfer learning from a synthetic dataset, 3
● Tweaks with objective functions to tackle data imbalance, 5
● Different datasets splits, 10
Modeling Hypothesis
6x7x3x5x10 = 6,300 combinations to test in 3 weeks!
Conducted ~100* experiments on the entire dataset using pipelines within 3 weeks
● 100 000+ images
● Each experiment takes 15 min – 6 hours on a single GPU (P3 instance type)
* not counting development runs and experiments in notebook instances
We always had quite a few pending improvement hypotheses in backlog
● Each good hypothesis needs several runs to determine best hyperparameters
● OR automatic hyperparameter optimizer
Data preparation took ~5 hours
● Had to parallelize and reuse outputs
Each experiment produces artifacts: models, metrics, predictions
Met security and compliance requirements
Benefits and Outcomes of ML Infrastructure
Results Summary
3X
Increase in ML
model’s recall
(same precision)
95%
ML Engineer’s time
was dedicated to
experimentation
100+
Large scale
experiments in 3
weeks by 3 ML
engineers
This could not be achieved without Provectus ML Infrastructure on AWS
100%
Secure and FDA
Compliant
Overview of AWS Infrastructure
for Machine Learning
VISION SPEECH TEXT SEARCH NEW CHATBOTS PERSONALIZATION FORECASTING FRAUD NEW DEVELOPMENT NEW CONTACT CENTERS
Amazon SageMaker
Amazon
SageMaker
Ground
Truth
Amazon
A2I
Amazon
SageMaker
Neo
Built-in
algorithms
SageMaker
Notebooks NEW
SageMaker
Experiments NEW
Model
tuning
SageMaker
Debugger NEW
SageMaker
Autopilot NEW
Model
hosting
SageMaker
Model Monitor NEW
Deep Learning
AMIs & Containers
GPUs &
CPUs
Elastic
Inference
Inferentia FPGA
Amazon
Rekognition
Amazon
Polly
Amazon
Transcribe
+Medical
Amazon
Comprehend
+Medical
Amazon
Translate
Amazon
Lex
Amazon
Personalize
Amazon
Forecast
Amazon
Fraud Detector
Amazon
CodeGuru
AWS AI Services
AWS ML Services
AWS ML Frameworks & Infrastructure
Amazon
Textract
Amazon
Kendra
Contact Lens
For Amazon Connect
Amazon SageMaker Studio IDE
NEW
NEW NEW
AWS AI/ML Stack
Amazon SageMaker - A Fully Managed Services for ML
10101101
0
0101010
Collect
and prepare
training data
Select or
Build ML
algorithms
Set up and
manage
environments
for training
Train, debug,
and tune
models
Deploy
models in
production
Manage
training runs
Monitor
models
Scale and manage
the production
environment
Validate
predictions
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Image registry
Container image repository
Amazon Elastic
Container Registry
(Amazon ECR)
Compute
Where the containers run
Amazon Elastic
Compute Cloud
(Amazon EC2)
Jupyter notebook
instances
High performance
algorithms
Large-scale
training
Optimization One-click
deployment
Fully managed with
auto-scaling
ML services
Fully-managed service that
covers the entire machine
learning workflow
Amazon SageMaker
Management
Deployment, scheduling,
scaling, and management of
containerized applications
Amazon Elastic
Kubernetes Service
(Amazon EKS)
Amazon Elastic
Container Service
(Amazon ECS)
ML Infrastructure and Services
1
2
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kubernetes
Amazon SageMaker Operators
for Kubernetes
github.com/aws/amazon-sagemaker-operator-for-k8s
Kubeflow
Amazon SageMaker Components
for Kubeflow Pipelines
github.com/kubeflow/pipelines/tree/master/components/
aws/sagemaker
Scaling ML on Kubernetes with Amazon SageMaker
2
1
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Fully-managed infrastructure
• Ground Truth labeling
• Automatic model tuning
• Built-in optimized algorithms
• Managed Spot Training
• Scalable inference endpoints
• Model monitoring
• Easy scalability
• Portability
• Composability
• Scalability
• Shared infrastructure
• Repeatable pipelines
• Automation
• CI/CD
• Open-source
Open Source + Amazon SageMaker Value Proposition
Amazon SageMaker Kubeflow
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kubeflow Pipeline
Component
Other
component
Pipeline
step
Pipeline
step
Pipeline
step
Input/Output
Implementation
(container)
Metadata
Amazon
ECR
Amazon
SageMaker
Amazon SageMaker Components for Kubeflow Pipelines
Other
component
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Example pipeline:
1. Hyperparameter optimization
2. Select best hyperparameters and increase epochs
3. Training model using the best hyperparameters
4. Create an Amazon SageMaker model
5. Deploy the model
BYO containerBYO training scripts
Amazon SageMaker Components for Kubeflow Pipelines
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Model
development
Model
training
Model
tracking
Model
deployment
Hyper-param
tuning
Data
prep
Amazon SageMaker + Kubeflow for Machine Learning
Amazon SageMaker
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kubernetes
Amazon SageMaker Operators
for Kubernetes
github.com/aws/amazon-sagemaker-operator-for-k8s
Kubeflow
Amazon SageMaker Components
for Kubeflow Pipelines
github.com/kubeflow/pipelines/tree/master/componen
ts/aws/sagemaker
Scaling ML on Kubernetes with Amazon SageMaker
1
2
© 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Product Architecture Kubernetes Orchestration Dev Interface GUI Ease of Use
SageMaker
Components
Kubeflow
Pipeline
Components
Yes
Self Hosted
Kubeflow
Pipelines
Python
KFP
Dashboard
Medium
SageMaker
Operators
Kubernetes
Operators
Customer
Resources
Yes
Kubernetes
Tools (Ex.
Flyte, Argo)
YAML,
or custom
extension
by customer
None,
or custom
Advanced
Amazon SageMaker Operators for Kubernetes vs.
Components for Kubeflow Pipelines
Provectus ML Infrastructure
on AWS
Amazon SageMaker Services
How Provectus Adds Value
Feature Store
Store and reuse features to build ML models faster
ML Workflow Orchestrator
Reproduce and track the whole ML Workflow
Dataset Management
Track and govern training datasets
Dataset Sampling
Sample from production
streams
Advanced Monitoring
Detect drift in text & images
MLOps
Continuous Training & Delivery
The Core of MLOps and Reproducible Experimentation
Pipelines
1. Backbone of Experimentation flow
2. Essential part of Continuous Integration and Delivery flow
3. Major part of Continuous Retraining flow
4. Production workload (unlike traditional CI/CD)
5. Part of day-to-day model tuning and development process
6. Idempotent — Should produce the same results with the same inputs
ML Pipeline Characteristics
ML Pipeline Options
Component
/Option
Amazon SageMaker
Managed
AWS
Native
Kubernetes
Native
DSL
Orchestrator
Metadata
Tracker & UI
Integrations (Tuner,
Debugger,
TensorBoard, etc)
ML Pipeline Options
Component
/Option
Amazon SageMaker
Managed
AWS
Native
Kubernetes
Native
DSL SageMaker Processing Data Science SDK
for Step Functions
Kubeflow Pipelines
Orchestrator SageMaker Processing Step Functions Argo Workflow
Metadata
Tracker & UI
Amazon SageMaker
Experiments
N/A Kubeflow
Metadata
Integrations (Tuner,
Debugger,
TensorBoard, etc)
Amazon SageMaker
Services DIY
Opensource, Amazon
SageMaker
Components
Kubeflow: Orchestrator and Experiments Tracker of Choice
ML Engineer-Centric Flow
End-to-end
Amazon
SageMaker +
Kubeflow
Pipelines
MLOps with
Argo Workflows,
Amazon SageMaker,
& Kubeflow
Summary of Kubeflow on AWS
Best Practices:
● Invest into a library of reusable components
● Use Amazon SageMaker Components for Kubeflow
● Deploy on Amazon EKS, consider Provectus Swiss Army
Kube for a quick start
● Use Argo and Kubeflow for MLOps
Benefits:
● Metadata Tracker and Pipeline Orchestrator
● Minimal intervention into existing day-to-day ML routines
Feature Store
Value Proposition of Feature Store
A data management layer for machine learning features.
1. Better ROI from feature engineering — Facilitates collaboration,
sharing and reusing of features
2. Increases ML Engineer productivity — Storage is further
decoupled from ML pipelines
3. Prevents training-serving data skew by design
4. Can encapsulate or facilitate data versioning and features
quality monitoring
Good News: A properly designed Data Lake
covers 80% of requirements for Feature Store
Higher Level Operations:
● Fetch batch (take a sample)
● Get one
● Add / Deprecate feature
Lineage Metadata:
● Upstream Models
● Data Sources and transformations
Annotation Metadata:
● Agreements
● Judgements
● Annotation job parameters
Adding ML Awareness to Data Lake
Data Profiling Metadata:
● Min/max
● Uniqueness, missing values, etc.
Governance Metadata:
● Owner
● Description
● Version
● Last updated, SLA
Feature Store: Options
Not a Store. General purpose Data Catalogue.
Adds nice UI, Governance and Searchability.
Great design. Early Stage. Nicely overlaps with Data Lake.
No extensive metadata management yet.
AWS support: https://github.com/feast-dev/feast/issues/367
By Ph.D for Ph.Ds. Tremendous amount of work,
very advanced concepts but overcomplicated.
By creators of Uber Michelangelo. Closed source.
1. Modern ML infrastructure accelerates time to value for ML initiatives and increases
trust from the business
2. Eliminates handoffs between Data Scientists, ML Engineers and IT
3. Must-have requirement for small ML shops and for large organizations. Spans from
straightforward “image classification” projects to more complex ML pipelines
4. Must-have requirement for secure and compliant environments
5. Minimizes growing technical debt in machine learning projects
6. Complements fully managed AWS services with Open Source projects for pipeline
orchestration, experiments tracking, dataset versioning, and feature store
Summary of ML Infrastructure
125 University Avenue
Suite 290, Palo Alto
California, 94301
hello@provectus.com
Questions, details?
We would be happy to answer!

More Related Content

What's hot

How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Ryan Jarvinen
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleDatabricks
 
GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...Weaveworks
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumSasha Rosenbaum
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsFatih Baltacı
 
Google Vertex AI
Google Vertex AIGoogle Vertex AI
Google Vertex AIVikasBisoi
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCPOliver Fierro
 
Containers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes IstioContainers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes IstioAraf Karsh Hamid
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflowDatabricks
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOpsDatabricks
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopBob Killen
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflowDatabricks
 
Argo Workflows 3.0, a detailed look at what’s new from the Argo Team
Argo Workflows 3.0, a detailed look at what’s new from the Argo TeamArgo Workflows 3.0, a detailed look at what’s new from the Argo Team
Argo Workflows 3.0, a detailed look at what’s new from the Argo TeamLibbySchulze
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersDaniel Zivkovic
 

What's hot (20)

How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...GitOps - Modern best practices for high velocity app dev using cloud native t...
GitOps - Modern best practices for high velocity app dev using cloud native t...
 
MLOps by Sasha Rosenbaum
MLOps by Sasha RosenbaumMLOps by Sasha Rosenbaum
MLOps by Sasha Rosenbaum
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Managing the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOpsManaging the Machine Learning Lifecycle with MLOps
Managing the Machine Learning Lifecycle with MLOps
 
Google Vertex AI
Google Vertex AIGoogle Vertex AI
Google Vertex AI
 
Serverless and Design Patterns In GCP
Serverless and Design Patterns In GCPServerless and Design Patterns In GCP
Serverless and Design Patterns In GCP
 
Containers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes IstioContainers Docker Kind Kubernetes Istio
Containers Docker Kind Kubernetes Istio
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
Deep Learning on ECS
Deep Learning on ECSDeep Learning on ECS
Deep Learning on ECS
 
Introduction to MLflow
Introduction to MLflowIntroduction to MLflow
Introduction to MLflow
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
 
Argo Workflows 3.0, a detailed look at what’s new from the Argo Team
Argo Workflows 3.0, a detailed look at what’s new from the Argo TeamArgo Workflows 3.0, a detailed look at what’s new from the Argo Team
Argo Workflows 3.0, a detailed look at what’s new from the Argo Team
 
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML EngineersIntro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
 

Similar to MLOps and Reproducible ML on AWS with Kubeflow and Amazon SageMaker

AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondProvectus
 
Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerAmazon Web Services
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...DataWorks Summit
 
[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft AzureKorkrid Akepanidtaworn
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 
Machine Learning in azione con Amazon SageMaker
Machine Learning in azione con Amazon SageMakerMachine Learning in azione con Amazon SageMaker
Machine Learning in azione con Amazon SageMakerAmazon Web Services
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)sKaushikNarayanan
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)MvkZ
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)sKaushikNarayanan
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)MvkZ
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)sKaushikNarayanan
 
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...Docker, Inc.
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scalaSviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scalaAmazon Web Services
 
Tuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning OptimizationTuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning OptimizationSigOpt
 
10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M UsersAmazon Web Services
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiManu Mukerji
 

Similar to MLOps and Reproducible ML on AWS with Kubeflow and Amazon SageMaker (20)

Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMaker
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure[AI] ML Operationalization with Microsoft Azure
[AI] ML Operationalization with Microsoft Azure
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Machine Learning in azione con Amazon SageMaker
Machine Learning in azione con Amazon SageMakerMachine Learning in azione con Amazon SageMaker
Machine Learning in azione con Amazon SageMaker
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)
 
Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)Big datatraining.in devops-part2 (1)
Big datatraining.in devops-part2 (1)
 
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...
The Complexity to "Yes" in Analytics Software and the Possibilities with Dock...
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scalaSviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
Sviluppa, addestra e distribuisci modelli di Machine learning su qualsiasi scala
 
Tuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning OptimizationTuning the Untunable - Insights on Deep Learning Optimization
Tuning the Untunable - Insights on Deep Learning Optimization
 
10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users
 
Strata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu MukerjiStrata CA 2019: From Jupyter to Production Manu Mukerji
Strata CA 2019: From Jupyter to Production Manu Mukerji
 

More from Provectus

Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP SolutionProvectus
 
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.Provectus
 
Choosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare OrganizationsChoosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare OrganizationsProvectus
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRCost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRProvectus
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...Provectus
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...Provectus
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...Provectus
 
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky..."Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...Provectus
 
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2..."Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...Provectus
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...Provectus
 
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ..."Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...Provectus
 
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019Provectus
 
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019Provectus
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...Provectus
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019Provectus
 
How to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAMHow to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAMProvectus
 
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC MeetupYurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC MeetupProvectus
 
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC MeetupAndrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC MeetupProvectus
 

More from Provectus (20)

Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP Solution
 
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
 
Choosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare OrganizationsChoosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare Organizations
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRCost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky..."Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
 
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2..."Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
 
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ..."Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
 
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
 
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
 
How to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAMHow to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAM
 
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC MeetupYurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
 
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC MeetupAndrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

MLOps and Reproducible ML on AWS with Kubeflow and Amazon SageMaker

  • 1. MLOps and Reproducible ML on AWS with Kubeflow and Amazon SageMaker Presented by: Stepan Pushkarev, CTO @ Provectus Qingwei Li, ML Specialist Solutions Architect @ AWS
  • 2. 1. Learn how to a build scalable and secure ML Infrastructure on AWS with Provectus 2. Explore best practices of using Amazon SageMaker with open source tools for better experience and productivity Webinar Objectives
  • 3. 1. Familiarity with AWS & Amazon SageMaker services 2. Familiarity with ML Workflow 3. Familiarity with Kubeflow & Kubeflow Pipelines Webinar Prerequisites
  • 4. 1. Introductions 2. Case Study: GoCheck Kids 3. Overview of AWS Infrastructure for Machine Learning 4. Provectus ML Infrastructure on AWS a. Experimentation b. MLOps c. Feature Store Agenda
  • 5. AI-First Consultancy & Solutions Provider Сlients ranging from fast-growing startups to large enterprises 450 employees and growing Established in 2010 HQ in Palo Alto Offices across the US, Canada, and Europe We are obsessed about leveraging cloud, data, and AI to reimagine the way businesses operate, compete, and deliver customer value
  • 6. Innovative Tech Vendors Seeking for niche expertise to differentiate and win the market Midsize to Large Enterprises Seeking to accelerate innovation, achieve operational excellence Our Clients
  • 7. Introductions Stepan Pushkarev Chief Technology Officer, Provectus Iskandar Sitdikov ML Solutions Architect, Provectus Rinat Gareev ML Solutions Architect, Provectus Ilnur Garifullin ML Solutions Architect, Provectus Qingwei Li ML Specialist Solutions Architect, AWS
  • 8. The past few years have been like a dream come true for those who work in analytics and big data.There is a new career path for platform engineers to learn Hadoop, Scala and Spark. Java and Python programmers have a chance to move to the Big Data world. There they find higher salaries, new challenges and get to scale up to distributed systems. But recently I am starting to hear some complaints and dashed hopes from engineers who have spent time working there.
  • 9. 1. Tools evolution — The Apache Spark/Hadoop ecosystem is great, but it is not stable and user-friendly enough to just run and forget. Engineers and data scientists should contribute to existing open source projects and create new tools to fill the gaps in day-to-day operations. 2. Education and cross skills — When data scientists write code, they need to think not just about abstractions, but consider the practical issues of what is possible and what is reasonable. For example, they need to think how long their query will run and whether the data they extract will fit into the storage mechanism they are using. 3. Improve the process — DevOps might be a solution. Here DevOps does not just mean writing Ansible scripts and installing Jenkins. We need DevOps working in optimal fashion to reduce handoffs and invent new tools to give everyone self-service to make them as productive as possible.
  • 10. Why ML Infrastructure GoCheck Kids Story: Secure, agile, and compliant ML infrastructure for Deep Vision Screening
  • 12. Reduce manual overhead for child vision screening. Detect strabismus, crescent, dark iris/pupil population, as well as to reject images where child is not looking straight into the camera. Security and compliance requirements - Track everything, do not touch anything. Deep Vision Solution for GoCheck Kids Business Problem Solution End-to-end deep learning image classification models to detect child gaze, strabismus, crescent, and dark iris/pupil population.
  • 13. Provectus has developed quite a few ML models: ● Different input (pre-processing, region cropping, single vs two eyes, etc.), 6 ● Different feature generation backbones (deep convolutional networks: ResNet, MobileNet, EfficientNet, custom, etc.), 7 ● Transfer learning from a synthetic dataset, 3 ● Tweaks with objective functions to tackle data imbalance, 5 ● Different datasets splits, 10 Modeling Hypothesis 6x7x3x5x10 = 6,300 combinations to test in 3 weeks!
  • 14. Conducted ~100* experiments on the entire dataset using pipelines within 3 weeks ● 100 000+ images ● Each experiment takes 15 min – 6 hours on a single GPU (P3 instance type) * not counting development runs and experiments in notebook instances We always had quite a few pending improvement hypotheses in backlog ● Each good hypothesis needs several runs to determine best hyperparameters ● OR automatic hyperparameter optimizer Data preparation took ~5 hours ● Had to parallelize and reuse outputs Each experiment produces artifacts: models, metrics, predictions Met security and compliance requirements Benefits and Outcomes of ML Infrastructure
  • 15. Results Summary 3X Increase in ML model’s recall (same precision) 95% ML Engineer’s time was dedicated to experimentation 100+ Large scale experiments in 3 weeks by 3 ML engineers This could not be achieved without Provectus ML Infrastructure on AWS 100% Secure and FDA Compliant
  • 16. Overview of AWS Infrastructure for Machine Learning
  • 17. VISION SPEECH TEXT SEARCH NEW CHATBOTS PERSONALIZATION FORECASTING FRAUD NEW DEVELOPMENT NEW CONTACT CENTERS Amazon SageMaker Amazon SageMaker Ground Truth Amazon A2I Amazon SageMaker Neo Built-in algorithms SageMaker Notebooks NEW SageMaker Experiments NEW Model tuning SageMaker Debugger NEW SageMaker Autopilot NEW Model hosting SageMaker Model Monitor NEW Deep Learning AMIs & Containers GPUs & CPUs Elastic Inference Inferentia FPGA Amazon Rekognition Amazon Polly Amazon Transcribe +Medical Amazon Comprehend +Medical Amazon Translate Amazon Lex Amazon Personalize Amazon Forecast Amazon Fraud Detector Amazon CodeGuru AWS AI Services AWS ML Services AWS ML Frameworks & Infrastructure Amazon Textract Amazon Kendra Contact Lens For Amazon Connect Amazon SageMaker Studio IDE NEW NEW NEW AWS AI/ML Stack
  • 18. Amazon SageMaker - A Fully Managed Services for ML 10101101 0 0101010 Collect and prepare training data Select or Build ML algorithms Set up and manage environments for training Train, debug, and tune models Deploy models in production Manage training runs Monitor models Scale and manage the production environment Validate predictions
  • 19.
  • 20.
  • 21. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Image registry Container image repository Amazon Elastic Container Registry (Amazon ECR) Compute Where the containers run Amazon Elastic Compute Cloud (Amazon EC2) Jupyter notebook instances High performance algorithms Large-scale training Optimization One-click deployment Fully managed with auto-scaling ML services Fully-managed service that covers the entire machine learning workflow Amazon SageMaker Management Deployment, scheduling, scaling, and management of containerized applications Amazon Elastic Kubernetes Service (Amazon EKS) Amazon Elastic Container Service (Amazon ECS) ML Infrastructure and Services 1 2
  • 22. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes Amazon SageMaker Operators for Kubernetes github.com/aws/amazon-sagemaker-operator-for-k8s Kubeflow Amazon SageMaker Components for Kubeflow Pipelines github.com/kubeflow/pipelines/tree/master/components/ aws/sagemaker Scaling ML on Kubernetes with Amazon SageMaker 2 1
  • 23. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 24. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • Fully-managed infrastructure • Ground Truth labeling • Automatic model tuning • Built-in optimized algorithms • Managed Spot Training • Scalable inference endpoints • Model monitoring • Easy scalability • Portability • Composability • Scalability • Shared infrastructure • Repeatable pipelines • Automation • CI/CD • Open-source Open Source + Amazon SageMaker Value Proposition Amazon SageMaker Kubeflow
  • 25. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubeflow Pipeline Component Other component Pipeline step Pipeline step Pipeline step Input/Output Implementation (container) Metadata Amazon ECR Amazon SageMaker Amazon SageMaker Components for Kubeflow Pipelines Other component
  • 26. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Example pipeline: 1. Hyperparameter optimization 2. Select best hyperparameters and increase epochs 3. Training model using the best hyperparameters 4. Create an Amazon SageMaker model 5. Deploy the model BYO containerBYO training scripts Amazon SageMaker Components for Kubeflow Pipelines
  • 27. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Model development Model training Model tracking Model deployment Hyper-param tuning Data prep Amazon SageMaker + Kubeflow for Machine Learning Amazon SageMaker
  • 28. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kubernetes Amazon SageMaker Operators for Kubernetes github.com/aws/amazon-sagemaker-operator-for-k8s Kubeflow Amazon SageMaker Components for Kubeflow Pipelines github.com/kubeflow/pipelines/tree/master/componen ts/aws/sagemaker Scaling ML on Kubernetes with Amazon SageMaker 1 2
  • 29. © 2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Product Architecture Kubernetes Orchestration Dev Interface GUI Ease of Use SageMaker Components Kubeflow Pipeline Components Yes Self Hosted Kubeflow Pipelines Python KFP Dashboard Medium SageMaker Operators Kubernetes Operators Customer Resources Yes Kubernetes Tools (Ex. Flyte, Argo) YAML, or custom extension by customer None, or custom Advanced Amazon SageMaker Operators for Kubernetes vs. Components for Kubeflow Pipelines
  • 32. How Provectus Adds Value Feature Store Store and reuse features to build ML models faster ML Workflow Orchestrator Reproduce and track the whole ML Workflow Dataset Management Track and govern training datasets Dataset Sampling Sample from production streams Advanced Monitoring Detect drift in text & images MLOps Continuous Training & Delivery
  • 33. The Core of MLOps and Reproducible Experimentation Pipelines
  • 34. 1. Backbone of Experimentation flow 2. Essential part of Continuous Integration and Delivery flow 3. Major part of Continuous Retraining flow 4. Production workload (unlike traditional CI/CD) 5. Part of day-to-day model tuning and development process 6. Idempotent — Should produce the same results with the same inputs ML Pipeline Characteristics
  • 35. ML Pipeline Options Component /Option Amazon SageMaker Managed AWS Native Kubernetes Native DSL Orchestrator Metadata Tracker & UI Integrations (Tuner, Debugger, TensorBoard, etc)
  • 36. ML Pipeline Options Component /Option Amazon SageMaker Managed AWS Native Kubernetes Native DSL SageMaker Processing Data Science SDK for Step Functions Kubeflow Pipelines Orchestrator SageMaker Processing Step Functions Argo Workflow Metadata Tracker & UI Amazon SageMaker Experiments N/A Kubeflow Metadata Integrations (Tuner, Debugger, TensorBoard, etc) Amazon SageMaker Services DIY Opensource, Amazon SageMaker Components
  • 37. Kubeflow: Orchestrator and Experiments Tracker of Choice
  • 40. MLOps with Argo Workflows, Amazon SageMaker, & Kubeflow
  • 41.
  • 42. Summary of Kubeflow on AWS Best Practices: ● Invest into a library of reusable components ● Use Amazon SageMaker Components for Kubeflow ● Deploy on Amazon EKS, consider Provectus Swiss Army Kube for a quick start ● Use Argo and Kubeflow for MLOps Benefits: ● Metadata Tracker and Pipeline Orchestrator ● Minimal intervention into existing day-to-day ML routines
  • 44. Value Proposition of Feature Store A data management layer for machine learning features. 1. Better ROI from feature engineering — Facilitates collaboration, sharing and reusing of features 2. Increases ML Engineer productivity — Storage is further decoupled from ML pipelines 3. Prevents training-serving data skew by design 4. Can encapsulate or facilitate data versioning and features quality monitoring
  • 45. Good News: A properly designed Data Lake covers 80% of requirements for Feature Store
  • 46. Higher Level Operations: ● Fetch batch (take a sample) ● Get one ● Add / Deprecate feature Lineage Metadata: ● Upstream Models ● Data Sources and transformations Annotation Metadata: ● Agreements ● Judgements ● Annotation job parameters Adding ML Awareness to Data Lake Data Profiling Metadata: ● Min/max ● Uniqueness, missing values, etc. Governance Metadata: ● Owner ● Description ● Version ● Last updated, SLA
  • 47. Feature Store: Options Not a Store. General purpose Data Catalogue. Adds nice UI, Governance and Searchability. Great design. Early Stage. Nicely overlaps with Data Lake. No extensive metadata management yet. AWS support: https://github.com/feast-dev/feast/issues/367 By Ph.D for Ph.Ds. Tremendous amount of work, very advanced concepts but overcomplicated. By creators of Uber Michelangelo. Closed source.
  • 48. 1. Modern ML infrastructure accelerates time to value for ML initiatives and increases trust from the business 2. Eliminates handoffs between Data Scientists, ML Engineers and IT 3. Must-have requirement for small ML shops and for large organizations. Spans from straightforward “image classification” projects to more complex ML pipelines 4. Must-have requirement for secure and compliant environments 5. Minimizes growing technical debt in machine learning projects 6. Complements fully managed AWS services with Open Source projects for pipeline orchestration, experiments tracking, dataset versioning, and feature store Summary of ML Infrastructure
  • 49. 125 University Avenue Suite 290, Palo Alto California, 94301 hello@provectus.com Questions, details? We would be happy to answer!