SlideShare a Scribd company logo
1 of 50
Download to read offline
Challenges of Operationalizing Data Science in
Production
Machine Learning Operations Meet-Up #1
July 4
Real-world Data Science Challenges
• Section 1: Business Aspects
• Section 2: Technology and Operational Aspects
• Demo
Agenda
3
Santanu Dey is a Solutions Architect helping
customers with their Digital Transformations
journey, solutions involving Cloud, Analytics,
Microservices etc.
Over 18 years of proven track record of
designing and operationalizing high-volume,
mission critical, distributed systems.
Speakers
@Santanu_Dey
santanud@iguazio.com
Santanu Dey
Rasmi's primary background is in product and
technology management. His secondary
background covers business transformation
and operations functions across enterprise
and startup environment. He currently is a
Product Owner at Experian's APAC innovation
Hub - XLabs.
https://www.linkedin.com/in/ras
mi-m-428b3a46/
Rasmi Mohapatra
Section 1: Business Aspects
5
What is Data Science?
1. Domain Knowledge!
2. Actually understanding math!
3. Visualisation!
Business Issue #0
Don’t stop until you find your data scientist!
Business Issue #1
Giving into hype - Underestimating “small” data
Business Issue #2
Being unaware of regulation, compliance
Business Issue #3
Can’t explain it right to right people? You probably losing your hard work!!!
Business Issue #4
Accessing clean data
# 0- Focus on your domain and business requirements
# 1- Business cares about outcomes – not Big Data or Small
# 2– Be aware of regulatory implications of Data
# 3– Focus on Explain-ability & “Good enough” accuracy
# 4– Making the dataset usable for Data Science
Business Challenges Summery
Section 2: Technology Aspects
Data Science Journey
Sampledata set
at rest
Data Scientist
Workstation
Models
What does it take to Productionize AI Apps
Data Science Team
Large Data Sets
?Productionize Smart
Applications
Where are the key challenges?
Landmines!
Large Data Sets
Store &
access
large
training
data sets
access
fresh &
historical
data
Cleanse
and
prepare
large data
sets
support
for tools of
choice
distributed
training
including
GPUs
Track
models
and data
Make
models
portable
Serve
models at
scale
retrain
and
deploy
models
via CI/CD
multi-
tenancy
on a
shared
cluster
Data Engineer Data Scientist ML-Ops
Infra &
SLA aspects
Platform &
Tooling
Productionize Smart
Applications
Data Science Lifecycle
Load data
Pre-
processing
Training a
model
Deployment
Ongoing
monitoring
DS Lifecycle & IT Implications
Load data
Pre-
processing
Training a
model
Deployment
Ongoing
monitoring
SLA Driven
• Latency
• Automation
• Monitoring
• Accuracy
Compute Driven
• CPU
• GPU
• Multi-node
• Scheduling / Sharing
• Elasticity
I/O Intensive
• High Volume
• Legacy Touchpoints
• Often long wait
• Multiplesilos
DS Lifecycle & IT Implications
Load data
Pre-
processing
Training a
model
Deployment
Ongoing
monitoring
SLA Driven
• Latency
• Automation
• Monitoring
• Accuracy
Compute Driven
• CPU
• GPU
• Multi-node
• Scheduling / Sharing
• Elasticity
I/O Intensive
• High Volume
• Legacy Touchpoints
• Often long wait
• Multiplesilos
Customer
Driven
Data Science Lifecycle & Multiple Roles
Load data
Pre-
processing
Training a
model
Deployment
Ongoing
monitoring
Data Engineer Data Scientist ML-Ops
Friction Across Data Science Lifecycle & Roles
Load data
Pre-
processing
Training a
model
Deployment
Ongoing
monitoring
Data Engineer Data Scientist ML-Ops
✓ Get fresh and relevant
data from actual
system
✗ Dependency on IT for
data access and data
prep
✗ Old data, unclean,
wrong granularity
✓ Self-service data prep
by Data Scientist
✗ Am I forced to use
specific services /
tools?
✗ data prep code not
scaling for large
dataset
✓ Support Jupytar and
ALL my DS tools
✗ Rewrite the training
code to scale the
training
✗ Experiments are not
repeatable
✗ IT blocks my tools
✓ Portable model
artifact
✗ Too much CI/CD work
✗ Data input is different
for inference phase
✗ Difficult to
version/track/rollback
models
✓ scale and recover
elastically
✗ Security / Scalability/
Monitoring of
deployed models
✗ No feedback loop
DIY Approach
DIY Approach
Load data
Pre-
processing
Training a
model
Deploymen
t
Ongoing
monitoring
AWS S3
AWS Kinesis
AWS
Lambda
AWS
DynamoDB
AWS EMR
AWS
Batch
Load data
Pre-
processing
Training a
model
Deploymen
t
Ongoing
monitoring
AWS S3
AWS Kinesis
AWS
Lambda
AWS
DynamoDB
AWS EMR AWS S3
AWS SageMaker
Notebooks
ML Training
Cluster
AWS SageMaker
Models
AWS
Batch
DIY Approach
Load data
Pre-
processing
Training a
model
Deploymen
t
Ongoing
monitoring
AWS S3
AWS Kinesis
AWS
Lambda
AWS
DynamoDB
AWS EMR AWS S3
AWS SageMaker
Notebooks
ML Training
Cluster
AWS SageMaker
Models
AWS
Batch
AWS
Lambda
AWS SageMaker
Endpoint
AWS ECR
AWS
CodeCommit
AWS
CodePipeline
AWS
CodeBuild
DIY Approach
Load data
Pre-
processing
Training a
model
Deploymen
t
Ongoing
monitoring
AWS S3
AWS Kinesis
AWS
Lambda
AWS
DynamoDB
AWS EMR AWS S3
AWS SageMaker
Notebooks
ML Training
Cluster
AWS SageMaker
Models
AWS
Batch
AWS
Lambda
AWS SageMaker
Endpoint
AWS ECR
AWS
CodeCommit
AWS
CodePipeline
AWS
CodeBuild
Data Engineer Data Scientist ML-Ops
DIY Approach
Building
yourown
platform
Automation
Availability, Scalability, Cost
DIY Approach
Technical
Concerns
Automation
Availability, Scalability, Cost
Business
Focus
Data
Wrangling
for Patterns
Feature
Identification
Build, Train,
Test Model
Refining
Algorithm
Model
Parameter
Tuning
Self-Service
DIY Approach
Data
Ingestion
Technical
Concerns
Automation
Availability, Scalability, Cost
Business
Focus
Data
Wrangling
for Patterns
Feature
Identification
Build, Train,
Test Model
Refining
Algorithm
Model
Parameter
Tuning
Self-Service
DIY Approach
Data
Ingestion
Focos on the
business
challenges as
outlined in section 1
Platform Approach
Platform Approach
Data Science Platform Services
Platform Approach
Data Science Platform Services
Data
Wrangling
for Patterns
Feature
Identification
Build, Train,
Test Model
Refining
Algorithm
Model
Parameter
Tuning
Data
Ingestion
Platform Approach
Data Science Platform Services
Resource Sharing
Data
Wrangling
for Patterns
Feature
Identification
Build, Train,
Test Model
Refining
Algorithm
Model
Parameter
Tuning
Self-Service
Data
Ingestion
Platform Approach
Data Science Platform Services
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Data
Wrangling
for Patterns
Feature
Identification
Build, Train,
Test Model
Refining
Algorithm
Model
Parameter
Tuning
Self-Service
Data
Ingestion
Business
Focus
A platform
should hidethe
technical
concerns
Cloud or Hybrid or On-Prem
Demo
Summary
41
Code/Model Development Is Just The FIRST Step
Develop/Experiment Package Scale-out Tune Instrument Automate
• Dependencies
• Parameters
• Run scripts
• Build
• Load-balance
• Data partitions
• Modeldistribution
• Hyper params
• Parallelism
• GPU support
• Query tuning
• Caching
• Monitoring
• Logging
• Versioning
• Security
• CI/CD
• Workflows
• Rolling upgrades
• A/B testing
Weeks with one
data scientist
Months with a large team of developers,
scientists, data engineers and DevOps
Every piece of code, data science algorithm, or data processing task must be built for production
Nuclio: Fast Serverless for Data Science & RT Analytics
magic commands from
notebook to function
Extending MLPipelines from batch:
1. Parallel processing
2. Code build/deployment
3. Stream processing
4. API/Model Serving
High-performance IO and Computation
+ GPU Optimizations Code + DevOps Automation:
1. Auto-scaling(to zero)
2. Automatedlogging & monitoring
3. Security hardening
4. Auto-build and CI/CD
5. Workloadmobility (cloud/edge/..)
42
Platform Approach - Summary
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Self-Service
Cloud or Hybrid or On-Prem
PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless
KubeFlow
Time SeriesStream TableObject
Jupyter Notebook
Platform Approach - Summary
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Self-Service
Cloud or Hybrid or On-Prem
PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless
KubeFlow
Time SeriesStream TableObject
Jupyter Notebook
No new skills to
learn – use most all
common ML tools
e.g. Jupy, Kubeflow
Platform Approach - Summary
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Self-Service
Cloud or Hybrid or On-Prem
PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless
KubeFlow
Time SeriesStream TableObject
Jupyter Notebook
Natively available
data services for
high IOPS ML
workloads
Platform Approach - Summary
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Self-Service
Cloud or Hybrid or On-Prem
PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless
KubeFlow
Time SeriesStream TableObject
Jupyter Notebook
Ability to support
CPU as well GPU
for training as well
as inferencing
Platform Approach - Summary
Resource Sharing
Multi-Model Persistence CacheLong-Term Commodity Storage
Self-Service
Cloud or Hybrid or On-Prem
PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless
KubeFlow
Time SeriesStream TableObject
Jupyter Notebook
scaling, monitoring
and sharing of
platform resources
using K8S
Q&A
santanud@iguazio.com
▪ ML Pipelines for Production: KubeFlow
▪ Why use GPUs to Accelerate ML Projects
▪ How Serverless Simplifies ML Model Development & Deployment
Upcoming Meet-up Sessions
Thanks

More Related Content

What's hot

Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...
Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...
Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...Databricks
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Databricks
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
Ml infra at an early stage
Ml infra at an early stageMl infra at an early stage
Ml infra at an early stageNick Handel
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Databricks
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
How to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceHow to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceDatabricks
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLDatabricks
 
Using Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in AviationUsing Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in AviationDatabricks
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneSri Ambati
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsDatabricks
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AIDatabricks
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_futureNisha Talagala
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)Jasjeet Thind
 
Advanced Model Comparison and Automated Deployment Using ML
Advanced Model Comparison and Automated Deployment Using MLAdvanced Model Comparison and Automated Deployment Using ML
Advanced Model Comparison and Automated Deployment Using MLDatabricks
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilDatabricks
 
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Weaveworks
 

What's hot (20)

NextGenML
NextGenML NextGenML
NextGenML
 
Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...
Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...
Productionalizing Machine Learning Solutions with Effective Tracking, Monitor...
 
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Ml infra at an early stage
Ml infra at an early stageMl infra at an early stage
Ml infra at an early stage
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
How to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-SourceHow to Build a ML Platform Efficiently Using Open-Source
How to Build a ML Platform Efficiently Using Open-Source
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETL
 
Using Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in AviationUsing Apache Spark for Predicting Degrading and Failing Parts in Aviation
Using Apache Spark for Predicting Degrading and Failing Parts in Aviation
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest AirportsConsolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AI
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Ml ops past_present_future
Ml ops past_present_futureMl ops past_present_future
Ml ops past_present_future
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
Advanced Model Comparison and Automated Deployment Using ML
Advanced Model Comparison and Automated Deployment Using MLAdvanced Model Comparison and Automated Deployment Using ML
Advanced Model Comparison and Automated Deployment Using ML
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
 
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
Overcoming Regulatory & Compliance Hurdles with Hybrid Cloud EKS and Weave Gi...
 

Similar to Challenges of Operationalising Data Science in Production

ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Precisely
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersRevolution Analytics
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndCloudera, Inc.
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMProduct School
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudInside Analysis
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Sri Ambati
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketDremio Corporation
 
SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?Nicolas Georgeault
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...DataWorks Summit
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security BrokersRobin Vermeirsch
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzerquaish abuzer
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSPhilip Filleul
 

Similar to Challenges of Operationalising Data Science in Production (20)

ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Bridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the CloudBridging the Gap: Analyzing Data in and Below the Cloud
Bridging the Gap: Analyzing Data in and Below the Cloud
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...Near realtime AI deployment with huge data and super low latency - Levi Brack...
Near realtime AI deployment with huge data and super low latency - Levi Brack...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers
 
Resume quaish abuzer
Resume quaish abuzerResume quaish abuzer
Resume quaish abuzer
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
 

More from iguazio

Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Productioniguazio
 
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018iguazio
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...iguazio
 
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018iguazio
 
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...iguazio
 
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018iguazio
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesiguazio
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioiguazio
 
iguazio - nuclio Meetup Nov 30th
iguazio - nuclio Meetup Nov 30thiguazio - nuclio Meetup Nov 30th
iguazio - nuclio Meetup Nov 30thiguazio
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017iguazio
 

More from iguazio (11)

Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to ProductionWebinar: Cutting Time, Complexity and Cost from Data Science to Production
Webinar: Cutting Time, Complexity and Cost from Data Science to Production
 
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018
The Problem is Data: Gwen Shapira, Confluent, Serverless NYC 2018
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
 
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
Serverless and AI: Orit Nissan-Messing, Iguazio, Serverless NYC 2018
 
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...
Serverless real use cases and best practices: Asavari Tayal, Microsoft, Serve...
 
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
 
Stac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakesStac summit june 14th - goodbye datalakes
Stac summit june 14th - goodbye datalakes
 
Running High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclioRunning High-Speed Serverless with nuclio
Running High-Speed Serverless with nuclio
 
iguazio - nuclio Meetup Nov 30th
iguazio - nuclio Meetup Nov 30thiguazio - nuclio Meetup Nov 30th
iguazio - nuclio Meetup Nov 30th
 
nuclio Overview October 2017
nuclio Overview October 2017nuclio Overview October 2017
nuclio Overview October 2017
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Challenges of Operationalising Data Science in Production

  • 1. Challenges of Operationalizing Data Science in Production Machine Learning Operations Meet-Up #1 July 4
  • 2. Real-world Data Science Challenges • Section 1: Business Aspects • Section 2: Technology and Operational Aspects • Demo Agenda
  • 3. 3 Santanu Dey is a Solutions Architect helping customers with their Digital Transformations journey, solutions involving Cloud, Analytics, Microservices etc. Over 18 years of proven track record of designing and operationalizing high-volume, mission critical, distributed systems. Speakers @Santanu_Dey santanud@iguazio.com Santanu Dey Rasmi's primary background is in product and technology management. His secondary background covers business transformation and operations functions across enterprise and startup environment. He currently is a Product Owner at Experian's APAC innovation Hub - XLabs. https://www.linkedin.com/in/ras mi-m-428b3a46/ Rasmi Mohapatra
  • 5. 5 What is Data Science?
  • 6.
  • 10. Business Issue #0 Don’t stop until you find your data scientist!
  • 11. Business Issue #1 Giving into hype - Underestimating “small” data
  • 12. Business Issue #2 Being unaware of regulation, compliance
  • 13. Business Issue #3 Can’t explain it right to right people? You probably losing your hard work!!!
  • 15. # 0- Focus on your domain and business requirements # 1- Business cares about outcomes – not Big Data or Small # 2– Be aware of regulatory implications of Data # 3– Focus on Explain-ability & “Good enough” accuracy # 4– Making the dataset usable for Data Science Business Challenges Summery
  • 17. Data Science Journey Sampledata set at rest Data Scientist Workstation Models
  • 18. What does it take to Productionize AI Apps Data Science Team Large Data Sets ?Productionize Smart Applications
  • 19. Where are the key challenges?
  • 20. Landmines! Large Data Sets Store & access large training data sets access fresh & historical data Cleanse and prepare large data sets support for tools of choice distributed training including GPUs Track models and data Make models portable Serve models at scale retrain and deploy models via CI/CD multi- tenancy on a shared cluster Data Engineer Data Scientist ML-Ops Infra & SLA aspects Platform & Tooling Productionize Smart Applications
  • 21. Data Science Lifecycle Load data Pre- processing Training a model Deployment Ongoing monitoring
  • 22. DS Lifecycle & IT Implications Load data Pre- processing Training a model Deployment Ongoing monitoring SLA Driven • Latency • Automation • Monitoring • Accuracy Compute Driven • CPU • GPU • Multi-node • Scheduling / Sharing • Elasticity I/O Intensive • High Volume • Legacy Touchpoints • Often long wait • Multiplesilos
  • 23. DS Lifecycle & IT Implications Load data Pre- processing Training a model Deployment Ongoing monitoring SLA Driven • Latency • Automation • Monitoring • Accuracy Compute Driven • CPU • GPU • Multi-node • Scheduling / Sharing • Elasticity I/O Intensive • High Volume • Legacy Touchpoints • Often long wait • Multiplesilos Customer Driven
  • 24. Data Science Lifecycle & Multiple Roles Load data Pre- processing Training a model Deployment Ongoing monitoring Data Engineer Data Scientist ML-Ops
  • 25. Friction Across Data Science Lifecycle & Roles Load data Pre- processing Training a model Deployment Ongoing monitoring Data Engineer Data Scientist ML-Ops ✓ Get fresh and relevant data from actual system ✗ Dependency on IT for data access and data prep ✗ Old data, unclean, wrong granularity ✓ Self-service data prep by Data Scientist ✗ Am I forced to use specific services / tools? ✗ data prep code not scaling for large dataset ✓ Support Jupytar and ALL my DS tools ✗ Rewrite the training code to scale the training ✗ Experiments are not repeatable ✗ IT blocks my tools ✓ Portable model artifact ✗ Too much CI/CD work ✗ Data input is different for inference phase ✗ Difficult to version/track/rollback models ✓ scale and recover elastically ✗ Security / Scalability/ Monitoring of deployed models ✗ No feedback loop
  • 27. DIY Approach Load data Pre- processing Training a model Deploymen t Ongoing monitoring AWS S3 AWS Kinesis AWS Lambda AWS DynamoDB AWS EMR AWS Batch
  • 28. Load data Pre- processing Training a model Deploymen t Ongoing monitoring AWS S3 AWS Kinesis AWS Lambda AWS DynamoDB AWS EMR AWS S3 AWS SageMaker Notebooks ML Training Cluster AWS SageMaker Models AWS Batch DIY Approach
  • 29. Load data Pre- processing Training a model Deploymen t Ongoing monitoring AWS S3 AWS Kinesis AWS Lambda AWS DynamoDB AWS EMR AWS S3 AWS SageMaker Notebooks ML Training Cluster AWS SageMaker Models AWS Batch AWS Lambda AWS SageMaker Endpoint AWS ECR AWS CodeCommit AWS CodePipeline AWS CodeBuild DIY Approach
  • 30. Load data Pre- processing Training a model Deploymen t Ongoing monitoring AWS S3 AWS Kinesis AWS Lambda AWS DynamoDB AWS EMR AWS S3 AWS SageMaker Notebooks ML Training Cluster AWS SageMaker Models AWS Batch AWS Lambda AWS SageMaker Endpoint AWS ECR AWS CodeCommit AWS CodePipeline AWS CodeBuild Data Engineer Data Scientist ML-Ops DIY Approach
  • 32. Technical Concerns Automation Availability, Scalability, Cost Business Focus Data Wrangling for Patterns Feature Identification Build, Train, Test Model Refining Algorithm Model Parameter Tuning Self-Service DIY Approach Data Ingestion
  • 33. Technical Concerns Automation Availability, Scalability, Cost Business Focus Data Wrangling for Patterns Feature Identification Build, Train, Test Model Refining Algorithm Model Parameter Tuning Self-Service DIY Approach Data Ingestion Focos on the business challenges as outlined in section 1
  • 35. Platform Approach Data Science Platform Services
  • 36. Platform Approach Data Science Platform Services Data Wrangling for Patterns Feature Identification Build, Train, Test Model Refining Algorithm Model Parameter Tuning Data Ingestion
  • 37. Platform Approach Data Science Platform Services Resource Sharing Data Wrangling for Patterns Feature Identification Build, Train, Test Model Refining Algorithm Model Parameter Tuning Self-Service Data Ingestion
  • 38. Platform Approach Data Science Platform Services Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Data Wrangling for Patterns Feature Identification Build, Train, Test Model Refining Algorithm Model Parameter Tuning Self-Service Data Ingestion Business Focus A platform should hidethe technical concerns Cloud or Hybrid or On-Prem
  • 39. Demo
  • 41. 41 Code/Model Development Is Just The FIRST Step Develop/Experiment Package Scale-out Tune Instrument Automate • Dependencies • Parameters • Run scripts • Build • Load-balance • Data partitions • Modeldistribution • Hyper params • Parallelism • GPU support • Query tuning • Caching • Monitoring • Logging • Versioning • Security • CI/CD • Workflows • Rolling upgrades • A/B testing Weeks with one data scientist Months with a large team of developers, scientists, data engineers and DevOps Every piece of code, data science algorithm, or data processing task must be built for production
  • 42. Nuclio: Fast Serverless for Data Science & RT Analytics magic commands from notebook to function Extending MLPipelines from batch: 1. Parallel processing 2. Code build/deployment 3. Stream processing 4. API/Model Serving High-performance IO and Computation + GPU Optimizations Code + DevOps Automation: 1. Auto-scaling(to zero) 2. Automatedlogging & monitoring 3. Security hardening 4. Auto-build and CI/CD 5. Workloadmobility (cloud/edge/..) 42
  • 43. Platform Approach - Summary Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Self-Service Cloud or Hybrid or On-Prem PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless KubeFlow Time SeriesStream TableObject Jupyter Notebook
  • 44. Platform Approach - Summary Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Self-Service Cloud or Hybrid or On-Prem PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless KubeFlow Time SeriesStream TableObject Jupyter Notebook No new skills to learn – use most all common ML tools e.g. Jupy, Kubeflow
  • 45. Platform Approach - Summary Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Self-Service Cloud or Hybrid or On-Prem PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless KubeFlow Time SeriesStream TableObject Jupyter Notebook Natively available data services for high IOPS ML workloads
  • 46. Platform Approach - Summary Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Self-Service Cloud or Hybrid or On-Prem PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless KubeFlow Time SeriesStream TableObject Jupyter Notebook Ability to support CPU as well GPU for training as well as inferencing
  • 47. Platform Approach - Summary Resource Sharing Multi-Model Persistence CacheLong-Term Commodity Storage Self-Service Cloud or Hybrid or On-Prem PandasDaskTensorFlow PyTorch Spark Presto GrafanaPrometheusRapidsServerless KubeFlow Time SeriesStream TableObject Jupyter Notebook scaling, monitoring and sharing of platform resources using K8S
  • 49. ▪ ML Pipelines for Production: KubeFlow ▪ Why use GPUs to Accelerate ML Projects ▪ How Serverless Simplifies ML Model Development & Deployment Upcoming Meet-up Sessions