SlideShare a Scribd company logo
1 of 19
DEVELOPING ML SOLUTIONS
WITH SAGEMAKER
Knowledge Share by Temiloluwa Adeoti | October 12
2
Strategies for Developing ML Solutions on AWS
Polly – Text to Speech
Rekognition – Computer Vision
Comprehend – Natural Language Processing
Lex – Chatbots
AWS AI/ML Services
01 Pretrained Models on
AWS Market place
02
A collection of:
- Pre-trained models
- Solution Templates
03 - EMR Jupyter or Zepplin Notebooks
- Notebooks with Glue Dev endpoint
- Cloud 9 development
- Sagemaker Jupyter Notebooks
- Sagemaker Studio
04
Subscribe to pre-trained models and deploy them
to Sagemaker from web console
Accelerated ML development with
Sagemaker JumpStart
Dev Enviroments for .Py and .Ipynb files
3
What is Sagemaker?
Sagemaker is a fully-managed AWS Service that facilitates almost every part of the ML
development lifecycle.
Data
• Sagemaker Groundtruth - automated data labelling
• Sagemaker Data Wrangler – data preprocessing pipelines
• Sagemaker Clarify – bias detection
Training
• Sagemaker Experiments – experiments tracking
• Sagemaker Autopilot – auto ml
• Sagemaker Feature Store – feature store
Inference
• Sagemaker Monitor – detect drift in models
• Sagemaker Elastic Inference – lower cost inference acceleration
• Sagemaker Neo – compile model for multiple hardware platforms
Full list of Features: https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
4
My Personal Challenges with Sagemaker
1. Poor Documentation
2. APIs with a deluge of Parameters
3. Docker containers are king and debugging them are difficult
5
Ways to interact with Sagemaker resources
1 AWS Console – Web Interface
2 AWS CLI – command line
3 AWS SDK for python (Boto3) – Low level API
4 Sagemaker Python SDK – High level API
6
Sagemaker Development Environments
01 Sagemaker Jupyter Notebooks
Similar to Google Colab but you have to manually create your
notebook instance
02
Sagemaker Studio
Jupyter Lab type of experience that provides full access to
Sagemaker’s resources
7
Jupyter Notebooks Disadvantages
• They are slow to startup: 5 – 10 times slower than Sagemaker Studio Notebooks
• Lacks the integrated Notebook sharing features present in Sagemaker Notebooks
• Development environment has a fixed instance type
o You can switch the instance type on which your notebook should run in Sagemaker Studio
o Better cost savings on with Sagemaker Studio
8
Sagemaker Studio Architecture
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
AWS Service Account
- JupyterServer App
- KernelGateway Apps
Customer Account
- Amazon EFS
9
Basic workflow with a Sagemaker Studio Notebook
JupyterServerApp
created from Jupyter
notebooks
Infrastructure:
(EFS, Jupyter Server)
KernelGatewayApp
created from Docker
Images
Infrastructure:
(ml.t3.medium)
CreateTrainingJob
Infrastructure: (ml.c5.xlarge)
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
10
Sagemaker Instance Types
• ml.t3.medium (Free-Tier Eligible) >> Fast
launch
• ml.m5.large >> Fast launch
General purpose
(no GPUs)
1
Compute
Optimized (No
GPUs)
2 • ml.c5.large >> Fast launch
Accelerated
computing (1+
GPUs)
3 • ml.g4dn.xlarge >> Fast launch
Memory optimized
(no GPUs)
4 • ml.r5.large
Full list can be found here: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-
instance-types.html
Fast Launch:
Optimized to launch
in under 2 minutes
11
Sagemaker Input
Source: https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/
Data Sources
• S3 is the first-class data source
• File-mode: Dataset is downloaded to the training instance
• Pipe mode: Dataset is streamed to your training instance
• Gives higher throughput
• Preferrable for large datasets
• Only supported for Probobuf record-IO encoded data
• FileSystems: EFS or Amazon FSx are other data sources
12
Sagemaker Algorithms: JumpStart
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
Pre-trained Models and Solutions Templates
• Pretrained models and frameworks are stored as docker images
• Model API – Load pretrained model
• Prepocessor API – Run preprocessing jobs
• Transform API – Run Batch transform jobs
• Predictor API – Supply class to model to make real-time predictions using sagemaker endpoints
• Estimator API – Train or Finetune pre-trained Model
Docker Images: https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html
13
Sagemaker Algorithms: Amazon Estimators aka Amazon
Algorithm Estimator
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
base class: sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase
1.Linear Learner
2.Latent Dirichlet Allocation (LDA)
3.Neural Topic Modelling (NTM)
4.Object2Vec
5.Principal Component Analysis (PCA)
6.RandomCutForest
7.Kmeans
8.Knearest Neighbors
9.Ip-Insights
10.Factorization machines
14
Amazon Estimator Example: Linear Learner
Source: https://sagemaker-
examples.readthedocs.io/en/latest/scientific_details_of_algorithms/linear_learner_multiclass_classification/linear_learner_multiclass_classification.html
Easy to use: Supply inputs then call Estimator.fit()
Supported input formats: protobuf, csv, json
Common information: https://docs.aws.amazon.com/sagemaker/latest/dg/common-info-all-im-models.html
15
Sagemaker Algorithms: Frameworks
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
base class: sagemaker.estimator.Framework
1.Scikit-learn
2.SparkMLServing
3.Tensorflow
4.Xgboost
5.Pytorch
6.HuggingFace
7.MXNet
8.Chainer
9.REinforcement Learning
16
Frameworks Example: Xgboost
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
Use as a Framework: custom logic is defined in script-mode
17
Frameworks Example: Xgboost
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
Use a builtin algorithm: retrieved from docker images
Estimator used here
is the base sagemaker
estimator
18
Best Resources
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
1. Sagemaker Examples Documentation: https://sagemaker-
examples.readthedocs.io/en/latest/intro.html
2. Amazon SageMaker Python SDK:
https://sagemaker.readthedocs.io/en/stable/
19
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
DEMO

More Related Content

Similar to ML_Development_with_Sagemaker.pptx

Getting Started with Platform-as-a-Service
Getting Started with Platform-as-a-ServiceGetting Started with Platform-as-a-Service
Getting Started with Platform-as-a-ServiceCloudBees
 
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWSAWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWSAmazon Web Services
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDatabricks
 
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel PartnersCraeg Strong
 
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...Craeg Strong
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerDavinder Kohli
 
Windows Azure & How to Deploy Wordress
Windows Azure & How to Deploy WordressWindows Azure & How to Deploy Wordress
Windows Azure & How to Deploy WordressGeorge Kanellopoulos
 
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelCloudera Japan
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyRobert Dempsey
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning serviceRuth Yakubu
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...Cloud Native Day Tel Aviv
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructureharendra_pathak
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Julien SIMON
 
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel PartnersCraeg Strong
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Emerson Eduardo Rodrigues Von Staffen
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...Amazon Web Services
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...Rustem Feyzkhanov
 

Similar to ML_Development_with_Sagemaker.pptx (20)

Getting Started with Platform-as-a-Service
Getting Started with Platform-as-a-ServiceGetting Started with Platform-as-a-Service
Getting Started with Platform-as-a-Service
 
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWSAWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
 
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
 
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, Docker
 
Windows Azure & How to Deploy Wordress
Windows Azure & How to Deploy WordressWindows Azure & How to Deploy Wordress
Windows Azure & How to Deploy Wordress
 
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with Ruby
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
 

Recently uploaded

EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 

Recently uploaded (20)

EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

ML_Development_with_Sagemaker.pptx

  • 1. DEVELOPING ML SOLUTIONS WITH SAGEMAKER Knowledge Share by Temiloluwa Adeoti | October 12
  • 2. 2 Strategies for Developing ML Solutions on AWS Polly – Text to Speech Rekognition – Computer Vision Comprehend – Natural Language Processing Lex – Chatbots AWS AI/ML Services 01 Pretrained Models on AWS Market place 02 A collection of: - Pre-trained models - Solution Templates 03 - EMR Jupyter or Zepplin Notebooks - Notebooks with Glue Dev endpoint - Cloud 9 development - Sagemaker Jupyter Notebooks - Sagemaker Studio 04 Subscribe to pre-trained models and deploy them to Sagemaker from web console Accelerated ML development with Sagemaker JumpStart Dev Enviroments for .Py and .Ipynb files
  • 3. 3 What is Sagemaker? Sagemaker is a fully-managed AWS Service that facilitates almost every part of the ML development lifecycle. Data • Sagemaker Groundtruth - automated data labelling • Sagemaker Data Wrangler – data preprocessing pipelines • Sagemaker Clarify – bias detection Training • Sagemaker Experiments – experiments tracking • Sagemaker Autopilot – auto ml • Sagemaker Feature Store – feature store Inference • Sagemaker Monitor – detect drift in models • Sagemaker Elastic Inference – lower cost inference acceleration • Sagemaker Neo – compile model for multiple hardware platforms Full list of Features: https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
  • 4. 4 My Personal Challenges with Sagemaker 1. Poor Documentation 2. APIs with a deluge of Parameters 3. Docker containers are king and debugging them are difficult
  • 5. 5 Ways to interact with Sagemaker resources 1 AWS Console – Web Interface 2 AWS CLI – command line 3 AWS SDK for python (Boto3) – Low level API 4 Sagemaker Python SDK – High level API
  • 6. 6 Sagemaker Development Environments 01 Sagemaker Jupyter Notebooks Similar to Google Colab but you have to manually create your notebook instance 02 Sagemaker Studio Jupyter Lab type of experience that provides full access to Sagemaker’s resources
  • 7. 7 Jupyter Notebooks Disadvantages • They are slow to startup: 5 – 10 times slower than Sagemaker Studio Notebooks • Lacks the integrated Notebook sharing features present in Sagemaker Notebooks • Development environment has a fixed instance type o You can switch the instance type on which your notebook should run in Sagemaker Studio o Better cost savings on with Sagemaker Studio
  • 8. 8 Sagemaker Studio Architecture Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html AWS Service Account - JupyterServer App - KernelGateway Apps Customer Account - Amazon EFS
  • 9. 9 Basic workflow with a Sagemaker Studio Notebook JupyterServerApp created from Jupyter notebooks Infrastructure: (EFS, Jupyter Server) KernelGatewayApp created from Docker Images Infrastructure: (ml.t3.medium) CreateTrainingJob Infrastructure: (ml.c5.xlarge) Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
  • 10. 10 Sagemaker Instance Types • ml.t3.medium (Free-Tier Eligible) >> Fast launch • ml.m5.large >> Fast launch General purpose (no GPUs) 1 Compute Optimized (No GPUs) 2 • ml.c5.large >> Fast launch Accelerated computing (1+ GPUs) 3 • ml.g4dn.xlarge >> Fast launch Memory optimized (no GPUs) 4 • ml.r5.large Full list can be found here: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available- instance-types.html Fast Launch: Optimized to launch in under 2 minutes
  • 11. 11 Sagemaker Input Source: https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/ Data Sources • S3 is the first-class data source • File-mode: Dataset is downloaded to the training instance • Pipe mode: Dataset is streamed to your training instance • Gives higher throughput • Preferrable for large datasets • Only supported for Probobuf record-IO encoded data • FileSystems: EFS or Amazon FSx are other data sources
  • 12. 12 Sagemaker Algorithms: JumpStart Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html Pre-trained Models and Solutions Templates • Pretrained models and frameworks are stored as docker images • Model API – Load pretrained model • Prepocessor API – Run preprocessing jobs • Transform API – Run Batch transform jobs • Predictor API – Supply class to model to make real-time predictions using sagemaker endpoints • Estimator API – Train or Finetune pre-trained Model Docker Images: https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html
  • 13. 13 Sagemaker Algorithms: Amazon Estimators aka Amazon Algorithm Estimator Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html base class: sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase 1.Linear Learner 2.Latent Dirichlet Allocation (LDA) 3.Neural Topic Modelling (NTM) 4.Object2Vec 5.Principal Component Analysis (PCA) 6.RandomCutForest 7.Kmeans 8.Knearest Neighbors 9.Ip-Insights 10.Factorization machines
  • 14. 14 Amazon Estimator Example: Linear Learner Source: https://sagemaker- examples.readthedocs.io/en/latest/scientific_details_of_algorithms/linear_learner_multiclass_classification/linear_learner_multiclass_classification.html Easy to use: Supply inputs then call Estimator.fit() Supported input formats: protobuf, csv, json Common information: https://docs.aws.amazon.com/sagemaker/latest/dg/common-info-all-im-models.html
  • 15. 15 Sagemaker Algorithms: Frameworks Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html base class: sagemaker.estimator.Framework 1.Scikit-learn 2.SparkMLServing 3.Tensorflow 4.Xgboost 5.Pytorch 6.HuggingFace 7.MXNet 8.Chainer 9.REinforcement Learning
  • 16. 16 Frameworks Example: Xgboost Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html Use as a Framework: custom logic is defined in script-mode
  • 17. 17 Frameworks Example: Xgboost Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html Use a builtin algorithm: retrieved from docker images Estimator used here is the base sagemaker estimator
  • 18. 18 Best Resources Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html 1. Sagemaker Examples Documentation: https://sagemaker- examples.readthedocs.io/en/latest/intro.html 2. Amazon SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/

Editor's Notes

  1. Some features are accessible directly for the AWS web console While others are accessible programmatically
  2. Some features are accessible directly for the AWS web console While others are accessible programmatically
  3. Switch to Deployed Jupyter Notebook
  4. Advantages of Studio over Jupyter notebooks Easy to spin up and shut down training instances Cost savings because you can decide on the type of instance you want Collaboration between user profiles because it makes it easier to share notebooks than using Github
  5. ****** Switch to Show Domain Creation ************ - You create a domain using AWS IAM identity or AWS IAM to authenticate - Afterwards, you create user profiles: this corresponds to a single user with a unique home directory in the EFS ****** Switch to Show Notebook Creation from docker images ***** Show notebook creation from docker images Show terminals: System terminal and Docker Image terminal Show JumpStart
  6. Algorithms are already precoded Supply your data to these Algorithms
  7. ***** Link to common properties of algorithms ******** Algorithms are already precoded Supply your data to these Algorithms
  8. Algorithms are already precoded Supply your data to these Algorithms
  9. Demo notes: sagemaker-python-sdk/scikit_learn_iris/