SlideShare a Scribd company logo
1 of 19
DEVELOPING ML SOLUTIONS
WITH SAGEMAKER
Knowledge Share by Temiloluwa Adeoti | October 12
2
Strategies for Developing ML Solutions on AWS
Polly – Text to Speech
Rekognition – Computer Vision
Comprehend – Natural Language Processing
Lex – Chatbots
AWS AI/ML Services
01 Pretrained Models on
AWS Market place
02
A collection of:
- Pre-trained models
- Solution Templates
03 - EMR Jupyter or Zepplin Notebooks
- Notebooks with Glue Dev endpoint
- Cloud 9 development
- Sagemaker Jupyter Notebooks
- Sagemaker Studio
04
Subscribe to pre-trained models and deploy them
to Sagemaker from web console
Accelerated ML development with
Sagemaker JumpStart
Dev Enviroments for .Py and .Ipynb files
3
What is Sagemaker?
Sagemaker is a fully-managed AWS Service that facilitates almost every part of the ML
development lifecycle.
Data
• Sagemaker Groundtruth - automated data labelling
• Sagemaker Data Wrangler – data preprocessing pipelines
• Sagemaker Clarify – bias detection
Training
• Sagemaker Experiments – experiments tracking
• Sagemaker Autopilot – auto ml
• Sagemaker Feature Store – feature store
Inference
• Sagemaker Monitor – detect drift in models
• Sagemaker Elastic Inference – lower cost inference acceleration
• Sagemaker Neo – compile model for multiple hardware platforms
Full list of Features: https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
4
My Personal Challenges with Sagemaker
1. Poor Documentation
2. APIs with a deluge of Parameters
3. Docker containers are king and debugging them are difficult
5
Ways to interact with Sagemaker resources
1 AWS Console – Web Interface
2 AWS CLI – command line
3 AWS SDK for python (Boto3) – Low level API
4 Sagemaker Python SDK – High level API
6
Sagemaker Development Environments
01 Sagemaker Jupyter Notebooks
Similar to Google Colab but you have to manually create your
notebook instance
02
Sagemaker Studio
Jupyter Lab type of experience that provides full access to
Sagemaker’s resources
7
Jupyter Notebooks Disadvantages
• They are slow to startup: 5 – 10 times slower than Sagemaker Studio Notebooks
• Lacks the integrated Notebook sharing features present in Sagemaker Notebooks
• Development environment has a fixed instance type
o You can switch the instance type on which your notebook should run in Sagemaker Studio
o Better cost savings on with Sagemaker Studio
8
Sagemaker Studio Architecture
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
AWS Service Account
- JupyterServer App
- KernelGateway Apps
Customer Account
- Amazon EFS
9
Basic workflow with a Sagemaker Studio Notebook
JupyterServerApp
created from Jupyter
notebooks
Infrastructure:
(EFS, Jupyter Server)
KernelGatewayApp
created from Docker
Images
Infrastructure:
(ml.t3.medium)
CreateTrainingJob
Infrastructure: (ml.c5.xlarge)
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
10
Sagemaker Instance Types
• ml.t3.medium (Free-Tier Eligible) >> Fast
launch
• ml.m5.large >> Fast launch
General purpose
(no GPUs)
1
Compute
Optimized (No
GPUs)
2 • ml.c5.large >> Fast launch
Accelerated
computing (1+
GPUs)
3 • ml.g4dn.xlarge >> Fast launch
Memory optimized
(no GPUs)
4 • ml.r5.large
Full list can be found here: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-
instance-types.html
Fast Launch:
Optimized to launch
in under 2 minutes
11
Sagemaker Input
Source: https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/
Data Sources
• S3 is the first-class data source
• File-mode: Dataset is downloaded to the training instance
• Pipe mode: Dataset is streamed to your training instance
• Gives higher throughput
• Preferrable for large datasets
• Only supported for Probobuf record-IO encoded data
• FileSystems: EFS or Amazon FSx are other data sources
12
Sagemaker Algorithms: JumpStart
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
Pre-trained Models and Solutions Templates
• Pretrained models and frameworks are stored as docker images
• Model API – Load pretrained model
• Prepocessor API – Run preprocessing jobs
• Transform API – Run Batch transform jobs
• Predictor API – Supply class to model to make real-time predictions using sagemaker endpoints
• Estimator API – Train or Finetune pre-trained Model
Docker Images: https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html
13
Sagemaker Algorithms: Amazon Estimators aka Amazon
Algorithm Estimator
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
base class: sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase
1.Linear Learner
2.Latent Dirichlet Allocation (LDA)
3.Neural Topic Modelling (NTM)
4.Object2Vec
5.Principal Component Analysis (PCA)
6.RandomCutForest
7.Kmeans
8.Knearest Neighbors
9.Ip-Insights
10.Factorization machines
14
Amazon Estimator Example: Linear Learner
Source: https://sagemaker-
examples.readthedocs.io/en/latest/scientific_details_of_algorithms/linear_learner_multiclass_classification/linear_learner_multiclass_classification.html
Easy to use: Supply inputs then call Estimator.fit()
Supported input formats: protobuf, csv, json
Common information: https://docs.aws.amazon.com/sagemaker/latest/dg/common-info-all-im-models.html
15
Sagemaker Algorithms: Frameworks
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
base class: sagemaker.estimator.Framework
1.Scikit-learn
2.SparkMLServing
3.Tensorflow
4.Xgboost
5.Pytorch
6.HuggingFace
7.MXNet
8.Chainer
9.REinforcement Learning
16
Frameworks Example: Xgboost
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
Use as a Framework: custom logic is defined in script-mode
17
Frameworks Example: Xgboost
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
Use a builtin algorithm: retrieved from docker images
Estimator used here
is the base sagemaker
estimator
18
Best Resources
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
1. Sagemaker Examples Documentation: https://sagemaker-
examples.readthedocs.io/en/latest/intro.html
2. Amazon SageMaker Python SDK:
https://sagemaker.readthedocs.io/en/stable/
19
Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html
DEMO

More Related Content

Similar to ML_Development_with_Sagemaker.pptx

MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
Craeg Strong
 
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
Craeg Strong
 
Windows Azure & How to Deploy Wordress
Windows Azure & How to Deploy WordressWindows Azure & How to Deploy Wordress
Windows Azure & How to Deploy Wordress
George Kanellopoulos
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
harendra_pathak
 
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
Craeg Strong
 

Similar to ML_Development_with_Sagemaker.pptx (20)

Getting Started with Platform-as-a-Service
Getting Started with Platform-as-a-ServiceGetting Started with Platform-as-a-Service
Getting Started with Platform-as-a-Service
 
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWSAWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
AWS Public Sector Symposium 2014 Canberra | Test and Development on AWS
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ DevicesDelivering Insights from 20M+ Smart Homes with 500M+ Devices
Delivering Insights from 20M+ Smart Homes with 500M+ Devices
 
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211202 NADOG Adapting to Covid with Serverless Craeg Strong Ariel Partners
 
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
20211202 North America DevOps Group NADOG Adapting to Covid With Serverless C...
 
Private Cloud with Open Stack, Docker
Private Cloud with Open Stack, DockerPrivate Cloud with Open Stack, Docker
Private Cloud with Open Stack, Docker
 
Windows Azure & How to Deploy Wordress
Windows Azure & How to Deploy WordressWindows Azure & How to Deploy Wordress
Windows Azure & How to Deploy Wordress
 
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
The Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with RubyThe Future is Now: Leveraging the Cloud with Ruby
The Future is Now: Leveraging the Cloud with Ruby
 
Azure machine learning service
Azure machine learning serviceAzure machine learning service
Azure machine learning service
 
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
20211028 ADDO Adapting to Covid with Serverless Craeg Strong Ariel Partners
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
 

Recently uploaded

Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Lisi Hocke
 
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Marc Lester
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
drm1699
 

Recently uploaded (20)

From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST APIFrom Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdf
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
 
Software Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements EngineeringSoftware Engineering - Introduction + Process Models + Requirements Engineering
Software Engineering - Introduction + Process Models + Requirements Engineering
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
Entropy, Software Quality, and Innovation (presented at Princeton Plasma Phys...
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCAOpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdf
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 

ML_Development_with_Sagemaker.pptx

  • 1. DEVELOPING ML SOLUTIONS WITH SAGEMAKER Knowledge Share by Temiloluwa Adeoti | October 12
  • 2. 2 Strategies for Developing ML Solutions on AWS Polly – Text to Speech Rekognition – Computer Vision Comprehend – Natural Language Processing Lex – Chatbots AWS AI/ML Services 01 Pretrained Models on AWS Market place 02 A collection of: - Pre-trained models - Solution Templates 03 - EMR Jupyter or Zepplin Notebooks - Notebooks with Glue Dev endpoint - Cloud 9 development - Sagemaker Jupyter Notebooks - Sagemaker Studio 04 Subscribe to pre-trained models and deploy them to Sagemaker from web console Accelerated ML development with Sagemaker JumpStart Dev Enviroments for .Py and .Ipynb files
  • 3. 3 What is Sagemaker? Sagemaker is a fully-managed AWS Service that facilitates almost every part of the ML development lifecycle. Data • Sagemaker Groundtruth - automated data labelling • Sagemaker Data Wrangler – data preprocessing pipelines • Sagemaker Clarify – bias detection Training • Sagemaker Experiments – experiments tracking • Sagemaker Autopilot – auto ml • Sagemaker Feature Store – feature store Inference • Sagemaker Monitor – detect drift in models • Sagemaker Elastic Inference – lower cost inference acceleration • Sagemaker Neo – compile model for multiple hardware platforms Full list of Features: https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
  • 4. 4 My Personal Challenges with Sagemaker 1. Poor Documentation 2. APIs with a deluge of Parameters 3. Docker containers are king and debugging them are difficult
  • 5. 5 Ways to interact with Sagemaker resources 1 AWS Console – Web Interface 2 AWS CLI – command line 3 AWS SDK for python (Boto3) – Low level API 4 Sagemaker Python SDK – High level API
  • 6. 6 Sagemaker Development Environments 01 Sagemaker Jupyter Notebooks Similar to Google Colab but you have to manually create your notebook instance 02 Sagemaker Studio Jupyter Lab type of experience that provides full access to Sagemaker’s resources
  • 7. 7 Jupyter Notebooks Disadvantages • They are slow to startup: 5 – 10 times slower than Sagemaker Studio Notebooks • Lacks the integrated Notebook sharing features present in Sagemaker Notebooks • Development environment has a fixed instance type o You can switch the instance type on which your notebook should run in Sagemaker Studio o Better cost savings on with Sagemaker Studio
  • 8. 8 Sagemaker Studio Architecture Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html AWS Service Account - JupyterServer App - KernelGateway Apps Customer Account - Amazon EFS
  • 9. 9 Basic workflow with a Sagemaker Studio Notebook JupyterServerApp created from Jupyter notebooks Infrastructure: (EFS, Jupyter Server) KernelGatewayApp created from Docker Images Infrastructure: (ml.t3.medium) CreateTrainingJob Infrastructure: (ml.c5.xlarge) Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html
  • 10. 10 Sagemaker Instance Types • ml.t3.medium (Free-Tier Eligible) >> Fast launch • ml.m5.large >> Fast launch General purpose (no GPUs) 1 Compute Optimized (No GPUs) 2 • ml.c5.large >> Fast launch Accelerated computing (1+ GPUs) 3 • ml.g4dn.xlarge >> Fast launch Memory optimized (no GPUs) 4 • ml.r5.large Full list can be found here: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available- instance-types.html Fast Launch: Optimized to launch in under 2 minutes
  • 11. 11 Sagemaker Input Source: https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/ Data Sources • S3 is the first-class data source • File-mode: Dataset is downloaded to the training instance • Pipe mode: Dataset is streamed to your training instance • Gives higher throughput • Preferrable for large datasets • Only supported for Probobuf record-IO encoded data • FileSystems: EFS or Amazon FSx are other data sources
  • 12. 12 Sagemaker Algorithms: JumpStart Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html Pre-trained Models and Solutions Templates • Pretrained models and frameworks are stored as docker images • Model API – Load pretrained model • Prepocessor API – Run preprocessing jobs • Transform API – Run Batch transform jobs • Predictor API – Supply class to model to make real-time predictions using sagemaker endpoints • Estimator API – Train or Finetune pre-trained Model Docker Images: https://sagemaker.readthedocs.io/en/stable/doc_utils/pretrainedmodels.html
  • 13. 13 Sagemaker Algorithms: Amazon Estimators aka Amazon Algorithm Estimator Source: https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks.html base class: sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase 1.Linear Learner 2.Latent Dirichlet Allocation (LDA) 3.Neural Topic Modelling (NTM) 4.Object2Vec 5.Principal Component Analysis (PCA) 6.RandomCutForest 7.Kmeans 8.Knearest Neighbors 9.Ip-Insights 10.Factorization machines
  • 14. 14 Amazon Estimator Example: Linear Learner Source: https://sagemaker- examples.readthedocs.io/en/latest/scientific_details_of_algorithms/linear_learner_multiclass_classification/linear_learner_multiclass_classification.html Easy to use: Supply inputs then call Estimator.fit() Supported input formats: protobuf, csv, json Common information: https://docs.aws.amazon.com/sagemaker/latest/dg/common-info-all-im-models.html
  • 15. 15 Sagemaker Algorithms: Frameworks Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html base class: sagemaker.estimator.Framework 1.Scikit-learn 2.SparkMLServing 3.Tensorflow 4.Xgboost 5.Pytorch 6.HuggingFace 7.MXNet 8.Chainer 9.REinforcement Learning
  • 16. 16 Frameworks Example: Xgboost Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html Use as a Framework: custom logic is defined in script-mode
  • 17. 17 Frameworks Example: Xgboost Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html Use a builtin algorithm: retrieved from docker images Estimator used here is the base sagemaker estimator
  • 18. 18 Best Resources Source: https://sagemaker.readthedocs.io/en/v2.38.0/frameworks/index.html 1. Sagemaker Examples Documentation: https://sagemaker- examples.readthedocs.io/en/latest/intro.html 2. Amazon SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/

Editor's Notes

  1. Some features are accessible directly for the AWS web console While others are accessible programmatically
  2. Some features are accessible directly for the AWS web console While others are accessible programmatically
  3. Switch to Deployed Jupyter Notebook
  4. Advantages of Studio over Jupyter notebooks Easy to spin up and shut down training instances Cost savings because you can decide on the type of instance you want Collaboration between user profiles because it makes it easier to share notebooks than using Github
  5. ****** Switch to Show Domain Creation ************ - You create a domain using AWS IAM identity or AWS IAM to authenticate - Afterwards, you create user profiles: this corresponds to a single user with a unique home directory in the EFS ****** Switch to Show Notebook Creation from docker images ***** Show notebook creation from docker images Show terminals: System terminal and Docker Image terminal Show JumpStart
  6. Algorithms are already precoded Supply your data to these Algorithms
  7. ***** Link to common properties of algorithms ******** Algorithms are already precoded Supply your data to these Algorithms
  8. Algorithms are already precoded Supply your data to these Algorithms
  9. Demo notes: sagemaker-python-sdk/scikit_learn_iris/