Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Build, train and deploy your ML models with Amazon Sage Maker

39 views

Published on

Talk by Sangeetha Krishnan, MTS at Adobe on the topic "Build, train and deploy your ML models with Amazon Sage Maker" at AWS Community Day, Bangalore 2018

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Build, train and deploy your ML models with Amazon Sage Maker

  1. 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BENGALURU
  2. 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Build, Train and Deploy your ML models with Amazon SageMaker Sangeetha Krishnan, Member of Technical Staff | Adobe
  3. 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ABOUT MYSELF ● Software Development Engineer at Adobe Systems. ● Part of CloudTech & Adobe I/O Events development team. ● Areas of interest: ○ Machine Learning ○ Natural Language Processing ○ Music + Travel!
  4. 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. WHY AMAZON SAGEMAKER? ★ It is a fully managed machine learning service ★ Very quick and easy to build, train and deploy your ML models ★ Integrated Jupyter notebook ★ Several built-in machine learning algorithms provided by Amazon Sagemaker ★ Capability to automatically tune the machine learning models to generate the best solution ★ Easy deploy to production ★ Automatic scaling for production variants
  5. 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AGENDA Image caption 1 Image caption 2 Image caption 3 Image caption 4 Image caption 5 Image caption 6 Getting started with Amazon SageMaker Built-in ML Algorithms Hyper- parameter tuning Accessing the model endpoints Blue/Green Deployments Security and Best Practices
  6. 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Getting Started with Amazon SageMaker
  7. 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Storage and deployment dependencies: 1. S3 bucket 2. EC2 Container Registry 3. Notebook instance
  8. 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Setting up the prerequisites and Notebook instance ● Create an S3 bucket ( preferable to have name of S3 starting with sagemaker-*) ● Create a notebook instance: https://console.aws.amazon.com/sagemaker/ ● Create an IAM role give access to the S3 bucket created ● Create a Jupyter notebook
  9. 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SMS Spam Detection ● Problem Statement: Given an SMS Text, determine if the message is a spam or ham ( non-spam ) ● Kaggle dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset ● The first column is the label (spam/ham), the second column in the dataset is the SMS text. ● 5574 messages in the dataset. 87% ham (4850) , 13% spam (724)
  10. 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Built-in Machine Learning Algorithms
  11. 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Different Categories of Algorithms Supervised Learning Unsupervised Learning Image and Object Focused Text Related ● DeepAR Forecasting ● Factorisation Machine ● Linear Learner ● XGBoost ● K Means Algorithm ● K Nearest Neighbours ● PCA ● Random Cut Forest ● Image Classification algorithm that uses CNN ● Object Detection Algorithm ● Blazing Text ● LDA ● Sequence2Sequenc e ● Neural Topic Model
  12. 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Problem categories Factorization Machines ● Recommendation Systems ● Ad-click predictions XGBoost ● Fraud predictions DeepAR Forecasting ● Traffic, electricity, pageviews Random Cut Forests ● Detecting anomalous data points K-Means algorithm and KNN ● Document clustering ● Identifying related articles Image and Object Detection and Classification BlazingText and Seq2Seq ● Sentiment Analysis ● Named Entity Recognition ● Machine Translation
  13. 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Advantages of using Built-in Algorithms ● Designed for solving issues in a commercial setting ● General common purpose algorithms ● Designed for training on huge datasets ● Can support terabytes of data ● Greater reliability ● Faster training and streaming of the datasets ● Training on multiple instances that share their state
  14. 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BlazingText Algorithm ● Provides highly optimized implementations of the Word2vec and text classification algorithms. ● Is used in problems like text classification, named entity recognition, machine translation, etc. ● Word2Vec generate word embeddings ● Ability to generate meaningful vectors for out of vocabulary words ● Provide semantic relationship between words ● Useful for many NLP problems ● Can be trained for huge datasets in a couple of minutes ● Supports multi core CPU and single GPU modes for the purpose of training
  15. 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data format ● Supervised Learning ● Binary classification Text Classification: ● Training and Validation set (Text file) __label__spam sms_text1 __label__ham sms_text2 ● Test Data (Json request) {"instances":["sms_text_1", “sms_text_2”,... ]}
  16. 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning
  17. 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tunable parameters for BlazingText Classification Model buckets [1000000, 10000000] epochs [5, 15] learning_rate [0.005, 0.05] min_count [0, 100] mode [‘supervised’] vector_dim [32,300] word_ngrams [1,3]
  18. 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. bt_model = sagemaker.estimator.Estimator(container, role, train_instance_count=1, train_instance_type='ml.m4.xlarge', train_volume_size = 30, train_max_run = 360000, input_mode= 'File', output_path=s3_output_location, sagemaker_session=sess) bt_model.set_hyperparameters(mode="supervised", epochs=10, min_count=2, early_stopping=True, patience=4, min_epochs=5) from sagemaker.tuner import HyperparameterTuner, IntegerParameter, ContinuousParameter hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.01, 0.05), 'vector_dim': IntegerParameter(20, 50), 'word_ngrams': IntegerParameter(1, 4)}
  19. 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. objective_metric_name = 'validation:accuracy' objective_type='Maximize' tuner = HyperparameterTuner(estimator=bt_model, objective_metric_name=objective_metric_name, objective_type=objective_type, hyperparameter_ranges=hyperparameter_ranges, max_jobs=4, max_parallel_jobs=2 )
  20. 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Hyperparameter Tuning Jobs blazingtext-181002- 1157-001-073948fb blazingtext-181002- 1157-002-fba7f0b8 blazingtext-181002- 1157-003-f7407aa1 blazingtext-181002- 1157-004-3a4b4863
  21. 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Maximize/Minimize Objective Metric ● In this example, the objective metric is to maximize the 'validation:accuracy'
  22. 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints
  23. 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Postman ● Get the invocation endpoint from the model details in Amazon Sagemaker Console
  24. 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Postman ● Generate the access key ID and secret access key from your security credentials. This will be used in creating the Authorization token in postman
  25. 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Accessing the Model Endpoints via Lambda Functions Setting up lambda function to access the SageMaker endpoint and return the response Setting up the API Gateway that 1. Accepts client request 2. Forwards the request parameters to the lambda function and waits for the response 3. Forwards the response to the client
  26. 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. import os import io import boto3 import json import csv ENDPOINT_NAME = os.environ['ENDPOINT_NAME'] runtime = boto3.Session().client(service_name='sagemaker-runtime',region_name='ap-southeast-2') def lambda_handler(event, context): print("Received event: " + json.dumps(event, indent=2)) payload = "{"instances" : [""+event+""]}" print(payload) response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME, Body=payload) predictions = response['Body'].read().decode('utf8') print(predictions) return predictions Lambda Function
  27. 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. API Gateway Configuration Curl request: curl -X POST https://los599anje.execute-api.ap- southeast- 2.amazonaws.com/test/spamdetection -d '"Hello from Airtel. For 1 months free access call 9113851022"' Response: "[{"prob": [0.880291223526001], "label": ["__label__spam"]}]"%
  28. 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Blue/Green Deployments using Amazon SageMaker
  29. 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deploying multiple models to the same endpoint ● Particularly useful in blue/green deployments ● Blue deployment -> current deployment ● Green deployment -> the new model that is to be tested in production ● We can do this by diverting a small amount of traffic to the green deployment. We achieve this using ProductionVariant in SageMaker. ● Easy rollback to the blue deployment if the green one fails.
  30. 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Steps in Blue/Green Deployment User SageMaker Endpoint A 100% User SageMaker Endpoint A 90% User SageMaker Endpoint 100% B 10% B User SageMaker Endpoint A 0% B 100%
  31. 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Switching to the new ModelProduction Variant Deployment
  32. 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Invocation Metrics
  33. 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security and Best Practices
  34. 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deployment Recommendations ● Deploy multiple instances for each production endpoint ● Deploy in VPC with more than one subnets with multiple availability regions
  35. 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Security Practices ● Specify Virtual Private Cloud (VPC) for your notebook instance with outbound connections via Network Address Translation (NAT) ● Exercise judgement when granting individuals access to notebook instances that are attached to a VPC that contains sensitive information.
  36. 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Securing Training jobs and Endpoints ● Run Training jobs in a private VPC. ● Create a VPC endpoint to access S3 ● Configure custom policy that allow access to S3 only from your private VPC ● If you want to deny access to certain resources, add to the custom policy ● Configure rule for the security group to allow inbound communication between other members in the same security group ● Configure a NAT gateway that allows only outbound connections from the private VPC.
  37. 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. BENGALURU THANK YOU

×