SlideShare a Scribd company logo
1 of 25
Download to read offline
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Demonstrating My Projects With AWS
Sagemaker Clarify, Sagemaker Tuner & AWS
Feature Store
Varun Garg, Ph.D.
Algorithm Engineer at Magna
Ph.D. in Data Fusion
Department of Electrical and Computer Engineering
University of Massachusetts Lowell, Lowell, MA, USA
February 19, 2024
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Proposal Summary
This presentation goes over a few tasks I did on the AWS Sagmaker service
AWS Sagemaker is a fully managed machine learning service offered by Amazon
Web Services.
Sagemaker offers capabilities for data pre-processing, model training,
hyper-parameter tuning, and model deployment.
Sagemaker provides a wide range of built-in algorithms and frameworks,
including TensorFlow, PyTorch,
In this presentation, I will be presenting:
Bias Calculation with Sagemaker Clarify different bias metrics
Hyper-parameter tuning using AWS Sagemaker’s Hyper parameter Tuner
AWS feature Store where the data scientist can share the feature with
other team members by writing AWS Athena queries interfaced by
using AWS Glue.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Outline of the Presentation
1 AWS Clarify: Statistical Bias
2 AWS Hyperparameter Tuning
3 AWS Feature Store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify Statistical Bias
AWS Clarify Statistical Bias
Amazon SageMaker Clarify offers enhanced visibility to data scientists by
enabling them to effectively compute pre-training bias metrics (data bias, class
imbalance) or post-training metrics bias (model bias, etc)
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Detecting Statistical Bias: FlowChart
Amazon SageMaker Clarify has the ability to identify various useful
measurements at different steps of data science projects, such as during data
preparation, and post-model deployment.
In the following slides we will demonstrate the usage of Sagemaker Clarify
These metrics were applied to a Kaggle dataset [1]
Figure: Figure showing the usage of Sagemaker Clarify in data preparation phase,
model training phase and the deployment phase of the ML pipeline making. Figure
Credits AWS Documentation [2]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Configuring AWS Clarify
We will be using Sagemaker Clarify for calculating Pre-training bias metrics
such as class imbalance, KL- Divergence, etc.
In the code snippet below multiple variables are defined to configure AWS to
clarify such input data, output path of the report, column name in data having
labels
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Clarify: Launching Task
In AWS clarify configuration we define the calculation metrics such as Class
Imbalance, KL divergence, LP norm.
Figure: AWS executing the AWS clarify task using AWS
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Results: Bias Report
In the following table for the column in the dataset called ’dresses’ we can
observe the calculated metrics such as Class Imbalance, KL divergence, and LP
norm.
Since the dataset is unbalanced we can see class unbalance score is high
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SagemakerHyperparameter Tuning
Sagemaker Hyperparameter Tuning
SageMaker tuner offers data scientists the ability to identify optimum
hyper-parameters. It offers state-of-the-art search strategies for tuning ML
models
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning Step
Figure: AWS Hyper-parameter tuning Flowchart. Figure Credits AWS [3]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Hyper-parameter Tuning: Steps
As per AWS documentation, A hyper-parameter tuning job contains the
following components:
Tuning job settings
Training job definitions
Tuning job configuration
The following are few types of tuning job settings:
Warm start: In this job, the results from the previous tuning job can be
utilized to for performance improvement in a new tuning job.
Early stopping: This is common-type of job that stops the execution of
the training when the performance of the model has not improved after
multiple consecutive epochs
In the following slides we will demonstrate a tuning job for parameters such
as learning rate and batch size. A random search strategy was utilized.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Define Hyper-parameters
Figure: Defining the hyper-parameters (learning rate, batch size) to be tuned with
their corresponding ranges
Figure: Defining the metric for evaluation of the performance of the model. This will
be used to select what hyper-parameter value is optimum
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Training job definitions: Defining Pytorch Estimator
Figure: Creating a PyTorch model and adding different input arguments such as,
instance type, metric definitions defined earlier
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Tuning job configuration: Defining Sagemaker Tuner
Figure: Created an AWS Sagemaker tuner object and provided the arguments such as
tuning, tuning metrics, and number of parallel jobs
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
RESULTS: AWS Sagemaker Tuner
Figure: Executing the AWS Sagemaker Tuner Object
Figure: Sorted list (descending order) of the validation accuracy with different AWS
Sagemaker Tuner with respect to the batch size and learning rate. Learning rate
0.000021 and batch size 128 were found as the optimum parameters
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store
Feature Store
SageMaker Feature Store enables data science teams to efficiently reuse ML
features for various teams and models. This functionality helps with the smooth
delivery of features for large-scale predictive modeling.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Feature Store: Overview
Feature engineering is an important process of ML pipeline development. AWS
feature store allows the sharing/managing of features between multiple users.
This allows efficient development of new models and maintaining or
troubleshooting existing data pipelines
AWS feature store allows users to track metadata such as:
Data sources using the features
Models using the features
Transformations used for the calculation of the features
AWS feature store allows users to avoid reinventing features from scratch and
troubleshooting existing models.
In the following slides we will demonstrate steps to create a new feature
store. Using AWS Athena queries will store and retrieve new features.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Feature Store: FlowChart
Figure: AWS Feature Store Pipeline showing the use of feature store by multiple
users for different applications and different data sources. Figure Credits AWS
Documentation [4]
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS SageMaker: Initialization Feature Group
Figure: Creating a new feature store service in AWS Sagemaker by using the boto3
client
Figure: Initialize a new feature group and AWS feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
AWS Sagemaker Feature Group: Adding Features
Figure: Defining the features to be added in the store by providing the feature name,
data type
Figure: Created an object of AWS feature group and provided the input arguments
such as the feature definitions, AWS Sagemaker session type
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Storing Features
Figure: Storing the features in the AWS feature Group in AWS Glue table using AWS
Athena query. Using AWS Athena to interface with the AWS Glue table so that the
feature can accessed by other team members
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Output: AWS Feature Store: Reading Stored Features
Figure: Reading/Extracting the features in the AWS feature Group from the AWS
Glue table using AWS Athena query.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
Summary
In this presentation we presented different functionalities within AWS Sagemaker
We discussed about pre-training bias calculation using Sagemaker clarify. We
computed class imbalance, and KL divergence using the Women’s Clothing
Reviews dataset.
In the second section of the presentation we discussed how to perform
hyper-parameter tuning using AWS Sagemaker’s. We tuned parameters such as
learning rate and batch over a Pytorch model
In the third section of the presentation we discussed about AWS feature store.
We discussed its benefits and demonstrated how to setup a new AWS feature
store and insert or retrieve features to/from the feature store
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References I
A. F. Agarap, “Women’s e-commerce clothing reviews.”
%https://www.kaggle.com/datasets/nicapotato/
womens-ecommerce-clothing-reviews.
Accessed: 2024-01-10.
N. M. Kado and K. Wadia, “What is amazon sagemaker?.” https:
//community.aws/concepts/what-is-sagemaker#sagemaker-clarify.
Accessed: 2024-01-10.
D. Mbaya, “Amazon sagemaker automatic model tuning now supports
grid search.” https://aws.amazon.com/blogs/machine-learning/
amazon-sagemaker-automatic-model-tuning-now-supports-grid-search/.
Accessed: 2024-01-10.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P
B
y
V
a
r
u
n
G
a
r
g
AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store
References II
B. Lindsey, M. Pasappulatti, and M. Roy, “Extend model lineage to
include ml features using amazon sagemaker feature store.”
https://aws.amazon.com/blogs/machine-learning/
extend-model-lineage-to-include-ml-features-using-amazon-sagemaker
Accessed: 2024-01-15.
Garg.Kumar.Varun@Gmail.com
Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store
B
y
V
a
r
u
n
G
a
r
g
,
P

More Related Content

Similar to AWS_projects related AWS services such as feature store store and clarify

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataChris Fregly
 
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...Edureka!
 
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS Tooling
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS ToolingManaging Application Lifecycle using Jira and Bitbucket Cloud and AWS Tooling
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS ToolingAtlassian
 
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...Amazon Web Services
 
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기Amazon Web Services Korea
 
AWS Advanced Analytics Automation Toolkit (AAA)
AWS Advanced Analytics Automation Toolkit (AAA)AWS Advanced Analytics Automation Toolkit (AAA)
AWS Advanced Analytics Automation Toolkit (AAA)CloudHesive
 
AWS Cloud School, Barcelona, Spain - intro and closing remarks
AWS Cloud School, Barcelona, Spain - intro and closing remarksAWS Cloud School, Barcelona, Spain - intro and closing remarks
AWS Cloud School, Barcelona, Spain - intro and closing remarksklamarv
 
What's New with Amazon Redshift - ADB202 - Anaheim AWS Summit
What's New with Amazon Redshift - ADB202 - Anaheim AWS SummitWhat's New with Amazon Redshift - ADB202 - Anaheim AWS Summit
What's New with Amazon Redshift - ADB202 - Anaheim AWS SummitAmazon Web Services
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...Amazon Web Services Korea
 
20 questions from digital cloud
20 questions from digital cloud20 questions from digital cloud
20 questions from digital cloudVishnu Sure
 
AWS AWSome Day - Getting Started Best Practices
AWS AWSome Day - Getting Started Best PracticesAWS AWSome Day - Getting Started Best Practices
AWS AWSome Day - Getting Started Best PracticesIan Massingham
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022SkillCertProExams
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
 
Data Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksData Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksAmazon Web Services
 
An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)Julien SIMON
 
What's new in Serverless at AWS?
What's new in Serverless at AWS?What's new in Serverless at AWS?
What's new in Serverless at AWS?Daniel Zivkovic
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...Amazon Web Services Korea
 
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAmazon Web Services
 
Cloud Expedition Technical1 - Día 1.pptx
Cloud Expedition Technical1 - Día 1.pptxCloud Expedition Technical1 - Día 1.pptx
Cloud Expedition Technical1 - Día 1.pptxalfredoagarciat2867
 

Similar to AWS_projects related AWS services such as feature store store and clarify (20)

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...
AWS Data Pipeline Tutorial | AWS Tutorial For Beginners | AWS Certification T...
 
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS Tooling
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS ToolingManaging Application Lifecycle using Jira and Bitbucket Cloud and AWS Tooling
Managing Application Lifecycle using Jira and Bitbucket Cloud and AWS Tooling
 
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...
Predicting Hospital Readmissions Using Amazon Machine Learning (HLC304) - AWS...
 
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기
Amazon EMR과 SageMaker를 이용하여 데이터를 준비하고 머신러닝 모델 개발 하기
 
AWS Advanced Analytics Automation Toolkit (AAA)
AWS Advanced Analytics Automation Toolkit (AAA)AWS Advanced Analytics Automation Toolkit (AAA)
AWS Advanced Analytics Automation Toolkit (AAA)
 
AWS Cloud School, Barcelona, Spain - intro and closing remarks
AWS Cloud School, Barcelona, Spain - intro and closing remarksAWS Cloud School, Barcelona, Spain - intro and closing remarks
AWS Cloud School, Barcelona, Spain - intro and closing remarks
 
What's New with Amazon Redshift - ADB202 - Anaheim AWS Summit
What's New with Amazon Redshift - ADB202 - Anaheim AWS SummitWhat's New with Amazon Redshift - ADB202 - Anaheim AWS Summit
What's New with Amazon Redshift - ADB202 - Anaheim AWS Summit
 
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
[AWS Innovate 온라인 컨퍼런스] Kubernetes와 SageMaker를 활용하여 Machine Learning 워크로드 관리하...
 
20 questions from digital cloud
20 questions from digital cloud20 questions from digital cloud
20 questions from digital cloud
 
AWSome Day Intro
AWSome Day IntroAWSome Day Intro
AWSome Day Intro
 
AWS AWSome Day - Getting Started Best Practices
AWS AWSome Day - Getting Started Best PracticesAWS AWSome Day - Getting Started Best Practices
AWS AWSome Day - Getting Started Best Practices
 
Aws data analytics practice tests 2022
Aws data analytics practice tests 2022Aws data analytics practice tests 2022
Aws data analytics practice tests 2022
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
Data Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksData Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech Talks
 
An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)An Introduction to Amazon SageMaker (October 2018)
An Introduction to Amazon SageMaker (October 2018)
 
What's new in Serverless at AWS?
What's new in Serverless at AWS?What's new in Serverless at AWS?
What's new in Serverless at AWS?
 
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS,  AI/ML Service 성능 향상을 위한 협력 모델 - 서...
[AWS Dev Day] 인공지능 / 기계 학습 | Intel on AWS, AI/ML Service 성능 향상을 위한 협력 모델 - 서...
 
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
 
Cloud Expedition Technical1 - Día 1.pptx
Cloud Expedition Technical1 - Día 1.pptxCloud Expedition Technical1 - Día 1.pptx
Cloud Expedition Technical1 - Día 1.pptx
 

Recently uploaded

The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 

Recently uploaded (20)

The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 

AWS_projects related AWS services such as feature store store and clarify

  • 1. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store Varun Garg, Ph.D. Algorithm Engineer at Magna Ph.D. in Data Fusion Department of Electrical and Computer Engineering University of Massachusetts Lowell, Lowell, MA, USA February 19, 2024 Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 2. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Proposal Summary This presentation goes over a few tasks I did on the AWS Sagmaker service AWS Sagemaker is a fully managed machine learning service offered by Amazon Web Services. Sagemaker offers capabilities for data pre-processing, model training, hyper-parameter tuning, and model deployment. Sagemaker provides a wide range of built-in algorithms and frameworks, including TensorFlow, PyTorch, In this presentation, I will be presenting: Bias Calculation with Sagemaker Clarify different bias metrics Hyper-parameter tuning using AWS Sagemaker’s Hyper parameter Tuner AWS feature Store where the data scientist can share the feature with other team members by writing AWS Athena queries interfaced by using AWS Glue. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 3. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Outline of the Presentation 1 AWS Clarify: Statistical Bias 2 AWS Hyperparameter Tuning 3 AWS Feature Store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 4. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Clarify Statistical Bias AWS Clarify Statistical Bias Amazon SageMaker Clarify offers enhanced visibility to data scientists by enabling them to effectively compute pre-training bias metrics (data bias, class imbalance) or post-training metrics bias (model bias, etc) Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 5. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Detecting Statistical Bias: FlowChart Amazon SageMaker Clarify has the ability to identify various useful measurements at different steps of data science projects, such as during data preparation, and post-model deployment. In the following slides we will demonstrate the usage of Sagemaker Clarify These metrics were applied to a Kaggle dataset [1] Figure: Figure showing the usage of Sagemaker Clarify in data preparation phase, model training phase and the deployment phase of the ML pipeline making. Figure Credits AWS Documentation [2] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 6. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Configuring AWS Clarify We will be using Sagemaker Clarify for calculating Pre-training bias metrics such as class imbalance, KL- Divergence, etc. In the code snippet below multiple variables are defined to configure AWS to clarify such input data, output path of the report, column name in data having labels Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 7. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Clarify: Launching Task In AWS clarify configuration we define the calculation metrics such as Class Imbalance, KL divergence, LP norm. Figure: AWS executing the AWS clarify task using AWS Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 8. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Results: Bias Report In the following table for the column in the dataset called ’dresses’ we can observe the calculated metrics such as Class Imbalance, KL divergence, and LP norm. Since the dataset is unbalanced we can see class unbalance score is high Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 9. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS SagemakerHyperparameter Tuning Sagemaker Hyperparameter Tuning SageMaker tuner offers data scientists the ability to identify optimum hyper-parameters. It offers state-of-the-art search strategies for tuning ML models Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 10. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Hyper-parameter Tuning Step Figure: AWS Hyper-parameter tuning Flowchart. Figure Credits AWS [3] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 11. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Hyper-parameter Tuning: Steps As per AWS documentation, A hyper-parameter tuning job contains the following components: Tuning job settings Training job definitions Tuning job configuration The following are few types of tuning job settings: Warm start: In this job, the results from the previous tuning job can be utilized to for performance improvement in a new tuning job. Early stopping: This is common-type of job that stops the execution of the training when the performance of the model has not improved after multiple consecutive epochs In the following slides we will demonstrate a tuning job for parameters such as learning rate and batch size. A random search strategy was utilized. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 12. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Training job definitions: Define Hyper-parameters Figure: Defining the hyper-parameters (learning rate, batch size) to be tuned with their corresponding ranges Figure: Defining the metric for evaluation of the performance of the model. This will be used to select what hyper-parameter value is optimum Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 13. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Training job definitions: Defining Pytorch Estimator Figure: Creating a PyTorch model and adding different input arguments such as, instance type, metric definitions defined earlier Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 14. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Tuning job configuration: Defining Sagemaker Tuner Figure: Created an AWS Sagemaker tuner object and provided the arguments such as tuning, tuning metrics, and number of parallel jobs Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 15. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store RESULTS: AWS Sagemaker Tuner Figure: Executing the AWS Sagemaker Tuner Object Figure: Sorted list (descending order) of the validation accuracy with different AWS Sagemaker Tuner with respect to the batch size and learning rate. Learning rate 0.000021 and batch size 128 were found as the optimum parameters Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 16. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Feature Store Feature Store SageMaker Feature Store enables data science teams to efficiently reuse ML features for various teams and models. This functionality helps with the smooth delivery of features for large-scale predictive modeling. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 17. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Feature Store: Overview Feature engineering is an important process of ML pipeline development. AWS feature store allows the sharing/managing of features between multiple users. This allows efficient development of new models and maintaining or troubleshooting existing data pipelines AWS feature store allows users to track metadata such as: Data sources using the features Models using the features Transformations used for the calculation of the features AWS feature store allows users to avoid reinventing features from scratch and troubleshooting existing models. In the following slides we will demonstrate steps to create a new feature store. Using AWS Athena queries will store and retrieve new features. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 18. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Feature Store: FlowChart Figure: AWS Feature Store Pipeline showing the use of feature store by multiple users for different applications and different data sources. Figure Credits AWS Documentation [4] Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 19. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS SageMaker: Initialization Feature Group Figure: Creating a new feature store service in AWS Sagemaker by using the boto3 client Figure: Initialize a new feature group and AWS feature store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 20. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store AWS Sagemaker Feature Group: Adding Features Figure: Defining the features to be added in the store by providing the feature name, data type Figure: Created an object of AWS feature group and provided the input arguments such as the feature definitions, AWS Sagemaker session type Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 21. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Output: AWS Feature Store: Storing Features Figure: Storing the features in the AWS feature Group in AWS Glue table using AWS Athena query. Using AWS Athena to interface with the AWS Glue table so that the feature can accessed by other team members Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 22. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Output: AWS Feature Store: Reading Stored Features Figure: Reading/Extracting the features in the AWS feature Group from the AWS Glue table using AWS Athena query. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 23. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store Summary In this presentation we presented different functionalities within AWS Sagemaker We discussed about pre-training bias calculation using Sagemaker clarify. We computed class imbalance, and KL divergence using the Women’s Clothing Reviews dataset. In the second section of the presentation we discussed how to perform hyper-parameter tuning using AWS Sagemaker’s. We tuned parameters such as learning rate and batch over a Pytorch model In the third section of the presentation we discussed about AWS feature store. We discussed its benefits and demonstrated how to setup a new AWS feature store and insert or retrieve features to/from the feature store Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 24. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store References I A. F. Agarap, “Women’s e-commerce clothing reviews.” %https://www.kaggle.com/datasets/nicapotato/ womens-ecommerce-clothing-reviews. Accessed: 2024-01-10. N. M. Kado and K. Wadia, “What is amazon sagemaker?.” https: //community.aws/concepts/what-is-sagemaker#sagemaker-clarify. Accessed: 2024-01-10. D. Mbaya, “Amazon sagemaker automatic model tuning now supports grid search.” https://aws.amazon.com/blogs/machine-learning/ amazon-sagemaker-automatic-model-tuning-now-supports-grid-search/. Accessed: 2024-01-10. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P
  • 25. B y V a r u n G a r g AWS Clarify: Statistical Bias AWS Hyperparameter Tuning AWS Feature Store References II B. Lindsey, M. Pasappulatti, and M. Roy, “Extend model lineage to include ml features using amazon sagemaker feature store.” https://aws.amazon.com/blogs/machine-learning/ extend-model-lineage-to-include-ml-features-using-amazon-sagemaker Accessed: 2024-01-15. Garg.Kumar.Varun@Gmail.com Demonstrating My Projects With AWS Sagemaker Clarify, Sagemaker Tuner & AWS Feature Store B y V a r u n G a r g , P