SlideShare a Scribd company logo
1 of 23
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
1
On Optimizing
Operational Efficiency in
Storage Systems via
Deep Reinforcement
Learning
Sunil Srinivasa1, Girish Kathalagiri1, Julu
Subramanyam Varanasi2, Luis Carlos
Quintela1, Mohamad Charafeddine1,
ChiHoon Lee1
1 Samsung SDS America
2 Samsung Semiconductors Inc
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
2
I. Motivation, Intro, & Problem Setup
II. Short Primer on RL
III. Problem Formulation
IV.Actor Critic Network & Algorithm
V. Results
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Motivation
• Data Centers as large sinkholes of energy.
• US data centers consumption is about 70 billion Kwh, 1.8% total US energy
consumption (Lawrence Berkeley Lab study 2016), ~ $8B
3
Image: www.aashe.orgImage: systemair.com
Building-level: Data Center energy usage optimization,
optimizing water pumps, cooling towers, ..
Server-level: energy usage optimization per server,
optimizing fans speed inside servers
focus of this talk
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Problem setup: operational efficiency of a storage server
4
24 SSDs solid state drives for storage
1 2 3 … 24
1 2 3 4 5
24 temperature reading for each SSD drive
Speed setting for the fans
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Problem setup: operational efficiency of a storage server
5
• High temperatures lower the performance of reading/writing
• High temperatures shorten the lifespan of the SSD
• Temperature varies based on the workload: read or write,
how many KB, and how often.
• Current status quo control is rule-based.
We would like to use Deep Reinforcement Learning to represent the state of operation and learn
the optimum policy for controlling the fan speeds, as a proxy of energy usage for cooling.
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Contributions
• Model-free approach, no need to understand to SSD server behavior
dynamics of workload and temperature
• We trained on the real environment (the server), not through a simulator
• We designed a Reward function for this problem to guide the algorithm
towards the desired operation behavior (servers temperature & fans
speeds)
6
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
7
I. Motivation, Intro, & Problem Setup
II. Short Primer on RL
III. Problem Formulation
IV.Actor Critic Network & Algorithm
V. Results
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Rewards
rt
Perceived
State of the
environment
st
8
RL Agent Environment
Action at | at+1
st
rt
| st+1
| rt+1 | rt+2 | …. rt+H-1
| at+2 | …. at+H-1
| st+2 | …. st+H-1
Policy π(at|st)
to optimize expected
return 𝑅 = 𝐸 𝑘=0
𝐻−1
𝛾 𝑘
𝑟𝑡 𝑠𝑡, 𝑎𝑡 , i.e.,
expected sum of discounted rewards
Reinforcement Learning
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Reinforcement Learning: Action Value, State Value, Advantage functions
• Action Value Function Qπ(st, at): is the scalar expected long-term return achieved
for following Policy π when in state st by performing action at.  How good is this
state-action pair.
• State Value Function Vπ(st): is the scalar expected long-term return achieved for
following Policy π when in state st averaged over all possible actions {a} from that
state.  How good is this state st.
• Advantage Function Aπ(st, at) = Qπ(st, at) - Vπ(st) is the advantage of taking action at
when in state st compared to the average of all possible actions from that state.
The Action and Value functions are unknown, and we need to learn them by interacting
with the environment. One approach is through training Deep Neural Network(s) which
takes the state as an input and outputs an approximation of Q and V.
9
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
10
I. Motivation, Intro, & Problem Setup
II. Short Primer on RL
III.Problem Formulation:
State, Action, Rewards
IV.Actor Critic Network & Algorithm
V. Results
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Problem setup using Deep Reinforcement Learning
11
State: function of workload (Reads or Writes) and temperature
RL Agent Environment
Actions:
24 SSD Drives
Fan Speeds
Reward: A function of Temperatures & Fans Speeds
Advantage Actor-Critic
(A2C)
Desired
operation
region
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Problem Formulation: State
12
Where normalization operation per field is Γ(𝑥) =
𝑥 − 𝑚𝑖𝑛𝑋
𝑚𝑎𝑥𝑋 −𝑚𝑖𝑛𝑋
1. Γ(# of I/O requests to the server per second)
2. Γ(# of Kbytes read, averaged over all SSDs per sec)
3. Γ(# of Kbytes written, averaged over all SSDs per sec)
4. Γ(# of Kbytes read in previous time slot, averaged over all SSDs)
5. Γ(# of Kbytes written in previous time slot, avg. over all SSDs)
6. Γ(Mean Temperate: averaged over all SSDs)
7. Γ(Mean Fan Speed: averaged over all 5 cooling fans)
State =
Time slot = 25 sec
Horizon = 10 slots
sec
Read via
iostat
Read via
ipmi-sensors
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Problem Formulation: Actions
13
sec
Raw Actions
Size of Actions space = 7
A = {6, 8, 10, .., 18}
Kilo rpm (revolutions per minute)
We tried 2 approaches
Incremental Actions
Size of Actions space = 3
A = {Decrease by 1000,
No change,
Increase by 1000} rpm
Allows for
smoother
transitions
Action is applied to all 5 fans via the ipmitool command
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
We chose this design above.
T is the mean temp across all SSDs.
F is the mean speed across fans.
Problem Formulation: Rewards
Desirable
operation
region
Reward shaping is an important design step which
requires domain knowledge and guides the RL agent
towards its optimal behavior
For this RL application, a good reward function should have:
1) High reward for low fan speeds and low temperature
2) Low reward for high fan speeds and low temperature –
as this is wasting energy
3) Low reward for low fan speed and high temperature –
as this will damage the SSDs
𝑅 = −𝑚𝑎𝑥
Γ(𝑇)
Γ(𝐹)
,
Γ(𝐹)
Γ(𝑇)
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
15
I. Motivation, Intro, & Problem Setup
II. Short Primer on RL
III. Problem Formulation
IV.Actor Critic Network & Algorithm
V. Results
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Actor Critic (A2C) Dueling Network Architecture
16
State:
Vector of length 7
Actor: Approximates optimal policy,
of choosing Best action for state s
Critic: approximates the state-value
function to estimate how attractive
is it being at state s
Policy Neural Network
Value Function Neural Network
A NN with 2 branches: one for the Policy
evaluation (what action to do: actor), and
one for policy estimation (how good is it being
in such state: critic)
Other option is to estimate the Actor in one NN,
and the Critic in another NN. Our approach has
the advantage of having: (1) fewer parameters, (2)
mitigates overfitting one function over the other,
(3) faster convergence, and (4) more accurate
approximations for the actor & the critic.
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
A2C algorithm pseudo-code
17
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
18
I. Motivation, Intro, & Problem Setup
II. Short Primer on RL
III. Problem Formulation
IV.Actor Critic Network & Algorithm
V. Results
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Learning… over few days
19
RL Agent
Fan Speeds
State:
Different
workloads,
Temperature
Reward
• Learning directly on the real environment
(no simulator)
• Model-free: does not require any knowledge
of the SSD server behavior dynamics
• Exposed to different stochastic workloads
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Performance for Idle Vs Heavy Periodic I/O workloads on the operational
contours
20
Status Quo controller Using Deep Reinforcement Learning
with Raw Actions
Desired
Operational
Region
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Performance for different stochastic workloads – Attempt 1
21
At the beginning of training, algorithm
is exploring and learning
Using Deep Reinforcement Learning
with Raw Actions – Suboptimal
performance, likely due to insufficient exploration
Resulting
distribution
from different
workloads
Desired
Operational
Region
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Performance for different stochastic workloads – Attempt 2
22
At the beginning of training, algorithm
is exploring and learning
Once finished learning right policy,
operational behavior is within desired region.
Used DRL with incremental actions.
Desired
Operational
Region
Resulting
distribution
from different
workloads
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
23

More Related Content

Similar to On Optimizing Operational Efficiency in Storage Systems via Deep Reinforcement Learning

Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Amazon Web Services
 
Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for MLBin Han
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryDataCore Software
 
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017Amazon Web Services Korea
 
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Amazon Web Services
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon Web Services
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptxclaudio48
 
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Arghya Kusum Das
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...DataWorks Summit
 
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...Amazon Web Services
 
Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Rahul Gupta
 
Office 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesOffice 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesThousandEyes
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesAravindharamanan S
 
Smart Grid Analytics: All That Remains to be Ready is You
Smart Grid Analytics: All That Remains to be Ready is YouSmart Grid Analytics: All That Remains to be Ready is You
Smart Grid Analytics: All That Remains to be Ready is YouLauren Watters
 
Predictive Maintenance - Portland Machine Learning Meetup
Predictive Maintenance - Portland Machine Learning MeetupPredictive Maintenance - Portland Machine Learning Meetup
Predictive Maintenance - Portland Machine Learning MeetupIan Downard
 

Similar to On Optimizing Operational Efficiency in Storage Systems via Deep Reinforcement Learning (20)

Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
 
Python tutorial for ML
Python tutorial for MLPython tutorial for ML
Python tutorial for ML
 
Presto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 Updates
 
Speed Control of PMDCM Based GA and DS Techniques
Speed Control of PMDCM Based GA and DS TechniquesSpeed Control of PMDCM Based GA and DS Techniques
Speed Control of PMDCM Based GA and DS Techniques
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
 
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
AWS 기반 실시간 서비스 개발 및 운영 사례 - AWS Summit Seoul 2017
 
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
 
CAQA5e_ch1 (3).pptx
CAQA5e_ch1 (3).pptxCAQA5e_ch1 (3).pptx
CAQA5e_ch1 (3).pptx
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...
Amazon EC2 T Instances – Burstable, Cost-Effective Performance (CMP209) - AWS...
 
Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018Hazelcast Jet - January 08, 2018
Hazelcast Jet - January 08, 2018
 
Siem tools
Siem toolsSiem tools
Siem tools
 
Office 365 Monitoring Best Practices
Office 365 Monitoring Best PracticesOffice 365 Monitoring Best Practices
Office 365 Monitoring Best Practices
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examples
 
Smart Grid Analytics: All That Remains to be Ready is You
Smart Grid Analytics: All That Remains to be Ready is YouSmart Grid Analytics: All That Remains to be Ready is You
Smart Grid Analytics: All That Remains to be Ready is You
 
Predictive Maintenance - Portland Machine Learning Meetup
Predictive Maintenance - Portland Machine Learning MeetupPredictive Maintenance - Portland Machine Learning Meetup
Predictive Maintenance - Portland Machine Learning Meetup
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

On Optimizing Operational Efficiency in Storage Systems via Deep Reinforcement Learning

  • 1. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 1 On Optimizing Operational Efficiency in Storage Systems via Deep Reinforcement Learning Sunil Srinivasa1, Girish Kathalagiri1, Julu Subramanyam Varanasi2, Luis Carlos Quintela1, Mohamad Charafeddine1, ChiHoon Lee1 1 Samsung SDS America 2 Samsung Semiconductors Inc
  • 2. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 2 I. Motivation, Intro, & Problem Setup II. Short Primer on RL III. Problem Formulation IV.Actor Critic Network & Algorithm V. Results
  • 3. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Motivation • Data Centers as large sinkholes of energy. • US data centers consumption is about 70 billion Kwh, 1.8% total US energy consumption (Lawrence Berkeley Lab study 2016), ~ $8B 3 Image: www.aashe.orgImage: systemair.com Building-level: Data Center energy usage optimization, optimizing water pumps, cooling towers, .. Server-level: energy usage optimization per server, optimizing fans speed inside servers focus of this talk
  • 4. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Problem setup: operational efficiency of a storage server 4 24 SSDs solid state drives for storage 1 2 3 … 24 1 2 3 4 5 24 temperature reading for each SSD drive Speed setting for the fans
  • 5. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Problem setup: operational efficiency of a storage server 5 • High temperatures lower the performance of reading/writing • High temperatures shorten the lifespan of the SSD • Temperature varies based on the workload: read or write, how many KB, and how often. • Current status quo control is rule-based. We would like to use Deep Reinforcement Learning to represent the state of operation and learn the optimum policy for controlling the fan speeds, as a proxy of energy usage for cooling.
  • 6. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Contributions • Model-free approach, no need to understand to SSD server behavior dynamics of workload and temperature • We trained on the real environment (the server), not through a simulator • We designed a Reward function for this problem to guide the algorithm towards the desired operation behavior (servers temperature & fans speeds) 6
  • 7. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 7 I. Motivation, Intro, & Problem Setup II. Short Primer on RL III. Problem Formulation IV.Actor Critic Network & Algorithm V. Results
  • 8. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Rewards rt Perceived State of the environment st 8 RL Agent Environment Action at | at+1 st rt | st+1 | rt+1 | rt+2 | …. rt+H-1 | at+2 | …. at+H-1 | st+2 | …. st+H-1 Policy π(at|st) to optimize expected return 𝑅 = 𝐸 𝑘=0 𝐻−1 𝛾 𝑘 𝑟𝑡 𝑠𝑡, 𝑎𝑡 , i.e., expected sum of discounted rewards Reinforcement Learning
  • 9. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Reinforcement Learning: Action Value, State Value, Advantage functions • Action Value Function Qπ(st, at): is the scalar expected long-term return achieved for following Policy π when in state st by performing action at.  How good is this state-action pair. • State Value Function Vπ(st): is the scalar expected long-term return achieved for following Policy π when in state st averaged over all possible actions {a} from that state.  How good is this state st. • Advantage Function Aπ(st, at) = Qπ(st, at) - Vπ(st) is the advantage of taking action at when in state st compared to the average of all possible actions from that state. The Action and Value functions are unknown, and we need to learn them by interacting with the environment. One approach is through training Deep Neural Network(s) which takes the state as an input and outputs an approximation of Q and V. 9
  • 10. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 10 I. Motivation, Intro, & Problem Setup II. Short Primer on RL III.Problem Formulation: State, Action, Rewards IV.Actor Critic Network & Algorithm V. Results
  • 11. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Problem setup using Deep Reinforcement Learning 11 State: function of workload (Reads or Writes) and temperature RL Agent Environment Actions: 24 SSD Drives Fan Speeds Reward: A function of Temperatures & Fans Speeds Advantage Actor-Critic (A2C) Desired operation region
  • 12. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Problem Formulation: State 12 Where normalization operation per field is Γ(𝑥) = 𝑥 − 𝑚𝑖𝑛𝑋 𝑚𝑎𝑥𝑋 −𝑚𝑖𝑛𝑋 1. Γ(# of I/O requests to the server per second) 2. Γ(# of Kbytes read, averaged over all SSDs per sec) 3. Γ(# of Kbytes written, averaged over all SSDs per sec) 4. Γ(# of Kbytes read in previous time slot, averaged over all SSDs) 5. Γ(# of Kbytes written in previous time slot, avg. over all SSDs) 6. Γ(Mean Temperate: averaged over all SSDs) 7. Γ(Mean Fan Speed: averaged over all 5 cooling fans) State = Time slot = 25 sec Horizon = 10 slots sec Read via iostat Read via ipmi-sensors
  • 13. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Problem Formulation: Actions 13 sec Raw Actions Size of Actions space = 7 A = {6, 8, 10, .., 18} Kilo rpm (revolutions per minute) We tried 2 approaches Incremental Actions Size of Actions space = 3 A = {Decrease by 1000, No change, Increase by 1000} rpm Allows for smoother transitions Action is applied to all 5 fans via the ipmitool command
  • 14. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets We chose this design above. T is the mean temp across all SSDs. F is the mean speed across fans. Problem Formulation: Rewards Desirable operation region Reward shaping is an important design step which requires domain knowledge and guides the RL agent towards its optimal behavior For this RL application, a good reward function should have: 1) High reward for low fan speeds and low temperature 2) Low reward for high fan speeds and low temperature – as this is wasting energy 3) Low reward for low fan speed and high temperature – as this will damage the SSDs 𝑅 = −𝑚𝑎𝑥 Γ(𝑇) Γ(𝐹) , Γ(𝐹) Γ(𝑇)
  • 15. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 15 I. Motivation, Intro, & Problem Setup II. Short Primer on RL III. Problem Formulation IV.Actor Critic Network & Algorithm V. Results
  • 16. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Actor Critic (A2C) Dueling Network Architecture 16 State: Vector of length 7 Actor: Approximates optimal policy, of choosing Best action for state s Critic: approximates the state-value function to estimate how attractive is it being at state s Policy Neural Network Value Function Neural Network A NN with 2 branches: one for the Policy evaluation (what action to do: actor), and one for policy estimation (how good is it being in such state: critic) Other option is to estimate the Actor in one NN, and the Critic in another NN. Our approach has the advantage of having: (1) fewer parameters, (2) mitigates overfitting one function over the other, (3) faster convergence, and (4) more accurate approximations for the actor & the critic.
  • 17. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets A2C algorithm pseudo-code 17
  • 18. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 18 I. Motivation, Intro, & Problem Setup II. Short Primer on RL III. Problem Formulation IV.Actor Critic Network & Algorithm V. Results
  • 19. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Learning… over few days 19 RL Agent Fan Speeds State: Different workloads, Temperature Reward • Learning directly on the real environment (no simulator) • Model-free: does not require any knowledge of the SSD server behavior dynamics • Exposed to different stochastic workloads
  • 20. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Performance for Idle Vs Heavy Periodic I/O workloads on the operational contours 20 Status Quo controller Using Deep Reinforcement Learning with Raw Actions Desired Operational Region
  • 21. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Performance for different stochastic workloads – Attempt 1 21 At the beginning of training, algorithm is exploring and learning Using Deep Reinforcement Learning with Raw Actions – Suboptimal performance, likely due to insufficient exploration Resulting distribution from different workloads Desired Operational Region
  • 22. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Performance for different stochastic workloads – Attempt 2 22 At the beginning of training, algorithm is exploring and learning Once finished learning right policy, operational behavior is within desired region. Used DRL with incremental actions. Desired Operational Region Resulting distribution from different workloads
  • 23. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 23

Editor's Notes

  1. is an area of machine learning an agent learns from interaction with an environment what actions to take in order to optimize a long-term objective (expectation of cumulative rewards)