SlideShare a Scribd company logo
Contributions to Performance Modeling and
Management of Data Centers
Rerngvit Yanggratoke

Advisor: Prof. Rolf Stadler
Opponent: Prof. Filip De Turck

Licentiate Seminar
October 25, 2013
1
Introduction
Networked services – search and social-network services
Performance of such services is important
Data center

Internet

Backend
component

Performance management of backend components in a data center
1. Resource allocation for a large-scale cloud environment
2. Performance modeling of a distributed key-value store
2
Outline
1. Resource allocation for a large-scale cloud environment
• Motivation
• Problem
• Approach
• Evaluation results
• Related research / publications
2. Performance modeling of a distributed key-value store
• Motivation
• Problem
• Approach
• Evaluation results
• Related research / publications
– Contributions
– Open questions

3
Motivation – Large-scale Cloud
“U.S. Data centers Growing in Size But Declining in Numbers”
IDC Research, 2012.

• Apple's data center in North
Carolina, USA (500’000 feet2)
• Expected to host 150’000+
machines

• Amazon data center in
Virginia, USA (about 300’000
machines)
• Microsoft's mega data center
in Dublin, Ireland (300’000
feet2)
• Facebook planned for a
massive data center in
Iowa, USA (1.4M feet2)
• eBay's data center in
Utah, USA (at least 240’000
feet2)
4
Resource Allocation
Select the machine for executing an application that satisfies:
• Resource demands of all applications in the cloud
• Management objectives of the cloud provider

Resource allocation system computes the solution to the problem 5
Problem & Approach
Resource allocation system that supports
– Joint allocation of compute and network resources
– Generic and extensible for different management objectives
– Scalable operation (> 100’000 machines)
– Dynamic adaptation to changes in load patterns

Approach
– We assume the full-bisection-bandwidth network
– Formulate the problem as an optimization problem
– Distributed protocols - gossip-based algorithms
6
An Objective Function f(t)
An objective function f(t) expresses a management objective
“Smaller f(t) moves the system closer to achieve the objective”
Balance load objective

f (t) =

å

k

q

q Î{w ,g , l }

Energy efficiency objective

åq q

c u
n n

(t)

2

nÎN

Fairness objective
æ
ö
d
v
f (t) = - ç å kq å q m q m (t) ÷
ç
÷
q Î{w ,g }
mÎM (t )
è
ø

f (t) = å pn (t)
Pn(t) = energy of machine n at time t

Service differentiation objective
æ
ö
q
d
v
f (t) = - ç å k å q m q m (t) ÷
ç
÷
èq Î{w ,g } mÎb (t )
ø

q Î {w, g, l} = {CPU, Memory, Network}
N = set of machines, M = set of applications

7
Optimization Problem
Minimize
f(t)
Subject to:
net
(1) q n (t) +

å

r
q m (t) £ qnc for q Î {w, g , l } and n Î N

mÎAn (t )
d
m

r
(2) qm (t) ³ q for q Î {w, g, l } and m Î M

• (1) are capacity constraints
• (2) are demand constraints
• Choose a specific f(t) states the problem for a specific
management objective
• This problem is NP-hard
• We apply a heuristic solution
• How to distribute the computation?
8
Gossip Protocol
• A round-based protocol that relies on pairwise interactions between
nodes to accomplish a global objective
• Each node executes the same code and the size of the exchanged
message is limited
“During a gossip interaction, move an application that minimizes f(t)”

Balanced load objective

9
Generic Resource Management Protocol (GRMP)
A generic and scalable gossip protocol for resource allocation

<- f(t) is applied here
10
Evaluation Results
Balanced load objective

Energy efficiency objective

Protocol
Ideal

Protocol
Ideal
Relative power consumption

1

Unbalance

0.8
0.6
0.4
0.2
0
0.95
0.75
0.55
0.35
0.15
CPU Load Factor

0.3

0.6

0.9

1.2

1.5

1
0.8
0.6
0.4
0.2
0
0.95
0.75
0.55
0.35
0.15

CPU Load Factor

Network Load Factor

0.3

0.6

0.9

1.2

1.5

Network Load Factor

Scalability:
• System size up to 100’000
• Other parameters do not change

11
Related Research
• Centralized resource allocation system
– (D. Gmach et al., 2008), (A. Verma et al., 2009),
(C. Guo et al., 2010), (H. Hitesh et al., 2011),
(V. Shrivastava et al., 2011), (D. Breitgand and
A.Epstein, 2012), (J.W. Jiang et al., 2012), (O. Biran et al. 2012), …
• Distributed resource allocation system
– (G. Jung et al., 2010), (Yazir et al., 2010),
(D. Jayasinghe et al., 2011), (E. Feller et al., 2012), this thesis
• Initial placement
– (A. Verma et al., 2009), (M. Wang et al., 2011),
(D. Jayasinghe et al., 2011), (X. Zhang et al., 2012)
• Dynamic placement
– (F. Wuhib et al., 2012), (D. Gmach et al., 2008), (Yazir et
al., 2010), (J. Xu and Jose Fortes, 2011) , this thesis
12
Publications
• R. Yanggratoke, F. Wuhib and R. Stadler, “Gossip-based
resource allocation for green computing in large clouds,” In
Proc. 7th International Conference on Network and Service
Management (CNSM), Paris, France, October 24-28, 2011.
• F. Wuhib, R. Yanggratoke, and R. Stadler, “Allocating Compute
and Network Resources under Management Objectives in
Large-Scale Clouds,” Journal of Network and Systems
Management (JNSM), pages 126, 2013.

13
Outline
1. Resource allocation for a large-scale cloud environment
• Motivation
• Problem
• Approach
• Evaluation results
• Related research / publications

2. Performance modeling of a distributed key-value store
• Motivation
• Problem
• Approach
• Evaluation results
• Related research / publications
– Contributions
– Open questions

14
Spotify Backend
A distributed key-value store

• Motivation: low latency is key to the Spotify service
15
Problem & Approach
• Development of analytical models for
– Predicting response time distribution
– Estimating capacity under different object allocation policies
• Random policy
• Popularity-aware policy
• The models should be
– Accurate for Spotify’s operational range
– Obtainable
– Simple
– Efficient
• Approach
– Simplified architecture
– Probabilistic and stochastic modeling techniques
16
Simplified Architecture for a Particular Site

•
•
•
•

Model only Production Storage
AP selects a storage server uniformly at random
Ignore network and access point delays
Consider steady-state conditions and Poisson arrivals
17
Model for Response Time Distribution
Model for a single storage server

Probability that a request to a server is served below a latency t is

(

Pr(T £ t) = q +(1- q) 1- e

- md (1-(1-q)l /md nd )t

)

Model for a cluster of storage servers
Probability that a request to the cluster is served below a latency t is
1
l
Pr(Tc £ t) =
f (t, nd , m d,s , c , qs )
å
S sÎS
S
18
Model for Estimating Capacity
under Different Object Allocation Policies
Capacity
Ω : Max request rate to a server before a QoS is not satisfied
Ωc: Max request rate to the cluster such that the request rate to
each server is at most Ω
Capacity of a cluster under the popularity-aware policy

Wc (popularity) = W× S
Capacity of a cluster under the random policy
Oa
1/a
O +
a -1
Wc (random) = W ×
Om a
1/a
Om +
a -1
When

Om

O
»
+
S

2 O ln S æ
log log S
ç1ç
S
2 log S
è

ö
÷
÷
ø

(

, Assuming O >> S log S

19

)

3
Evaluation Results – Response Time Dist.
Spotify
Server

Spotify
Cluster

20
Evaluation Results – Estimating Capacity
Number
Random policy
of objects

Popularity-aware policy

Measurement

Model

Error (%)

Measurement

Model

Error (%)

1250

166.50

176.21

5.83

190.10

200

5.21

1750

180.30

180.92

0.35

191.00

200

4.71

2500

189.70

182.30

3.90

190.80

200

4.82

5000

188.40

186.92

0.78

191.00

200

4.71

10000

188.60

190.39

0.95

192.40

200

3.95

21
Related Research
• Distributed key-value store
– Amazon Dynamo (G. Decandia et al., 2007), Cassandra
(A. Lakshman, 2010), Riak, HyperDec, Scalaris
• Performance modeling of a storage device
– (J.D. Garcia et al., 2008), (A.S. Lebrecht., 2009),
(F. Cady et al., 2011), (S. Boboila and P. Desnoyers, 2011),
(P. Desnoyers, 2012), (S. Li and H.H. Huang, 2010)
• White-box vs. black-box performance model for a storage system
– Whitebox: (A.S. Lebrecht., 2009), (F. Cady et al., 2011),
(S. Boboila and P. Desnoyers, 2011), (P. Desnoyers, 2012),
this thesis
– Blackbox: (J.D. Garcia et al., 2008), (S. Li and H.H.
Huang, 2010), (A. Gulati et al., 2011)
22
Publications
• R. Yanggratoke, G. Kreitz, M. Goldmann and R.
Stadler, “Predicting response times for the Spotify backend,”
In Proc. 8th International conference on Network and service
management (CNSM), Las Vegas, NV, USA, October 2226, 2012. Best Paper award.
• R. Yanggratoke, G. Kreitz, M. Goldmann, R. Stadler and V.
Fodor, “On the performance of the Spotify backend,” accepted
to Journal of Network and Systems Management
(JNSM), 2013.

23
Contributions
• We designed, developed, and evaluated a generic protocol for
resource allocation that
– supports joint allocation of compute and network resources
– Is generic and extensible for different management
objectives
– enables scalable operation (> 100’000 machines)
– supports dynamic adaptation to changes in load patterns
• We designed, developed, and evaluated performance models
for predicting response time distribution and capacity of a
distributed key-value store
– Simple yet accurate for Spotify’s operational range
– Obtainable
– Efficient
24
Open Questions for Future Research
• Resource allocation for a large-scale cloud environment
– Centralized vs. decentralized resource allocation
– Multiple data centers & telecom clouds
• Performance modeling of a distributed key-value store
– Black-box models for performance predictions
– Online performance management using analytical models

25

More Related Content

What's hot

Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
aimsnist
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
Jinho Choi
 
A survey on massively Parallelism for indexing multidimensional datasets on t...
A survey on massively Parallelism for indexing multidimensional datasets on t...A survey on massively Parallelism for indexing multidimensional datasets on t...
A survey on massively Parallelism for indexing multidimensional datasets on t...
Tejovat Technologies Pvt.Ltd.,Wakad
 
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
aimsnist
 
Crystallization classification semisupervised
Crystallization classification semisupervisedCrystallization classification semisupervised
Crystallization classification semisupervised
Madhav Sigdel
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design
Dan Elton
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense Slides
Debasmit Das
 
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
Debasmit Das
 
Meow Hagedorn
Meow HagedornMeow Hagedorn
Meow Hagedorn
MedicineAndDermatology
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
NAVER Engineering
 
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
Automated Generation of High-accuracy Interatomic Potentials Using Quantum DataAutomated Generation of High-accuracy Interatomic Potentials Using Quantum Data
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
aimsnist
 
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
aimsnist
 

What's hot (12)

Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
A survey on massively Parallelism for indexing multidimensional datasets on t...
A survey on massively Parallelism for indexing multidimensional datasets on t...A survey on massively Parallelism for indexing multidimensional datasets on t...
A survey on massively Parallelism for indexing multidimensional datasets on t...
 
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
Polymer Genome: An Informatics Platform for Polymer Dielectrics Discovery and...
 
Crystallization classification semisupervised
Crystallization classification semisupervisedCrystallization classification semisupervised
Crystallization classification semisupervised
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense Slides
 
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibra...
 
Meow Hagedorn
Meow HagedornMeow Hagedorn
Meow Hagedorn
 
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN[GAN by Hung-yi Lee]Part 1: General introduction of GAN
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
 
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
Automated Generation of High-accuracy Interatomic Potentials Using Quantum DataAutomated Generation of High-accuracy Interatomic Potentials Using Quantum Data
Automated Generation of High-accuracy Interatomic Potentials Using Quantum Data
 
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
Failing Fastest: What an Effective HTE and ML Workflow Enables for Functional...
 

Viewers also liked

Gossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large cloudsGossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large clouds
Rerngvit Yanggratoke
 
Stack Mobile
Stack Mobile Stack Mobile
Stack Mobile
Rerngvit Yanggratoke
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
In a Rocket
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
Kirsty Hulse
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Stanford GSB Corporate Governance Research Initiative
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
Post Planner
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
ux singapore
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
LinkedIn
 

Viewers also liked (8)

Gossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large cloudsGossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large clouds
 
Stack Mobile
Stack Mobile Stack Mobile
Stack Mobile
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Similar to Licentiate Defense Slide

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Alessandro Suglia
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Claudio Greco
 
Wrokflow programming and provenance query model
Wrokflow programming and provenance query model  Wrokflow programming and provenance query model
Wrokflow programming and provenance query model
Rayhan Ferdous
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
Zbigniew Jerzak
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Sharding
inside-BigData.com
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
FELIX75
 
rerngvit_phd_seminar
rerngvit_phd_seminarrerngvit_phd_seminar
rerngvit_phd_seminar
rerngvit yanggratoke
 
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and ApplicationsSemantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Artificial Intelligence Institute at UofSC
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
Ajay Ohri
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
bdemchak
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
The Statistical and Applied Mathematical Sciences Institute
 
Learning Content and Usage Factors Simultaneously
Learning Content and Usage Factors SimultaneouslyLearning Content and Usage Factors Simultaneously
Learning Content and Usage Factors Simultaneously
Arnab Bhadury
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Jack Clark
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
Ajay Ohri
 
Knowledge Discovery in Environmental Management
Knowledge Discovery in Environmental Management Knowledge Discovery in Environmental Management
Knowledge Discovery in Environmental Management
Dr. Aparna Varde
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing SystemsLatency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Zbigniew Jerzak
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
San Diego Supercomputer Center
 
Wheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation RelayWheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation Relay
National Information Standards Organization (NISO)
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
Jeff Z. Pan
 

Similar to Licentiate Defense Slide (20)

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Wrokflow programming and provenance query model
Wrokflow programming and provenance query model  Wrokflow programming and provenance query model
Wrokflow programming and provenance query model
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Sharding
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
rerngvit_phd_seminar
rerngvit_phd_seminarrerngvit_phd_seminar
rerngvit_phd_seminar
 
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and ApplicationsSemantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
Learning Content and Usage Factors Simultaneously
Learning Content and Usage Factors SimultaneouslyLearning Content and Usage Factors Simultaneously
Learning Content and Usage Factors Simultaneously
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
 
Knowledge Discovery in Environmental Management
Knowledge Discovery in Environmental Management Knowledge Discovery in Environmental Management
Knowledge Discovery in Environmental Management
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing SystemsLatency-aware Elastic Scaling for Distributed Data Stream Processing Systems
Latency-aware Elastic Scaling for Distributed Data Stream Processing Systems
 
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
 
Wheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation RelayWheeler & Benedict -- Enabling the Preservation Relay
Wheeler & Benedict -- Enabling the Preservation Relay
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 

Recently uploaded

WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 

Recently uploaded (20)

WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 

Licentiate Defense Slide

  • 1. Contributions to Performance Modeling and Management of Data Centers Rerngvit Yanggratoke Advisor: Prof. Rolf Stadler Opponent: Prof. Filip De Turck Licentiate Seminar October 25, 2013 1
  • 2. Introduction Networked services – search and social-network services Performance of such services is important Data center Internet Backend component Performance management of backend components in a data center 1. Resource allocation for a large-scale cloud environment 2. Performance modeling of a distributed key-value store 2
  • 3. Outline 1. Resource allocation for a large-scale cloud environment • Motivation • Problem • Approach • Evaluation results • Related research / publications 2. Performance modeling of a distributed key-value store • Motivation • Problem • Approach • Evaluation results • Related research / publications – Contributions – Open questions 3
  • 4. Motivation – Large-scale Cloud “U.S. Data centers Growing in Size But Declining in Numbers” IDC Research, 2012. • Apple's data center in North Carolina, USA (500’000 feet2) • Expected to host 150’000+ machines • Amazon data center in Virginia, USA (about 300’000 machines) • Microsoft's mega data center in Dublin, Ireland (300’000 feet2) • Facebook planned for a massive data center in Iowa, USA (1.4M feet2) • eBay's data center in Utah, USA (at least 240’000 feet2) 4
  • 5. Resource Allocation Select the machine for executing an application that satisfies: • Resource demands of all applications in the cloud • Management objectives of the cloud provider Resource allocation system computes the solution to the problem 5
  • 6. Problem & Approach Resource allocation system that supports – Joint allocation of compute and network resources – Generic and extensible for different management objectives – Scalable operation (> 100’000 machines) – Dynamic adaptation to changes in load patterns Approach – We assume the full-bisection-bandwidth network – Formulate the problem as an optimization problem – Distributed protocols - gossip-based algorithms 6
  • 7. An Objective Function f(t) An objective function f(t) expresses a management objective “Smaller f(t) moves the system closer to achieve the objective” Balance load objective f (t) = å k q q Î{w ,g , l } Energy efficiency objective åq q c u n n (t) 2 nÎN Fairness objective æ ö d v f (t) = - ç å kq å q m q m (t) ÷ ç ÷ q Î{w ,g } mÎM (t ) è ø f (t) = å pn (t) Pn(t) = energy of machine n at time t Service differentiation objective æ ö q d v f (t) = - ç å k å q m q m (t) ÷ ç ÷ èq Î{w ,g } mÎb (t ) ø q Î {w, g, l} = {CPU, Memory, Network} N = set of machines, M = set of applications 7
  • 8. Optimization Problem Minimize f(t) Subject to: net (1) q n (t) + å r q m (t) £ qnc for q Î {w, g , l } and n Î N mÎAn (t ) d m r (2) qm (t) ³ q for q Î {w, g, l } and m Î M • (1) are capacity constraints • (2) are demand constraints • Choose a specific f(t) states the problem for a specific management objective • This problem is NP-hard • We apply a heuristic solution • How to distribute the computation? 8
  • 9. Gossip Protocol • A round-based protocol that relies on pairwise interactions between nodes to accomplish a global objective • Each node executes the same code and the size of the exchanged message is limited “During a gossip interaction, move an application that minimizes f(t)” Balanced load objective 9
  • 10. Generic Resource Management Protocol (GRMP) A generic and scalable gossip protocol for resource allocation <- f(t) is applied here 10
  • 11. Evaluation Results Balanced load objective Energy efficiency objective Protocol Ideal Protocol Ideal Relative power consumption 1 Unbalance 0.8 0.6 0.4 0.2 0 0.95 0.75 0.55 0.35 0.15 CPU Load Factor 0.3 0.6 0.9 1.2 1.5 1 0.8 0.6 0.4 0.2 0 0.95 0.75 0.55 0.35 0.15 CPU Load Factor Network Load Factor 0.3 0.6 0.9 1.2 1.5 Network Load Factor Scalability: • System size up to 100’000 • Other parameters do not change 11
  • 12. Related Research • Centralized resource allocation system – (D. Gmach et al., 2008), (A. Verma et al., 2009), (C. Guo et al., 2010), (H. Hitesh et al., 2011), (V. Shrivastava et al., 2011), (D. Breitgand and A.Epstein, 2012), (J.W. Jiang et al., 2012), (O. Biran et al. 2012), … • Distributed resource allocation system – (G. Jung et al., 2010), (Yazir et al., 2010), (D. Jayasinghe et al., 2011), (E. Feller et al., 2012), this thesis • Initial placement – (A. Verma et al., 2009), (M. Wang et al., 2011), (D. Jayasinghe et al., 2011), (X. Zhang et al., 2012) • Dynamic placement – (F. Wuhib et al., 2012), (D. Gmach et al., 2008), (Yazir et al., 2010), (J. Xu and Jose Fortes, 2011) , this thesis 12
  • 13. Publications • R. Yanggratoke, F. Wuhib and R. Stadler, “Gossip-based resource allocation for green computing in large clouds,” In Proc. 7th International Conference on Network and Service Management (CNSM), Paris, France, October 24-28, 2011. • F. Wuhib, R. Yanggratoke, and R. Stadler, “Allocating Compute and Network Resources under Management Objectives in Large-Scale Clouds,” Journal of Network and Systems Management (JNSM), pages 126, 2013. 13
  • 14. Outline 1. Resource allocation for a large-scale cloud environment • Motivation • Problem • Approach • Evaluation results • Related research / publications 2. Performance modeling of a distributed key-value store • Motivation • Problem • Approach • Evaluation results • Related research / publications – Contributions – Open questions 14
  • 15. Spotify Backend A distributed key-value store • Motivation: low latency is key to the Spotify service 15
  • 16. Problem & Approach • Development of analytical models for – Predicting response time distribution – Estimating capacity under different object allocation policies • Random policy • Popularity-aware policy • The models should be – Accurate for Spotify’s operational range – Obtainable – Simple – Efficient • Approach – Simplified architecture – Probabilistic and stochastic modeling techniques 16
  • 17. Simplified Architecture for a Particular Site • • • • Model only Production Storage AP selects a storage server uniformly at random Ignore network and access point delays Consider steady-state conditions and Poisson arrivals 17
  • 18. Model for Response Time Distribution Model for a single storage server Probability that a request to a server is served below a latency t is ( Pr(T £ t) = q +(1- q) 1- e - md (1-(1-q)l /md nd )t ) Model for a cluster of storage servers Probability that a request to the cluster is served below a latency t is 1 l Pr(Tc £ t) = f (t, nd , m d,s , c , qs ) å S sÎS S 18
  • 19. Model for Estimating Capacity under Different Object Allocation Policies Capacity Ω : Max request rate to a server before a QoS is not satisfied Ωc: Max request rate to the cluster such that the request rate to each server is at most Ω Capacity of a cluster under the popularity-aware policy Wc (popularity) = W× S Capacity of a cluster under the random policy Oa 1/a O + a -1 Wc (random) = W × Om a 1/a Om + a -1 When Om O » + S 2 O ln S æ log log S ç1ç S 2 log S è ö ÷ ÷ ø ( , Assuming O >> S log S 19 ) 3
  • 20. Evaluation Results – Response Time Dist. Spotify Server Spotify Cluster 20
  • 21. Evaluation Results – Estimating Capacity Number Random policy of objects Popularity-aware policy Measurement Model Error (%) Measurement Model Error (%) 1250 166.50 176.21 5.83 190.10 200 5.21 1750 180.30 180.92 0.35 191.00 200 4.71 2500 189.70 182.30 3.90 190.80 200 4.82 5000 188.40 186.92 0.78 191.00 200 4.71 10000 188.60 190.39 0.95 192.40 200 3.95 21
  • 22. Related Research • Distributed key-value store – Amazon Dynamo (G. Decandia et al., 2007), Cassandra (A. Lakshman, 2010), Riak, HyperDec, Scalaris • Performance modeling of a storage device – (J.D. Garcia et al., 2008), (A.S. Lebrecht., 2009), (F. Cady et al., 2011), (S. Boboila and P. Desnoyers, 2011), (P. Desnoyers, 2012), (S. Li and H.H. Huang, 2010) • White-box vs. black-box performance model for a storage system – Whitebox: (A.S. Lebrecht., 2009), (F. Cady et al., 2011), (S. Boboila and P. Desnoyers, 2011), (P. Desnoyers, 2012), this thesis – Blackbox: (J.D. Garcia et al., 2008), (S. Li and H.H. Huang, 2010), (A. Gulati et al., 2011) 22
  • 23. Publications • R. Yanggratoke, G. Kreitz, M. Goldmann and R. Stadler, “Predicting response times for the Spotify backend,” In Proc. 8th International conference on Network and service management (CNSM), Las Vegas, NV, USA, October 2226, 2012. Best Paper award. • R. Yanggratoke, G. Kreitz, M. Goldmann, R. Stadler and V. Fodor, “On the performance of the Spotify backend,” accepted to Journal of Network and Systems Management (JNSM), 2013. 23
  • 24. Contributions • We designed, developed, and evaluated a generic protocol for resource allocation that – supports joint allocation of compute and network resources – Is generic and extensible for different management objectives – enables scalable operation (> 100’000 machines) – supports dynamic adaptation to changes in load patterns • We designed, developed, and evaluated performance models for predicting response time distribution and capacity of a distributed key-value store – Simple yet accurate for Spotify’s operational range – Obtainable – Efficient 24
  • 25. Open Questions for Future Research • Resource allocation for a large-scale cloud environment – Centralized vs. decentralized resource allocation – Multiple data centers & telecom clouds • Performance modeling of a distributed key-value store – Black-box models for performance predictions – Online performance management using analytical models 25

Editor's Notes

  1. Good morning every body, welcome to my licentiate seminar on the topic “Contributions to Performance Modeling and Management of Data Centers”So, let begin with the introduction to the topic.
  2. Let me start with the introduction to the topic by explaining to you about the networked service, which is the central theme of this thesis.1.) I believe many of you here have used or heard some networked services before, for example, the search-engine services like Google or a social network services like Facebook.===These services have become parts of our daily life, because we interact with them very often, e.g., when we want to search for information, or follow what our friends are doing.2.) We all agree that performance of such services are important. An example of such performance is the response time of services. For example, when you search for information, you want to be interactive, you want it to be fast! You don’t want to wait for half an hour for your query to come back.3.) In order to understand the performance of these services, let look into how these services work. They typically adopt two main components.3.1) The first one is the frontend component are the components that interact directly with the users like interfaces in your laptop, web technologies, mobile applications like in Android or iPhone. 3.2) The second component, which is equally important, which are not very visible to the user but also quite important is the backend components typically residing in a data center. The backend component performs operations for sharing states between applications, for example, web servers that serve data from data bases.4.) Performance management of this second component is the main focus of this thesis. In particular, this thesis focuses on two key problems in management of such backend components residing in a data center.I will go through each of them in subsequent slides.4.)
  3. Here is an outline of the talk today.For each problem, I will go through motivation, problem, approach, and results what we worked on.Then, I will summarize related research for both problems, conclude the presentation, and outline some open requests.
  4. This work is about large-scale cloud. So, let begin with the motivation of the work, or why should we consider a large-scale cloud?The key reason is that data centers are getting bigger. This is the quote of IDC Research last year that “US Data center is increasing in size but decreasing in numbers.” If we think about this, it also makes senses because it is more cost effectiveness when we have an “economy of scale”.What we see here is the Icloud data center from Apple. With a massive size of 500,000 square feet, it expects to host at least 150,000 machines.It is not just Apple. Many other Internet-based companies like Amazon, Microsoft, Facebook and eBay. All have plan for these kind of massive-scale data centers.So the key message here is that the large-scale center is the trend. Let move on to see what kind of problem that this thesis deal with.
  5. In particular, the first problem in this thesis relates to the resource allocation. SoWhat do I mean by resource allocation?I particularly focus on the “application placement problem”or answering the questionHow to select the machine for executing an application that satisfies - Resource demands of all applications in the cloud -&gt; Application need resources to operation for example some CPU cores or some memory for temporary computation- Management objectives of the cloud provider -&gt; The easiest ways to do this is to
  6. In particular, the first problem we consider is designing and evaluating the resource allocation system that
  7. So, this is our optimization problem. Essentially, it minimizes f(t) under the following constraints.
  8. A round-based protocol that relies on pairwise interactions between nodes to accomplish a global objectiveEach node executes the same code and the size of the exchanged message is limited
  9. Here is an outline of the talk today.For each problem, I will go through motivation, problem, approach, and results what we worked on.Then, I will summarize related research for both problems, conclude the presentation, and outline some open requests.