PhD Thesis: Performance Modeling of Cloud Computing Centers

Performance Modeling of Cloud Computing Centers
Hamzeh Khazaei
Advisor: Dr. Jelena Miˇsi´c
Co-Advisor: Dr. Rasit Eskicioglu
Department of Computer Science
University of Manitoba
January 18, 2013
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 1 / 30

Overview
1 Introduction
2 Cloud Computing Performance Analysis
3 Monolithic Performance Models
Basic Assumptions
Batch Arrivals
Highly Virtualized Environment
4 Interacting Performance Models
A Fine-Grained Performance Model
A Pool Management Schema
Heterogeneous Requests and Resources
5 Summary & Future Work

Introduction
CC is a computing paradigm in which diﬀerent computing resources,
such as infrastructure, platforms and software applications, are made
accessible over the Internet to remote users as services.
CC has a service oriented architecture:
Software as a Service (SaaS)
Platform as a Service (SaaS)
Infrastructure as a Service (IaaS)

Introduction
Cloud services diﬀer from traditional hosting in three principal
aspects:
On demand
Elastic
User transparent
As everything is delivered as services, QoS (Quality of Service) is
critical.
Availability and QoS have been listed as top obstacles for growth and
acceptability of CC.

Performance Analysis
Due to:
Dynamic nature of cloud environments
Diversity of users request
Time dependency of load
It is really hard to
Providing expected quality services
While avoiding over-provisioning
Techniques and Mechanisms are required that guarantee a minimum
level of QoS.
Our main tools are Queuing Theory and Probabilistic Analysis.

Related Research
Attracted considerable research attention
However, a few addressed performance issues.
A handful of analytical models: (e.g.,:)
Classic open network (exponential service time)
M/M/m/m + r queuing system (exponential service time)
There is no indication for assuming exponentially distributed service
time for cloud requests.
Proposed abstract model: multi-server queuing systems (M/G/m)
with generally distributed service time

Related Research
Many existing works about M/G/m queuing systems rely on
approximations.
In general, these are accurate enough only for:
Small number of servers (less than 10)
High oﬀered load (ρ > 0.8)
Small Coeﬃcient of Variation (less than 1)
In case of large number of servers and CoV over 1, the approximation
models are virtually useless.
As a result, existing methods were not directly applicable to
performance analysis of cloud centers.

Monolithic Performance Models

Basic Assumptions
We started modeling the cloud environment as an M/G/m queuing system.
Assumptions:
Dynamic arrival process (Poisson process)
Single arrival
Generally distributed service time
Hypo-exponential service time
Scale (i.e., large number of servers)
Inﬁnite buﬀer space

Approximate Analytical Model
Number of tasks in system can be considered as a stochastic process
(original process).
Original process found hard to be analyzed directly (due to general
service time).
We employ embedded process technique.
We identify a semi-Markov process (SMP) in the original process.
SMP records the number of tasks in system just at the moment of
task arrivals.

System Behavior
servers
An An+1 super-task
arrivals
task
departures
time
qn tasks
in system
qn+1 tasks
in system
Vn+1 tasks serviced,
departed from system

Adding More Details
Examine the hyper exponential distribution for service time
Examine finite buffer of system (M/G/m/m + r): blocking
probability and probability of immediate service are important.
Probability distribution of waiting and response time are also
obtained.
Using distributions, the exact behavior of cloud centers under different
configurations, assumptions and parameter settings is captured.

Batch Arrivals - Super-Tasks
We add the assumption of arrivals that require more servers at the
same time (super-task).
There is no published work considering performance evaluation of
clouds under batch arrivals.
We employ M[x]/G/m/m + r queuing system for analytical modeling.
Batch sizes are assumed to be generally distributed.
We adopt total acceptance of admission and servicing policy.

Homogenization
Our results indicate that the response time is very sensitive to the
CoV of task service times and the number of tasks in a super-task.
Homogenization
1 2 M/2-1 M/2 1 2 M/2-1 M/2
splitter
Cloud sub-center 1 Cloud sub-center 2
1 2 M-1 M
Heterogeneous Cloud Center

Highly Virtualized Environment
in
super-task(s)
PM #1
Hardware
Hypervisor layer
VM
1
VM
2
VM
3
VM
m-1
VM
m
M[x]
/G/m/m+r
Out
Load Balancing Server
client
client
client
`
client
...
PM #2
Hardware
VM
1
VM
2
VM
3
VM
m-1
VM
m
Out
...
PM #M
Hardware
VM
1
VM
2
VM
3
VM
m-1
VM
m
Out
...
...
Hypervisor layer
Hypervisor layer
PMs
Cloud Center
in
M[x]
/G/m/m+r
in
M[x]
/G/m/m+r

Virtualization Eﬀects
performance vs. number of tiles
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
0.0
0.5
1.0
1.5
2.0
normalizedservicetime
number of tiles
VMmarkscore

Research Contributions
For diﬀerent values of oﬀered load, number of servers, system
capacity, distributions of service time and batch size, we obtain:
Distribution of number of tasks in the system
Distribution of response and waiting time
Blocking probability and probability of immediate service for cloud
centers
Propose some techniques for improving the above mentioned
performance metrics.
Measuring performance metrics for highly virtualized environments

Interacting Performance Models

A Fine-Grained Performance Model
Performance models ought to cover vast parameter space while it is
still tractable.
A monolithic model may suﬀer from intractability and poor scalability
due to large number of parameters.
We develop and evaluate tractable functional sub-models and their
interaction model while iteratively solve them.
Each sub-models captures diﬀerent servicing steps in a complex cloud
center and the overall solution obtain by iteration over individual
sub-model solutions.

High Level Architecture
Resource Assigning Module
Global
Queue
VM Provisioning Module
Instance
Creation
VM Start upPM#1
Queue
VM
run-time
out
Time
Super-task
Arrival
Global Queuing PM Provisioning ̶̶ Queuing at PMs
VM Provisioning at each
PM
Actual Service
Time
Super-task blocking due
to insufficient capacity
Super-task blocking due
to lack of room in queue
in
FIFO
FIFO
PM#2
Queue
PM#N
Queue
...
Hot Pool
Warm
Pool
Cold Pool

Resource Assigning Module
0,0
λst
Lq,h1,h0,h
Lq,c1,c0,c
Lq,w1,w0,w
λst λst
αh(1-Ph)Pwαw
αw(1-Pw)αw(1-Pw)αw(1-Pw)
αh(1-Ph)αh(1-Ph)
λst
Phαh
λst
...
...
...
Phαh
Pwαw
Pwαw
λst
λst
λst
Phαh Phαh
λst
λst
αc(1-Pc)
Pcαc
αc(1-Pc) Pcαc
Pcαc
αc(1-Pc)
Hot Pool
Warm Pool
Cold Pool

Virtual Machine Provisioning Module (Hot Pool)
0,0,0
MSS-3,2,10,1,10,0,1
LQ,2,0MSS-2,2,00,2,00,1,0
λ1 λ2 λMSS
...λ1
...
φh 2φh 2φh
μ
...
μ
λ1
LQ-1,2,1 LQ,2,1...
LQ,0,m
LQ,1,m-1
φh
LQ-1,1,m-1
λ1
λ1
mμ
(m-1)μ
LQ-1,0,m
mμ
φh
...0,0,m
φh
0,0,m-1
λ1
mμ
(m-1)μ φh
...
λ1
(m-1)μ
λ1
λ1
0,1,m-1
λ1
φh
...
...
μ μ
μ
λ1
2φh
2φh 2φh 2φh
λMSS
...λ2
λ1
λMSS
...
λ1
λMSS
...
λ1
λMSS
...
λ2
λMSS
...
λ2
λMSS
...
λ2
λMSS
...
λ2
...
...
2φh
2φh
2φh
(m-1)μ
λ1
λMSS
...
λ2
λMSS
...
λ2

Interaction Diagram among Modules
VMPSM_cold
RASM
Ph
Pc
Ph
BPq
VMPSM_warm
Ph
BPq
Pw
Pw
VMPSM_hot
Performance Model
Effective Task
Rejection Probability
and Effective Total
Servicing Delay
Performance Model Outputs:
· Task Rejection Probability
· Total Servicing Delay
BPq
BPq
Availability
Model

Pool Management Module
i,j,k i+1,j-1,k i+j-2,2,k
RPi
...
FRw
i+j-1,2,k-1
SU
i+j,1,k-1
FRw
i+j,2,k-2
SU
i+j,0,k
FRw
SU
i+j+k,0,0...
FRc
i+j+1,0,k-1
FRc
i+j+1,1,k-2
i+j+1,2,k-3
SU
SU
i+j-1,1,k
FRw
FR
w
FRw
...
...
i+j+K-1,0,1
i+j+k-1,1,0
SU
FRw
FRc
RPi
RPi
RPi
RPi
RPiRPiRPi
RPi
RPi
RPi
RPi
i+j+k-2,2,0
FRw
RPi
RPi
FRw
FRw
RPi
FRw
RPi
i-1,j,k+1
0,2,i+k+j-2
...
RPi
RPi
RPi
FRc
FRw
FRwFRw
FRc
Initial Configuration
No Running Task All PMs are Busy

Scalability of Performance Models
Relationship between the size of sub-models and design parameters
Sub-Model Design Para. No. of States: f(i)
RASM Lq f(Lq) = 3Lq + 1
VMPSM
f(m) = 3, if m=1
m f(m) = 6, if m=2
f(m) = 2f (m − 1) − f (m −
2) + 1, if m>2
PMSM
f(Nw , Nc, σ) =
Nw , Nc, σ Nw + Nc + 1 +
σ σ
s=1(Nw + Nc − σ − s)

Heterogeneous Requests and Resources
PMs are different among pools.
VMs are different in terms of number of virtual CPUs.
Specification of a typical super-task and PMs in each pool
Heterogeneous Request
# of vCPUs
per VM
# of Cores
per vCPU
RAM
(Gig)
Disk
(Gig)
# of VMs
Heterogeneous Physical Machines
Hot Pool Warm Pool Cold Pool
PM:{
Cores: 10
RAM: 40 Gig
Disk: 500 Gig
Max # of VMs: 10
}
PM:{
Cores: 8
RAM: 25 Gig
Disk: 250 Gig
Max # of VMs: 8
}
PM:{
Cores: 4
RAM: 10 Gig
Disk: 200 Gig
Max # of VMs: 2
}
vCPU:{1,2,3 or 4
Cores}
vCPU:{1,2 or 4
Cores}
vCPU:{2 or 4
Cores}

Rejection Probability
Stable Regime
Transient Regime
Total Delay
Stable Regime
Transient Regime
Unstable Regime
Maximum gain
Maximum
queuing delay
Processing
delays
are dominant

Research Contributions
Propose and evaluate performance models suitable for large sized IaaS
clouds
Performance models are sufficiently detailed to capture all realistic
aspects of resource allocation process, instance creation and
instantiation delays of a modern cloud center, while maintaining a
good tradeoff between accuracy and tractability.
The stable, transient and unstable regimes of operation for given
configurations have been identified.
Describe the dynamics of server pools and provides the optimum pool
arrangement.
Capacity planning is going to be a less challenging task for cloud
providers.
Provide power consumption strategy
Introduce heterogeneity for the first time

Conclusion
Summary
Introduction to cloud computing
Monolithic performance models
Basic assumptions
Super-task arrivals
Highly virtualized environments
Interacting performance sub-models
A ﬁne-grained performance model
A pool management schema
Heterogeneous cloud centers
Results and research contributions
Future Research
Extend the heterogeneity
Live migration of VMs

The End

PhD Thesis: Performance Modeling of Cloud Computing Centers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PhD Thesis: Performance Modeling of Cloud Computing Centers

Similar to PhD Thesis: Performance Modeling of Cloud Computing Centers (20)

More from York University

More from York University (7)

Recently uploaded

Recently uploaded (20)

PhD Thesis: Performance Modeling of Cloud Computing Centers