PhD Thesis: Performance Modeling of Cloud Computing Centers
1. Performance Modeling of Cloud Computing Centers
Hamzeh Khazaei
Advisor: Dr. Jelena Miˇsi´c
Co-Advisor: Dr. Rasit Eskicioglu
Department of Computer Science
University of Manitoba
January 18, 2013
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 1 / 30
2. Overview
1 Introduction
2 Cloud Computing Performance Analysis
3 Monolithic Performance Models
Basic Assumptions
Batch Arrivals
Highly Virtualized Environment
4 Interacting Performance Models
A Fine-Grained Performance Model
A Pool Management Schema
Heterogeneous Requests and Resources
5 Summary & Future Work
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 2 / 30
3. Introduction
CC is a computing paradigm in which different computing resources,
such as infrastructure, platforms and software applications, are made
accessible over the Internet to remote users as services.
CC has a service oriented architecture:
Software as a Service (SaaS)
Platform as a Service (SaaS)
Infrastructure as a Service (IaaS)
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 3 / 30
4. Introduction
Cloud services differ from traditional hosting in three principal
aspects:
On demand
Elastic
User transparent
As everything is delivered as services, QoS (Quality of Service) is
critical.
Availability and QoS have been listed as top obstacles for growth and
acceptability of CC.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 4 / 30
5. Performance Analysis
Due to:
Dynamic nature of cloud environments
Diversity of users request
Time dependency of load
It is really hard to
Providing expected quality services
While avoiding over-provisioning
Techniques and Mechanisms are required that guarantee a minimum
level of QoS.
Our main tools are Queuing Theory and Probabilistic Analysis.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 5 / 30
6. Related Research
Attracted considerable research attention
However, a few addressed performance issues.
A handful of analytical models: (e.g.,:)
Classic open network (exponential service time)
M/M/m/m + r queuing system (exponential service time)
There is no indication for assuming exponentially distributed service
time for cloud requests.
Proposed abstract model: multi-server queuing systems (M/G/m)
with generally distributed service time
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 6 / 30
7. Related Research
Many existing works about M/G/m queuing systems rely on
approximations.
In general, these are accurate enough only for:
Small number of servers (less than 10)
High offered load (ρ > 0.8)
Small Coefficient of Variation (less than 1)
In case of large number of servers and CoV over 1, the approximation
models are virtually useless.
As a result, existing methods were not directly applicable to
performance analysis of cloud centers.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 7 / 30
9. Basic Assumptions
We started modeling the cloud environment as an M/G/m queuing system.
Assumptions:
Dynamic arrival process (Poisson process)
Single arrival
Generally distributed service time
Hypo-exponential service time
Scale (i.e., large number of servers)
Infinite buffer space
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 9 / 30
10. Approximate Analytical Model
Number of tasks in system can be considered as a stochastic process
(original process).
Original process found hard to be analyzed directly (due to general
service time).
We employ embedded process technique.
We identify a semi-Markov process (SMP) in the original process.
SMP records the number of tasks in system just at the moment of
task arrivals.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 10 / 30
11. System Behavior
servers
An An+1 super-task
arrivals
task
departures
time
qn tasks
in system
qn+1 tasks
in system
Vn+1 tasks serviced,
departed from system
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 11 / 30
12. Adding More Details
Examine the hyper exponential distribution for service time
Examine finite buffer of system (M/G/m/m + r): blocking
probability and probability of immediate service are important.
Probability distribution of waiting and response time are also
obtained.
Using distributions, the exact behavior of cloud centers under different
configurations, assumptions and parameter settings is captured.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 12 / 30
13. Batch Arrivals - Super-Tasks
We add the assumption of arrivals that require more servers at the
same time (super-task).
There is no published work considering performance evaluation of
clouds under batch arrivals.
We employ M[x]/G/m/m + r queuing system for analytical modeling.
Batch sizes are assumed to be generally distributed.
We adopt total acceptance of admission and servicing policy.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 13 / 30
14. Homogenization
Our results indicate that the response time is very sensitive to the
CoV of task service times and the number of tasks in a super-task.
Homogenization
1 2 M/2-1 M/2 1 2 M/2-1 M/2
splitter
Cloud sub-center 1 Cloud sub-center 2
1 2 M-1 M
Heterogeneous Cloud Center
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 14 / 30
15. Highly Virtualized Environment
in
super-task(s)
PM #1
Hardware
Hypervisor layer
VM
1
VM
2
VM
3
VM
m-1
VM
m
M[x]
/G/m/m+r
Out
Load Balancing Server
client
client
client
`
client
...
PM #2
Hardware
VM
1
VM
2
VM
3
VM
m-1
VM
m
Out
...
PM #M
Hardware
VM
1
VM
2
VM
3
VM
m-1
VM
m
Out
...
...
Hypervisor layer
Hypervisor layer
PMs
Cloud Center
in
M[x]
/G/m/m+r
in
M[x]
/G/m/m+r
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 15 / 30
16. Virtualization Effects
performance vs. number of tiles
0 2 4 6 8 10 12 14
0
2
4
6
8
10
12
0.0
0.5
1.0
1.5
2.0
normalizedservicetime
number of tiles
VMmarkscore
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 16 / 30
17. Research Contributions
For different values of offered load, number of servers, system
capacity, distributions of service time and batch size, we obtain:
Distribution of number of tasks in the system
Distribution of response and waiting time
Blocking probability and probability of immediate service for cloud
centers
Propose some techniques for improving the above mentioned
performance metrics.
Measuring performance metrics for highly virtualized environments
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 17 / 30
19. A Fine-Grained Performance Model
Performance models ought to cover vast parameter space while it is
still tractable.
A monolithic model may suffer from intractability and poor scalability
due to large number of parameters.
We develop and evaluate tractable functional sub-models and their
interaction model while iteratively solve them.
Each sub-models captures different servicing steps in a complex cloud
center and the overall solution obtain by iteration over individual
sub-model solutions.
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 19 / 30
20. High Level Architecture
Resource Assigning Module
Global
Queue
VM Provisioning Module
Instance
Creation
VM Start upPM#1
Queue
VM
run-time
out
Time
Super-task
Arrival
Global Queuing PM Provisioning ̶̶ Queuing at PMs
VM Provisioning at each
PM
Actual Service
Time
Super-task blocking due
to insufficient capacity
Super-task blocking due
to lack of room in queue
in
FIFO
FIFO
PM#2
Queue
PM#N
Queue
...
Hot Pool
Warm
Pool
Cold Pool
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 20 / 30
23. Interaction Diagram among Modules
VMPSM_cold
RASM
Ph
Pc
Ph
BPq
VMPSM_warm
Ph
BPq
Pw
Pw
VMPSM_hot
Performance Model
Effective Task
Rejection Probability
and Effective Total
Servicing Delay
Performance Model Outputs:
· Task Rejection Probability
· Total Servicing Delay
BPq
BPq
Availability
Model
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 23 / 30
24. Pool Management Module
i,j,k i+1,j-1,k i+j-2,2,k
RPi
...
FRw
i+j-1,2,k-1
SU
i+j,1,k-1
FRw
i+j,2,k-2
SU
i+j,0,k
FRw
SU
i+j+k,0,0...
FRc
i+j+1,0,k-1
FRc
i+j+1,1,k-2
i+j+1,2,k-3
SU
SU
i+j-1,1,k
FRw
FR
w
FRw
...
...
i+j+K-1,0,1
i+j+k-1,1,0
SU
FRw
FRc
RPi
RPi
RPi
RPi
RPiRPiRPi
RPi
RPi
RPi
RPi
i+j+k-2,2,0
FRw
RPi
RPi
FRw
FRw
RPi
FRw
RPi
i-1,j,k+1
0,2,i+k+j-2
...
RPi
RPi
RPi
FRc
FRw
FRwFRw
FRc
Initial Configuration
No Running Task All PMs are Busy
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 24 / 30
25. Scalability of Performance Models
Relationship between the size of sub-models and design parameters
Sub-Model Design Para. No. of States: f(i)
RASM Lq f(Lq) = 3Lq + 1
VMPSM
f(m) = 3, if m=1
m f(m) = 6, if m=2
f(m) = 2f (m − 1) − f (m −
2) + 1, if m>2
PMSM
f(Nw , Nc, σ) =
Nw , Nc, σ Nw + Nc + 1 +
σ σ
s=1(Nw + Nc − σ − s)
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 25 / 30
26. Heterogeneous Requests and Resources
PMs are different among pools.
VMs are different in terms of number of virtual CPUs.
Specification of a typical super-task and PMs in each pool
Heterogeneous Request
# of vCPUs
per VM
# of Cores
per vCPU
RAM
(Gig)
Disk
(Gig)
# of VMs
Heterogeneous Physical Machines
Hot Pool Warm Pool Cold Pool
PM:{
Cores: 10
RAM: 40 Gig
Disk: 500 Gig
Max # of VMs: 10
}
PM:{
Cores: 8
RAM: 25 Gig
Disk: 250 Gig
Max # of VMs: 8
}
PM:{
Cores: 4
RAM: 10 Gig
Disk: 200 Gig
Max # of VMs: 2
}
vCPU:{1,2,3 or 4
Cores}
vCPU:{1,2 or 4
Cores}
vCPU:{2 or 4
Cores}
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 26 / 30
27. Rejection Probability
Stable Regime
Transient Regime
Total Delay
Stable Regime
Transient Regime
Unstable Regime
Maximum gain
Maximum
queuing delay
Processing
delays
are dominant
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 27 / 30
28. Research Contributions
Propose and evaluate performance models suitable for large sized IaaS
clouds
Performance models are sufficiently detailed to capture all realistic
aspects of resource allocation process, instance creation and
instantiation delays of a modern cloud center, while maintaining a
good tradeoff between accuracy and tractability.
The stable, transient and unstable regimes of operation for given
configurations have been identified.
Describe the dynamics of server pools and provides the optimum pool
arrangement.
Capacity planning is going to be a less challenging task for cloud
providers.
Provide power consumption strategy
Introduce heterogeneity for the first time
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 28 / 30
29. Conclusion
Summary
Introduction to cloud computing
Monolithic performance models
Basic assumptions
Super-task arrivals
Highly virtualized environments
Interacting performance sub-models
A fine-grained performance model
A pool management schema
Heterogeneous cloud centers
Results and research contributions
Future Research
Extend the heterogeneity
Live migration of VMs
Hamzeh Khazaei (UoM) Cloud Performance Modeling January 18, 2013 29 / 30