The Impact of Dynamic Monitoring Interval Adjustment on
Power Consumption in Virtualized Data Centers
Mark White BSc., H.Dip
Submitted in accordance with the requirements for the degree of
Masters of Science in Computer Science and Information
Technology
Discipline of Information Technology, College of Engineering and Informatics
National University of Ireland, Galway
Research Supervisors: Dr. Hugh Melvin, Dr. Michael Schukat
Research Director: Prof. Gerard Lyons
September 2014
The candidate confirms that the work submitted is his own and that appropriate credit has been given where
reference has been made to the work of others
Contents
Chapter 1 Introduction.................................................................................................................. 1
1.1 The Hybrid Cloud...........................................................................................................1
1.2 Migration...................................................................................................................... 2
1.3 Energy Efficiency............................................................................................................ 2
1.4 Cooling.......................................................................................................................... 4
1.5 Research Objectives.......................................................................................................4
1.5.1 Hypothesis............................................................................................................. 4
1.5.2 CloudSim................................................................................................................ 5
1.5.3 Methodology..........................................................................................................6
1.6 Conclusion..................................................................................................................... 6
Chapter 2 Literature Review..........................................................................................................8
Introduction............................................................................................................................. 8
2.1 Performance versus Power............................................................................................. 8
2.2 Increased Density ..........................................................................................................9
2.3 Hardware.................................................................................................................... 12
2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution...................................... 12
2.3.2 Servers, Storage Devices & Network Equipment..................................................... 13
2.3.3 Cooling................................................................................................................ 13
2.3.4 Industry Standards & Guidelines............................................................................ 15
2.3.5 Three Seminal Papers ........................................................................................... 17
2.4 Software ..................................................................................................................... 24
2.4.1 Virtualization........................................................................................................ 24
2.4.2 Migration............................................................................................................. 25
2.5 Monitoring Interval...................................................................................................... 34
2.5.1 Static Monitoring Interval ..................................................................................... 36
2.5.2 Dynamic Monitoring Interval................................................................................. 37
2.6 Conclusion................................................................................................................... 38
Chapter 3 CloudSim.................................................................................................................... 39
Introduction........................................................................................................................... 39
3.1 Overview..................................................................................................................... 39
3.2 Workload.................................................................................................................... 40
3.3 Capacity...................................................................................................................... 42
3.4 Local Regression / Minimum Migration Time (LR / MMT) ............................................... 44
3.5 Selection Policy – Local Regression(LR)......................................................................... 44
3.6 Allocation Policy – Minimum Migration Time (MMT) ..................................................... 45
3.7 Default LRMMT ........................................................................................................... 45
3.7.1 init()………………………………………………………. .............................................................. 45
3.7.2 start() .................................................................................................................. 46
3.8 Over-utilization............................................................................................................ 48
3.9 Migration.................................................................................................................... 50
3.10 Reporting.................................................................................................................... 52
3.11 Conclusion................................................................................................................... 52
Chapter 4 Implementation.......................................................................................................... 54
Introduction........................................................................................................................... 54
4.1 Interval Adjustment Algorithm...................................................................................... 54
4.2 Comparable Workloads................................................................................................ 58
4.3 C# Calculator............................................................................................................... 61
4.4 Interval Adjustment Code............................................................................................. 64
4.5 Reporting.................................................................................................................... 69
4.6 Conclusion................................................................................................................... 70
Chapter 5 Tests, Results & Evaluation.......................................................................................... 71
Introduction........................................................................................................................... 71
5.1 Tests & Results ............................................................................................................ 71
5.2 Evaluation of Test Results............................................................................................. 75
5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?............................... 76
5.2.2 Result of Reduced Migration Count ....................................................................... 77
5.2.3 Scalability............................................................................................................. 77
5.3 Evaluation of CloudSim ................................................................................................ 78
5.3.1 Local Regression Sliding Window........................................................................... 78
5.3.2 RAM.................................................................................................................... 79
5.3.3 Dynamic RAMAdjustment .................................................................................... 79
5.3.4 SLA-based Migration............................................................................................. 79
Chapter 6 Conclusions ................................................................................................................ 81
REFERENCES............................................................................................................................... 83
APPENDIXA................................................................................................................................ 89
APPENDIX B................................................................................................................................ 90
List of Figures
Figure 1 Data Center Service Supply Chain.................................................................................... 3
Figure 2 Relative contributions to the thermal output of a typical DC............................................. 12
Figure 3 A Typical AHU Direct Expansion (DX) Cooling System ................................................. 14
Figure 4 A Typical DC Air Flow System...................................................................................... 15
Figure 5 Performance of web server during live migration (C. Clark)............................................. 30
Figure 6 Pre-Copy algorithm ....................................................................................................... 32
Figure 7 CloudSim Architecture .................................................................................................. 41
Figure 8 Flow Chart Depicting the LR / MMT simulation process ................................................. 48
Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average................ 57
Figure 10 A screenshot of the data generated for calculation of the default workload ...................... 59
Figure 11 Intervals calculated during the dynamic simulation ........................................................ 59
Figure 12 Calculation of the Average CPU Utilization for the Default Files.................................... 60
Figure 13 How the dynamic interval adjustment code interacts with CloudSim............................... 66
Figure 14 Interval Calculation for the Dynamic Simulation ........................................................... 72
Figure 15 VM decommissioning comparison................................................................................ 73
Figure 16 Operational Hosts - Default Simulation......................................................................... 74
Figure 17 Operational Hosts - Dynamic Simulation ...................................................................... 74
Figure 18 Average CPU Utilization - Dynamic Simulation............................................................ 75
Acknowledgements
My supervisor, Dr. Hugh Melvin, who identified at an early stage that (for the most part) I
could be left to my own devices to get on with the work required. His supervisory approach
resulted in the freedom to progress at my own pace knowing he was available as and when I
needed a ‘boost’. When clarity was requested, Hugh demonstrated an enviable ability to
extract the salient issue and point me in the right direction. Although typically performed
cycling up a steep hill on his way home from work, the momentary pauses during review
meetings while he reflected on the issues were often more productive than hours of reading
code. Future students should be so lucky to have him oversee their research endeavours.
My second supervisor, Dr. Michael Schukat, who is capable of clarifying a complicated issue
with a carefully worded question – followed (invariably) with a reassuring smile.
Dr. Ripduman Sohan & Dr. Sherif Akoush in the Computing Laboratory at Cambridge
University without whom I would not have identified the approach taken in this thesis. Over
the course of a few (all too brief) visits with them, I also became aware of the extent of my
intellectual abilities (and limitations!!!).
The principle author of the CloudSim framework, Dr. Anton Beloglazov. Despite his having
moved on from the University of Melbourne where he wrote CloudSim for his doctoral
thesis, his detailed responses to the countless queries I posed during the course of my
research were invaluable and generous to a fault.
My colleagues in the Discipline of IT at NUI Galway for timely coffee-breaks, lunch
invitations and encounters in the corridors – because the breaks are a vital constituent of the
work and the queries as to progress and words of support were more important than you
could possibly have imagined.
My parents, who repeatedly remind me that:
‘You are capable of anything you put your mind to’
Deirdre (Dee) O’ Connor – if convention allowed your name would be on the title page!
Abstract
Virtualization is one of the principle data center technologies increasingly deployed in recent
years to meet the challenges of escalating costs, industry standards and the search for a
competitive edge. This thesis presents a novel approach to management of the virtualized
system which dynamically adjusts the monitoring interval with respect to the average CPU
utilization for the data center. The potential for reduced power consumption, by identifying
performance opportunities at an earlier stage than typical virtualized systems which use a
static interval, is analysed. It is proposed that the adjusted interval will result in analysis of
data center metrics being performed at a more appropriate level of granularity than current
static monitoring systems.
Chapter1 Introduction
1
Chapter 1 Introduction
The availability of cloud-based Data Centers (DCs) in recent years has introduced significant
opportunities for enterprises to reduce costs. The initial Capital Expenditure (CapEx)
associated with setting up a DC has been prohibitively high in the past, but this may no
longer be the primary concern. For example, start-ups choosing to implement Infrastructure-
as-a-Service (IaaS) cloud architectures are free to focus on optimizing other aspects of the
business rather than worrying about raising the capital to build (and maintain) fully equipped
DCs. A young enterprise can now pay a relatively small monthly fee to Amazon (EC2) or
Microsoft (Azure), for example, in return for a scalable infrastructure on which to build their
new product or service. Existing companies are also availing of significant savings and
opportunities by moving to the cloud.
1.1 The Hybrid Cloud
In the future the architecture of cloud computing infrastructure will facilitate a business
moving the public portion of their services from one remote DC to another for cost or
efficiency gains. For example, a DC provider in one US state may be charging less for
compute time because energy costs in that state are lower than those in a neighbouring state.
Migration of enterprise services to the less expensive location could be facilitated. To enable
this type of migratory activity, the Distributed Management Task Force (DMTF) has created
the Open Virtualization Format (OVF) specification. The OVF standard “provides an
intermediary format for Virtual Machine (VM) images. It lets an organization create a VM
instance on top of one hypervisor and then export it to the OVF so that it can be run by
another hypervisor” [4]. With the exception of Amazon, all the major cloud providers (Citrix
Systems, IBM, Microsoft, Oracle and VMware) are involved in the development of OVF.
The short and medium term solution to the interoperability issue will certainly be
‘hybrid’ clouds where the enterprise maintains the private portion of their infrastructure on
their local network and the public portion is hosted on a federated cloud - facilitating indirect
(but not direct) movement between providers e.g. in a similar fashion to switching broadband
providers, a software development company may initially choose to lease a Microsoft data
center for their infrastructure but subsequently transfer to Google if the latter offering
Chapter1 Introduction
2
becomes more suitable for their purposes (e.g. proximity to client requests or more energy
efficient).
Development of new products may be performed securely on the enterprise Local
Area Network (LAN) and subsequently ‘released’ onto the public cloud for global
distribution. Movement from one provider to another is currently (and for the foreseeable
future will be) performed manually by the enterprise administrator using separate
management interfaces i.e. an Amazon API or a Microsoft API. The vision of the DMTF is a
unified interface known as Cloud Infrastructure Management Interface (CIMI). It is currently
a work-in-progress but ultimately hopes to facilitate direct transfer of data between cloud
providers.
The core technology upon which this data transfer between providers will be
facilitated is virtualization – most specifically, migration of VMs.
1.2 Migration
The practice of regularly migrating services between providers may well become feasible in
the future, providing enterprises with significant opportunities to dynamically reduce the
energy portion of their Operating Expenditure (OpEx) budget. This would also result in
operators becoming more competitive, perhaps, all performance metrics being equal, gaining
their edge from increased energy efficiency efforts. Hybrid cloud environments also facilitate
smaller IT teams, resulting in reduced staffing costs.
1.3 Energy Efficiency
Data centers currently account for close to 3% of all global energy consumed on an annual
basis. It is certain that the industry will continue to expand as increasing volumes of data are
generated, transmitted, stored and analysed. This expansion will require significantly more
energy than is currently used by the sector, energy which must be managed as responsibly as
possible. Energy management, however, is not possible without measurement.
The measurement of a DC’s energy efficiency helps staff and management focus on
the various subsystems of the operation with a view to improving the overall efficiency of the
data center. While advances in hardware and software continue apace, the DC industry has
only recently begun to consider the responsibility of ensuring that the energy it uses is not
Chapter1 Introduction
3
wasted. The global economic downturn of 2007 played no small part in motivating DC
operators to review their practices. In an attempt to remain competitive, while constantly
upgrading infrastructure and services to meet the needs of their customers, data center
operators have since identified energy efficiency as a cost opportunity. The moral aspects of
managing energy for the future are all well and good. It appears more likely, however, that
the potential operational savings in the short to medium term have provided the primary
motivation for data center operators to take stock.
In addition to the operational savings achieved when the data center becomes more
energy efficient on a daily basis, additional capital savings may also be realized. All items of
IT equipment have a replacement interval which may be increased due to redundancies
discovered during the energy efficiency audit. For example, should the existing cooling
volume of the room be found to be in excess of requirements, additional air handling units
(AHUs) could be switched to standby, not only reducing the power consumed by that unit but
also increasing the interval before the unit needs to be repaired or replaced.
The amount of power and cooling that a DC uses on a day-to-day basis determines
how much irreplaceable fossil fuels it consumes and the quantity of carbon emissions for
which it is responsible.
Figure 1 Data Center Service Supply Chain
Chapter1 Introduction
4
Within the supply chain of DC services, illustrated in Figure 1, the main emissions occur at
the power generation site. Location is a key factor for the CO2 intensity of the power
consumed by the data center. A gas- or coal-fired powered utility creates much more CO2
than a hydro- or wind-powered utility. For this reason, many green-field DCs are now being
located near low-cost, environmentally friendly power sources.
1.4 Cooling
Location is also a key factor with respect to cooling. A data center in a cool climate such as
Ireland requires less cooling power than a data center in a warmer climate such as Mexico.
To avail of climate-related opportunities, large-scale DCs have recently been built in
temperate locations such as Dublin (e.g. Google) and Sweden (e.g. Facebook), demonstrating
the significance of the cost reductions possible. This being the case, if migration of DC
services across Wide Area Networks (WANs) becomes cost-feasible in the future, concepts
such as ‘follow the moon’ / ‘follow the sun’ (where the services provided by a DC are
moved, across the network, closer to where they are most needed throughout the day) may
become prevalent. Migration of data center services across both Local Area Networks
(LANs) and WANs is discussed in more detail in Chapter 2.
1.5 Research Objectives
1.5.1 Hypothesis
While the effort to optimize the individual component costs (e.g. downtime) of a migration is
worthwhile, this research aims to investigate further opportunities for energy savings if,
rather than optimising the individual component costs, a migration is viewed as a single all-
encapsulating entity and focus is applied to reducing the total number of migrations taking
place in a DC. Throughout a migration both the source and the destination servers are
running. Quite apart from the extra CPU processing, RAM access and bandwidth required to
achieve a migration, there is an additional energy cost associated with simply keeping both
servers simultaneously powered for the duration of the migration. In addition, if the
destination server was not previously running before the migration was initiated, the time
delay starting it up (as a new host machine) must also be factored into any calculation of
efficiency.
Chapter1 Introduction
5
The principle metric for monitoring the DC workload is CPU utilization which is one of the
primary resources associated with servicing that workload. In a virtualized environment CPU
utilization is an indication of the processing capacity being used by a host while serving the
requirements of the VMs located on it. In current practice, the CPU utilization value
delivered to monitoring systems is averaged over a constant monitoring interval (i.e. 300
seconds). This interval is typically pre-configured (via a management interface) by the data
center operator, rendering it static. With a relatively small percentage of the host's CPU
concerned with running the virtualization hypervisor, CPU utilization is primarily dependent
on the workload being serviced by the VMs located on the host. This workload typically
varies with time as requests to the servers fluctuate outside the DC. As such, the frequency of
change of the CPU utilization value closely tracks the frequency of change of the incoming
workload.
This thesis investigates the merits of moving from a fixed interval to one which is
dynamically adjusted based on the overall CPU utilization average of the DC. At each
interval a weighted CPU utilization average for the DC is calculated and the next monitoring
interval is adjusted accordingly. By dynamically adjusting the monitoring interval with
respect to the average CPU utilization of the DC, this research analyses the potential for
reduced power consumption through identification of performance opportunities at an earlier
stage than systems which use a static 300 second interval. It is proposed that these
performance opportunities would otherwise have remained hidden mid-interval. Calculated
on the basis of how ‘busy’ the DC currently is, the adjusted interval is more likely to be at an
appropriate level of granularity than its static counterpart.
1.5.2 CloudSim
A secondary objective of this research was to examine the efficacy of the CloudSim
framework with respect to simulation of power-aware DCs. Given the lack of access for
researchers to ‘real-world’ data center infrastructure, a robust simulator with which to
experiment is of paramount importance. CloudSim is one such framework and is currently
deployed by many researchers in the field of data center energy efficiency worldwide. It is
discussed in detail in Chapter 3.
Chapter1 Introduction
6
The online forums relating to the CloudSim framework are very active with researchers
attempting to establish the best way to achieve their objectives. While the documentation for
the code is extensive (and there are a number of basic examples of how the software can be
used included in the CloudSim framework), there is little by way of explanation of the
methodologies used by the original author of the code, thus resulting in each individual
researcher having to spend an inordinate amount of time investigating the capabilities (and
limitations) of the framework. This can only be achieved by reviewing many thousands of
lines of code and testing to establish the functionality of each module and method.
Through the course of this research, a number of CloudSim issues were identified
which, it is hoped, will prove useful to future researchers. They are discussed chronologically
(at the point in development when they were identified) and relate to both the framework
code and the accuracy of virtual machine and migration simulation. They are also
summarized in Chapter 5.
1.5.3 Methodology
This thesis uses the CloudSim framework (described in more detail in Chapter 3) as the base
simulator for implementation and testing of the hypothesis. A considerable review of the
existing CloudSim code was required to establish the capabilities of the framework and also
to identify what additional code modules would be needed to meet the thesis objectives.
Ultimately it was found that no facility existed in CloudSim to test the hypothesis and thus a
number of extensions to the existing code were designed and developed. These were then
integrated with the framework such that the default CloudSim simulation could be reliably
compared with the dynamic extension created for this research i.e. implementing dynamic
interval adjustment.
1.6 Conclusion
The remainder of this thesis is structured as follows. The literature review in Chapter 2
describes current (and future) efforts to improve energy efficiency in the data center industry.
Both hardware and software approaches are discussed, with a focus on virtualized systems,
installed as standard in all green-field DCs and retro-fitted to the majority of existing brown-
field sites. Chapter 3 details the specific modules in the CloudSim framework required to
build the test bed for analysis of the hypothesis. An explanation as to how these modules
Chapter1 Introduction
7
interact with each other is also provided. Chapter 4 specifies the new Java methods written to
create and test the hypothesis. Integration of the new code with the existing framework is also
described. Chapter 5 discusses the tests performed to evaluate the hypothesis and analyses the
results in the context of current energy efficiency efforts in the data center industry. Chapter 6
concludes this thesis with a summary of the limitations identified in the CloudSim
framework, in the hope that the work of future researchers can more effectively benefit from,
and build upon, its code-base.
Chapter2 Literature Review
8
Chapter 2 Literature Review
Introduction
By adjusting the DC monitoring interval with respect to the incoming workload, this thesis
investigates opportunities for more energy efficient management of data center resources.
Given this objective, extensive examination of the evolution of DC resource management
methods over the last few years was required in an effort to identify an approach which had
not been previously applied.
This thesis is primarily concerned with the energy efficiency of DCs when migrating
VMs across the LAN and WAN. To contextualize the research more completely, the
following literature review extends the introductory discussion in Chapter 1 to encompass the
entire data center infrastructure, analyzing current (and previous) efforts by operators and
academic researchers to reduce power consumption from, not only a software, but also a
hardware perspective. The chapter closes with an in-depth review of existing monitoring
interval approaches and technologies.
2.1 Performance versus Power
Most of the advances achieved by both the DC industry and academic researchers before
2006 paid particular attention to the performance of the infrastructure, with the principle
focus of operator efforts set firmly on keeping the DC available to clients 24/7. In fact, when
advertising and selling the services they offer, operators still choose to feature their ‘uptime’
percentage as their primary Unique Selling Point (USP). The ‘5 nines’ (i.e. 99.999% uptime),
denoting High Availability (HA), are seldom omitted from a typical DC operator’s marketing
material. However, the increase in power consumption required to boost performance seldom
received more attention than summary recognition as an additional expense. The power /
performance trade-off is undoubtedly a difficult hurdle to overcome, especially while cost
competitiveness is uppermost in the minds of data center operators. Invariably, before 2006,
most commercial development efforts to improve the operation of DCs were focussed solely
on performance.
Chapter2 Literature Review
9
In more recent years, increased consumer demand for faster traffic and larger, more flexible,
storage solutions has changed how the industry views the resources required to operate
competitively. More equipment (e.g. servers, routers) has been required to meet demand but
the space required to accommodate this equipment has already been allocated to existing
equipment. The strategy adopted, since 2006, by a DC industry looking to the future, was to
increase the density of IT equipment rather than the more expensive option of purchasing (or
renting) additional square footage. The solution combined new server technologies and
virtualization.
2.2 Increased Density
An analogy: increasing infrastructural density in a data center is similar to adding more
bedrooms to a house without extending the property. The house can now accommodate
private spaces for more people but each person has less space than before. In the data center
there are now more servers per square foot, resulting in more compute / storage capability.
Despite the space-saving advantages of VM technology and techniques (i.e. migration),
which reduced the number of servers required to host applications, the primary disadvantage
of increased density was that each new blade server required significantly more power than
its predecessor. A standard rack with 65-70 blades operating at high loads might require 20 -
30kW of power compared with previous rack consumptions of 2 - 5kW. This additional
power generates additional heat. In a similar manner to maintaining comfortable levels of
heat and humidity for people in a house, heat in the rack, and resultant heat in the server
room, must be removed to maintain the equipment at a safe operating temperature and
humidity. Summarily, the introduction of increased server room density, from 2006 onwards,
resulted in increased power and cooling requirements for modern DCs.
At their 25th Annual Data Center Conference held in Las Vegas in late November
2006, Gartner analysts hypothesized that:
“…by 2008, 50% of current data centers will have insufficient
power and cooling capacity to meet the demands of high-density
equipment…” [1]
Chapter2 Literature Review
10
During his address to the conference, Gartner Research Vice President, Michael Bell
suggested that: “Although power and cooling challenges will not be a perpetual problem, it is
important for DC managers to focus on the electrical and cooling issue in the near term, and
adopt best practice to mitigate the problem before it results in equipment failure, downtime
and high remediation costs”. This was one of the first ‘shots across the bow’ for a data center
industry which, until then, had been solely focussed on improving performance (e.g. uptime,
response time) almost in deference to escalating energy costs.
Based on data provided by IDC [2], Jonathon Koomey published a report [3] in
February 2007 estimating the electricity used by all DCs in both the US and globally for
2005. The executive summary states that:
“The total power demand in 2005 (including associated
infrastructure) is equivalent (in capacity terms) to about five 1000
MW power plants for the U.S. and 14 such plants for the world. The
total electricity bill for operating those servers and associated
infrastructure in 2005 was about $2.7 billion and $7.2 billion for the
U.S. and the world, respectively.”
A few months later the global economic downturn brought with it increasingly restrictive
operating budgets and higher energy prices. The competitive edge was becoming harder to
identify. Quite apart from the economic factors affecting the industry, the timely publication
by the EPA of its report to the US Congress [4] in August 2007 highlighted significant
opportunities to reduce both capital and operating costs by optimizing the power and cooling
infrastructure involved in data center operations. Industry analysts were once again
identifying an escalating power consumption trend which required immediate attention.
The report assessed the principle opportunities for energy efficiency improvements in
US DCs. The process of preparing the report brought all the major industry players together.
In an effort to identify a range of energy efficiency opportunities, 3 main improvement
scenarios were formulated:
1. Improved Operation: maximizes the efficiency of the existing data center
infrastructure by utilizing improvements such as ‘free cooling’ and raising
Chapter2 Literature Review
11
temperature / humidity set-points. Minimal capital cost (‘the low hanging fruit’) is
incurred by the operator
2. Best Practice: adopt practices and technologies used in the most energy-efficient
facilities
3. State-of-the-art: uses all available energy efficiency practices and technologies
The potential energy savings and associated capital cost calculated for each of the 3 scenarios
respectively were:
1. Improved Operation: 20% saving - least expensive
2. Best Practice: 45% saving
3. State-of-the-art: 55% saving - most expensive
Notably, a proviso was also offered by the report in that: “…due to local constraints, the best
strategy for a particular data center could only be ascertained by means of a site-specific
review - not all suggested scenarios apply to all data centers.” Regardless of which (if any)
subsequent strategy was adopted by a particular data center operator, performance of a site-
specific review invariably served the purpose of demonstrating that reduction of power
consumption was indeed a viable opportunity to, not only significantly reduce both capital
and operating costs, but also re-gain a competitive edge.
The economic downturn, the Gartner conference and the reports by both the EPA and
Koomey were a significant part of the catalyst for the energy approach beginning to receive a
level of attention closer, if not equal, to that of the performance approach in previous years.
Efficient management of power and cooling, while maintaining performance levels, became
the order of the day.
At the highest level, DC infrastructure can be subdivided into hardware and software.
While it is true that both are inextricably linked to the energy performance of the DC, it is
useful for the purposes of this review to examine them separately.
Chapter2 Literature Review
12
2.3 Hardware
Rasmussen [5] identified power distribution, conversion losses and cooling as representing
between 30 – 45% of the electricity bill in larger DCs. Cooling alone accounted for 30% of
this total.
Figure 2 Relative contributions to the thermal output of a typical DC
2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution
The power being provided to the IT equipment in the racks is typically routed through an
Uninterruptible Power Supply (UPS) which feeds Power Distribution Units (PDUs) located
in or near the rack. Through use of better components, circuit design and right-sizing
strategies, manufacturers such as American Power Conversion (APC) and Liebert have
turned their attention to maximizing efficiency across the full load spectrum, without
Chapter2 Literature Review
13
sacrificing redundancy. Some opportunities may exist in efforts to re-balance the load across
the 3 phases supplying the power to the racks but efficiencies in the power supply &
distribution system are outside the scope of this research.
2.3.2 Servers, Storage Devices & Network Equipment
Manufacturers such as IBM and Intel are designing increasingly efficient server blades with
features such as chip-level thermal strategies (Dynamic Voltage & Frequency Scaling
(DVFS)), multicore processors and power management leading the way. Enterprise operators
such as Google and Facebook have recently designed and installed their own servers which
have demonstrated increased efficiencies but these servers are specifically ‘fit-for-purpose’.
They may not be sufficiently generic to be applicable to a majority of DC configurations.
2.3.3 Cooling
There are a variety of standard systems for cooling in data centers but all typically involve
Air Handling Units (AHUs) or Computer Room Air Handlers (CRAHs). Well-designed DCs
have aligned their racks in an alternating hot aisle / cold aisle configuration with cold air from
the AHU(s) entering the cold aisle through perforated or grated tiles above a sub-floor
plenum. Hot air is exhausted from the rear of the racks and removed from the room back to
the same AHU(s) forming a closed-loop system. The hot air is passed directly over an
evaporator (Figure 3: 4) in the AHU which contains a liquid refrigerant (e.g. ethylene glycol /
water solution). The amount of heat absorbed is determined by the speed of the air crossing
the coil and / or the flow rate of the refrigerant through the coil. The flow rate is controlled by
tandem scroll compressors (Fig 3: 1). A dead-band setting is applied to each AHU and is
divided equally between all the compressors in the system. As each dead-band range above
the set-point is reached a compressor will engage to increase the flow rate. As the
temperature returns (down through the dead-band increments) toward the set-point, the
compressors disengage – reducing the flow through the evaporator until the set-point is
reached again. The heat absorbed through the coil is fed to an array of condensers outside the
DC where it evaporates into the atmosphere as exhaust or is reused in some other part of the
facility. The set point of the AHU is configured on installation of the unit and must (if
deemed appropriate) be changed manually by a member of staff following analysis and
review. Unfortunately these reviews happen all too seldom in typical DCs, despite the
inevitable changes taking place in the server room workload on a daily basis.
Chapter2 Literature Review
14
Figure 3 A Typical AHU Direct Expansion (DX) Cooling System
Depending on the configuration, the heat removal system might potentially consume 50% of
a typical DC’s energy. Industry is currently embracing a number of opportunities involving
temperature and airflow analysis:
1. aisle containment strategies
2. increasing the temperature rise (ΔT) across the rack
3. raising the operating temperature of the AHU(s)
4. repositioning AHU temperature and humidity sensors
5. thermal management by balancing the IT load layout [6, 7]
6. ‘free cooling’ – eliminating the high-consumption chiller from the system through
the use of strategies such as air- and water-side economizers
Chapter2 Literature Review
15
Figure 4 A Typical DC Air Flow System
In addition to temperature maintenance, the AHUs also vary the humidity of the air entering
the server room according to set-points. Low humidity (dry air) may cause static which has
the potential to short electronic circuits. High levels of moisture in the air may lead to faster
component degradation. Although less of a concern as a result of field experience and recent
studies performed by Intel and others, humidity ranges have been defined for the industry and
should be observed to maximize the lifetime of the IT equipment. Maintaining humidity
ranges definitively increases the interval between equipment replacement schedules and, as a
result, has a net positive outcome on capital expenditure budgets.
2.3.4 Industry Standards & Guidelines
2.3.4.1 Standards
Power Usage Effectiveness (PUE2) [8] is now the de facto standard used to measure a DC’s
efficiency. It is defined as the ratio of all electricity used by the DC to the electricity used just
by the IT equipment. In contrast to the original PUE [9] rated in kilowatts of power (kW),
PUE2 must be based on the highest measured kilowatt hour (kWh) reading taken during
analysis. In 3 of the 4 PUE2 categories now defined, the readings must span a 12 month
period, eliminating the effect of seasonal fluctuations in ambient temperatures:
PUE =
𝑇𝑜𝑡𝑎𝑙 𝐷𝑎𝑡𝑎 𝐶𝑒𝑛𝑡𝑟𝑒 𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐𝑖𝑡𝑦 ( 𝑘𝑊ℎ)
𝐼𝑇 𝐸𝑞𝑢𝑖𝑝𝑚𝑒𝑛𝑡 𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐𝑖𝑡𝑦 ( 𝑘𝑊ℎ)
Chapter2 Literature Review
16
A PUE of 2.0 suggests that for each kWh of IT electricity used another kWh is used by the
infrastructure to supply and support it. The most recent PUE averages [10] for the industry
fall within the range of 1.83 – 1.92 with worst performers coming in at 3.6 and a few top
performers publishing results below 1.1 in recent months. Theoretically, the best possible
PUE is 1.0 but a web-hosting company (Pair Networks) recently quoted a PUE of 0.98 for
one of its DCs in Las Vegas, Nevada. Their calculation was based on receipt of PUE ‘credit’
for contributing unused power (generated on-site) back to the grid. Whether additional PUE
‘credit’ should be allowed for contributing to the electricity grid is debatable. If this were the
case, with sufficient on-site generation, PUE could potentially reach 0.0 and cease to have
meaning. Most DCs are now evaluating their own PUE ratio to identify possible
improvements in their power usage. Lower PUE ratios have become a very marketable aspect
of the data center business and have been recognized as such. Other standards and metrics
(2.3.4.2.1 – 2.3.4.2.4) have been designed for the industry but, due for the most part to the
complex processes required to calculate them, have not as yet experienced the same wide-
spread popularity as PUE and PUE2.
2.3.4.2 Other Standards
2.3.4.2.1 Water Usage Effectiveness (WUE) measures DC water usage to provide an
assessment of the water used on-site for operation of the data center. This includes water used
for humidification and water evaporated on-site for energy production or cooling of the DC
and its support system.
2.3.4.2.2 Carbon Usage Effectiveness (CUE) measures DC-level carbon emissions.
CUE does not cover the emissions associated with the lifecycle of the equipment in the DC or
the building itself.
2.3.4.2.3 The Data Center Productivity (DCP) framework is a collection of metrics
which measure the consumption of a DC-related resource in terms of DC output. DCP looks
to define what a data center accomplishes relative to what it consumes.
2.3.4.2.4 Data Center Compute Efficiency (DCCE) enables data center operators to
determine the efficiency of compute resources. The metric makes it easier for data center
operators to discover unused servers (both physical and virtual) and decommission or
redeploy them.
Chapter2 Literature Review
17
Surprisingly, efforts to improve efficiency have not been implemented to the extent one
would expect. 73% of respondents to a recent Uptime Institute survey [11] stated that
someone outside of the data center (the real estate / facilities department) was responsible for
paying the utility bill. 8% of data center managers weren’t even aware who paid the bill. The
lack of accountability is obvious and problematic. If managers are primarily concerned with
maintaining the DC on a daily basis there is an inevitable lack of incentive to implement even
the most basic energy efficiency strategy in the short to medium term. It is clear that a
paradigm shift is required to advance the cause of energy efficiency monitoring at the ‘C-
level’ (CEO, CFO, CIO) of data center operations.
2.3.4.3 Guidelines
Data center guidelines are intermittently published by The American Society of Heating,
Refrigeration and Air Conditioning Engineers (ASHRAE). These guidelines [12, 13] suggest
‘allowable’ and ‘recommended’ temperature and humidity ranges within which it is safe to
operate IT equipment. The most recent edition of the guidelines [14] suggests operating
temperatures of 18 – 27⁰C. The maximum for humidity is 60% RH.
One of the more interesting objectives of the recent guidelines is to have the rack inlet
recognized as the position from where the temperature and humidity should be measured. The
majority of DCs currently measure at the return inlet to the AHU, despite more relevant
temperature and humidity metrics being present at the inlet to the racks.
2.3.5 Three Seminal Papers
In the context of improving the hardware infrastructure of the DC post-2006, three academic
papers were found to be repeatedly referenced as forming a basis for the work of the most
prominent researchers in the field. They each undertake a similar methodology when
identifying solutions and are considered to have led the way for a significant number of
subsequent research efforts. The methodologies which are common to each of the papers (and
relevant to this thesis) include:
1. Identification of a power consumption opportunity within the DC and adoption of
a software-based solution
Chapter2 Literature Review
18
2. Demonstration of the absolute requirement for monitoring the DC environment as
accurately as possible without overloading the system with additional processing
Summary review of the three papers follows.
2.3.5.1 Paper 1: Viability of Dynamic Cooling Control in a Data Center
Environment (2006)
In the context of dynamically controlling the cooling system Boucher et al. [15] focused their
efforts on 3 requirements:
1. A distributed sensor network to indicate the local conditions of the data center.
Solution: a network of temperature sensors was installed at:
 Rack inlets
 Rack outlets
 Tile inlets
2. The ability to vary cooling resources locally. Solution: 4 actuation points, which exist
in a typical data center, were identified as having further potential in maintaining
optimal server room conditions:
2.1 CRAC supply temperature – this is the temperature of the conditioned air
entering the room. CRACs are typically operated on the basis of a single
temperature sensor at the return side of the unit. This sensor is responsible for
taking an average of the air temperature returning from the room. The CRAC then
correlates this reading with a set-point which is configured manually by data
center staff. The result of the correlation is the basis upon which the CRAC
decides by how much the temperature of the air sent back out into the room should
be adjusted. Variation is achieved in a Direct Expansion (DX) system with
variable capacity compressors varying the flow of refrigerant across the cooling
coil. In a water-cooled system chilled water supply valves modulate the
temperature.
2.2 The crucial element in the operational equation of the CRAC, regardless of the
system deployed, is the set-point. The set-point is manually set by data center staff
and generally requires considerable analysis of the DC environment before any
Chapter2 Literature Review
19
adjustment is made. Typically, the set-point is configured (when the CRAC is
initially installed) according to some prediction of the future cooling demand. Due
to a number of factors (including the cost of consultancy) it is all too common that
no regular analysis of the room’s thermal dynamics is performed (if at all). This is
despite instalment of additional IT equipment (and increased work load on the
existing infrastructure) throughout the lifecycle of the data center. Clearly a very
static situation exists in this case.
2.3 CRAC fan speed – the speed at which the fans in the CRAC blow the air into
the room (via a sub-floor plenum). In 2006 (at the time of this paper), typical
CRACs had fans running at a set speed and without further analysis no
reconfiguration took place after installation. Most CRACs since then have been
designed with Variable Speed Drives (VSDs) - which can vary the speed of the
fan according to some set of rules. However, with no dynamic thermal analysis of
the DC environment taking place on a regular basis, the VSD rules are effectively
hardwired into the system. The VSDs are an unused feature of the CRAC as a
result.
2.4 Floor tile openings – the openings of the floor tiles in the cold aisle. The
velocity at which the cold air leaving the CRAC enters the room is dependent
upon a number of factors. Assuming it has passed through the sub-floor plenum
with minimal pressure loss, the air will rise into the room at some velocity (via the
floor tile openings). Floor tiles are either perforated or grated. Perforated tiles
typically have 25% of their surface area open whereas grated tiles may have 40 –
60% of their surface open. The more open surface area available on the tile the
higher the velocity with which the air will enter the room. The authors had
previously designed and implemented a new tile - featuring an electronically
controlled sliding damper mechanism which could vary the size of the opening
according to requirements.
So it is evident that as a typical DC matures and the thermodynamics of the
environment change with higher CPU loads and additional IT equipment, the
cooling system should have a dynamic cooling control system to configure it for
Chapter2 Literature Review
20
continuous maximum efficiency. Boucher et al. propose that this control system
should be based on the 4 available actuation points above.
3. The knowledge of each variable’s effect on DC environment. Solution: the paper
focused on how each of the actuator variables (2.1, 2.2 and 2.3 and 2.4 above) can
affect the thermal dynamic of the data center.
Included in the findings of the study were:
 CRAC supply temperatures have an approximate linear relationship with rack inlet
temperatures. An anomaly was identified where the magnitude of the rack inlet
response to a change in CRAC supply temperature was not of the same order. Further
study was suggested.
 Under-provisioned flow provided by the CRAC fans affects the Supply Heat Index
(SHI*) but overprovisioning has a negligible effect. SHI is a non-dimensional
measure of the local magnitude of hot and cold air mixing. Slower air flow rates cause
an increase in SHI (more mixing) whereas faster air flow rates have little or no effect.
*SHI is also referred to as Heat Density Factor (HDF). The metric is based on the
principle of a thermal multiplier which was formulated by Sharma et al. [16]
The study concluded that significant energy savings (in the order of 70% in this case) were
possible where a dynamic cooling control system, controlled by software, was appropriately
deployed.
2.3.5.2 Paper 2: Impact of Rack-level Compaction on the Data Center Cooling
Ensemble (2008)
Shah et al. [17] deal with the impact on the data center cooling ensemble when the density of
compute power is increased. The cooling ‘ensemble’ is considered to be all elements of the
cooling system from the chip to the cooling tower.
Increasing density involves replacing low-density racks with high-density blade
servers and has been the chosen alternative to purchasing (or renting) additional space for
Chapter2 Literature Review
21
most DCs in recent years. New enterprise and co-location data centers also implement the
strategy to maximize the available space. Densification leads to increased power dissipation
and corresponding heat flux within the DC environment.
A typical cooling system performs two types of work:
1. Thermodynamic – removes the heat dissipated by the IT equipment
2. Airflow – moves the air through the data center and related systems
The metric chosen by Shah et al. for evaluation in this case is the ‘grand’ Coefficient of
Performance (COPG) which is a development of the original COP metric suggested by Patel
et al. [18, 19]. It measures the amount of heat removed by the cooling infrastructure per unit
of power input and does so at a more granular level than the traditional COP used in
thermodynamics, specifying heat removal at the chip, system, rack, room and facility levels.
In order to calculate the COPG of the model used for the test case each component of
the cooling system needed to be evaluated separately, before applying each result to the
overall system. Difficulties arose where system-level data was either simply unavailable or,
due to high heterogeneity, impossible to infer. However, the model was generic enough that it
could be applied to the variety of cooling systems currently being used by ‘real world’ DCs.
Note: in a similar vein, the research for this thesis examines the CPU utilization of
each individual server in the data center such that an overall DC utilization metric at each
interval can be calculated. Servers which are powered-off at the time of monitoring have no
effect on the result and are excluded from the calculation.
The assumption that increased density leads to less efficiency in the cooling system is
incorrect. If elements of the cooling system were previously running at low loads they would
typically have been operating at sub-optimal efficiency levels. Increasing the load on a
cooling system may in fact increase its overall efficiency through improved operational
efficiencies in one or more of its subsystems.
94 existing low-density racks were replaced with high-density Hewlett Packard (HP) blades
for Shah’s research. The heat load increased from 1.9MW to 4.7MW. The new heat load was
still within the acceptable range for the existing cooling infrastructure. No modifications to
the ensemble were required.
Chapter2 Literature Review
22
Upon analysis of the results, COPG was found to have increased by 15%. This was, in part,
achieved with improved efficiencies in the compressor system of the CRACs. While it is
acknowledged that there is a crossover point at which compressors become less efficient, the
increase in heat flux of the test model resulted in raising the work of the compressor to a
point somewhere below this crossover. The improvement in compressor efficiency was
attributed to the higher density HP blade servers operating at a higher ΔT (reduced flow rates)
across the rack. The burden on the cooling ensemble was reduced - resulting in a higher
COPG.
With the largest individual source of DC power consumption (about 40% in this case)
typically coming from the CRAC - which contains the compressor - it makes sense to direct
an intelligent analysis of potential operational efficiencies at that particular part of the system.
The paper states that: “The continuously changing nature of the heat load distribution
in the room makes optimization of the layout challenging; therefore, to compensate for
recirculation effects, the CRAC units may be required to operate at higher speeds and lower
supply temperature than necessary. Utilization of a dynamically coupled thermal solution,
which modulates the CRAC operating points based on sensed heat load, can help reduce this
load”.
In this paper Shah et al. present a model for performing evaluation of the cooling
ensemble using COPG, filling the gap of knowledge through detailed experimentation with
measurements across the entire system. They conclude that energy efficiencies are possible
via increased COP in one or more of the cooling infrastructure components. Where thermal
management strategies capable of handling increased density are in place, there is significant
motivation to increase density without any adverse impact on energy efficiency.
2.3.5.3 Paper 3: Data Center Efficiency with Higher Ambient Temperatures and
Optimized Cooling Control (2011)
Ahuja et al. [20] introduce the concept of ‘deviation from design intent’. When a data center
is first outfitted with a cooling system, best estimates are calculated for future use. The
intended use of the DC in the future is almost impossible to predict at this stage. As the
lifecycle of the DC matures, the IT equipment will deviate from the best estimates upon
Chapter2 Literature Review
23
which the cooling system was originally designed to operate. Without on-going analysis of
the DC’s thermal dynamics, the cooling system may become decreasingly ‘fit-for-purpose’.
As a possible solution to this deviation from intent, this paper proposes that cooling of
the DC environment should be controlled from the chip rather than a set of remote sensors in
the room or on the rack doors. Each new IT component would have chip-based sensing
already installed and therefore facilitate a “plug ‘n’ play” cooling system.
The newest Intel processors (since Intel® Pentium® M) on the market feature an ‘on-
die’ Digital Thermal Sensor (DTS). DTS provides the temperature of the processor and
makes the result available for reading via Model Specific Registers (MSRs). The Intel white
paper [21] which describes DTS states that:
“… applications that are more concerned about power consumption
can use thermal information to implement intelligent power
management schemes to reduce consumption.”
While Intel is referring to power management of the server itself, DTS could
theoretically be extended to the cooling management system also.
Current DCs control the air temperature and flow rate from the chip to the chassis but
there is a lack of integration once the air has left the chassis. If the purpose of the data center
is to house, power and cool every chip then it has the same goal as the chassis and the chassis
is already taking its control data from the chip. This strategy needs to be extended to the
wider server room environment in an integrated manner.
The industry has recently been experimenting with positioning the cooling sensors at
the front of the rack rather than at the return inlet of the AHU. The motivation for this is to
sense the air temperature which matters most – the air which the IT equipment uses for
cooling. The disadvantage of these remote sensors (despite being better placed than sensors at
the AHU return inlet) is that they are statically positioned, a position which may later be
incorrect should changes in the thermal dynamics of the environment occur. The closer to the
server one senses - the more reliable the sensed data will be for thermal control purposes.
Ahuja et al. propose that the logical conclusion is to move the sensors even closer to the
Chapter2 Literature Review
24
server – in fact, right into the processor. If those sensors already exist (as is the case with the
Intel processors) then use should be made of them for a more accurate cooling management
system.
The paper investigates the possible gains by moving the temperature sensors (and
changing the set-point accordingly) to a variety of positions in the DC:
1. AHU return – 28⁰C
2. AHU supply – 18⁰C
3. Rack inlet – 23⁰C
4. Server - 30⁰C
The first test was carried out on a single isolated rack with those results then
extrapolated to a DC model with a cooling capacity of 100kW. 4 perimeter down-flow AHUs
(N + 1 redundancy) performed the heat removal. While the 4 rows in the DC were not
contained they did follow the standard hot / cold aisle arrangement. The tests showed that use
of the server sensors resulted in more servers being maintained within the ASHRAE
guideline temperature range of 18 – 27⁰C. Controlling the cooling system at the server
yielded maximum benefit.
Ahuja et al. concluded that a processor-based set of metrics capable of controlling a
power management scheme on the server should, by extension, also be capable of controlling
a dynamic cooling control system outside the rack. If every server in a DC was intermittently
reporting its operating temperature (and air flow) to a cooling control system, the cooling
system would be operating on a more robust data set i.e., more accurate readings, delivering
higher energy efficiency savings than possible with previous DC configurations.
2.4 Software
2.4.1 Virtualization
In a virtualized data center, multiple Virtual Machines (VMs) are typically co-located on a
single physical server, sharing the processing capacity of the server's CPU between them.
When, for example, increased demands on the CPU result in reduced performance of one of
the VMs to the point where a Service Level Agreement (SLA) may be violated, virtualization
technology facilitates a migration. Migration relocates the services being provided by the VM
Chapter2 Literature Review
25
on this 'over-utilized' host to a similar VM on another physical server, where sufficient
capacity (e.g. CPU) is available to maintain SLA performance.
Conversely, reduced demand on the CPU of a host introduces opportunities for server
consolidation, the objective of which is to minimize the number of operational servers
consuming power. The remaining VMs on an 'under-utilized' host are migrated so that the
host can be switched off, saving power. Server consolidation provides significant energy
efficiency opportunities.
There are numerous resource allocation schemes for managing VMs in a data center,
all of which involve the migration of a VM from one host to another to achieve one, or a
combination of, objectives. Primarily these objectives will involve either increased
performance or reduced energy consumption - the former, until recently, receiving more of
the operator’s time and effort than the latter.
In particular, SLA@SOI has completed extensive research in recent years in the area
of SLA-focused (e.g. CPU, memory, location, isolation, hardware redundancy level) VM
allocation and re-provisioning [22]. The underlying concept is that VMs are assigned to the
most appropriate hosts in the DC according to both service level and power consumption
objectives. Interestingly, Hyser et al. [23] suggest that a provisioning scheme which also
includes energy constraints may choose to violate user-based SLAs ‘if the financial penalty
for doing so was [sic] less than the cost of the power required to meet the agreement’. In a
cost-driven DC it is clear that some trade-off (between meeting energy objectives and
compliance with strict user-based SLAs e.g. application response times) is required. A similar
power / performance trade-off may be required to maximize the energy efficiency of a host-
level migration.
2.4.2 Migration
The principal underlying technology which facilitates management of workload in a DC is
virtualization. Rather than each server hosting a single operating system (or application),
virtualization facilitates a number of VMs being hosted on a single physical server, each of
which may run a different operating system (or even different versions of the same operating
Chapter2 Literature Review
26
system). These VMs may be re-located (migrated) to a different host on the LAN for a
variety of reasons:
 Maintenance
Servers intermittently need to be removed from the network for maintenance. The
applications running on these servers may need to be kept running during the
maintenance period so they are migrated to other servers for the duration.
 Consolidation
In a virtualized DC some of the servers may be running at (or close to) idle – using
expensive power to maintain a machine which is effectively not being used to
capacity. To conserve power, resource allocation software moves the applications on
the under-utilized machine to a ‘busier’ machine - as long as the latter has the
required overhead to host the applications. The under-utilized machine can then be
switched off – saving on power and cooling.
 Energy Efficiency
Hotspots regularly occur in the server room i.e. the cooling system is working too
hard in the effort to eliminate the exhaust air from a certain area. The particular
workload which is causing the problem can be identified and relocated to a cooler
region in the DC to relieve the pressure in the overheated area.
Virtual Machines may also be migrated to servers beyond the LAN (i.e. across the Wide Area
Network (WAN):
 Follow the sun - minimize network latency during office hours by placing VMs close
to where their applications are requested most often
 Where latency is not a primary concern there are a number of different strategies
which may apply:
 Availability of renewable energy / improved energy mix
 Less expensive cooling overhead (e.g. ‘free’ cooling in more temperate / cooler
climates
 Follow the moon (less expensive electricity at night)
 Fluctuating electricity prices on the open market [24]
Chapter2 Literature Review
27
 Disaster Recovery (DR)
 Maintenance / Fault tolerance
 Bursting i.e. temporary provisioning of additional resources
 Backup / Mirroring
Regardless of the motivation, migration of virtual machines both within the DC and also to
other DCs (in the cloud or within the enterprise network) not only extends the opportunity for
significant cost savings but may also provide faster application response times if located
closer to clients. To maintain uptime and response Service Level Agreement (SLAs)
parameters of 99.999% (or higher), these migrations must be performed ‘hot’ or ‘live’,
keeping the application available to users while the virtual machine hosting the application
(and associated data) is moved to the destination server. Once all the data has been migrated,
requests coming into the source VM are redirected to the new machine and the source VM
can be switched off or re-allocated. The most popular algorithm by which virtual machines
are migrated is known as pre-copy and is deployed by both Citrix and VMWare – currently
considered to be the global leaders in software solutions for migration and virtualized
systems. A variety of live migration algorithms have been developed in the years since 2007.
Some are listed below:
1. Pre-copy [25]
2. GA for Renewable Energy Placement [26]
3. pMapper: Power Aware Migration [27]
4. De-duplication, Smart Stop & Copy, Page Deltas & CBR (Content Based
Replication) [28]
5. Layer 3: IP LightPath [29]
6. Adaptive Memory Compression [30]
7. Parallel Data Compression [31]
8. Adaptive Pre-paging and Dynamic Self-ballooning [32]
9. Replication and Scheduling [33]
Chapter2 Literature Review
28
10. Reinforcement Learning [34]
11. Trace & Replay [35]
12. Distributed Replicated Block Device (DRBD) [36]
The LAN-based migration algorithm used by the Amazon EC2 virtualization hypervisor
product (Citrix XenMotion) is primarily based on pre-copy but also integrates some aspects
of the algorithms listed above. It serves as a good example of the live migration process. It is
discussed in the following section.
2.4.2.1 Citrix XenMotion Live Migration
The virtual machine on the source (or current machine) keeps running while transferring its
state to the destination. A helper thread iteratively copies the state needed while both end-
points keep evolving. The number of iterations determines the duration of live migration. As
a last step, a stop-and-copy approach is used. Its duration is referred to as downtime. All
implementations of live migration use heuristics to determine when to switch from iterating
to stop-and-copy.
Pre-copy starts by copying the whole source VM state to the destination system.
While copying, the source system keeps responding to client requests. As memory pages may
get updated (‘dirtied’) on the source system (Dirty Page Rate), even after they have been
copied to the destination system, the approach employs mechanisms to monitor page updates.
The performance of live VM migration is usually defined in terms of migration time
and system downtime. All existing techniques control migration time by limiting the rate of
memory transfers while system downtime is determined by how much state has been
transferred during the ‘live’ process. Minimizing both of these metrics is correlated with
optimal VM migration performance and it is achieved using open-loop control techniques.
With open-loop control, the VM administrator manually sets configuration parameters for the
migration service thread, hoping that these conditions can be met. The input parameters are a
limit to the network bandwidth allowed to the migration thread and the acceptable downtime
for the last iteration of the migration. Setting a low bandwidth limit while ignoring page
modification rates can result in a backlog of pages to migrate and prolong migration. Setting
a high bandwidth limit can affect the performance of running applications. Checking the
Chapter2 Literature Review
29
estimated downtime to transfer the backlogged pages against the desired downtime can keep
the algorithm iterating indefinitely. Approaches that impose limits on the number of iterations
or statically increasing the allowed downtime can render live migration equivalent to pure
stop-and-copy migration.
2.4.2.2 Wide Area Network Migration
With WAN transmissions becoming increasingly feasible and affordable, live migration of
larger data volumes over significantly longer distances is becoming a realistic possibility [37,
38]. As a result, the existing algorithms, which have been refined for LAN migration, will be
required to perform the same functionality over the WAN. However, a number of constraints
present themselves when considering long distance migration of virtual machines. The
constraints unique to WAN migration are:
 Bandwidth (I/O throughput – lower over WANs)
 Latency (distance to destination VM – further on WANs)
 Disk Storage (transfer of SAN / NAS data associated with the applications running on
the source VM to the destination VM)
Bandwidth (and latency) becomes an increasingly pertinent issue during WAN migration
because of the volume of data being transmitted across the network. In the time it takes to
transmit a single iteration of pre-copy memory to the destination, there is an increased chance
(relative to LAN migration) that the same memory may have been re-written at the source.
The rate at which memory is rewritten is known as the Page Dirty Rate (PDR) - calculated by
dividing the number of pages dirtied in the last round by the time the last round took (Mbits /
sec). This normalizes PDR for comparison with bandwidth. Xen implements variable
bandwidth during the pre-copy phase based on this comparison. There are 2 main categories
of PDR when live migration is being considered:
1. Low / Typical PDR: Memory is being re-written slower than the rate at which those
changes can be transmitted to the destination i.e. PDR < Migration bandwidth
2. Diabolical PDR (DPDR): The rate at which memory is being re-written at the source
VM exceeds the rate at which that re-written memory can be migrated ‘live’ to the
destination (PDR > Migration bandwidth). The result of this is that the pre-copy phase
may not converge at all. The PDR floods I/O and the pre-copy migration must be
Chapter2 Literature Review
30
immediately stopped i.e. pre-copy migration will not converge. All remaining pages
are then transferred to the destination. The result of this is a longer downtime (while
the pages are transferred), potential SLA violations and, most notably for the purposes
of this research, increased power consumption while both hosts are running
concurrently.
2.4.2.3 PDR Analysis and Compression of Transmitted Pages
Current algorithms send the entire VM state on the 1st iteration (Figure 5: 62 seconds). To
reduce the time spent on the 1st iteration, pages frequently ‘dirtied’ should be identified
before the 1st iteration - the objective being to hold back these pages until the final iteration
(reducing the number of pages resent during iterative pre-copy) or at least hold them back
until some analysis calculates that they are ‘unlikely’ (with some confidence interval) to be
dirtied again. There is a reasonable assumption that there will be multiple iterations in a high
PDR environment – in the (rare) case where a VM has no dirty pages, only a single iteration
would be required to transfer the entire state. Pre-migration analysis would not be continuous
(due to the CPU overhead) but should begin at some short interval before the migration takes
place i.e. just after the decision to migrate has been made.
Figure 5 Performance of web server during live migration (C. Clark)
With a pre-migration analysis phase the time required for the 1st iteration will be reduced.
There may be an argument that downtime is increased due to the additional pages held back
during the first iteration. High PDR pages - which have not been sent in the 1st iteration –
Chapter2 Literature Review
31
would likely be identified in the 2nd (or subsequent) iterations anyway – resulting in a very
similar Writable Working Set (WWS) on the final iteration. In low PDR environments
research suggests that the WWS in the majority of cases is a small proportion of the entire
data set (perhaps approximately 10%) which needs to be transferred – resulting in minimal
iterations being required before a stop condition is reached i.e. subsequent iterations would
yield diminishing returns. This is not the case where an application may be memory intensive
i.e. the PDR is diabolical and floods the I/O rate.
Conversely, if the WWS is so small – is the effort to identify it at the pre-iterative
stage worth the effort? If the algorithm can be applied to diabolical environments as well as
acceptable PDR environments then the answer is yes – the effort is worth it. There is an
inevitable trade-off between the time (and CPU overhead) required to identify the WWS on
each iteration and the resulting time saved during iterative pre-copy due to less pages being
transferred. However, identifying a minimal WWS will intrinsically save time.
Finding the ‘threshold’ (current research suggests a simple high/low threshold) is an
interesting research challenge! A bitmap indicating the Page Dirty Count is required to keep
track of pages being repeatedly dirtied. A count however is probably too simplistic. Would an
upper / lower bounded threshold be more applicable? A bounded threshold would ‘hold’ the
pages which are above the lower threshold boundary but below the upper threshold boundary
i.e. deemed least likely to be dirtied again. Boundary calculation should include a confidence
interval - to minimize the un-synced pages before the final iteration occurs. These categorized
‘hold’ pages might be held until the next iteration and if they are found to still have a ‘hold’
status (fall between the upper and lower threshold boundaries) they are then transferred. With
successive iterations more is known about recent PDR patterns. Analysis of these should
theoretically yield boundary calculations which are more accurate as a result.
Note: An additional parallel check, before the nth iteration takes place, of all the pages
which were transmitted from the threshold area would identify those pages which have been
subsequently dirtied. The compressed deltas of these pages would be re-transmitted in the
final iteration – along with those that were still above the upper threshold. The success of the
new algorithm could be judged on the percentage error at this stage i.e. how many pages were
sent from the ‘hold’ area but subsequently dirtied?
Chapter2 Literature Review
32
2.4.2.4 Parallel Identification of Dirty Pages and Multi-Threaded Adaptive
Memory Compression
In addition to the pre-migration analysis stage it may also be useful to examine the potential
of parallel dirty page identification and compression. In Figure 6 the blue area is when dirty
pages are identified for the next round and delta compression takes place. However, in the
time this phase is taking place more pages will be dirtied. If the same interval was moved
back (to be in parallel with the previous data transfer) would more pages be dirtied? The
answer appears to be no i.e. the PDR is independent of the process which actually calculates
it. The benefit of this parallelism is that the algorithm is ready to move immediately to
transfer n + 1 when transfer n has completed – reducing the iterative pre-copy time by
eliminating the blue interval in Figure 6.
It is probable that some overlap may be optimal rather than full parallelism. Time–
series analysis of dirtying patterns during the previous transfer interval might yield an
optimal overlap i.e. the best time to start identifying the new dirty pages, rather than waiting
until the transfer has completed. It would also be beneficial to investigate further whether, as
the number of dirty pages reduces with subsequent iterations, the time required to identify
(and compress) the dirty page deltas could also be reduced (research suggests cache access
times remain relatively constant). If this were true then the inner overlap could be sent deeper
back into the transfer time reducing the outer overlap further. Additionally, multi-threaded
compression would yield further reductions in the overlap interval.
Figure 6 Pre-Copy algorithm
Chapter2 Literature Review
33
2.4.2.5 Throttling
The critical issue in high PDR environments is that the possibility of convergence is reduced
(if not eliminated altogether). It is similar to a funnel filling up too quickly. If the PDR
continues at a high rate the funnel will eventually overflow resulting in service timeouts i.e.
the application will not respond to subsequent requests or response times will be significantly
decreased. The current solution is to abandon pre-copy migration, stop the VM and transfer
all memory i.e. empty the funnel. Unfortunately, in the time it takes to empty the funnel,
more pages have been dirtied because requests to the application do not stop. This may
actually prohibit the migration altogether because the downtime is such that an unacceptable
level of SLA violations occur.
If, however, the speed at which the application’s response thread can be artificially
slowed down (throttled) intermittently then the funnel is given a better chance to empty its
current contents. This would be analogous to temporarily decreasing the flow from the tap to
reduce the volume in the funnel.
Previous solutions suggested that slowing response time to requests (known as
Dynamic Rate Limiting) would alter the rate at which I/O throughput was performed but
results proved that detrimental VM degradation tended to occur. In addition, other processes
on the same physical machine were negatively affected. Dedicated migration switches were
required to divert the additional load from the core. The focus was on the I/O throughput as
opposed to the incoming workload (PDR).
How the PDR could be intermittently throttled without adverse degradation of either
the VM in question, or other machine processes, is the central question.
Successful PDR throttling, in conjunction with threshold calculations and optimized
parallel adaptive memory compression / dirty page identification, would achieve a lower
PDR. However, the issue of PDR can be essentially circumvented if the number of migrations
taking place in a DC as a whole can be reduced.
In the majority of typical PDR environments Clark et al. [39] have shown that the
initial number of dirty pages i.e. the Writable Working Set (WWS), is a small proportion of
the entire page set (perhaps 10% or less) which needs to be transferred, typically resulting in
minimal iterations being required before a stop condition is reached i.e. subsequent iterations
Chapter2 Literature Review
34
would yield diminishing returns. This is not the case where an application may be particularly
memory-intensive i.e. the PDR is diabolical.
Degradation of application performance during live migration (due to DPDRs or for
other reasons) results in increased response times, threatening violation of SLAs and
increasing power consumption. For optimization of migration algorithms with DPDRs there
are 2 possible approaches for solving the DPDR problem:
1. Increase bandwidth
2. Decrease PDR
Typical applications only exhibit this DPDR-like behaviour as spikes or outliers in normal
write activity. Live migration was previously abandoned by commercial algorithms when
DPDRs were encountered. However, in its most recent version of vSphere (5.0), VMWare
has included an enhancement called ‘Stun During Page Send’ (SDPS) [40] which guarantees
that the migration will continue despite experiencing a DPDR (VMWare refers to DPDRs as
‘pathological’ loads). Tracking both the transmission rate and the PDR, a diabolical PDR can
be identified. When a DPDR is identified by VMWare, the response time of the virtual
machine is slowed down (‘stunned’) by introducing microsecond delays (sleep processes) to
the vCPU. This lowers the response time to application requests and thus slows the rate of
PDR to less than the migration bandwidth (in order to ensure convergence (PDR <
bandwidth).
Xen implements a simple equivalent – limiting ‘rogue’ processes (other applications
or services running parallel to the migration) to 40 write faults before putting them on a wait
queue.
2.5 Monitoring Interval
Much effort has been applied to optimizing the live migration process in recent years. During
migration, the primary factors impacting on a VM’s response SLA are the migration time
and, perhaps more importantly, the downtime. These are the metrics which define the
efficiency of a migration. If a DC operator intends to migrate the VM(s) hosting a client
application it must factor these constraints into its SLA guarantee. It is clear that every
possible effort should be made to minimize the migration time (and downtime) - so that the
Chapter2 Literature Review
35
best possible SLAs may be offered to clients. This can only be achieved by choosing the VM
with the lowest potential PDR for each migration.
However, response and uptime SLAs become increasingly difficult to maintain if
reduction (or at least minimization) of power consumption is a primary objective because
each migration taking place consumes additional energy (while both servers are running and
processing cycles, RAM, bandwidth are being consumed). Based on this premise, Voorsluys
et al. [41] evaluate the cost of live migration, demonstrating that DC power consumption can
be reduced if there is a reduction in migrations. The cost of a migration (as shown) is
dependent on a number of factors, including the amount of RAM being used by the source
VM (which needs to be transferred to the destination) and the bandwidth available for the
migration. The higher the bandwidth the faster data can be transferred. Additionally, power
consumption is increased because 2 VMs (source and destination) are running concurrently
for much of the migration process.
In order to reduce the migration count in a DC each migration should be performed
under the strict condition that the destination host is chosen such that power consumption in
the DC is minimized post-migration. This can only be achieved by examining all possible
destinations before each migration begins - to identify the optimal destination host for each
migrating VM from a power consumption point-of-view. The critical algorithm for resource
(VM) management is the placement algorithm.
Chapter2 Literature Review
36
These 2 conditions i.e.
1. Migrate the VM with the lowest Page Dirty Rate
2. Choose the destination host for minimal power consumption post-migration
form the basis upon which the Local Regression / Minimum Migration Time (LRMMT)
algorithm in CloudSim [42] operates (c.f. Chapter 3).
2.5.1 Static Monitoring Interval
Recent research efforts in energy efficiency perform monitoring of the incoming workload
but almost exclusively focus on techniques for analysis of the data being collected rather than
improving the quality of the data.
In their hotspot identification paper, Xu and Sekiya [43] select a monitoring interval
of 2 minutes. The interval is chosen on the basis of balancing the cost of the additional
processing required against the benefit of performing the migration. The 2 minute interval
remains constant during experimentation.
Using an extended version of the First Fit Decreasing algorithm, Takeda et al. [44]
are motivated by consolidation of servers, to save power. They use a static 60 second
monitoring interval for their work.
Xu and Chen et al. [45] monitor the usage levels of a variety of server resources
(CPU, memory, and bandwidth), polling metrics as often as they become available. Their
results show that monitoring at such a granular level may not only lead to excessive data
processing but the added volume of network monitoring traffic (between multiple hosts and
the monitoring system) may also be disproportionate to the accuracy required.
The processing requirements of DC hosts vary as the workload varies and are not
known until they arrive at the VM, requesting service. While some a priori analysis of the
workload may be performed to predict future demand, as in the work of Gmach et al. [46],
unexpected changes may occur which have not been established by any previously identified
patterns. A more dynamic solution is required which reacts in real-time to the incoming
workload rather than making migration decisions based on a priori analysis.
Chapter2 Literature Review
37
VMware vSphere facilitates a combination of collection intervals and levels [47].
The interval is the time between data collection points and the level determines which metrics
are collected at each interval. Examples of vSphere metrics are as follows:
 Collection Interval: 1 day
 Collection Frequency: 5 minutes (static)
 Level 1 data: 'cpuentitlement', 'totalmhz', 'usage', 'usagemhz'
 Level 2 data: 'idle', 'reservedCapacity' + all of Level 1 data (above)
VMware intervals and levels in a DC are adjusted manually by the operator as circumstances
require. Once chosen, they remain constant until the operator re-configures them. Manual
adjustment decisions, which rely heavily on the experience and knowledge of the operator,
may not prove as accurate and consistent over time as an informed, dynamically adjusted
system.
In vSphere, the minimum collection frequency available is 5 minutes. Real-time data
is summarized at each interval and later aggregated for more permanent storage and analysis.
2.5.2 Dynamic Monitoring Interval
Chandra et al. [48] focus on dynamic resource allocation techniques which are sensitive to
fluctuations in data center application workloads. Typically SLA guarantees are managed by
reserving a percentage of available resources (e.g. CPU, network) for each application. The
portion allocated to each application depends on the expected workload and the SLA
requirements of the application. The workload of many applications (e.g. web servers) varies
over time, presenting a significant challenge when attempting to perform a priori estimation
of such workloads. Two issues arise when considering provisioning of resources for web
servers:
1. Over-provisioning based on worst case workload scenarios may result in potential
underutilization of resources e.g. higher CPU priority allocated to application which
seldom requires it
2. Under-provisioning may result in violation of SLAs e.g. not enough CPU priority
given to an application which requires it
Chapter2 Literature Review
38
An alternate approach is to allocate resources to applications dynamically based on
observation of their behaviour in real-time. Any remaining capacity is later allocated to those
applications as and when they are found to require it. Such a system reacts in real-time to
unanticipated workload fluctuations (in either direction), meeting QoS objectives which may
include optimization of power consumption in addition to typical performance SLAs such as
response time.
While Chandra and others [49, 50] have previously used dynamic workload analysis
approaches, their focus was on resource management to optimize SLA guarantees i.e.
performance. No consideration is given in their work to the effect on power consumption
when performance is enhanced. This research differentiates itself in that dynamic analysis of
the workload is performed for the purpose of identifying power consumption opportunities
while also maintaining (or improving) the performance of the DC infrastructure. The search
for improved energy efficiency is driven in this research by DC cost factors which were not as
significant an issue 10-15 years ago as they are now.
2.6 Conclusion
This chapter provided an in-depth analysis of data center energy efficiency state-of-the-art.
Software solutions to energy efficiency issues were presented, demonstrating that many
opportunities still exist for improvement in server room power consumption using a software
approach to monitoring (and control) of the complex systems which comprise a typical DC.
The principle lesson to take from prior (and existing) research in the field is that most of the
DC infrastructure can be monitored using software solutions but that monitoring (and
subsequent processing of the data collected) should not overwhelm the monitoring /
processing system and thus impact negatively on the operation of the DC infrastructure. This
thesis proposes that dynamic adjustment of the monitoring interval with respect to the
incoming workload may represent a superior strategy from an energy efficiency perspective.
Chapter 3 discusses in more detail the capabilities provided by the Java-based CloudSim
framework (used for this research) and the particular code modules relevant to testing the
hypothesis presented herein.
Chapter3 CloudSim
39
Chapter 3 CloudSim
Introduction
Researchers working on data center energy efficiency from a software perspective are
typically hindered by lack of access to real-world infrastructure because it is infeasible to add
additional workload to a data center which already has a significant ‘real-world’ workload to
service on a daily basis. From a commercial perspective, DC operators are understandably
unwilling to permit experimentation on a network which, for the most part, has been fine-
tuned to manage their existing workload. In this chapter, details of the CloudSim framework
are presented with special emphasis on those aspects particularly related to this MSc research
topic.
The CloudSim framework [42] is a Java-based simulator, designed and written by
Anton Beloglazov at the University of Melbourne for his doctoral thesis. It provides a limited
software solution to the above issues and is deployed in this research to simulate a standalone
power-aware data center with LAN-based migration capabilities. The Eclipse IDE is used to
run (and edit) CloudSim.
3.1 Overview
Default power-aware algorithms in CloudSim analyse the state of the DC infrastructure at
static 300-second intervals. This reflects current industry practice where an average CPU
utilization value for each host is polled every 5 minutes (i.e. 300 seconds) by virtualization
monitoring systems (e.g. VMware). At each interval the CPU utilization of all hosts in the
simulation is examined to establish whether or not they are adequately servicing the workload
which has been applied to the VMs placed on them.
If a host is found to be over-utilized (i.e. the CPU does not have the capacity to
service the complete workload of all the VMs placed on it) a decision is made to migrate one
or more of the VMs to another host where the required capacity to service the workload is
available.
Chapter3 CloudSim
40
Conversely, if a host is found to be under-utilized (i.e. the CPU is operating at such a low
capacity that power could be saved by switching it off), the remaining VMs are migrated to
another host and the machine is powered off. The CloudSim modules used only implement
migration when a host is over-utilized, reflecting the focus of this research.
There are two primary steps in the power-aware CloudSim migration algorithm for
over-utilized hosts:
1. Migrate the VM with the lowest Page Dirty Rate
2. Choose the destination host for minimal power consumption post-migration
The default CPU utilization threshold for an over-utilized host in CloudSim is 100%. An
adjustable safety parameter is also provided by CloudSim, effectively acting as overhead
provision. As an example, if the CPU utilization value were 90% and was then multiplied by
a safety parameter of 1.2, the resulting value of 108% would exceed the over-utilization
threshold. A safety parameter of 1.1 would result in a final value of 99% (for the same initial
utilization), thus not exceeding the threshold.
3.2 Workload
The workloads applied to the VMs on each host in a simulated DC for power-aware
CloudSim simulations are referred to as a ‘cloudlets’. These are flat text files which contain
sample CPU utilization percentages gathered (per interval) from over 500 DC locations
worldwide. As long as no migration takes place (i.e. the host doesn’t become over-utilized),
the VM assigned to service the workload at the beginning of a simulation (depicted in Figure
7) remains associated with that workload until the cloudlet has been completed. However, if a
migration takes place (because the host has become over-utilized) the workload is then
applied to the VM on the destination host. Despite the term ‘VM Migration’, it is the
workload (not the VM) which changes location within the DC when a migration takes place.
Chapter3 CloudSim
41
Figure 7 CloudSim Architecture
The duration of default CloudSim simulations is 24 hours (i.e. 86,400 seconds). This equates
to 288 intervals of 5 minutes (300 seconds) each. Thus, each of the 1052 cloudlets (stored in
the PlanetLab directory) contains 288 values to make a value available for reading at each
interval of the simulation.
Chapter3 CloudSim
42
At the beginning of each simulation, the entire cloudlet is loaded into an array
(UtilizationModelPlanetLabInMemory.data[]) from which the values are read at each
interval throughout the simulation. Each cloudlet is assigned to a corresponding VM on a 1-
2-1 basis at the beginning of the simulation.
The values being read from the cloudlets are percentages which simulate ‘real-
world’ CPU utilization values. These need to be converted to a variable in CloudSim which is
related to actual work performed. CloudSim work performance is defined in MIs (Million
Instructions). The workload of the cloudlet (termed length) is a constant i.e. 2500 *
SIMULATION_LENGTH (2500 * 86400 = 216,000,000 MIs). CloudSim keeps track of the
VM workload already performed by subtracting the MIs completed during each interval from
the total cloudlet MI length. As such each cloudlet starts at t = 0 seconds with a workload of
216,000,000 MI and this load is reduced according to the work completed at each interval.
To check whether a cloudlet has been fully executed the IsFinished() method is
called at each interval.
// checks whether this Cloudlet has finished or not
if (cl.IsFinished)
{
…
}
final long finish = resList.get(index).finishedSoFar;
final long result = cloudletLength - finish;
if (result <= 0.0)
{
completed = true;
}
From the code tract above it can be seen that when (or if) the VM’s workload (represented by
the cloudletLength variable) is completed during the simulation the VM will be ‘de-
commissioned’.
3.3 Capacity
Each of the 4 VM types used in the CloudSim framework represents a ‘real-world’ virtual
machine. They are assigned a MIPS value (i.e. 500, 1000, 2000, 2500) before the simulation
begins. This value reflects the maximum amount of processing capacity on the host to which
Chapter3 CloudSim
43
a VM is entitled. Likewise each host CPU has an initial MIPS capacity of either 1860 or
2660, again reflecting ‘real-world’ servers. These configuration settings limit the number of
VMs which can be run on each host and also the volume of workload which can be
performed by each VM at each interval.
Example: A host has a capacity of 2660 MIPS. A VM (with a capacity of 500
MIPS) has just been started on the host and the first value read from the cloudlet array is 5%
of the host’s capacity (i.e. 2660 / 20 = 133 MIPS). If the next interval is 30 seconds long then
the amount of instructions processed by the VM is 133 * 30 = 3990MI.
This completed work is subtracted from the total cloudlet length (i.e. 216,000,000 –
3990 = 215,996,010MI). At each subsequent interval throughout the simulation the same
algorithm is applied until such time as the remaining workload to be processed is at (or
below) zero. At this stage the VM is de-commissioned because the workload is complete.
In this example the 5% CPU percentage from the cloudlet (i.e. 133 MIPS) is
approximately 27% (500/133) of the CPU capacity allocated to the VM. If the original value
read from the cloudlet was greater than 18.79% (i.e. 2660 / 500), the VM would have
insufficient capacity to continue servicing the workload and SLA violations would occur. 2
options typically need to be considered when this happens:
1. Increase the VM’s capacity on the host – not facilitated in CloudSim
2. Apply the workload to a VM with a larger capacity on a different host, requiring a
migration. This will only occur if the host is also over-utilized which is a significant
shortfall in the CloudSim modules used for testing the hypothesis. An over-utilized
VM (causing SLA violations) will not result in a migration in the version of CloudSim
being used for this research. Additionally, it is notable that the CloudSim reports
(generated at the end of the simulation) detail very low SLA violation averages which
indicates that the particular workload (cloudlet) percentages being applied in this
version of CloudSim are insufficient to push the VMs beyond their capacity.
The difficulty of correctly sizing VM MIPS (and allocating appropriate host capacity to
them) so that they are capable of meeting their workload requirements can be seen from this
example. CloudSim goes some way to achieving this by applying VMs to hosts on a 1-2-1
Chapter3 CloudSim
44
basis at the start of the simulation i.e. in a default simulation with 1052 VMs being placed on
800 hosts, the first 800 VMs are applied to the first 800 hosts and the remaining 352 VMs are
allocated to hosts 1 -> 352. Therefore, when the simulation starts, 352 hosts have 2 VMs and
the remainder host a single VM.
As processing continues the VM placement algorithm attempts to allocate as many
VMs to each host as capacity will allow. The remaining (empty) hosts are then powered off -
simulating the server consolidation effort typical of most modern DCs. It is clear that there is
a conflict of interests taking place. On the one hand there is an attempt to maximize
performance by migrating VMs to hosts with excess capacity but, on the other hand,
competition for CPU cycles is being created by co-locating VMs on the same host,
potentially creating an over-utilization scenario.
3.4 Local Regression / Minimum Migration Time (LR / MMT)
Beloglazov concludes from his CloudSim experiments that the algorithm which combines
Local Regression and Minimum Migration Time (LR / MMT) is most efficient for
maintaining optimal performance and maximizing energy efficiency. Accordingly this
research uses the LR / MMT algorithmic combination as the basis for test and evaluation.
3.5 Selection Policy – Local Regression (LR)
Having passed the most recent CPU utilization values through the Local Regression (LR)
algorithm, hosts are considered over-utilized if the next predicted utilization value exceeds
the threshold of 100% [Appendix A]. LR predicts this value using a sliding window, each
new value being added at each subsequent interval throughout the simulation. The size of the
sliding window is 10. Until initial filling of the window has taken place (i.e. 10 intervals have
elapsed since the simulation began), CloudSim relies on a 'fallback' algorithm [Appendix A]
which considers a host to be over-utilized if its CPU utilization exceeds 70%.
VMs are chosen for migration according to MMT i.e. the VM with the lowest
predicted migration time will be selected for migration to another host. Migration time is
based on the amount of RAM being used by the VM. The VM using the least RAM will be
Chapter3 CloudSim
45
chosen as the primary candidate for migration, simulating minimization of the Dirty Page
Rate (DPR) during VM transfer [39] as previously discussed in Section 2.4.2.2.
3.6 Allocation Policy – Minimum Migration Time (MMT)
The destination host for the migration is chosen on the basis of power consumption following
migration i.e. the host with the lowest power consumption (post migration) is chosen as the
primary destination candidate. In some cases, more than one VM may require migration to
reduce the host's utilization below the threshold. Dynamic RAM adjustment is not facilitated
in CloudSim as the simulation proceeds. Rather, RAM values are read (during execution of
the MMT algorithm) on the basis of the initial allocation to each VM at the start of the
simulation.
3.7 Default LRMMT
The LRMMT algorithm begins in the main() method of the LrMmt.java class [Appendix B].
A PlanetLabRunner() object is instantiated. The PlanetLabRunner() class inherits from
the RunnerAbstract() class and sets up the various parameters required to run the LRMMT
simulation. The parameters are passed to the initLogOutput() method in the default
constructor of the super class (RunnerAbstract) which creates the folders required for
saving the results of the simulation. Two methods are subsequently called:
3.7.1 init()
Defined in the sub-class (PlanetLabRunner), this method takes the location of the PlanetLab
workload (string inputFolder) as a parameter and initiates the simulation. A new
DatacenterBroker() object is instantiated. Among other responsibilities, the broker will
create the VMs for the simulation, bind the cloudlets to those VMs and assign the VMs to the
data center hosts. The broker’s ‘id’ is now passed to the createCloudletListPlanetLab()
method which prepares the cloudlet files in the input folder for storage in a data[288] array.
It is from this array that each cloudlet value will be read so that an equivalent MI workload
value can be calculated for each VM. Having created the cloudletList, the number of
cloudlets (files) in the PlanetLab folder is now known and a list of VMs can be created with
each cloudlet being assigned to an individual VM i.e. there is a 1-2-1 relationship between a
cloudlet and a VM at the start of the simulation. The last call of the init() method creates a
Chapter3 CloudSim
46
hostList which takes as a parameter the number of hosts configured for the DC (i.e. 800)
from the PlanetLabConstants() class. On completion of the init() method the cloudlets
(workload), hosts and VMs are all instantiated and ready for the data center to be created.
3.7.2 start()
The start() method creates the data center, binds all the components created in the init()
method to the new data center and starts the simulation.
The first helper call of the start() method is to createDatacenter() which sets
up a number of parameters related to the characteristics of the DC. These include:
 arch (string) – whether the DC has a 32 or 64 bit architecture
 os (string) – the operating system running on the hosts – e.g. Linux / Windows
 vmm (string) – the virtual machine manager running on the hosts e.g. Xen
 time_zone (double) – where the DC is located – e.g. 10.0
 cost (double) – the cost of processing in this resource – e.g. 3.0
 costPerMem (double) – the cost of using memory in this resource – e.g. 0.05
 costPerStorage (double) – the cost of using storage in this resource – e.g. 0.001
 costPerBm (double) – the cost of using bandwidth in this resource – e.g. 0.0
In the case of a simulation of a cloud network, where more than one data center would be
required, these values can be altered for the purposes of calculating different infrastructural
costs across the cloud. In this research a single data center is being simulated. The defaults
are not adjusted.
Once the data center has been created a boolean value
(PowerDatacenter.disableMigrations - indicating whether or not migrations are
disabled) is set to false i.e. migrations are enabled for this simulation. The VM and cloudlet
lists are submitted (by the broker) to the datacenter object and the simulation is started i.e.
double lastClock = CloudSim.StartSimulation();
Chapter3 CloudSim
47
The StartSimulation() method calls the run() method which waits for completion of all
entities i.e. run() waits until the entities (cloudlets running on VMs) are run as threads so
that the stop condition for the StartSimulation() method is reached when the threads reach
the ‘non-RUNNABLE’ state or when there are no more events in the future event queue. Once
this point has been reached the clock time is returned to the calling method (i.e.
RunnerAbstract.start()) and the simulation is stopped.
Helper.printResults(datacenter, vmList, lastClock, experimentName,
Constants.OUTPUT_CSV, outputFolder);
Results (if enabled) are printed to both log and trace files and the simulation is completed.
Chapter3 CloudSim
48
3.8 Over-utilization
Figure 8 Flow Chart Depicting the LR / MMT simulation process
The full simulation process is depicted in Figure 8. The scheduling interval (i.e. how often
analysis will be performed) is set as a static variable in the Constants class. For the default
CloudSim simulation, this interval is 300 seconds. At each interval the CPU utilization of
every host is examined. Using a sliding window of the last 10 CPU utilization values, the
local regression algorithm predicts the CPU utilization value for the next interval. If this
value is below 100% no action is taken. However, if the CPU is predicted to be greater than
Chapter3 CloudSim
49
100% (at the next interval) the host is considered over-utilized and the MMT portion of the
algorithm is called. As mentioned previously, a ‘fallback’ algorithm is used until the first 10
CPU values are available. The ‘fallback’ over-utilization threshold is 70%. The code for
testing a host for overutilization is shown below:
if (utilizationHistory.length < length)
{
return getFallbackVmAllocationPolicy().isHostOverUtilized(host);
}
The length of the sliding window is 10. This is known in code as the utilizationHistory.
try
{
estimates = getParameterEstimates(utilizationHistoryReversed);
}
The getParameterEstimates() call runs the local regression algorithm against the sliding
window and (after including the safety parameter as a multiplier) the predicted utilization of
the host is calculated.
predictedUtilization *= getSafetyParameter();
if(predictedUtilization >= 1)
{
Constants.OverUtilizedHostsThisInterval++;
}
return predictedUtilization >= 1;
A Boolean indicating the utilization state of the host is returned to the calling function. If the
host is predicted to be over-utilized at the next interval, the value of the returned Boolean will
be true.
Chapter3 CloudSim
50
3.9 Migration
One or more VMs need to be migrated from the host in order to bring the CPU utilization
back below the threshold. The VM(s) to be migrated are chosen on the basis of the amount of
RAM they are using. Thus, the VM with the least RAM will be the primary candidate for
migration. The VM types used by CloudSim are listed below. It can be seen that (for the most
part †) different RAM values are configured for each VM at the start of the simulation.
CloudSim does not include dynamic RAM adjustment so the static values applied initially
remain the same for the duration. Cloud providers such as Amazon use the term ‘instance’ to
denote a spinning VM. The CloudSim VM types provided simulate some of the VM
instances available to customers in Amazon EC2.
1. High-CPU Medium Instance: 2.5 EC2 Compute Units, 0.85 GB
2. Extra Large Instance: 2 EC2 Compute Units, 3.75 GB †
3. Small Instance: 1 EC2 Compute Unit, 1.7 GB
4. Micro Instance: 0.5 EC2 Compute Unit, 0.633 GB
public final static int VM_TYPES = 4;
public final static int[] VM_RAM = { 870, 1740, 1740†, 613 };
All types are deployed when the VMs are being created at the beginning of the simulation.
Assuming all 4 are on a single host when the host is found to be over-utilized, the order in
which the VMs will be chosen for migration is:
1. 613 [index 3]
2. 870 [index 0]
3. 1740 [index 1]
4. 1740 [index 2] † (note: the VM_RAM value in the default CloudSim code is 1740.
This does not reflect the ‘real-world’ server [Extra Large Instance] being simulated,
which has a RAM value of 3.75 GB)
If there is more than one VM with RAM of 613 on the host they will be queued for migration
before the first ‘870’ enters the queue. The chosen VM is then added to a migration map
which holds a key-value pair of the:
Chapter3 CloudSim
51
 VM ID
 Destination Host ID
Once all the hosts have been analysed, the VMs in the migration map are migrated to their
chosen destinations using the VM placement algorithm. The destination for each VM is
chosen with the objective of optimizing power consumption i.e. the host which will use the
least power post-migration is deemed the most suitable.
public Vm getVmToMigrate(PowerHost host)
{
List<PowerVm> migratableVms = getMigratableVms(host);
if (migratableVms.isEmpty())
{
return null;
}
Vm vmToMigrate = null;
double minMetric = Double.MAX_VALUE;
for (Vm vm : migratableVms)
{
if (vm.isInMigration())
{
continue;
}
double metric = vm.getRam();
if (metric < minMetric)
{
minMetric = metric;
vmToMigrate = vm;
}
}
return vmToMigrate;
}
}
From the code above it can be seen that the VM with the least RAM (vm.getRam()) is
chosen for migration, the objective of which is to minimize the downtime required to transfer
the final RAM pages during the migration. Increased downtime during migration would
result in potential SLA violations as described in detail in Section 2.4.2.3. It is clear that, to a
certain extent, CloudSim is replicating the effort to minimize SLA violations which takes
place during ‘real-world’ live migrations.
Chapter3 CloudSim
52
3.10 Reporting
CloudSim facilitates reporting on various metrics available during the simulation. Reports are
generated as either flat text or MS Excel-type Comma Separated Values (CSV) file formats.
Additionally, metrics can be sent to the Eclipse console and read as the simulation progresses.
Below is a sample of the metrics summary from the trace file of the default CloudSim
simulation. Notable metrics include:
 Number of Hosts
 Number of VMs
 Energy Consumption
 Over-utilized Hosts
 Number of VM Migrations
 Average SLA Violation
Trace.printLine(String.format("Experiment name: " + experimentName));
Trace.printLine(String.format("Number of hosts: " + numberOfHosts));
Trace.printLine(String.format("Number of VMs: " + numberOfVms));
Trace.printLine(String.format("Total simulation time: %.2f sec",
totalSimulationTime));
Trace.printLine(String.format("Energy consumption: %.2f kWh", energy));
Trace.printLine(String.format("Overutilized Hosts: %d",
Constants.OverUtilizedHostsThisInterval));
Trace.printLine(String.format("Number of VM migrations: %d",
numberOfMigrations));
Trace.printLine(String.format("SLA: %.5f%%", sla * 100));
Trace.printLine(String.format("SLA perf degradation due to migration:
_%.2f%%", slaDegradationDueToMigration * 100));
Trace.printLine(String.format("SLA time per active host: %.2f%%",
slaTimePerActiveHost * 100));
Trace.printLine(String.format("Overall SLA violation: %.2f%%", slaOverall
_* 100));
Trace.printLine(String.format("Average SLA violation: %.2f%%", slaAverage
_* 100));
3.11 Conclusion
This chapter discussed some of the capabilities and limitations of the default CloudSim
framework being used for this research and identified the modules most related to testing the
hypothesis presented in this research. An explanation of how CloudSim processes the
Chapter3 CloudSim
53
workload, being applied from the cloudlets, was also provided. A range of shortfalls and
possible errors were also identified. Chapter 4 details the changes made to the default
framework and the new code added, which were integrated into the CloudSim code to
evaluate the hypothesis.
Chapter4 Implementation
54
Chapter 4 Implementation
Introduction
As it is not provided in the default CloudSim package, additional code was required to test
the effect on power consumption when the monitoring interval is adjusted. Chapter 3
provided an overview of the default power-aware CloudSim simulation and the related
modules. The specific capabilities (and limitations) of the framework, as they apply to
dynamic interval adjustment, were also outlined. In this chapter the changes which were
required to implement dynamic adjustment of the monitoring interval are described in more
detail.
The primary contribution of the thesis is to evaluate the impact of moving from a
static to a dynamic monitoring process whereby the predicted average utilization for the DC
at each interval is used to adjust the next interval. Before writing the code, which would
ultimately be integrated with the existing LR/MMT modules in CloudSim, an algorithm was
designed to clarify the steps involved.
4.1 Interval Adjustment Algorithm
The dynamic interval adjustment algorithm involves two principle steps:
1. Calculate the weighted mean of the CPU utilization value for all operational hosts in the
data center as in Equation 1. Non-operational hosts are excluded from the calculation,
they do not affect the average CPU utilization for the DC:
𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 =
∑ (𝑤 𝑖 𝑥 𝑖
𝑛
𝑖=1 )
∑ ( 𝑤 𝑖)n
i=1
, (1)
where wi is the weight applied to the range within which the predicted utilization value xi for
each operational host falls and n is the number of operational hosts.
Chapter4 Implementation
55
2. Choose and set the next monitoring interval with respect to the appropriate weighted
mean from Table 4.1:
The premise upon which weightings are applied to calculate the average utilization of the DC
is simplified for the purposes of this research. The primary objective is to adjust the
monitoring interval with respect to the upper utilization threshold. As such, a straightforward
set of weights (from 1 - 10) are applied to the CPU utilization for each host, such that host's
which have a higher CPU utilization i.e., are closer to 100%, are given more priority in the
calculation. If a simple average was taken of host utilization across the DC, this would have
the effect of masking hosts that are close to the threshold, where SLA’s are in danger of being
violated. If the lower threshold were taken into consideration, a different set of weights would
be appropriate, with increased importance applied in the regions closer to both thresholds and
reduced importance at the center (e.g. 40-60% CPU utilization). There is certainly scope for
further investigation of the simplified set of weights applied in this research, depicted in
Table 4.1.
Table 4.1: Application of Weights to Predicted Utilization
Predicted Utilization (%) per host ( xi ) Weight Applied ( wi )
1 – 10 1
11 – 20 2
21 – 30 3
… …
91 – 100 10
The monitoring intervals applied to the resulting weighted average prediction are depicted in
Figure 9. As with the weights discussed above, the intervals were chosen somewhat
arbitrarily and would benefit from further analysis. The maximum interval is aligned with the
existing default interval in CloudSim i.e. 300 seconds. A minimum interval of 30 seconds
facilitates 10 intervals in total, each having a corresponding 10% CPU utilization range from
0 to 100.
Chapter4 Implementation
56
However, if the minimum interval of 30 seconds was applied for the full 24 hour simulation,
2880 values would be required (i.e. (60 x 60 x 24) / 30) in each PlanetLab cloudlet file to
ensure a value could be read at each interval. The 288 values in the 1052 default files
provided by CloudSim were thus concatenated (using a C# program written specifically for
this purpose) to ensure sufficient values were available, resulting in a total of 105 complete
files with 2880 values each.
if(Constants.IsDefault)
{
data = new double[288]; // PlanetLab workload
}
else
{
data = new double[2880];
}
The code tract above demonstrates the difference between the two data[] arrays which hold
the PlanetLab cloudlet values read at each interval during the simulation. The default array
is 288 in length while the dynamic is 2880 – providing sufficient indices in the array to store
the required number of cloudlet values throughout the simulation.
Chapter4 Implementation
57
Figure 9 Application of the Monitoring Interval Basedon Weighted Utilization Average
It should be noted that the intervals and CPU ranges were chosen somewhat arbitrarily and
could be fine-tuned following further investigation. Additionally, as a result of the reduced
file count after concatenation, the number of hosts running in the simulation was reduced,
from the CloudSim default of 800 to 80, to maintain the ratio of VMs to hosts (Table 4.2):
Table 4.2: VMs to Hosts - Ratio Correction
Cloudlets / VMs Hosts
Default 1052 800
Dynamic 105 80
Chapter4 Implementation
58
4.2 Comparable Workloads
This research focuses on a comparison of the default CloudSim simulation with a dynamic
version and was required, therefore, to use comparable workloads. Ensuring that the
workloads are comparable (in a simulation which monitors at different intervals) involves
applying the same amount of processing to each VM during each interval. Accordingly, a
new set of files was created for the default simulation. The values for these files were
calculated based on the average of the values used in the dynamic simulation by the time each
300 second interval had elapsed. This was achieved by running the dynamic simulation for 24
hours and recording the data (e.g. interval length, cumulative interval, average utilization per
interval) observed (shown in Figure 10). This data was initially written to a Microsoft Excel
worksheet from within the CloudSim reporting structure and then exported to a Microsoft
SQL Server database. The number (727) and length of the intervals in the dynamic simulation
can be seen in Figure 11.
From Figure 12, it is clear that a lower (and/or) upper offset may occur during
calculation i.e. the dynamic interval ‘straddles’ the 300-second mark. To maintain as much
accuracy as possible for the calculation of the default file values, two new variables (i.e.
offsetBelow300, offsetAbove300) were introduced.
Chapter4 Implementation
59
Figure 10 A screenshot of the data generated for calculation of the default workload
Figure 11 Intervals calculated during the dynamic simulation
Chapter4 Implementation
60
Figure 12 Calculation of the Average CPU Utilization for the Default Files
The length of each interval is added to an accumulator until the total equals (or exceeds) 300
seconds. The average utilization for the accumulated intervals is then calculated. This average
includes (if required) the average for the final portion of any interval below the 300-second
mark. When an offset occurs above the 300-second mark (in the current accumulator), it is
‘held-over’ (i.e. added to the accumulator in the next 300-second interval). Some of the new
Java code written in CloudSim to monitor the interval and workload activity (generating the
data required for the calculator) is shown below – the code comments provide explanation:
if(Constants.IntervalGenerator)
{
int intervalDifference = 0;
int iOffsetBelowForPrinting = 0;
int iOffsetAboveForPrinting = 0;
int iAccumulatedIntervalsForPrinting = 0;
if(Constants.accumulatedIntervals >= 300)
{
//Constants.accumulatedIntervals is exactly 300
int accumulated = 300;
//calculate offsets
if(Constants.accumulatedIntervals > 300)
{
Chapter4 Implementation
61
accumulated = (int) Constants.accumulatedIntervals -
_(int) dInterval;
}
Constants.offsetBelow = 300 - accumulated;
Constants.offsetAbove = Constants.accumulatedIntervals - 300;
}
}
4.3 C# Calculator
Calculation of the new per-interval workloads was achieved using a separate ‘calculator’
program written in C#. The calculator implements the process depicted in Figure 12. The
principle C# method used to calculate the new averages for the default files in the calculator
program is CreateDefaultFiles(). The comments in the code explain each step of the
process:
private void CreateDefaultFiles()
{
//read in first 727 from each file - used in dynamic simulation
FileInfo[] Files = dinfo.GetFiles();
string currentNumber = string.Empty;
int iOffsetAboveFromPrevious = 0;
//initialize at max to ensure not used on first iteration
int iIndexForOffsetAbove = 727;
foreach (FileInfo filex in Files)
{
using (var reader = new StreamReader(filex.FullName))
{
//fill dynamicIn
for (int i = 0; i < 727; i++)
{
dynamicIn[i] = Convert.ToInt32(reader.ReadLine());
}
int iCurrentOutputIndex = 0;
//Calculate
for (int k = 0; k < 727; k++)
{
//add each average used here - including any offset
float iAccumulatedTotal = 0;
//reached > 300 accumulated intervals
int iReadCount =
_Convert.ToInt32(ds.Tables[0].Rows[k]["ReadCount"]);
Chapter4 Implementation
62
if (iReadCount > 0)
{
//first interval
if (k == 0)
{
int iValue = dynamicIn[k];
iAccumulatedTotal += iValue;
}
else
{
//readCount == 1: just check for offsets
if (iReadCount > 1)
{
for (int m = 1; m < iReadCount; m++)
{
int iValue = dynamicIn[k - m];
int iInterval =
_Convert.ToInt32(ds.Tables[0].Rows[k - _m]["Interval"]);
iAccumulatedTotal += iValue * iInterval;
}
}
}
//offset - read this interval
int iOffsetBelow =
_Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetBelow300"]);
if (iOffsetBelow > 0)
{
iAccumulatedTotal += iOffsetBelow * dynamicIn[k];
}
//use previous offset above in this calculation
if (k >= iIndexForOffsetAbove)
{
iAccumulatedTotal += iOffsetAboveFromPrevious;
//reset
iOffsetAboveFromPrevious = 0;
iIndexForOffsetAbove = 727;
}
//use this offset above in next calculation
int iOffsetAbove =
_Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetAbove300"]);
if (iOffsetAbove > 0)
{
//value for offset above to add to next
_accumulator
iOffsetAboveFromPrevious = iOffsetAbove *
_dynamicIn[k];
//use in next calculation - at a minimum
iIndexForOffsetAbove = k;
}
Chapter4 Implementation
63
float fAverage = iAccumulatedTotal / 300;
int iAverage = Convert.ToInt32( iAccumulatedTotal /
_300);
//first interval
if (k == 0)
{
iAverage = dynamicIn[k];
}
//save averaged value to array for writing
defaultOutput[iCurrentOutputIndex] =
_iAverage.ToString();
iCurrentOutputIndex++;
}
}
}
//Print to text file for default cloudlet
System.IO.File.WriteAllLines("C:UsersscoobyDesktopDefaultNewFiles
" + _filex.Name, defaultOutput);
}
}
The code above depicts the process by which the values required to calculate the default file
averages was achieved. Having run the dynamic simulationThe results of the calculator
program were written back out to flat text files i.e. the same format as the original CloudSim
cloudlet files. To compare the difference between the default and dynamic workloads, both
default and dynamic simulations were run using the new cloudlet files with a few lines of
additional code to monitor the workload during each simulation added to the default
constructor of the UtilizationModelPlanetlabInMemory() class in CloudSim. This
additional code ensured that the workload would be observed (and accumulated) as each
cloudlet was processed.
As the data is being read into the data[] array from the PlanetLab cloudlet files
each value is added to a new accumulator variable i.e. Constants.totalWorkload:
int n = data.length;
for (int i = 0; i < n - 1; i++)
{
data[i] = Integer.valueOf(input.readLine());
Constants.totalWorkload += data[i];
}
Chapter4 Implementation
64
The Constants.totalWorkload value was then divided by the relevant number of intervals
(i.e. Default: 288 / Dynamic: 727) to calculate the average workload per interval. A
difference of less than 1% in the per-interval workloads was observed, validating, for the
most part, the results generated for the default cloudlet files by the C# program.
The negligible difference may be explained by migrations taking place during
collection of the CPU utilization data prior to export. For example, if a workload on a VM is
calculated at 10% of its host’s capacity and then migrated to a host with a lower capacity, the
same workload would require more time to complete – skewing the average CPU utilization
that would have otherwise been calculated had the migration not taken place. This scenario
was not factored into calculation of the per-interval average CPU utilization, resulting in the
difference between the workloads of approximately 1%. This error margin was considered
acceptable in the context of the overall thesis objectives.
4.4 Interval Adjustment Code
The updateCloudletProcessing() method in the PowerDatacenter class is the principle
cloudlet processing method, run at each interval and provided by default in CloudSim. As
such it is the ideal place to position the function call to the additional code required to
implement interval adjustment.
To differentiate between the default and dynamic simulations at runtime, a constant
boolean variable (IsDefault) was created which indicates which type of simulation is being
run. Based on the value of the IsDefault variable, the code will fork either to the default
CloudSim code or the dynamic code written to adjust the monitoring interval. The fork
returns to the default CloudSim code once the AdjustInterval() method has been
executed:
if(Constants.IsDefault)
{
//run default simulation
}
else
{
//run dynamic simulation
AdjustInterval();
}
Chapter4 Implementation
65
The AdjustInterval() method (outlined below) is the entry point for the dynamic
monitoring interval adjustment simulation. Figure 13 depicts how the dynamic code interacts
with the CloudSim default:
protected void AdjustInterval(double currentTime)
{
double dTotalUsageForAverage = 0;
double dAverageUsage = 0;
int iDenominator = 0;
int iWeight = 0;
double timeDiff = currentTime - getLastProcessTime();
for (PowerHost host : this.<PowerHost> getHostList())
{
double utilizationOfCpu = host.getUtilizationOfCpu();
if(utilizationOfCpu > 0)
{
iWeight = GetWeight(utilizationOfCpu);
dTotalUsageForAverage += utilizationOfCpu * iWeight;
iDenominator += iWeight;
}
}
dAverageUsage = dTotalUsageForAverage / iDenominator;
//alter scheduling interval according to average utilization
SetSchedulingIntervalRelativeToUtilization(dAverageUsage);
}
Chapter4 Implementation
66
Figure 13 How the dynamic interval adjustment code interacts with CloudSim
Chapter4 Implementation
67
A host which is not running would have a CPU utilization of 0. As depicted by the code, only
hosts with CPU utilization greater than 0 will be included in the average CPU utilization for
the DC i.e.
if(utilizationOfCpu > 0)
A weighting is then applied (helper function: GetWeight() - below) to each result obtained.
This weighting (cf. Table 4.1) is based on the CPU utilization calculated for each host by the
getUtilizationOfCpu() method provided by default in CloudSim:
public int GetWeight(double utilization)
{
double iUtilization = utilization * 100;
int iWeight = 0;
//check utilization value range
if(iUtilization >= 0.00 && iUtilization <= 10.00)
{
iWeight = 1;
}
else if(iUtilization > 10.00 && iUtilization <= 20.00)
{
iWeight = 2;
}
else if(iUtilization > 20.00 && iUtilization <= 30.00)
{
iWeight = 3;
}
else if(iUtilization > 30.00 && iUtilization <= 40.00)
{
iWeight = 4;
}
else if(iUtilization > 40.00 && iUtilization <= 50.00)
{
iWeight = 5;
}
else if(iUtilization > 50.00 && iUtilization <= 60.00)
{
iWeight = 6;
}
else if(iUtilization > 60.00 && iUtilization <= 70.00)
{
iWeight = 7;
}
else if(iUtilization > 70.00 && iUtilization <= 80.00)
{
iWeight = 8;
}
else if(iUtilization > 80.00 && iUtilization <= 90.00)
Chapter4 Implementation
68
{
iWeight = 9;
}
else if(iUtilization > 90.00 && iUtilization <= 100.00)
{
iWeight = 10;
}
return iWeight;
}
The average utilization for the DC is then passed to another helper function
(SetSchedulingIntervalRelativeToUtilization() – shown below) which will adjust
the next monitoring interval (i.e. Constants.SCHEDULING_INTERVAL) based on the range
within which the utilization falls.
public void SetSchedulingIntervalRelativeToUtilization(double
dAverageUsage)
{
double iUtilization = dAverageUsage * 100;
double dInterval = 300;
if(iUtilization >= 0.00 && iUtilization <= 10.00)
{
dInterval = 300;
}
else if(iUtilization > 10.00 && iUtilization <= 20.00)
{
dInterval = 270;
}
else if(iUtilization > 20.00 && iUtilization <= 30.00)
{
dInterval = 240;
}
else if(iUtilization > 30.00 && iUtilization <= 40.00)
{
dInterval = 210;
}
else if(iUtilization > 40.00 && iUtilization <= 50.00)
{
dInterval = 180;
}
else if(iUtilization > 50.00 && iUtilization <= 60.00)
{
dInterval = 150;
}
else if(iUtilization > 60.00 && iUtilization <= 70.00)
{
dInterval = 120;
}
Chapter4 Implementation
69
else if(iUtilization > 70.00 && iUtilization <= 80.00)
{
dInterval = 90;
}
else if(iUtilization > 80.00 && iUtilization <= 90.00)
{
dInterval = 60;
}
else
{
dInterval = 30;
}
setSchedulingInterval(dInterval);
Constants.SCHEDULING_INTERVAL = dInterval;
}
The process then returns to the AdjustInterval() method which returns control back to the
default CloudSim code where the fork began. The default CloudSim code continues,
completing the (per-interval) updateCloudletProcessing() method and continuing the
simulation into the next interval, the length of which has now been adjusted with respect to
the predicted average CPU utilization for the DC.
4.5 Reporting
The metrics available in the default CloudSim reports (as described in Section 3.10) were
found to be sufficient for the purposes of the testing phase of this research (cf. Chapter 5).
However, some additional variables were needed during the design phase of the new
algorithm to adjust the monitoring interval. These were added as static Constants so that
they would be globally available across a range of classes and could be used without
requiring instantiation of any new objects. Most are associated with calculation of the per-
interval workload for the default PlanetLab cloudlet files. They include:
public static int fileBeingRead = 0;
public static boolean IntervalGenerator = false;
public static int previousIntervalCount = 0;
public static int intervalCount = 0;
public static int offsetBelow = 0;
public static int offsetAbove = 0;
public static int accumulatedOffsetTotal = 0;
public static int intervalLengthTotal = 1;
Chapter4 Implementation
70
The list below depicts a typical trace file for the default CloudSim LRMMT simulation. It
contains the output of the CloudSim reporting class i.e. Helper(). It also includes some of
the additional metrics added for the purpose of this research (bold green italics):
 Experiment name: default_lr_mmt_1.2
 Number of hosts: 80
 Number of VMs: 105
 Total simulation time: 86100.00 sec
 Energy consumption: 16.76 kWh
 Overutilized Hosts: 2249
 Number of VM migrations: 2305
 Total Workload: 3833.310000
 SLA: 0.00428%
 SLA perf degradation due to migration: 0.07%
 SLA time per active host: 5.97%
 Overall SLA violation: 0.53%
 Average SLA violation: 11.61%
 SLA time per host: 0.05%
 Number of host shutdowns: 1184
 Mean time before a host shutdown: 627.78 sec
 StDev time before a host shutdown: 1443.06 sec
 Mean time before a VM migration: 17.12 sec
 StDev time before a VM migration: 7.67 sec
4.6 Conclusion
This chapter has discussed the modifications required to CloudSim to implement dynamic
adjustment of the monitoring interval. A description of how the new code integrates with the
default code provided by CloudSim was also provided. In Chapter 5 the tests carried out to
compare the default with the dynamic simulations are described and results analysed. Finally,
potential opportunities for improvement of the CloudSim framework during the course of this
research are suggested.
Chapter5 Tests,Results&Evaluation
71
Chapter 5 Tests,Results & Evaluation
Introduction
Chapter 4 detailed the code changes that were required in the CloudSim framework to
implement dynamic adjustment of the monitoring interval. The new code integrates
seamlessly with the existing framework. No alterations were made to the underlying
CloudSim architecture. This chapter deals with the specifics of the simulations carried out to
test the hypothesis that opportunities for reduction of power consumption can be identified
when the length of the interval changes with respect to the varying workload experienced by
a typical DC.
5.1 Tests & Results
Using the dynamic PlanetLab cloudlet files and interval adjustment code, the simulation was
run for the full duration (i.e. 86100 seconds) and compared with the CloudSim default which
used the cloudlet files generated by the C# calculator. Key results are presented in Table 5.1,
whereby a significant reduction in over-utilized hosts, migrations and power consumption
was observed.
Table 5.1: Simulation Results
Interval
(seconds)
Time Interval
Count
Over-utilized
Hosts
Migration
Count
Power
Consumption
Static 300 86100 287 2249 2305 16.76
Dynamic 86100 727 1697 979 8.23
Figure 14 depicts the intervals that were calculated during the dynamic simulation based on
the average CPU utilization for the DC. It can be seen that the interval ranges from a
minimum of 30 seconds to a maximum of 270 seconds - indicating that the average CPU
utilization for the DC did not exceed 90% nor drop below 10%. From Figure 14, and as
described in Chapter 4, the number of intervals in the dynamic simulation is 727, compared
with 288 in the static simulation.
Chapter5 Tests,Results&Evaluation
72
Figure 14 Interval Calculation for the Dynamic Simulation
Figure 15 shows a comparison of the VM count during both simulations - indicating that
VMs are being constantly ‘de-commissioned’ as their workloads are completed during the
simulation. There are 34 VMs still running at the end of the default simulation whereas there
are only 6 VMs which have not completed their workloads at the end of the dynamic. This
indicates that the VM placement algorithm has performed more efficiently in the dynamic
simulation i.e. more of the PlanetLab workload from the cloudlet files has been completed
by the time the dynamic simulation has finished.
Chapter5 Tests,Results&Evaluation
73
Figure 15 VM decommissioning comparison
Comparing Figures 16 & 17, which depict the operational hosts at each interval in the default
and dynamic simulations, it can be seen that more efficient use is made of the hosts when the
interval is adjusted dynamically. A minimal number of operational servers is achieved sooner
in the dynamic simulation and the power-on / power-off behaviour of the default simulation
(which consumes both time and energy) is primarily absent from the dynamic. This is
discussed further in Section 5.2.1 below.
Chapter5 Tests,Results&Evaluation
74
Figure 16 Operational Hosts - Default Simulation
Figure 17 Operational Hosts - Dynamic Simulation
Chapter5 Tests,Results&Evaluation
75
Figure 18 depicts the per-interval average CPU utilization in the DC for the dynamic
simulation. A cluster of values can be seen at approximately 99% between 17 – 19 hours in
the dynamic simulation. There is a single operational host in this time period (Figure 17) with
a range of 9 - 19 VMs running on it. The high average CPU utilization is as a direct result of
all the remaining VMs being placed on this single host. The VM placement algorithm is most
efficient at this point in the dynamic simulation from a power consumption perspective,
optimizing energy efficiency by minimizing the number of operational hosts required to
service the DC workload. This placement configuration would not be possible if the
PlanetLab cloudlet workloads were higher. It is the relatively low CPU values being
allocated to the VMs from the cloudlet files that make the placement on a single host in this
time period possible.
Figure 18 Average CPU Utilization - Dynamic Simulation
5.2 Evaluation of Test Results
In Section 5.2 the results summarized above are investigated, based on an understanding of
CloudSim as derived from code, code comments and help documentation / user forums. It
explains the research findings using an ‘under-the-hood’ analysis. Section 5.3 identifies a
number of limitations in the CloudSim framework which would benefit from further
Chapter5 Tests,Results&Evaluation
76
investigation and is tailored more towards future researchers using the CloudSim framework
than DC operators.
5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?
CloudSim performs migrations when an over-utilized host is identified. One or more VMs are
chosen for migration to bring the CPU utilization of the host back below the over-utilization
threshold. It is not explicitly clear from the CloudSim metrics available why the migration
count is reduced in dynamic mode relative to static mode. The VM placement algorithm
defined by CloudSim is a complex module within the framework. The logic upon which it
works is that the most appropriate destination for the migrating VM is the one which results
in the lowest power consumption post-migration. However, it is clear from the operational
hosts observed in the dynamic simulation (depicted in Figure 17) that the VM placement
algorithm is also performing consolidation (c.f. Sections 2.4.1, 3.3 & 5.1 above).
During the period when only 1 host is operational in the dynamic simulation (Figure
17: 17 – 24 hours) it was observed that there were as many as 19 VMs running on that host
(c.f. Section 5.1). As a result, some over-allocation is occurring. Over-allocation is when so
many VMs are placed on a host than the host has insufficient capacity to service the workload
of every VM at each time frame. In the effort to consolidate, VMs will sometimes be placed
on a host which is currently running, rather than switching on a new host. The effect is that,
due to the increased length of the CPU queue on the host (i.e. more VMs are ‘waiting’ for
processing time slices), some VMs will not receive the CPU cycles required to complete their
workload in the available interval. The expected action would be migration of the ‘starved’
VMs but it is evident (from Figure 17) that no migrations are taking place i.e. no other host is
switched on. This is due to one of the limitations identified in the framework - that CloudSim
only performs a VM migration when the entire host is over-utilized – not when an individual
VM requires more capacity (c.f. Section 5.2.7 below). Clearly, there is a trade-off between
consolidation and migration. The conclusion reached, based on the results observed, is that
the reduced intervals in the dynamic simulation result in more frequent analysis, performing
this consolidation / migration trade-off more efficiently than the default simulation, resulting
in fewer over-utilized hosts and a reduced migration count.
Chapter5 Tests,Results&Evaluation
77
5.2.2 Result of Reduced Migration Count
Beloglazov et al. [51] show that decreased power consumption can be achieved in a DC if the
VM migration count can be reduced. Their work is based on the premise that additional
resources are consumed during a migration due to the extra processing required to move the
memory of the VM from its current host to another. Those processes may include:
 Identification of a suitable destination server i.e. VM placement algorithm
 Network traffic
 CPU processing on both source and destination servers whilst concurrently running
two VMs
In the case of live migration, transfer of the VM’s memory image is performed by the Virtual
Machine Manager (VMM) which copies the RAM, associated with the VM service, across to
the destination while the service on the VM is still running. RAM which is re-written on the
source must be transferred again. This process continues iteratively until the remaining
volume of RAM needing to be transferred is such that the service can be switched off with
minimal interruption. This period of time, while the service is unavailable, is known as
downtime. Any attempt to improve migration algorithms must take live-copy downtime into
consideration to prevent (or minimize) response SLAs. CloudSim achieves this (to some
extent) by choosing, for migration, the VM with the lowest RAM. However, the CloudSim
SLA metric in the modules used for this research does not take this downtime into
consideration.
Dynamic adjustment of the monitoring interval, however, minimizes this issue of
RAM transfer by reducing the need for the migration in the first place. The power consumed
as a result of the migrations is saved when additional migrations are not required.
5.2.3 Scalability
As outlined in Section 2.1 there is a trade-off between DC monitoring overhead costs and net
DC benefits. The issue here is that the additional volume of processing which takes place
when shorter monitoring intervals are applied may become such that it would not be
beneficial to apply dynamic interval adjustment at all.
Chapter5 Tests,Results&Evaluation
78
Take for example, Amazon’s EC2 EU West DC (located in Dublin, Ireland) which is
estimated to contain over 52,000 operational servers [52]. Processing of the data (CPU
utilization values) required to perform the interval adjustment is not an insignificant
additional workload. The algorithm will calculate the average CPU utilization of some 52,000
servers and apply the new interval. As such, if this calculation were to take place every 30
seconds (in a DC with an average CPU utilization above 90%), rather than every 300
seconds, there is a ten-fold increase in the total processing volume which includes both
collection and analysis of the data points. While it is unlikely that even the average CPU
utilization of the most efficient DC would exceed 90% for any extended period of time, it is
clear that the size of the DC (i.e. number of operational servers) does play a role in
establishing whether or not the interval adjustment algorithm described in this research
should be applied. Microsoft’s Chicago DC has approximately 140,000 servers installed.
With increasingly larger DCs being built to meet growing consumer demand, it is reasonable
to expect that DC server counts will reach 500,000 in the foreseeable future. Rather than
viewing the entire DC as a single entity from a monitoring perspective, perhaps the most
viable application of dynamic monitoring interval adjustment would be to sub-divide these
larger DCs into more manageable sections, calculating the monitoring interval for each
section separately.
5.3 Evaluation of CloudSim
5.3.1 Local Regression Sliding Window
The adjusted interval in this research (discussed in Chapter 3 Section 5) results in the 'first
fill' of the window occurring sooner than the CloudSim default i.e. the longest interval (5
minutes) in the dynamic version is the minimum (static) interval in the CloudSim default.
The first 10 values in the sliding window take 3000 seconds (i.e. 300 x 10) in the default
CloudSim simulation whereas, in the dynamic version, the window is filled after 1480
seconds. The result is a small increase in the accuracy of the utilization prediction at the
beginning of the simulation because the less accurate ‘fallback’ algorithm is ‘discarded’
sooner.
The size of the sliding window in the default CloudSim framework is 10 i.e. the 10 most
recent CPU utilization values from the host are used each time the local regression algorithm
Chapter5 Tests,Results&Evaluation
79
is performed. If there were more values in the window, the algorithm would be less sensitive
to short-term changes in the workload. Clearly the size of the sliding window should be
proportionate to the level of sensitivity required. The choice of this parameter would most
likely benefit from detailed sensitivity analysis.
5.3.2 RAM
Chapter 3 detailed the configuration settings provided by default in CloudSim in an effort to
simulate ‘real-world’ VMs. However, for reasons unclear from the code, two of the VM types
have the same RAM applied to them i.e. 1740. It would be preferable if either:
 4 distinct VM types were configured to better reflect ‘real-world’ scenarios and
improve the VM selection policy deployed by default
 Provision for dynamic RAM adjustment was included in the CloudSim framework
(c.f. Section 5.3.3 below).
5.3.3 Dynamic RAM Adjustment
No facility is provided in CloudSim to adjust the amount of RAM available to a VM while
the simulation proceeds. A migration to another host with a higher-capacity VM is required
in CloudSim should a VM require more RAM. While this simulates many ‘real-world’
systems, the facility to dynamically adjust the amount of RAM allocated to a VM (without
requiring migration) would improve the VM selection algorithm.
5.3.4 SLA-based Migration
The basis upon which a migration takes place in the CloudSim module used for this research
is an over-utilized host. If a VM requires additional RAM to service its workload it must be
migrated to another host where a larger VM can be configured to run the workload. However
the module does not facilitate SLA-based migration. Rather, only VMs on a host which is
over-utilized are migrated. This is a significant limitation in the design of CloudSim. Even
with this scenario, the VMs which need additional RAM may not be those migrated because
the algorithm for choosing the VMs to migrate selects the VM with the lowest RAM pages
requiring migration first. This will typically leave ‘starved’ VMs on the source host - still
Chapter5 Tests,Results&Evaluation
80
requiring additional RAM. Clearly, some improvement is required in the VM selection and
allocation policies deployed by the default CloudSim framework.
Chapter6 Conclusions
81
Chapter 6 Conclusions
In order to identify a potentially novel energy efficiency approach for virtualized DCs, a large
part of the research effort in this thesis was dedicated to evaluating the current state-of-the-
art. On completion of this investigative phase it was decided to focus on opportunities
relating to DC management software i.e. virtualization. Following this, the concept of a
dynamic monitoring interval was then proposed. Once CloudSim had been identified as the
most accessible framework in which to build a test bed, a significant amount of time was
spent reviewing the existing code to establish the capabilities (and limitations) of the
framework.
The dynamic simulation presented in this thesis is differentiated from the default
LR/MMT CloudSim in that the duration of the next interval is adjusted (with respect to the
weighted average of the data DC CPU utilization) rather than maintaining a static interval of
300 seconds which is the standard monitoring interval used in commercial applications (e.g.
VMWare, Citrix Xen).
The primary aim of this research (as outlined in the introductory chapter) was to
determine the impact on power consumption of dynamically adjusting the monitoring
interval. Analysis of DC metrics is performed more often suggesting that the DC is more
sensitive to changes in CPU utilization. The focus of this research was the over-utilization
threshold. In calculating the average CPU utilization for the DC, shorter intervals are applied
as the average utilization rate increases. Results indicated that power consumption could be
reduced when the monitoring interval is adjusted with respect to the incoming workload. As
indicated, future work should also examine the potential for reduced power consumption as
the average CPU utilization for the DC approaches some under-utilization threshold. This
would improve the CloudSim VM placement algorithm, providing a more accurate
simulation of the server consolidation efforts used by industry.
In addition, this research had a secondary objective – to evaluate the efficacy of
CloudSim as a simulator for power-aware DCs. During the course of reviewing existing code
and writing new modules the specific issues (outlined above) were found to exist in the
CloudSim framework. Discovery and documentation of them in this thesis will undoubtedly
Chapter6 Conclusions
82
prove both informative and useful for researchers undertaking CloudSim-based simulations in
the future.
Recent reports suggest that Microsoft’s Chicago DC has approximately 140,000
servers installed [53]. With increasingly larger DCs being built to meet growing consumer
demand, it is reasonable to expect that individual DC server counts will reach 250,000 -
500,000 in the foreseeable future. Rather than viewing the entire DC as a single entity from a
monitoring perspective, perhaps the most viable application of dynamic monitoring interval
adjustment would be to sub-divide these larger DCs into more manageable sections,
calculating (and adjusting) the monitoring interval for each section separately – ensuring that
the granularity of analysis most appropriately caters for all possible data center sizes and
configurations. This analysis would also make a valuable contribution to the state-of-the-art.
References
83
REFERENCES
[1] http://www.gartner.com/newsroom/id/499090 - last accessed on 19/09/2014
[2] http://www.idc.com - last accessed on 19/09/2014
[3] Koomey J.G., “Estimating Total Power Consumption by Servers in the U.S. and the
World”, 2007
[4] Energy Star Program - U.S. Environmental Protection Agency, “EPA Report to Congress
on Server and Data Center Energy Efficiency”, EPA, Aug 2007.
[5] N. Rasmussen, “Calculating Total Cooling Requirements for Data Centers,” American
Power Conversion, White Paper #25, 2007.
[6] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S., "Balance of Power:
Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE, vol.9,
no.1, pp. 42-49, January 2005
[7] Moore, J.; Sharma, R.; Shih, R.; Chase, J.; Patel, C.; Ranganathan, P., “Going Beyond
CPUs: The Potential of Temperature-Aware Solutions for the Data Center”, Hewlett Packard
Labs, 2002
[8] Data Center Efficiency Task Force, “Recommendations for Measuring and Reporting
Version 2 – Measuring PUE for Data Centers”, 17th May 2011
[9] C. Belady, A., Rawson, J. Pfleuger, and T., Cader, "Green Grid Data Center Power
Efficiency Metrics: PUE and DCIE," The Green Grid, 2008
[10] Koomey J.G., “Growth in Data Center Electricity Use 2005 to 2010”, report by
Analytics Press, completed at the request of The New York Times, August 2011
[11] The Uptime Institute, “Inaugural Annual Uptime Institute Data Center Industry Survey”,
Uptime Institute, May 2011
[12] ASHRAE, “Datacom Equipment Power Trends and Cooling Applications”, ASHRAE
INC, 2005
References
84
[13] ASHRAE, "Environmental Guidelines for Datacom Equipment - Expanding the
Recommended Environmental Envelope", ASHRAE INC, 2008
[14] ASHRAE, “Thermal Guidelines for Data Processing Environments – Expanded Data
Center Classes and Usage Guidance”, ASHRAE INC, August 2011
[15] Boucher, T.D.; Auslander, D.M.; Bash, C.E.; Federspiel, C.C.; Patel, C.D., "Viability of
Dynamic Cooling Control in a Data Center Environment," Thermal and Thermomechanical
Phenomena in Electronic Systems, 2004. ITHERM '04. The Ninth Intersociety Conference
on, pp. 593- 600 Vol. 1, 1-4 June 2004
[16] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S.; , "Balance of Power:
Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE , vol.9,
no.1, pp. 42- 49, Jan.-Feb. 2005
[17] Shah, A.; Patel, C.; Bash, C.; Sharma, R.; Shih, R.; , "Impact of Rack-level Compaction
on the Data Center Cooling Ensemble," Thermal and Thermomechanical Phenomena in
Electronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference on, pp.1175-1182,
28-31 May 2008
[18] C. Patel, et al., “Energy Flow in the Information Technology Stack: Coefficient of
Performance of the Ensemble and its Impact on the Total Cost of Ownership,” Technical
Report No. HPL-2006-55, Hewlett Packard Laboratories, March 2006
[19] C. Patel, et al., “Energy Flow in the Information Technology Stack: Introducing the
Coefficient of Performance of the Ensemble,” Proc. ASME IMECE, November 2006
[20] Ahuja, N.; Rego, C.; Ahuja, S.; Warner, M.; Docca, A.; "Data Center Efficiency with
Higher Ambient Temperatures and Optimized Cooling Control," Semiconductor Thermal
Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE,
pp.105-109, 20-24 March 2011
[21] Berktold, M.; Tian, T., “CPU Monitoring With DTS/PECI”, Intel Corporation,
September 2010
[22] M. Stopar, SLA@SOI XLAB, Efficient Distribution of Virtual Machines, March 24,
2011.
References
85
[23] C. Hyser, B. McKee, R. Gardner, and B. Watson. Autonomic Virtual Machine
Placement in the Data Center. Technical Report HPL-2007-189, HP Laboratories, Feb. 2008.
[24] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the Electric
Bill for Internet-Scale Systems,” in Proc. ACM Conference on Data Communication
(SIGCOMM’09), New York, NY, USA, 2009, pp. 123–134
[25] Bolin Hu; Zhou Lei; Yu Lei; Dong Xu; Jiandun Li; , "A Time-Series Based Precopy
Approach for Live Migration of Virtual Machines," Parallel and Distributed Systems
(ICPADS), 2011 IEEE 17th International Conference on , vol., no., pp.947-952, 7-9 Dec.
2011
[26] Carroll, R, Balasubramaniam, S, Botvich, D and Donnelly, W, Application of Genetic
Algorithm to Maximise Clean Energy usage for Data Centers, to appear in proceedings of
Bionetics 2010, Boston, December 2010
[27] Akshat Verma, Puneet Ahuja, Anindya Neogi, “pMapper: Power and Migration Cost
Aware Application Placement in Virtualized Systems”, Middleware 2008: 243-264
[28] P. Riteau, C. Morin, T. Priol, “Shrinker: Efficient Wide-Area Live Virtual Machine
Migration using Distributed Content-Based Addressing,”
http://hal.inria.fr/docs/00/45/47/27/PDF/RR-7198.pdf, 2010
[29] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. de Laat, J. Mambretti, I. Monga, B.
van Oudenaarde, S. Raghunath, and P. Wang. Seamless Live Migration of Virtual Machines
over the MAN/WAN. iGrid, 2006
[30] Hai Jin, Li Deng, Song Wu, Xuanhua Shi, and Xiaodong Pan. Live virtual machine
migration with adaptive memory compression. In Cluster, 2009
[31] Jonghyun Lee, MarianneWinslett, Xiaosong Ma, and Shengke Yu. Enhancing Data
Migration Performance via Parallel Data Compression. In Proceedings of the 16th
International Parallel and Distributed Processing Symposium (IPDPS), pages 47–54, April
2002
[32] M. R. Hines and K. Gopalan, “Post-copy based live virtual machine migration using
adaptive pre-paging and dynamic self-ballooning,” in Proceedings of the ACM/Usenix
international conference on Virtual execution environments (VEE’09), 2009, pp. 51–60
References
86
[33] Bose, S.K.; Brock, S.; Skeoch, R.; Rao, S.; , "CloudSpider: Combining Replication with
Scheduling for Optimizing Live Migration of Virtual Machines across Wide Area
Networks," Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM
International Symposium on , vol., no., pp.13-22, 23-26 May 2011
[34] Cioara, T.; Anghel, I.; Salomie, I.; Copil, G.; Moldovan, D.; Kipp, A.; , "Energy Aware
Dynamic Resource Consolidation Algorithm for Virtualized Service Centers Based on
Reinforcement Learning," Parallel and Distributed Computing (ISPDC), 2011 10th
International Symposium on , vol., no., pp.163-169, 6-8 July 2011
[35] H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu, “Live migration of virtual machine based on
full system trace and replay,” in Proceedings of the 18th International Symposium on High
Performance Distributed Computing (HPDC’09), 2009, pp. 101–110.
[36] http://www.drbd.org - last accessed on 19/09/2014
[37] K. Nagin, D. Hadas, Z. Dubitzky, A. Glikson, I. Loy, B. Rochwerger, and L. Schour,
“Inter-Cloud Mobility of Virtual Machines,” in Proc. of 4th Int’l Conf. on Systems & Storage
(SYSTOR). ACM, 2011, pp. 3:1–3:12.
[38] R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schi•oberg. Live Wide-Area
Migration of Virtual Machines including Local Persistent State. In VEE '07: Proceedings of
the 3rd international conference on Virtual execution environments, pages 169{179, New
York, NY, USA, 2007. ACM.
[39] C. Clark, K. Fraser, A. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, A. Warfield. Live
Migration of Virtual Machines. in Proceedings of the Symposium on Networked Systems
Design and Implementation, 2005.
[40] VMware vSphere® vMotion®, Architecture, Performance and Best Practices in
VMware vSphere® 5. Performance Study, Technical White Paper, Oct 2011.
[41] Voorsluys W., Broberg J., Venugopal S., Buyya R.: Cost of Virtual Machine Live
Migration in Clouds: a Performance Evaluation. In: Proceedings of the 1st International
Conference on Cloud Computing. Vol. 2009. Springer (2009)
[42] Buyya, R., Ranjan, R., Calheiros, R. N.: Modeling and Simulation of Scalable Cloud
Computing Environments and the CloudSim Toolkit: Challenges and Opportunities. In: High
References
87
Performance Computing & Simulation, 2009. HPCS'09. International Conference on, pp. 1-
11. IEEE (2009)
[43] Xu, Y., Sekiya, Y.: Scheme of Resource Optimization using VM Migration for
Federated Cloud. In: Proceedings of the Asia-Pacific Advanced Network, vol. 32, pp. 36-44.
(2011)
[44] Takeda, S., and Toshinori T.: A Rank-Based VM Consolidation Method for Power
Saving in Data Centers. IPSJ Online Transactions, vol. 3 pp. 88-96. J-STAGE (2010)
[45] Xu, L., Chen, W., Wang, Z., Yang, S.: Smart-DRS: A Strategy of Dynamic Resource
Scheduling in Cloud Data Center. In: Cluster Computing Workshops (CLUSTER
WORKSHOPS), IEEE International Conference on, pp. 120-127. IEEE (2012)
[46] Gmach, D., Rolia, J., Cherkasova, L., Kemper, A.: Workload Analysis and Demand
Prediction of Enterprise Data Center Applications. In: Workload Characterization, 2007.
IISWC 2007. IEEE 10th International Symposium on, pp. 171-180. IEEE (2007)
[47] VMware, http://pubs.vmware.com/vsphere-4-esx-
vcenter/index.jsp?topic=/com.vmware.vsphere.bsa.doc_40/vc_perfcharts_help/c_perfcharts_c
ollection_intervals.html - last accessed on 19/09/2014
[48] Chandra, A., W. Gong, et al. (2003). Dynamic Resource Allocation for Shared Data
Centers Using Online Measurements. Proceedings of the Eleventh International Workshop on
Quality of Service (IWQoS 2003), Berkeley, Monterey, CA, Springer. pp. 381-400.
[49] M. Aron, P. Druschel, and S. Iyer. A Resource Management Framework for Predictable
Quality of Service in Web Servers, 2001.
[50] J. Carlstrom and R. Rom. Application-Aware Admission Control and Scheduling in Web
Servers. In Proceedings of the IEEE Infocom 2002, June 2002.
[51] Beloglazov, A., and Rajkumar B.: Energy Efficient Resource Management in
Virtualized Cloud Data Centers. Proceedings of the 2010 10th IEEE/ACM International
Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010.
[52] http://huanliu.wordpress.com/2012/03/13/amazon-data-center-size - last accessed on
19/09/2014
References
88
[53] http://www.datacenterknowledge.com/archives/2009/06/29/microsoft-to-open-two-
massive-data-centers - last accessed on 19/09/2014
Appendices
89
APPENDIX A
The code which checks if a host is over-utilized. The ‘fallback’ algorithm is used until the
sliding window (length = 10) has been filled.
@Override
protected boolean isHostOverUtilized(PowerHost host)
{
PowerHostUtilizationHistory _host = (PowerHostUtilizationHistory)
host;
double[] utilizationHistory = _host.getUtilizationHistory();
int length = 10;
if (utilizationHistory.length < length)
{
return
getFallbackVmAllocationPolicy().isHostOverUtilized(host);
}
double[] utilizationHistoryReversed = new double[length];
for (int i = 0; i < length; i++)
{
utilizationHistoryReversed[i] = utilizationHistory[length - i
- 1];
}
double[] estimates = null;
try
{
estimates = getParameterEstimates(utilizationHistoryReversed);
}
catch (IllegalArgumentException e)
{
return
getFallbackVmAllocationPolicy().isHostOverUtilized(host);
}
double migrationIntervals =
Math.ceil(getMaximumVmMigrationTime(_host) /
Constants.SCHEDULING_INTERVAL);
double predictedUtilization = estimates[0] + estimates[1] * (length
+ migrationIntervals);
predictedUtilization *= getSafetyParameter();
addHistoryEntry(host, predictedUtilization);
if(predictedUtilization >= 1)
{
Constants.OverUtilizedHostsThisInterval++;
}
return predictedUtilization >= 1;
}
Appendices
90
APPENDIX B
The main() method of the LRMMT algorithm.
public static void main(String[] args) throws IOException
{
boolean enableOutput = true;
boolean outputToFile = true;
String inputFolder =
"C:UsersscoobyDesktopEclipseCloudSimexamplesworkloadplanet
lab";
//default workload generated from dynamic averages
String workload = "default";
//dynamic workload
if(!Constants.IsDefault)
{
workload = "dynamic"; // PlanetLab workload
}
if(Constants.IntervalGenerator)
{
Constants.SIMULATION_LIMIT = 86400;
}
String outputFolder =
"C:UsersscoobyDesktopEclipseWorkspaceoutput";
String vmAllocationPolicy = "lr"; // Local Regression (LR) VM
allocation policy
String vmSelectionPolicy = "mmt"; // Minimum Migration Time (MMT) VM
selection policy
String parameter = "1.2"; // the safety parameter of the LR policy
new PlanetLabRunner(enableOutput, outputToFile, inputFolder,
outputFolder, workload, vmAllocationPolicy, _vmSelectionPolicy,
parameter);
}

The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data Centers

  • 1.
    The Impact ofDynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data Centers Mark White BSc., H.Dip Submitted in accordance with the requirements for the degree of Masters of Science in Computer Science and Information Technology Discipline of Information Technology, College of Engineering and Informatics National University of Ireland, Galway Research Supervisors: Dr. Hugh Melvin, Dr. Michael Schukat Research Director: Prof. Gerard Lyons September 2014 The candidate confirms that the work submitted is his own and that appropriate credit has been given where reference has been made to the work of others
  • 3.
    Contents Chapter 1 Introduction..................................................................................................................1 1.1 The Hybrid Cloud...........................................................................................................1 1.2 Migration...................................................................................................................... 2 1.3 Energy Efficiency............................................................................................................ 2 1.4 Cooling.......................................................................................................................... 4 1.5 Research Objectives.......................................................................................................4 1.5.1 Hypothesis............................................................................................................. 4 1.5.2 CloudSim................................................................................................................ 5 1.5.3 Methodology..........................................................................................................6 1.6 Conclusion..................................................................................................................... 6 Chapter 2 Literature Review..........................................................................................................8 Introduction............................................................................................................................. 8 2.1 Performance versus Power............................................................................................. 8 2.2 Increased Density ..........................................................................................................9 2.3 Hardware.................................................................................................................... 12 2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution...................................... 12 2.3.2 Servers, Storage Devices & Network Equipment..................................................... 13 2.3.3 Cooling................................................................................................................ 13 2.3.4 Industry Standards & Guidelines............................................................................ 15 2.3.5 Three Seminal Papers ........................................................................................... 17 2.4 Software ..................................................................................................................... 24 2.4.1 Virtualization........................................................................................................ 24 2.4.2 Migration............................................................................................................. 25 2.5 Monitoring Interval...................................................................................................... 34 2.5.1 Static Monitoring Interval ..................................................................................... 36 2.5.2 Dynamic Monitoring Interval................................................................................. 37 2.6 Conclusion................................................................................................................... 38 Chapter 3 CloudSim.................................................................................................................... 39 Introduction........................................................................................................................... 39 3.1 Overview..................................................................................................................... 39 3.2 Workload.................................................................................................................... 40 3.3 Capacity...................................................................................................................... 42
  • 4.
    3.4 Local Regression/ Minimum Migration Time (LR / MMT) ............................................... 44 3.5 Selection Policy – Local Regression(LR)......................................................................... 44 3.6 Allocation Policy – Minimum Migration Time (MMT) ..................................................... 45 3.7 Default LRMMT ........................................................................................................... 45 3.7.1 init()………………………………………………………. .............................................................. 45 3.7.2 start() .................................................................................................................. 46 3.8 Over-utilization............................................................................................................ 48 3.9 Migration.................................................................................................................... 50 3.10 Reporting.................................................................................................................... 52 3.11 Conclusion................................................................................................................... 52 Chapter 4 Implementation.......................................................................................................... 54 Introduction........................................................................................................................... 54 4.1 Interval Adjustment Algorithm...................................................................................... 54 4.2 Comparable Workloads................................................................................................ 58 4.3 C# Calculator............................................................................................................... 61 4.4 Interval Adjustment Code............................................................................................. 64 4.5 Reporting.................................................................................................................... 69 4.6 Conclusion................................................................................................................... 70 Chapter 5 Tests, Results & Evaluation.......................................................................................... 71 Introduction........................................................................................................................... 71 5.1 Tests & Results ............................................................................................................ 71 5.2 Evaluation of Test Results............................................................................................. 75 5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?............................... 76 5.2.2 Result of Reduced Migration Count ....................................................................... 77 5.2.3 Scalability............................................................................................................. 77 5.3 Evaluation of CloudSim ................................................................................................ 78 5.3.1 Local Regression Sliding Window........................................................................... 78 5.3.2 RAM.................................................................................................................... 79 5.3.3 Dynamic RAMAdjustment .................................................................................... 79 5.3.4 SLA-based Migration............................................................................................. 79 Chapter 6 Conclusions ................................................................................................................ 81 REFERENCES............................................................................................................................... 83 APPENDIXA................................................................................................................................ 89 APPENDIX B................................................................................................................................ 90
  • 5.
    List of Figures Figure1 Data Center Service Supply Chain.................................................................................... 3 Figure 2 Relative contributions to the thermal output of a typical DC............................................. 12 Figure 3 A Typical AHU Direct Expansion (DX) Cooling System ................................................. 14 Figure 4 A Typical DC Air Flow System...................................................................................... 15 Figure 5 Performance of web server during live migration (C. Clark)............................................. 30 Figure 6 Pre-Copy algorithm ....................................................................................................... 32 Figure 7 CloudSim Architecture .................................................................................................. 41 Figure 8 Flow Chart Depicting the LR / MMT simulation process ................................................. 48 Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average................ 57 Figure 10 A screenshot of the data generated for calculation of the default workload ...................... 59 Figure 11 Intervals calculated during the dynamic simulation ........................................................ 59 Figure 12 Calculation of the Average CPU Utilization for the Default Files.................................... 60 Figure 13 How the dynamic interval adjustment code interacts with CloudSim............................... 66 Figure 14 Interval Calculation for the Dynamic Simulation ........................................................... 72 Figure 15 VM decommissioning comparison................................................................................ 73 Figure 16 Operational Hosts - Default Simulation......................................................................... 74 Figure 17 Operational Hosts - Dynamic Simulation ...................................................................... 74 Figure 18 Average CPU Utilization - Dynamic Simulation............................................................ 75
  • 6.
    Acknowledgements My supervisor, Dr.Hugh Melvin, who identified at an early stage that (for the most part) I could be left to my own devices to get on with the work required. His supervisory approach resulted in the freedom to progress at my own pace knowing he was available as and when I needed a ‘boost’. When clarity was requested, Hugh demonstrated an enviable ability to extract the salient issue and point me in the right direction. Although typically performed cycling up a steep hill on his way home from work, the momentary pauses during review meetings while he reflected on the issues were often more productive than hours of reading code. Future students should be so lucky to have him oversee their research endeavours. My second supervisor, Dr. Michael Schukat, who is capable of clarifying a complicated issue with a carefully worded question – followed (invariably) with a reassuring smile. Dr. Ripduman Sohan & Dr. Sherif Akoush in the Computing Laboratory at Cambridge University without whom I would not have identified the approach taken in this thesis. Over the course of a few (all too brief) visits with them, I also became aware of the extent of my intellectual abilities (and limitations!!!). The principle author of the CloudSim framework, Dr. Anton Beloglazov. Despite his having moved on from the University of Melbourne where he wrote CloudSim for his doctoral thesis, his detailed responses to the countless queries I posed during the course of my research were invaluable and generous to a fault. My colleagues in the Discipline of IT at NUI Galway for timely coffee-breaks, lunch invitations and encounters in the corridors – because the breaks are a vital constituent of the work and the queries as to progress and words of support were more important than you could possibly have imagined. My parents, who repeatedly remind me that: ‘You are capable of anything you put your mind to’ Deirdre (Dee) O’ Connor – if convention allowed your name would be on the title page!
  • 8.
    Abstract Virtualization is oneof the principle data center technologies increasingly deployed in recent years to meet the challenges of escalating costs, industry standards and the search for a competitive edge. This thesis presents a novel approach to management of the virtualized system which dynamically adjusts the monitoring interval with respect to the average CPU utilization for the data center. The potential for reduced power consumption, by identifying performance opportunities at an earlier stage than typical virtualized systems which use a static interval, is analysed. It is proposed that the adjusted interval will result in analysis of data center metrics being performed at a more appropriate level of granularity than current static monitoring systems.
  • 9.
    Chapter1 Introduction 1 Chapter 1Introduction The availability of cloud-based Data Centers (DCs) in recent years has introduced significant opportunities for enterprises to reduce costs. The initial Capital Expenditure (CapEx) associated with setting up a DC has been prohibitively high in the past, but this may no longer be the primary concern. For example, start-ups choosing to implement Infrastructure- as-a-Service (IaaS) cloud architectures are free to focus on optimizing other aspects of the business rather than worrying about raising the capital to build (and maintain) fully equipped DCs. A young enterprise can now pay a relatively small monthly fee to Amazon (EC2) or Microsoft (Azure), for example, in return for a scalable infrastructure on which to build their new product or service. Existing companies are also availing of significant savings and opportunities by moving to the cloud. 1.1 The Hybrid Cloud In the future the architecture of cloud computing infrastructure will facilitate a business moving the public portion of their services from one remote DC to another for cost or efficiency gains. For example, a DC provider in one US state may be charging less for compute time because energy costs in that state are lower than those in a neighbouring state. Migration of enterprise services to the less expensive location could be facilitated. To enable this type of migratory activity, the Distributed Management Task Force (DMTF) has created the Open Virtualization Format (OVF) specification. The OVF standard “provides an intermediary format for Virtual Machine (VM) images. It lets an organization create a VM instance on top of one hypervisor and then export it to the OVF so that it can be run by another hypervisor” [4]. With the exception of Amazon, all the major cloud providers (Citrix Systems, IBM, Microsoft, Oracle and VMware) are involved in the development of OVF. The short and medium term solution to the interoperability issue will certainly be ‘hybrid’ clouds where the enterprise maintains the private portion of their infrastructure on their local network and the public portion is hosted on a federated cloud - facilitating indirect (but not direct) movement between providers e.g. in a similar fashion to switching broadband providers, a software development company may initially choose to lease a Microsoft data center for their infrastructure but subsequently transfer to Google if the latter offering
  • 10.
    Chapter1 Introduction 2 becomes moresuitable for their purposes (e.g. proximity to client requests or more energy efficient). Development of new products may be performed securely on the enterprise Local Area Network (LAN) and subsequently ‘released’ onto the public cloud for global distribution. Movement from one provider to another is currently (and for the foreseeable future will be) performed manually by the enterprise administrator using separate management interfaces i.e. an Amazon API or a Microsoft API. The vision of the DMTF is a unified interface known as Cloud Infrastructure Management Interface (CIMI). It is currently a work-in-progress but ultimately hopes to facilitate direct transfer of data between cloud providers. The core technology upon which this data transfer between providers will be facilitated is virtualization – most specifically, migration of VMs. 1.2 Migration The practice of regularly migrating services between providers may well become feasible in the future, providing enterprises with significant opportunities to dynamically reduce the energy portion of their Operating Expenditure (OpEx) budget. This would also result in operators becoming more competitive, perhaps, all performance metrics being equal, gaining their edge from increased energy efficiency efforts. Hybrid cloud environments also facilitate smaller IT teams, resulting in reduced staffing costs. 1.3 Energy Efficiency Data centers currently account for close to 3% of all global energy consumed on an annual basis. It is certain that the industry will continue to expand as increasing volumes of data are generated, transmitted, stored and analysed. This expansion will require significantly more energy than is currently used by the sector, energy which must be managed as responsibly as possible. Energy management, however, is not possible without measurement. The measurement of a DC’s energy efficiency helps staff and management focus on the various subsystems of the operation with a view to improving the overall efficiency of the data center. While advances in hardware and software continue apace, the DC industry has only recently begun to consider the responsibility of ensuring that the energy it uses is not
  • 11.
    Chapter1 Introduction 3 wasted. Theglobal economic downturn of 2007 played no small part in motivating DC operators to review their practices. In an attempt to remain competitive, while constantly upgrading infrastructure and services to meet the needs of their customers, data center operators have since identified energy efficiency as a cost opportunity. The moral aspects of managing energy for the future are all well and good. It appears more likely, however, that the potential operational savings in the short to medium term have provided the primary motivation for data center operators to take stock. In addition to the operational savings achieved when the data center becomes more energy efficient on a daily basis, additional capital savings may also be realized. All items of IT equipment have a replacement interval which may be increased due to redundancies discovered during the energy efficiency audit. For example, should the existing cooling volume of the room be found to be in excess of requirements, additional air handling units (AHUs) could be switched to standby, not only reducing the power consumed by that unit but also increasing the interval before the unit needs to be repaired or replaced. The amount of power and cooling that a DC uses on a day-to-day basis determines how much irreplaceable fossil fuels it consumes and the quantity of carbon emissions for which it is responsible. Figure 1 Data Center Service Supply Chain
  • 12.
    Chapter1 Introduction 4 Within thesupply chain of DC services, illustrated in Figure 1, the main emissions occur at the power generation site. Location is a key factor for the CO2 intensity of the power consumed by the data center. A gas- or coal-fired powered utility creates much more CO2 than a hydro- or wind-powered utility. For this reason, many green-field DCs are now being located near low-cost, environmentally friendly power sources. 1.4 Cooling Location is also a key factor with respect to cooling. A data center in a cool climate such as Ireland requires less cooling power than a data center in a warmer climate such as Mexico. To avail of climate-related opportunities, large-scale DCs have recently been built in temperate locations such as Dublin (e.g. Google) and Sweden (e.g. Facebook), demonstrating the significance of the cost reductions possible. This being the case, if migration of DC services across Wide Area Networks (WANs) becomes cost-feasible in the future, concepts such as ‘follow the moon’ / ‘follow the sun’ (where the services provided by a DC are moved, across the network, closer to where they are most needed throughout the day) may become prevalent. Migration of data center services across both Local Area Networks (LANs) and WANs is discussed in more detail in Chapter 2. 1.5 Research Objectives 1.5.1 Hypothesis While the effort to optimize the individual component costs (e.g. downtime) of a migration is worthwhile, this research aims to investigate further opportunities for energy savings if, rather than optimising the individual component costs, a migration is viewed as a single all- encapsulating entity and focus is applied to reducing the total number of migrations taking place in a DC. Throughout a migration both the source and the destination servers are running. Quite apart from the extra CPU processing, RAM access and bandwidth required to achieve a migration, there is an additional energy cost associated with simply keeping both servers simultaneously powered for the duration of the migration. In addition, if the destination server was not previously running before the migration was initiated, the time delay starting it up (as a new host machine) must also be factored into any calculation of efficiency.
  • 13.
    Chapter1 Introduction 5 The principlemetric for monitoring the DC workload is CPU utilization which is one of the primary resources associated with servicing that workload. In a virtualized environment CPU utilization is an indication of the processing capacity being used by a host while serving the requirements of the VMs located on it. In current practice, the CPU utilization value delivered to monitoring systems is averaged over a constant monitoring interval (i.e. 300 seconds). This interval is typically pre-configured (via a management interface) by the data center operator, rendering it static. With a relatively small percentage of the host's CPU concerned with running the virtualization hypervisor, CPU utilization is primarily dependent on the workload being serviced by the VMs located on the host. This workload typically varies with time as requests to the servers fluctuate outside the DC. As such, the frequency of change of the CPU utilization value closely tracks the frequency of change of the incoming workload. This thesis investigates the merits of moving from a fixed interval to one which is dynamically adjusted based on the overall CPU utilization average of the DC. At each interval a weighted CPU utilization average for the DC is calculated and the next monitoring interval is adjusted accordingly. By dynamically adjusting the monitoring interval with respect to the average CPU utilization of the DC, this research analyses the potential for reduced power consumption through identification of performance opportunities at an earlier stage than systems which use a static 300 second interval. It is proposed that these performance opportunities would otherwise have remained hidden mid-interval. Calculated on the basis of how ‘busy’ the DC currently is, the adjusted interval is more likely to be at an appropriate level of granularity than its static counterpart. 1.5.2 CloudSim A secondary objective of this research was to examine the efficacy of the CloudSim framework with respect to simulation of power-aware DCs. Given the lack of access for researchers to ‘real-world’ data center infrastructure, a robust simulator with which to experiment is of paramount importance. CloudSim is one such framework and is currently deployed by many researchers in the field of data center energy efficiency worldwide. It is discussed in detail in Chapter 3.
  • 14.
    Chapter1 Introduction 6 The onlineforums relating to the CloudSim framework are very active with researchers attempting to establish the best way to achieve their objectives. While the documentation for the code is extensive (and there are a number of basic examples of how the software can be used included in the CloudSim framework), there is little by way of explanation of the methodologies used by the original author of the code, thus resulting in each individual researcher having to spend an inordinate amount of time investigating the capabilities (and limitations) of the framework. This can only be achieved by reviewing many thousands of lines of code and testing to establish the functionality of each module and method. Through the course of this research, a number of CloudSim issues were identified which, it is hoped, will prove useful to future researchers. They are discussed chronologically (at the point in development when they were identified) and relate to both the framework code and the accuracy of virtual machine and migration simulation. They are also summarized in Chapter 5. 1.5.3 Methodology This thesis uses the CloudSim framework (described in more detail in Chapter 3) as the base simulator for implementation and testing of the hypothesis. A considerable review of the existing CloudSim code was required to establish the capabilities of the framework and also to identify what additional code modules would be needed to meet the thesis objectives. Ultimately it was found that no facility existed in CloudSim to test the hypothesis and thus a number of extensions to the existing code were designed and developed. These were then integrated with the framework such that the default CloudSim simulation could be reliably compared with the dynamic extension created for this research i.e. implementing dynamic interval adjustment. 1.6 Conclusion The remainder of this thesis is structured as follows. The literature review in Chapter 2 describes current (and future) efforts to improve energy efficiency in the data center industry. Both hardware and software approaches are discussed, with a focus on virtualized systems, installed as standard in all green-field DCs and retro-fitted to the majority of existing brown- field sites. Chapter 3 details the specific modules in the CloudSim framework required to build the test bed for analysis of the hypothesis. An explanation as to how these modules
  • 15.
    Chapter1 Introduction 7 interact witheach other is also provided. Chapter 4 specifies the new Java methods written to create and test the hypothesis. Integration of the new code with the existing framework is also described. Chapter 5 discusses the tests performed to evaluate the hypothesis and analyses the results in the context of current energy efficiency efforts in the data center industry. Chapter 6 concludes this thesis with a summary of the limitations identified in the CloudSim framework, in the hope that the work of future researchers can more effectively benefit from, and build upon, its code-base.
  • 16.
    Chapter2 Literature Review 8 Chapter2 Literature Review Introduction By adjusting the DC monitoring interval with respect to the incoming workload, this thesis investigates opportunities for more energy efficient management of data center resources. Given this objective, extensive examination of the evolution of DC resource management methods over the last few years was required in an effort to identify an approach which had not been previously applied. This thesis is primarily concerned with the energy efficiency of DCs when migrating VMs across the LAN and WAN. To contextualize the research more completely, the following literature review extends the introductory discussion in Chapter 1 to encompass the entire data center infrastructure, analyzing current (and previous) efforts by operators and academic researchers to reduce power consumption from, not only a software, but also a hardware perspective. The chapter closes with an in-depth review of existing monitoring interval approaches and technologies. 2.1 Performance versus Power Most of the advances achieved by both the DC industry and academic researchers before 2006 paid particular attention to the performance of the infrastructure, with the principle focus of operator efforts set firmly on keeping the DC available to clients 24/7. In fact, when advertising and selling the services they offer, operators still choose to feature their ‘uptime’ percentage as their primary Unique Selling Point (USP). The ‘5 nines’ (i.e. 99.999% uptime), denoting High Availability (HA), are seldom omitted from a typical DC operator’s marketing material. However, the increase in power consumption required to boost performance seldom received more attention than summary recognition as an additional expense. The power / performance trade-off is undoubtedly a difficult hurdle to overcome, especially while cost competitiveness is uppermost in the minds of data center operators. Invariably, before 2006, most commercial development efforts to improve the operation of DCs were focussed solely on performance.
  • 17.
    Chapter2 Literature Review 9 Inmore recent years, increased consumer demand for faster traffic and larger, more flexible, storage solutions has changed how the industry views the resources required to operate competitively. More equipment (e.g. servers, routers) has been required to meet demand but the space required to accommodate this equipment has already been allocated to existing equipment. The strategy adopted, since 2006, by a DC industry looking to the future, was to increase the density of IT equipment rather than the more expensive option of purchasing (or renting) additional square footage. The solution combined new server technologies and virtualization. 2.2 Increased Density An analogy: increasing infrastructural density in a data center is similar to adding more bedrooms to a house without extending the property. The house can now accommodate private spaces for more people but each person has less space than before. In the data center there are now more servers per square foot, resulting in more compute / storage capability. Despite the space-saving advantages of VM technology and techniques (i.e. migration), which reduced the number of servers required to host applications, the primary disadvantage of increased density was that each new blade server required significantly more power than its predecessor. A standard rack with 65-70 blades operating at high loads might require 20 - 30kW of power compared with previous rack consumptions of 2 - 5kW. This additional power generates additional heat. In a similar manner to maintaining comfortable levels of heat and humidity for people in a house, heat in the rack, and resultant heat in the server room, must be removed to maintain the equipment at a safe operating temperature and humidity. Summarily, the introduction of increased server room density, from 2006 onwards, resulted in increased power and cooling requirements for modern DCs. At their 25th Annual Data Center Conference held in Las Vegas in late November 2006, Gartner analysts hypothesized that: “…by 2008, 50% of current data centers will have insufficient power and cooling capacity to meet the demands of high-density equipment…” [1]
  • 18.
    Chapter2 Literature Review 10 Duringhis address to the conference, Gartner Research Vice President, Michael Bell suggested that: “Although power and cooling challenges will not be a perpetual problem, it is important for DC managers to focus on the electrical and cooling issue in the near term, and adopt best practice to mitigate the problem before it results in equipment failure, downtime and high remediation costs”. This was one of the first ‘shots across the bow’ for a data center industry which, until then, had been solely focussed on improving performance (e.g. uptime, response time) almost in deference to escalating energy costs. Based on data provided by IDC [2], Jonathon Koomey published a report [3] in February 2007 estimating the electricity used by all DCs in both the US and globally for 2005. The executive summary states that: “The total power demand in 2005 (including associated infrastructure) is equivalent (in capacity terms) to about five 1000 MW power plants for the U.S. and 14 such plants for the world. The total electricity bill for operating those servers and associated infrastructure in 2005 was about $2.7 billion and $7.2 billion for the U.S. and the world, respectively.” A few months later the global economic downturn brought with it increasingly restrictive operating budgets and higher energy prices. The competitive edge was becoming harder to identify. Quite apart from the economic factors affecting the industry, the timely publication by the EPA of its report to the US Congress [4] in August 2007 highlighted significant opportunities to reduce both capital and operating costs by optimizing the power and cooling infrastructure involved in data center operations. Industry analysts were once again identifying an escalating power consumption trend which required immediate attention. The report assessed the principle opportunities for energy efficiency improvements in US DCs. The process of preparing the report brought all the major industry players together. In an effort to identify a range of energy efficiency opportunities, 3 main improvement scenarios were formulated: 1. Improved Operation: maximizes the efficiency of the existing data center infrastructure by utilizing improvements such as ‘free cooling’ and raising
  • 19.
    Chapter2 Literature Review 11 temperature/ humidity set-points. Minimal capital cost (‘the low hanging fruit’) is incurred by the operator 2. Best Practice: adopt practices and technologies used in the most energy-efficient facilities 3. State-of-the-art: uses all available energy efficiency practices and technologies The potential energy savings and associated capital cost calculated for each of the 3 scenarios respectively were: 1. Improved Operation: 20% saving - least expensive 2. Best Practice: 45% saving 3. State-of-the-art: 55% saving - most expensive Notably, a proviso was also offered by the report in that: “…due to local constraints, the best strategy for a particular data center could only be ascertained by means of a site-specific review - not all suggested scenarios apply to all data centers.” Regardless of which (if any) subsequent strategy was adopted by a particular data center operator, performance of a site- specific review invariably served the purpose of demonstrating that reduction of power consumption was indeed a viable opportunity to, not only significantly reduce both capital and operating costs, but also re-gain a competitive edge. The economic downturn, the Gartner conference and the reports by both the EPA and Koomey were a significant part of the catalyst for the energy approach beginning to receive a level of attention closer, if not equal, to that of the performance approach in previous years. Efficient management of power and cooling, while maintaining performance levels, became the order of the day. At the highest level, DC infrastructure can be subdivided into hardware and software. While it is true that both are inextricably linked to the energy performance of the DC, it is useful for the purposes of this review to examine them separately.
  • 20.
    Chapter2 Literature Review 12 2.3Hardware Rasmussen [5] identified power distribution, conversion losses and cooling as representing between 30 – 45% of the electricity bill in larger DCs. Cooling alone accounted for 30% of this total. Figure 2 Relative contributions to the thermal output of a typical DC 2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution The power being provided to the IT equipment in the racks is typically routed through an Uninterruptible Power Supply (UPS) which feeds Power Distribution Units (PDUs) located in or near the rack. Through use of better components, circuit design and right-sizing strategies, manufacturers such as American Power Conversion (APC) and Liebert have turned their attention to maximizing efficiency across the full load spectrum, without
  • 21.
    Chapter2 Literature Review 13 sacrificingredundancy. Some opportunities may exist in efforts to re-balance the load across the 3 phases supplying the power to the racks but efficiencies in the power supply & distribution system are outside the scope of this research. 2.3.2 Servers, Storage Devices & Network Equipment Manufacturers such as IBM and Intel are designing increasingly efficient server blades with features such as chip-level thermal strategies (Dynamic Voltage & Frequency Scaling (DVFS)), multicore processors and power management leading the way. Enterprise operators such as Google and Facebook have recently designed and installed their own servers which have demonstrated increased efficiencies but these servers are specifically ‘fit-for-purpose’. They may not be sufficiently generic to be applicable to a majority of DC configurations. 2.3.3 Cooling There are a variety of standard systems for cooling in data centers but all typically involve Air Handling Units (AHUs) or Computer Room Air Handlers (CRAHs). Well-designed DCs have aligned their racks in an alternating hot aisle / cold aisle configuration with cold air from the AHU(s) entering the cold aisle through perforated or grated tiles above a sub-floor plenum. Hot air is exhausted from the rear of the racks and removed from the room back to the same AHU(s) forming a closed-loop system. The hot air is passed directly over an evaporator (Figure 3: 4) in the AHU which contains a liquid refrigerant (e.g. ethylene glycol / water solution). The amount of heat absorbed is determined by the speed of the air crossing the coil and / or the flow rate of the refrigerant through the coil. The flow rate is controlled by tandem scroll compressors (Fig 3: 1). A dead-band setting is applied to each AHU and is divided equally between all the compressors in the system. As each dead-band range above the set-point is reached a compressor will engage to increase the flow rate. As the temperature returns (down through the dead-band increments) toward the set-point, the compressors disengage – reducing the flow through the evaporator until the set-point is reached again. The heat absorbed through the coil is fed to an array of condensers outside the DC where it evaporates into the atmosphere as exhaust or is reused in some other part of the facility. The set point of the AHU is configured on installation of the unit and must (if deemed appropriate) be changed manually by a member of staff following analysis and review. Unfortunately these reviews happen all too seldom in typical DCs, despite the inevitable changes taking place in the server room workload on a daily basis.
  • 22.
    Chapter2 Literature Review 14 Figure3 A Typical AHU Direct Expansion (DX) Cooling System Depending on the configuration, the heat removal system might potentially consume 50% of a typical DC’s energy. Industry is currently embracing a number of opportunities involving temperature and airflow analysis: 1. aisle containment strategies 2. increasing the temperature rise (ΔT) across the rack 3. raising the operating temperature of the AHU(s) 4. repositioning AHU temperature and humidity sensors 5. thermal management by balancing the IT load layout [6, 7] 6. ‘free cooling’ – eliminating the high-consumption chiller from the system through the use of strategies such as air- and water-side economizers
  • 23.
    Chapter2 Literature Review 15 Figure4 A Typical DC Air Flow System In addition to temperature maintenance, the AHUs also vary the humidity of the air entering the server room according to set-points. Low humidity (dry air) may cause static which has the potential to short electronic circuits. High levels of moisture in the air may lead to faster component degradation. Although less of a concern as a result of field experience and recent studies performed by Intel and others, humidity ranges have been defined for the industry and should be observed to maximize the lifetime of the IT equipment. Maintaining humidity ranges definitively increases the interval between equipment replacement schedules and, as a result, has a net positive outcome on capital expenditure budgets. 2.3.4 Industry Standards & Guidelines 2.3.4.1 Standards Power Usage Effectiveness (PUE2) [8] is now the de facto standard used to measure a DC’s efficiency. It is defined as the ratio of all electricity used by the DC to the electricity used just by the IT equipment. In contrast to the original PUE [9] rated in kilowatts of power (kW), PUE2 must be based on the highest measured kilowatt hour (kWh) reading taken during analysis. In 3 of the 4 PUE2 categories now defined, the readings must span a 12 month period, eliminating the effect of seasonal fluctuations in ambient temperatures: PUE = 𝑇𝑜𝑡𝑎𝑙 𝐷𝑎𝑡𝑎 𝐶𝑒𝑛𝑡𝑟𝑒 𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐𝑖𝑡𝑦 ( 𝑘𝑊ℎ) 𝐼𝑇 𝐸𝑞𝑢𝑖𝑝𝑚𝑒𝑛𝑡 𝐸𝑙𝑒𝑐𝑡𝑟𝑖𝑐𝑖𝑡𝑦 ( 𝑘𝑊ℎ)
  • 24.
    Chapter2 Literature Review 16 APUE of 2.0 suggests that for each kWh of IT electricity used another kWh is used by the infrastructure to supply and support it. The most recent PUE averages [10] for the industry fall within the range of 1.83 – 1.92 with worst performers coming in at 3.6 and a few top performers publishing results below 1.1 in recent months. Theoretically, the best possible PUE is 1.0 but a web-hosting company (Pair Networks) recently quoted a PUE of 0.98 for one of its DCs in Las Vegas, Nevada. Their calculation was based on receipt of PUE ‘credit’ for contributing unused power (generated on-site) back to the grid. Whether additional PUE ‘credit’ should be allowed for contributing to the electricity grid is debatable. If this were the case, with sufficient on-site generation, PUE could potentially reach 0.0 and cease to have meaning. Most DCs are now evaluating their own PUE ratio to identify possible improvements in their power usage. Lower PUE ratios have become a very marketable aspect of the data center business and have been recognized as such. Other standards and metrics (2.3.4.2.1 – 2.3.4.2.4) have been designed for the industry but, due for the most part to the complex processes required to calculate them, have not as yet experienced the same wide- spread popularity as PUE and PUE2. 2.3.4.2 Other Standards 2.3.4.2.1 Water Usage Effectiveness (WUE) measures DC water usage to provide an assessment of the water used on-site for operation of the data center. This includes water used for humidification and water evaporated on-site for energy production or cooling of the DC and its support system. 2.3.4.2.2 Carbon Usage Effectiveness (CUE) measures DC-level carbon emissions. CUE does not cover the emissions associated with the lifecycle of the equipment in the DC or the building itself. 2.3.4.2.3 The Data Center Productivity (DCP) framework is a collection of metrics which measure the consumption of a DC-related resource in terms of DC output. DCP looks to define what a data center accomplishes relative to what it consumes. 2.3.4.2.4 Data Center Compute Efficiency (DCCE) enables data center operators to determine the efficiency of compute resources. The metric makes it easier for data center operators to discover unused servers (both physical and virtual) and decommission or redeploy them.
  • 25.
    Chapter2 Literature Review 17 Surprisingly,efforts to improve efficiency have not been implemented to the extent one would expect. 73% of respondents to a recent Uptime Institute survey [11] stated that someone outside of the data center (the real estate / facilities department) was responsible for paying the utility bill. 8% of data center managers weren’t even aware who paid the bill. The lack of accountability is obvious and problematic. If managers are primarily concerned with maintaining the DC on a daily basis there is an inevitable lack of incentive to implement even the most basic energy efficiency strategy in the short to medium term. It is clear that a paradigm shift is required to advance the cause of energy efficiency monitoring at the ‘C- level’ (CEO, CFO, CIO) of data center operations. 2.3.4.3 Guidelines Data center guidelines are intermittently published by The American Society of Heating, Refrigeration and Air Conditioning Engineers (ASHRAE). These guidelines [12, 13] suggest ‘allowable’ and ‘recommended’ temperature and humidity ranges within which it is safe to operate IT equipment. The most recent edition of the guidelines [14] suggests operating temperatures of 18 – 27⁰C. The maximum for humidity is 60% RH. One of the more interesting objectives of the recent guidelines is to have the rack inlet recognized as the position from where the temperature and humidity should be measured. The majority of DCs currently measure at the return inlet to the AHU, despite more relevant temperature and humidity metrics being present at the inlet to the racks. 2.3.5 Three Seminal Papers In the context of improving the hardware infrastructure of the DC post-2006, three academic papers were found to be repeatedly referenced as forming a basis for the work of the most prominent researchers in the field. They each undertake a similar methodology when identifying solutions and are considered to have led the way for a significant number of subsequent research efforts. The methodologies which are common to each of the papers (and relevant to this thesis) include: 1. Identification of a power consumption opportunity within the DC and adoption of a software-based solution
  • 26.
    Chapter2 Literature Review 18 2.Demonstration of the absolute requirement for monitoring the DC environment as accurately as possible without overloading the system with additional processing Summary review of the three papers follows. 2.3.5.1 Paper 1: Viability of Dynamic Cooling Control in a Data Center Environment (2006) In the context of dynamically controlling the cooling system Boucher et al. [15] focused their efforts on 3 requirements: 1. A distributed sensor network to indicate the local conditions of the data center. Solution: a network of temperature sensors was installed at:  Rack inlets  Rack outlets  Tile inlets 2. The ability to vary cooling resources locally. Solution: 4 actuation points, which exist in a typical data center, were identified as having further potential in maintaining optimal server room conditions: 2.1 CRAC supply temperature – this is the temperature of the conditioned air entering the room. CRACs are typically operated on the basis of a single temperature sensor at the return side of the unit. This sensor is responsible for taking an average of the air temperature returning from the room. The CRAC then correlates this reading with a set-point which is configured manually by data center staff. The result of the correlation is the basis upon which the CRAC decides by how much the temperature of the air sent back out into the room should be adjusted. Variation is achieved in a Direct Expansion (DX) system with variable capacity compressors varying the flow of refrigerant across the cooling coil. In a water-cooled system chilled water supply valves modulate the temperature. 2.2 The crucial element in the operational equation of the CRAC, regardless of the system deployed, is the set-point. The set-point is manually set by data center staff and generally requires considerable analysis of the DC environment before any
  • 27.
    Chapter2 Literature Review 19 adjustmentis made. Typically, the set-point is configured (when the CRAC is initially installed) according to some prediction of the future cooling demand. Due to a number of factors (including the cost of consultancy) it is all too common that no regular analysis of the room’s thermal dynamics is performed (if at all). This is despite instalment of additional IT equipment (and increased work load on the existing infrastructure) throughout the lifecycle of the data center. Clearly a very static situation exists in this case. 2.3 CRAC fan speed – the speed at which the fans in the CRAC blow the air into the room (via a sub-floor plenum). In 2006 (at the time of this paper), typical CRACs had fans running at a set speed and without further analysis no reconfiguration took place after installation. Most CRACs since then have been designed with Variable Speed Drives (VSDs) - which can vary the speed of the fan according to some set of rules. However, with no dynamic thermal analysis of the DC environment taking place on a regular basis, the VSD rules are effectively hardwired into the system. The VSDs are an unused feature of the CRAC as a result. 2.4 Floor tile openings – the openings of the floor tiles in the cold aisle. The velocity at which the cold air leaving the CRAC enters the room is dependent upon a number of factors. Assuming it has passed through the sub-floor plenum with minimal pressure loss, the air will rise into the room at some velocity (via the floor tile openings). Floor tiles are either perforated or grated. Perforated tiles typically have 25% of their surface area open whereas grated tiles may have 40 – 60% of their surface open. The more open surface area available on the tile the higher the velocity with which the air will enter the room. The authors had previously designed and implemented a new tile - featuring an electronically controlled sliding damper mechanism which could vary the size of the opening according to requirements. So it is evident that as a typical DC matures and the thermodynamics of the environment change with higher CPU loads and additional IT equipment, the cooling system should have a dynamic cooling control system to configure it for
  • 28.
    Chapter2 Literature Review 20 continuousmaximum efficiency. Boucher et al. propose that this control system should be based on the 4 available actuation points above. 3. The knowledge of each variable’s effect on DC environment. Solution: the paper focused on how each of the actuator variables (2.1, 2.2 and 2.3 and 2.4 above) can affect the thermal dynamic of the data center. Included in the findings of the study were:  CRAC supply temperatures have an approximate linear relationship with rack inlet temperatures. An anomaly was identified where the magnitude of the rack inlet response to a change in CRAC supply temperature was not of the same order. Further study was suggested.  Under-provisioned flow provided by the CRAC fans affects the Supply Heat Index (SHI*) but overprovisioning has a negligible effect. SHI is a non-dimensional measure of the local magnitude of hot and cold air mixing. Slower air flow rates cause an increase in SHI (more mixing) whereas faster air flow rates have little or no effect. *SHI is also referred to as Heat Density Factor (HDF). The metric is based on the principle of a thermal multiplier which was formulated by Sharma et al. [16] The study concluded that significant energy savings (in the order of 70% in this case) were possible where a dynamic cooling control system, controlled by software, was appropriately deployed. 2.3.5.2 Paper 2: Impact of Rack-level Compaction on the Data Center Cooling Ensemble (2008) Shah et al. [17] deal with the impact on the data center cooling ensemble when the density of compute power is increased. The cooling ‘ensemble’ is considered to be all elements of the cooling system from the chip to the cooling tower. Increasing density involves replacing low-density racks with high-density blade servers and has been the chosen alternative to purchasing (or renting) additional space for
  • 29.
    Chapter2 Literature Review 21 mostDCs in recent years. New enterprise and co-location data centers also implement the strategy to maximize the available space. Densification leads to increased power dissipation and corresponding heat flux within the DC environment. A typical cooling system performs two types of work: 1. Thermodynamic – removes the heat dissipated by the IT equipment 2. Airflow – moves the air through the data center and related systems The metric chosen by Shah et al. for evaluation in this case is the ‘grand’ Coefficient of Performance (COPG) which is a development of the original COP metric suggested by Patel et al. [18, 19]. It measures the amount of heat removed by the cooling infrastructure per unit of power input and does so at a more granular level than the traditional COP used in thermodynamics, specifying heat removal at the chip, system, rack, room and facility levels. In order to calculate the COPG of the model used for the test case each component of the cooling system needed to be evaluated separately, before applying each result to the overall system. Difficulties arose where system-level data was either simply unavailable or, due to high heterogeneity, impossible to infer. However, the model was generic enough that it could be applied to the variety of cooling systems currently being used by ‘real world’ DCs. Note: in a similar vein, the research for this thesis examines the CPU utilization of each individual server in the data center such that an overall DC utilization metric at each interval can be calculated. Servers which are powered-off at the time of monitoring have no effect on the result and are excluded from the calculation. The assumption that increased density leads to less efficiency in the cooling system is incorrect. If elements of the cooling system were previously running at low loads they would typically have been operating at sub-optimal efficiency levels. Increasing the load on a cooling system may in fact increase its overall efficiency through improved operational efficiencies in one or more of its subsystems. 94 existing low-density racks were replaced with high-density Hewlett Packard (HP) blades for Shah’s research. The heat load increased from 1.9MW to 4.7MW. The new heat load was still within the acceptable range for the existing cooling infrastructure. No modifications to the ensemble were required.
  • 30.
    Chapter2 Literature Review 22 Uponanalysis of the results, COPG was found to have increased by 15%. This was, in part, achieved with improved efficiencies in the compressor system of the CRACs. While it is acknowledged that there is a crossover point at which compressors become less efficient, the increase in heat flux of the test model resulted in raising the work of the compressor to a point somewhere below this crossover. The improvement in compressor efficiency was attributed to the higher density HP blade servers operating at a higher ΔT (reduced flow rates) across the rack. The burden on the cooling ensemble was reduced - resulting in a higher COPG. With the largest individual source of DC power consumption (about 40% in this case) typically coming from the CRAC - which contains the compressor - it makes sense to direct an intelligent analysis of potential operational efficiencies at that particular part of the system. The paper states that: “The continuously changing nature of the heat load distribution in the room makes optimization of the layout challenging; therefore, to compensate for recirculation effects, the CRAC units may be required to operate at higher speeds and lower supply temperature than necessary. Utilization of a dynamically coupled thermal solution, which modulates the CRAC operating points based on sensed heat load, can help reduce this load”. In this paper Shah et al. present a model for performing evaluation of the cooling ensemble using COPG, filling the gap of knowledge through detailed experimentation with measurements across the entire system. They conclude that energy efficiencies are possible via increased COP in one or more of the cooling infrastructure components. Where thermal management strategies capable of handling increased density are in place, there is significant motivation to increase density without any adverse impact on energy efficiency. 2.3.5.3 Paper 3: Data Center Efficiency with Higher Ambient Temperatures and Optimized Cooling Control (2011) Ahuja et al. [20] introduce the concept of ‘deviation from design intent’. When a data center is first outfitted with a cooling system, best estimates are calculated for future use. The intended use of the DC in the future is almost impossible to predict at this stage. As the lifecycle of the DC matures, the IT equipment will deviate from the best estimates upon
  • 31.
    Chapter2 Literature Review 23 whichthe cooling system was originally designed to operate. Without on-going analysis of the DC’s thermal dynamics, the cooling system may become decreasingly ‘fit-for-purpose’. As a possible solution to this deviation from intent, this paper proposes that cooling of the DC environment should be controlled from the chip rather than a set of remote sensors in the room or on the rack doors. Each new IT component would have chip-based sensing already installed and therefore facilitate a “plug ‘n’ play” cooling system. The newest Intel processors (since Intel® Pentium® M) on the market feature an ‘on- die’ Digital Thermal Sensor (DTS). DTS provides the temperature of the processor and makes the result available for reading via Model Specific Registers (MSRs). The Intel white paper [21] which describes DTS states that: “… applications that are more concerned about power consumption can use thermal information to implement intelligent power management schemes to reduce consumption.” While Intel is referring to power management of the server itself, DTS could theoretically be extended to the cooling management system also. Current DCs control the air temperature and flow rate from the chip to the chassis but there is a lack of integration once the air has left the chassis. If the purpose of the data center is to house, power and cool every chip then it has the same goal as the chassis and the chassis is already taking its control data from the chip. This strategy needs to be extended to the wider server room environment in an integrated manner. The industry has recently been experimenting with positioning the cooling sensors at the front of the rack rather than at the return inlet of the AHU. The motivation for this is to sense the air temperature which matters most – the air which the IT equipment uses for cooling. The disadvantage of these remote sensors (despite being better placed than sensors at the AHU return inlet) is that they are statically positioned, a position which may later be incorrect should changes in the thermal dynamics of the environment occur. The closer to the server one senses - the more reliable the sensed data will be for thermal control purposes. Ahuja et al. propose that the logical conclusion is to move the sensors even closer to the
  • 32.
    Chapter2 Literature Review 24 server– in fact, right into the processor. If those sensors already exist (as is the case with the Intel processors) then use should be made of them for a more accurate cooling management system. The paper investigates the possible gains by moving the temperature sensors (and changing the set-point accordingly) to a variety of positions in the DC: 1. AHU return – 28⁰C 2. AHU supply – 18⁰C 3. Rack inlet – 23⁰C 4. Server - 30⁰C The first test was carried out on a single isolated rack with those results then extrapolated to a DC model with a cooling capacity of 100kW. 4 perimeter down-flow AHUs (N + 1 redundancy) performed the heat removal. While the 4 rows in the DC were not contained they did follow the standard hot / cold aisle arrangement. The tests showed that use of the server sensors resulted in more servers being maintained within the ASHRAE guideline temperature range of 18 – 27⁰C. Controlling the cooling system at the server yielded maximum benefit. Ahuja et al. concluded that a processor-based set of metrics capable of controlling a power management scheme on the server should, by extension, also be capable of controlling a dynamic cooling control system outside the rack. If every server in a DC was intermittently reporting its operating temperature (and air flow) to a cooling control system, the cooling system would be operating on a more robust data set i.e., more accurate readings, delivering higher energy efficiency savings than possible with previous DC configurations. 2.4 Software 2.4.1 Virtualization In a virtualized data center, multiple Virtual Machines (VMs) are typically co-located on a single physical server, sharing the processing capacity of the server's CPU between them. When, for example, increased demands on the CPU result in reduced performance of one of the VMs to the point where a Service Level Agreement (SLA) may be violated, virtualization technology facilitates a migration. Migration relocates the services being provided by the VM
  • 33.
    Chapter2 Literature Review 25 onthis 'over-utilized' host to a similar VM on another physical server, where sufficient capacity (e.g. CPU) is available to maintain SLA performance. Conversely, reduced demand on the CPU of a host introduces opportunities for server consolidation, the objective of which is to minimize the number of operational servers consuming power. The remaining VMs on an 'under-utilized' host are migrated so that the host can be switched off, saving power. Server consolidation provides significant energy efficiency opportunities. There are numerous resource allocation schemes for managing VMs in a data center, all of which involve the migration of a VM from one host to another to achieve one, or a combination of, objectives. Primarily these objectives will involve either increased performance or reduced energy consumption - the former, until recently, receiving more of the operator’s time and effort than the latter. In particular, SLA@SOI has completed extensive research in recent years in the area of SLA-focused (e.g. CPU, memory, location, isolation, hardware redundancy level) VM allocation and re-provisioning [22]. The underlying concept is that VMs are assigned to the most appropriate hosts in the DC according to both service level and power consumption objectives. Interestingly, Hyser et al. [23] suggest that a provisioning scheme which also includes energy constraints may choose to violate user-based SLAs ‘if the financial penalty for doing so was [sic] less than the cost of the power required to meet the agreement’. In a cost-driven DC it is clear that some trade-off (between meeting energy objectives and compliance with strict user-based SLAs e.g. application response times) is required. A similar power / performance trade-off may be required to maximize the energy efficiency of a host- level migration. 2.4.2 Migration The principal underlying technology which facilitates management of workload in a DC is virtualization. Rather than each server hosting a single operating system (or application), virtualization facilitates a number of VMs being hosted on a single physical server, each of which may run a different operating system (or even different versions of the same operating
  • 34.
    Chapter2 Literature Review 26 system).These VMs may be re-located (migrated) to a different host on the LAN for a variety of reasons:  Maintenance Servers intermittently need to be removed from the network for maintenance. The applications running on these servers may need to be kept running during the maintenance period so they are migrated to other servers for the duration.  Consolidation In a virtualized DC some of the servers may be running at (or close to) idle – using expensive power to maintain a machine which is effectively not being used to capacity. To conserve power, resource allocation software moves the applications on the under-utilized machine to a ‘busier’ machine - as long as the latter has the required overhead to host the applications. The under-utilized machine can then be switched off – saving on power and cooling.  Energy Efficiency Hotspots regularly occur in the server room i.e. the cooling system is working too hard in the effort to eliminate the exhaust air from a certain area. The particular workload which is causing the problem can be identified and relocated to a cooler region in the DC to relieve the pressure in the overheated area. Virtual Machines may also be migrated to servers beyond the LAN (i.e. across the Wide Area Network (WAN):  Follow the sun - minimize network latency during office hours by placing VMs close to where their applications are requested most often  Where latency is not a primary concern there are a number of different strategies which may apply:  Availability of renewable energy / improved energy mix  Less expensive cooling overhead (e.g. ‘free’ cooling in more temperate / cooler climates  Follow the moon (less expensive electricity at night)  Fluctuating electricity prices on the open market [24]
  • 35.
    Chapter2 Literature Review 27 Disaster Recovery (DR)  Maintenance / Fault tolerance  Bursting i.e. temporary provisioning of additional resources  Backup / Mirroring Regardless of the motivation, migration of virtual machines both within the DC and also to other DCs (in the cloud or within the enterprise network) not only extends the opportunity for significant cost savings but may also provide faster application response times if located closer to clients. To maintain uptime and response Service Level Agreement (SLAs) parameters of 99.999% (or higher), these migrations must be performed ‘hot’ or ‘live’, keeping the application available to users while the virtual machine hosting the application (and associated data) is moved to the destination server. Once all the data has been migrated, requests coming into the source VM are redirected to the new machine and the source VM can be switched off or re-allocated. The most popular algorithm by which virtual machines are migrated is known as pre-copy and is deployed by both Citrix and VMWare – currently considered to be the global leaders in software solutions for migration and virtualized systems. A variety of live migration algorithms have been developed in the years since 2007. Some are listed below: 1. Pre-copy [25] 2. GA for Renewable Energy Placement [26] 3. pMapper: Power Aware Migration [27] 4. De-duplication, Smart Stop & Copy, Page Deltas & CBR (Content Based Replication) [28] 5. Layer 3: IP LightPath [29] 6. Adaptive Memory Compression [30] 7. Parallel Data Compression [31] 8. Adaptive Pre-paging and Dynamic Self-ballooning [32] 9. Replication and Scheduling [33]
  • 36.
    Chapter2 Literature Review 28 10.Reinforcement Learning [34] 11. Trace & Replay [35] 12. Distributed Replicated Block Device (DRBD) [36] The LAN-based migration algorithm used by the Amazon EC2 virtualization hypervisor product (Citrix XenMotion) is primarily based on pre-copy but also integrates some aspects of the algorithms listed above. It serves as a good example of the live migration process. It is discussed in the following section. 2.4.2.1 Citrix XenMotion Live Migration The virtual machine on the source (or current machine) keeps running while transferring its state to the destination. A helper thread iteratively copies the state needed while both end- points keep evolving. The number of iterations determines the duration of live migration. As a last step, a stop-and-copy approach is used. Its duration is referred to as downtime. All implementations of live migration use heuristics to determine when to switch from iterating to stop-and-copy. Pre-copy starts by copying the whole source VM state to the destination system. While copying, the source system keeps responding to client requests. As memory pages may get updated (‘dirtied’) on the source system (Dirty Page Rate), even after they have been copied to the destination system, the approach employs mechanisms to monitor page updates. The performance of live VM migration is usually defined in terms of migration time and system downtime. All existing techniques control migration time by limiting the rate of memory transfers while system downtime is determined by how much state has been transferred during the ‘live’ process. Minimizing both of these metrics is correlated with optimal VM migration performance and it is achieved using open-loop control techniques. With open-loop control, the VM administrator manually sets configuration parameters for the migration service thread, hoping that these conditions can be met. The input parameters are a limit to the network bandwidth allowed to the migration thread and the acceptable downtime for the last iteration of the migration. Setting a low bandwidth limit while ignoring page modification rates can result in a backlog of pages to migrate and prolong migration. Setting a high bandwidth limit can affect the performance of running applications. Checking the
  • 37.
    Chapter2 Literature Review 29 estimateddowntime to transfer the backlogged pages against the desired downtime can keep the algorithm iterating indefinitely. Approaches that impose limits on the number of iterations or statically increasing the allowed downtime can render live migration equivalent to pure stop-and-copy migration. 2.4.2.2 Wide Area Network Migration With WAN transmissions becoming increasingly feasible and affordable, live migration of larger data volumes over significantly longer distances is becoming a realistic possibility [37, 38]. As a result, the existing algorithms, which have been refined for LAN migration, will be required to perform the same functionality over the WAN. However, a number of constraints present themselves when considering long distance migration of virtual machines. The constraints unique to WAN migration are:  Bandwidth (I/O throughput – lower over WANs)  Latency (distance to destination VM – further on WANs)  Disk Storage (transfer of SAN / NAS data associated with the applications running on the source VM to the destination VM) Bandwidth (and latency) becomes an increasingly pertinent issue during WAN migration because of the volume of data being transmitted across the network. In the time it takes to transmit a single iteration of pre-copy memory to the destination, there is an increased chance (relative to LAN migration) that the same memory may have been re-written at the source. The rate at which memory is rewritten is known as the Page Dirty Rate (PDR) - calculated by dividing the number of pages dirtied in the last round by the time the last round took (Mbits / sec). This normalizes PDR for comparison with bandwidth. Xen implements variable bandwidth during the pre-copy phase based on this comparison. There are 2 main categories of PDR when live migration is being considered: 1. Low / Typical PDR: Memory is being re-written slower than the rate at which those changes can be transmitted to the destination i.e. PDR < Migration bandwidth 2. Diabolical PDR (DPDR): The rate at which memory is being re-written at the source VM exceeds the rate at which that re-written memory can be migrated ‘live’ to the destination (PDR > Migration bandwidth). The result of this is that the pre-copy phase may not converge at all. The PDR floods I/O and the pre-copy migration must be
  • 38.
    Chapter2 Literature Review 30 immediatelystopped i.e. pre-copy migration will not converge. All remaining pages are then transferred to the destination. The result of this is a longer downtime (while the pages are transferred), potential SLA violations and, most notably for the purposes of this research, increased power consumption while both hosts are running concurrently. 2.4.2.3 PDR Analysis and Compression of Transmitted Pages Current algorithms send the entire VM state on the 1st iteration (Figure 5: 62 seconds). To reduce the time spent on the 1st iteration, pages frequently ‘dirtied’ should be identified before the 1st iteration - the objective being to hold back these pages until the final iteration (reducing the number of pages resent during iterative pre-copy) or at least hold them back until some analysis calculates that they are ‘unlikely’ (with some confidence interval) to be dirtied again. There is a reasonable assumption that there will be multiple iterations in a high PDR environment – in the (rare) case where a VM has no dirty pages, only a single iteration would be required to transfer the entire state. Pre-migration analysis would not be continuous (due to the CPU overhead) but should begin at some short interval before the migration takes place i.e. just after the decision to migrate has been made. Figure 5 Performance of web server during live migration (C. Clark) With a pre-migration analysis phase the time required for the 1st iteration will be reduced. There may be an argument that downtime is increased due to the additional pages held back during the first iteration. High PDR pages - which have not been sent in the 1st iteration –
  • 39.
    Chapter2 Literature Review 31 wouldlikely be identified in the 2nd (or subsequent) iterations anyway – resulting in a very similar Writable Working Set (WWS) on the final iteration. In low PDR environments research suggests that the WWS in the majority of cases is a small proportion of the entire data set (perhaps approximately 10%) which needs to be transferred – resulting in minimal iterations being required before a stop condition is reached i.e. subsequent iterations would yield diminishing returns. This is not the case where an application may be memory intensive i.e. the PDR is diabolical and floods the I/O rate. Conversely, if the WWS is so small – is the effort to identify it at the pre-iterative stage worth the effort? If the algorithm can be applied to diabolical environments as well as acceptable PDR environments then the answer is yes – the effort is worth it. There is an inevitable trade-off between the time (and CPU overhead) required to identify the WWS on each iteration and the resulting time saved during iterative pre-copy due to less pages being transferred. However, identifying a minimal WWS will intrinsically save time. Finding the ‘threshold’ (current research suggests a simple high/low threshold) is an interesting research challenge! A bitmap indicating the Page Dirty Count is required to keep track of pages being repeatedly dirtied. A count however is probably too simplistic. Would an upper / lower bounded threshold be more applicable? A bounded threshold would ‘hold’ the pages which are above the lower threshold boundary but below the upper threshold boundary i.e. deemed least likely to be dirtied again. Boundary calculation should include a confidence interval - to minimize the un-synced pages before the final iteration occurs. These categorized ‘hold’ pages might be held until the next iteration and if they are found to still have a ‘hold’ status (fall between the upper and lower threshold boundaries) they are then transferred. With successive iterations more is known about recent PDR patterns. Analysis of these should theoretically yield boundary calculations which are more accurate as a result. Note: An additional parallel check, before the nth iteration takes place, of all the pages which were transmitted from the threshold area would identify those pages which have been subsequently dirtied. The compressed deltas of these pages would be re-transmitted in the final iteration – along with those that were still above the upper threshold. The success of the new algorithm could be judged on the percentage error at this stage i.e. how many pages were sent from the ‘hold’ area but subsequently dirtied?
  • 40.
    Chapter2 Literature Review 32 2.4.2.4Parallel Identification of Dirty Pages and Multi-Threaded Adaptive Memory Compression In addition to the pre-migration analysis stage it may also be useful to examine the potential of parallel dirty page identification and compression. In Figure 6 the blue area is when dirty pages are identified for the next round and delta compression takes place. However, in the time this phase is taking place more pages will be dirtied. If the same interval was moved back (to be in parallel with the previous data transfer) would more pages be dirtied? The answer appears to be no i.e. the PDR is independent of the process which actually calculates it. The benefit of this parallelism is that the algorithm is ready to move immediately to transfer n + 1 when transfer n has completed – reducing the iterative pre-copy time by eliminating the blue interval in Figure 6. It is probable that some overlap may be optimal rather than full parallelism. Time– series analysis of dirtying patterns during the previous transfer interval might yield an optimal overlap i.e. the best time to start identifying the new dirty pages, rather than waiting until the transfer has completed. It would also be beneficial to investigate further whether, as the number of dirty pages reduces with subsequent iterations, the time required to identify (and compress) the dirty page deltas could also be reduced (research suggests cache access times remain relatively constant). If this were true then the inner overlap could be sent deeper back into the transfer time reducing the outer overlap further. Additionally, multi-threaded compression would yield further reductions in the overlap interval. Figure 6 Pre-Copy algorithm
  • 41.
    Chapter2 Literature Review 33 2.4.2.5Throttling The critical issue in high PDR environments is that the possibility of convergence is reduced (if not eliminated altogether). It is similar to a funnel filling up too quickly. If the PDR continues at a high rate the funnel will eventually overflow resulting in service timeouts i.e. the application will not respond to subsequent requests or response times will be significantly decreased. The current solution is to abandon pre-copy migration, stop the VM and transfer all memory i.e. empty the funnel. Unfortunately, in the time it takes to empty the funnel, more pages have been dirtied because requests to the application do not stop. This may actually prohibit the migration altogether because the downtime is such that an unacceptable level of SLA violations occur. If, however, the speed at which the application’s response thread can be artificially slowed down (throttled) intermittently then the funnel is given a better chance to empty its current contents. This would be analogous to temporarily decreasing the flow from the tap to reduce the volume in the funnel. Previous solutions suggested that slowing response time to requests (known as Dynamic Rate Limiting) would alter the rate at which I/O throughput was performed but results proved that detrimental VM degradation tended to occur. In addition, other processes on the same physical machine were negatively affected. Dedicated migration switches were required to divert the additional load from the core. The focus was on the I/O throughput as opposed to the incoming workload (PDR). How the PDR could be intermittently throttled without adverse degradation of either the VM in question, or other machine processes, is the central question. Successful PDR throttling, in conjunction with threshold calculations and optimized parallel adaptive memory compression / dirty page identification, would achieve a lower PDR. However, the issue of PDR can be essentially circumvented if the number of migrations taking place in a DC as a whole can be reduced. In the majority of typical PDR environments Clark et al. [39] have shown that the initial number of dirty pages i.e. the Writable Working Set (WWS), is a small proportion of the entire page set (perhaps 10% or less) which needs to be transferred, typically resulting in minimal iterations being required before a stop condition is reached i.e. subsequent iterations
  • 42.
    Chapter2 Literature Review 34 wouldyield diminishing returns. This is not the case where an application may be particularly memory-intensive i.e. the PDR is diabolical. Degradation of application performance during live migration (due to DPDRs or for other reasons) results in increased response times, threatening violation of SLAs and increasing power consumption. For optimization of migration algorithms with DPDRs there are 2 possible approaches for solving the DPDR problem: 1. Increase bandwidth 2. Decrease PDR Typical applications only exhibit this DPDR-like behaviour as spikes or outliers in normal write activity. Live migration was previously abandoned by commercial algorithms when DPDRs were encountered. However, in its most recent version of vSphere (5.0), VMWare has included an enhancement called ‘Stun During Page Send’ (SDPS) [40] which guarantees that the migration will continue despite experiencing a DPDR (VMWare refers to DPDRs as ‘pathological’ loads). Tracking both the transmission rate and the PDR, a diabolical PDR can be identified. When a DPDR is identified by VMWare, the response time of the virtual machine is slowed down (‘stunned’) by introducing microsecond delays (sleep processes) to the vCPU. This lowers the response time to application requests and thus slows the rate of PDR to less than the migration bandwidth (in order to ensure convergence (PDR < bandwidth). Xen implements a simple equivalent – limiting ‘rogue’ processes (other applications or services running parallel to the migration) to 40 write faults before putting them on a wait queue. 2.5 Monitoring Interval Much effort has been applied to optimizing the live migration process in recent years. During migration, the primary factors impacting on a VM’s response SLA are the migration time and, perhaps more importantly, the downtime. These are the metrics which define the efficiency of a migration. If a DC operator intends to migrate the VM(s) hosting a client application it must factor these constraints into its SLA guarantee. It is clear that every possible effort should be made to minimize the migration time (and downtime) - so that the
  • 43.
    Chapter2 Literature Review 35 bestpossible SLAs may be offered to clients. This can only be achieved by choosing the VM with the lowest potential PDR for each migration. However, response and uptime SLAs become increasingly difficult to maintain if reduction (or at least minimization) of power consumption is a primary objective because each migration taking place consumes additional energy (while both servers are running and processing cycles, RAM, bandwidth are being consumed). Based on this premise, Voorsluys et al. [41] evaluate the cost of live migration, demonstrating that DC power consumption can be reduced if there is a reduction in migrations. The cost of a migration (as shown) is dependent on a number of factors, including the amount of RAM being used by the source VM (which needs to be transferred to the destination) and the bandwidth available for the migration. The higher the bandwidth the faster data can be transferred. Additionally, power consumption is increased because 2 VMs (source and destination) are running concurrently for much of the migration process. In order to reduce the migration count in a DC each migration should be performed under the strict condition that the destination host is chosen such that power consumption in the DC is minimized post-migration. This can only be achieved by examining all possible destinations before each migration begins - to identify the optimal destination host for each migrating VM from a power consumption point-of-view. The critical algorithm for resource (VM) management is the placement algorithm.
  • 44.
    Chapter2 Literature Review 36 These2 conditions i.e. 1. Migrate the VM with the lowest Page Dirty Rate 2. Choose the destination host for minimal power consumption post-migration form the basis upon which the Local Regression / Minimum Migration Time (LRMMT) algorithm in CloudSim [42] operates (c.f. Chapter 3). 2.5.1 Static Monitoring Interval Recent research efforts in energy efficiency perform monitoring of the incoming workload but almost exclusively focus on techniques for analysis of the data being collected rather than improving the quality of the data. In their hotspot identification paper, Xu and Sekiya [43] select a monitoring interval of 2 minutes. The interval is chosen on the basis of balancing the cost of the additional processing required against the benefit of performing the migration. The 2 minute interval remains constant during experimentation. Using an extended version of the First Fit Decreasing algorithm, Takeda et al. [44] are motivated by consolidation of servers, to save power. They use a static 60 second monitoring interval for their work. Xu and Chen et al. [45] monitor the usage levels of a variety of server resources (CPU, memory, and bandwidth), polling metrics as often as they become available. Their results show that monitoring at such a granular level may not only lead to excessive data processing but the added volume of network monitoring traffic (between multiple hosts and the monitoring system) may also be disproportionate to the accuracy required. The processing requirements of DC hosts vary as the workload varies and are not known until they arrive at the VM, requesting service. While some a priori analysis of the workload may be performed to predict future demand, as in the work of Gmach et al. [46], unexpected changes may occur which have not been established by any previously identified patterns. A more dynamic solution is required which reacts in real-time to the incoming workload rather than making migration decisions based on a priori analysis.
  • 45.
    Chapter2 Literature Review 37 VMwarevSphere facilitates a combination of collection intervals and levels [47]. The interval is the time between data collection points and the level determines which metrics are collected at each interval. Examples of vSphere metrics are as follows:  Collection Interval: 1 day  Collection Frequency: 5 minutes (static)  Level 1 data: 'cpuentitlement', 'totalmhz', 'usage', 'usagemhz'  Level 2 data: 'idle', 'reservedCapacity' + all of Level 1 data (above) VMware intervals and levels in a DC are adjusted manually by the operator as circumstances require. Once chosen, they remain constant until the operator re-configures them. Manual adjustment decisions, which rely heavily on the experience and knowledge of the operator, may not prove as accurate and consistent over time as an informed, dynamically adjusted system. In vSphere, the minimum collection frequency available is 5 minutes. Real-time data is summarized at each interval and later aggregated for more permanent storage and analysis. 2.5.2 Dynamic Monitoring Interval Chandra et al. [48] focus on dynamic resource allocation techniques which are sensitive to fluctuations in data center application workloads. Typically SLA guarantees are managed by reserving a percentage of available resources (e.g. CPU, network) for each application. The portion allocated to each application depends on the expected workload and the SLA requirements of the application. The workload of many applications (e.g. web servers) varies over time, presenting a significant challenge when attempting to perform a priori estimation of such workloads. Two issues arise when considering provisioning of resources for web servers: 1. Over-provisioning based on worst case workload scenarios may result in potential underutilization of resources e.g. higher CPU priority allocated to application which seldom requires it 2. Under-provisioning may result in violation of SLAs e.g. not enough CPU priority given to an application which requires it
  • 46.
    Chapter2 Literature Review 38 Analternate approach is to allocate resources to applications dynamically based on observation of their behaviour in real-time. Any remaining capacity is later allocated to those applications as and when they are found to require it. Such a system reacts in real-time to unanticipated workload fluctuations (in either direction), meeting QoS objectives which may include optimization of power consumption in addition to typical performance SLAs such as response time. While Chandra and others [49, 50] have previously used dynamic workload analysis approaches, their focus was on resource management to optimize SLA guarantees i.e. performance. No consideration is given in their work to the effect on power consumption when performance is enhanced. This research differentiates itself in that dynamic analysis of the workload is performed for the purpose of identifying power consumption opportunities while also maintaining (or improving) the performance of the DC infrastructure. The search for improved energy efficiency is driven in this research by DC cost factors which were not as significant an issue 10-15 years ago as they are now. 2.6 Conclusion This chapter provided an in-depth analysis of data center energy efficiency state-of-the-art. Software solutions to energy efficiency issues were presented, demonstrating that many opportunities still exist for improvement in server room power consumption using a software approach to monitoring (and control) of the complex systems which comprise a typical DC. The principle lesson to take from prior (and existing) research in the field is that most of the DC infrastructure can be monitored using software solutions but that monitoring (and subsequent processing of the data collected) should not overwhelm the monitoring / processing system and thus impact negatively on the operation of the DC infrastructure. This thesis proposes that dynamic adjustment of the monitoring interval with respect to the incoming workload may represent a superior strategy from an energy efficiency perspective. Chapter 3 discusses in more detail the capabilities provided by the Java-based CloudSim framework (used for this research) and the particular code modules relevant to testing the hypothesis presented herein.
  • 47.
    Chapter3 CloudSim 39 Chapter 3CloudSim Introduction Researchers working on data center energy efficiency from a software perspective are typically hindered by lack of access to real-world infrastructure because it is infeasible to add additional workload to a data center which already has a significant ‘real-world’ workload to service on a daily basis. From a commercial perspective, DC operators are understandably unwilling to permit experimentation on a network which, for the most part, has been fine- tuned to manage their existing workload. In this chapter, details of the CloudSim framework are presented with special emphasis on those aspects particularly related to this MSc research topic. The CloudSim framework [42] is a Java-based simulator, designed and written by Anton Beloglazov at the University of Melbourne for his doctoral thesis. It provides a limited software solution to the above issues and is deployed in this research to simulate a standalone power-aware data center with LAN-based migration capabilities. The Eclipse IDE is used to run (and edit) CloudSim. 3.1 Overview Default power-aware algorithms in CloudSim analyse the state of the DC infrastructure at static 300-second intervals. This reflects current industry practice where an average CPU utilization value for each host is polled every 5 minutes (i.e. 300 seconds) by virtualization monitoring systems (e.g. VMware). At each interval the CPU utilization of all hosts in the simulation is examined to establish whether or not they are adequately servicing the workload which has been applied to the VMs placed on them. If a host is found to be over-utilized (i.e. the CPU does not have the capacity to service the complete workload of all the VMs placed on it) a decision is made to migrate one or more of the VMs to another host where the required capacity to service the workload is available.
  • 48.
    Chapter3 CloudSim 40 Conversely, ifa host is found to be under-utilized (i.e. the CPU is operating at such a low capacity that power could be saved by switching it off), the remaining VMs are migrated to another host and the machine is powered off. The CloudSim modules used only implement migration when a host is over-utilized, reflecting the focus of this research. There are two primary steps in the power-aware CloudSim migration algorithm for over-utilized hosts: 1. Migrate the VM with the lowest Page Dirty Rate 2. Choose the destination host for minimal power consumption post-migration The default CPU utilization threshold for an over-utilized host in CloudSim is 100%. An adjustable safety parameter is also provided by CloudSim, effectively acting as overhead provision. As an example, if the CPU utilization value were 90% and was then multiplied by a safety parameter of 1.2, the resulting value of 108% would exceed the over-utilization threshold. A safety parameter of 1.1 would result in a final value of 99% (for the same initial utilization), thus not exceeding the threshold. 3.2 Workload The workloads applied to the VMs on each host in a simulated DC for power-aware CloudSim simulations are referred to as a ‘cloudlets’. These are flat text files which contain sample CPU utilization percentages gathered (per interval) from over 500 DC locations worldwide. As long as no migration takes place (i.e. the host doesn’t become over-utilized), the VM assigned to service the workload at the beginning of a simulation (depicted in Figure 7) remains associated with that workload until the cloudlet has been completed. However, if a migration takes place (because the host has become over-utilized) the workload is then applied to the VM on the destination host. Despite the term ‘VM Migration’, it is the workload (not the VM) which changes location within the DC when a migration takes place.
  • 49.
    Chapter3 CloudSim 41 Figure 7CloudSim Architecture The duration of default CloudSim simulations is 24 hours (i.e. 86,400 seconds). This equates to 288 intervals of 5 minutes (300 seconds) each. Thus, each of the 1052 cloudlets (stored in the PlanetLab directory) contains 288 values to make a value available for reading at each interval of the simulation.
  • 50.
    Chapter3 CloudSim 42 At thebeginning of each simulation, the entire cloudlet is loaded into an array (UtilizationModelPlanetLabInMemory.data[]) from which the values are read at each interval throughout the simulation. Each cloudlet is assigned to a corresponding VM on a 1- 2-1 basis at the beginning of the simulation. The values being read from the cloudlets are percentages which simulate ‘real- world’ CPU utilization values. These need to be converted to a variable in CloudSim which is related to actual work performed. CloudSim work performance is defined in MIs (Million Instructions). The workload of the cloudlet (termed length) is a constant i.e. 2500 * SIMULATION_LENGTH (2500 * 86400 = 216,000,000 MIs). CloudSim keeps track of the VM workload already performed by subtracting the MIs completed during each interval from the total cloudlet MI length. As such each cloudlet starts at t = 0 seconds with a workload of 216,000,000 MI and this load is reduced according to the work completed at each interval. To check whether a cloudlet has been fully executed the IsFinished() method is called at each interval. // checks whether this Cloudlet has finished or not if (cl.IsFinished) { … } final long finish = resList.get(index).finishedSoFar; final long result = cloudletLength - finish; if (result <= 0.0) { completed = true; } From the code tract above it can be seen that when (or if) the VM’s workload (represented by the cloudletLength variable) is completed during the simulation the VM will be ‘de- commissioned’. 3.3 Capacity Each of the 4 VM types used in the CloudSim framework represents a ‘real-world’ virtual machine. They are assigned a MIPS value (i.e. 500, 1000, 2000, 2500) before the simulation begins. This value reflects the maximum amount of processing capacity on the host to which
  • 51.
    Chapter3 CloudSim 43 a VMis entitled. Likewise each host CPU has an initial MIPS capacity of either 1860 or 2660, again reflecting ‘real-world’ servers. These configuration settings limit the number of VMs which can be run on each host and also the volume of workload which can be performed by each VM at each interval. Example: A host has a capacity of 2660 MIPS. A VM (with a capacity of 500 MIPS) has just been started on the host and the first value read from the cloudlet array is 5% of the host’s capacity (i.e. 2660 / 20 = 133 MIPS). If the next interval is 30 seconds long then the amount of instructions processed by the VM is 133 * 30 = 3990MI. This completed work is subtracted from the total cloudlet length (i.e. 216,000,000 – 3990 = 215,996,010MI). At each subsequent interval throughout the simulation the same algorithm is applied until such time as the remaining workload to be processed is at (or below) zero. At this stage the VM is de-commissioned because the workload is complete. In this example the 5% CPU percentage from the cloudlet (i.e. 133 MIPS) is approximately 27% (500/133) of the CPU capacity allocated to the VM. If the original value read from the cloudlet was greater than 18.79% (i.e. 2660 / 500), the VM would have insufficient capacity to continue servicing the workload and SLA violations would occur. 2 options typically need to be considered when this happens: 1. Increase the VM’s capacity on the host – not facilitated in CloudSim 2. Apply the workload to a VM with a larger capacity on a different host, requiring a migration. This will only occur if the host is also over-utilized which is a significant shortfall in the CloudSim modules used for testing the hypothesis. An over-utilized VM (causing SLA violations) will not result in a migration in the version of CloudSim being used for this research. Additionally, it is notable that the CloudSim reports (generated at the end of the simulation) detail very low SLA violation averages which indicates that the particular workload (cloudlet) percentages being applied in this version of CloudSim are insufficient to push the VMs beyond their capacity. The difficulty of correctly sizing VM MIPS (and allocating appropriate host capacity to them) so that they are capable of meeting their workload requirements can be seen from this example. CloudSim goes some way to achieving this by applying VMs to hosts on a 1-2-1
  • 52.
    Chapter3 CloudSim 44 basis atthe start of the simulation i.e. in a default simulation with 1052 VMs being placed on 800 hosts, the first 800 VMs are applied to the first 800 hosts and the remaining 352 VMs are allocated to hosts 1 -> 352. Therefore, when the simulation starts, 352 hosts have 2 VMs and the remainder host a single VM. As processing continues the VM placement algorithm attempts to allocate as many VMs to each host as capacity will allow. The remaining (empty) hosts are then powered off - simulating the server consolidation effort typical of most modern DCs. It is clear that there is a conflict of interests taking place. On the one hand there is an attempt to maximize performance by migrating VMs to hosts with excess capacity but, on the other hand, competition for CPU cycles is being created by co-locating VMs on the same host, potentially creating an over-utilization scenario. 3.4 Local Regression / Minimum Migration Time (LR / MMT) Beloglazov concludes from his CloudSim experiments that the algorithm which combines Local Regression and Minimum Migration Time (LR / MMT) is most efficient for maintaining optimal performance and maximizing energy efficiency. Accordingly this research uses the LR / MMT algorithmic combination as the basis for test and evaluation. 3.5 Selection Policy – Local Regression (LR) Having passed the most recent CPU utilization values through the Local Regression (LR) algorithm, hosts are considered over-utilized if the next predicted utilization value exceeds the threshold of 100% [Appendix A]. LR predicts this value using a sliding window, each new value being added at each subsequent interval throughout the simulation. The size of the sliding window is 10. Until initial filling of the window has taken place (i.e. 10 intervals have elapsed since the simulation began), CloudSim relies on a 'fallback' algorithm [Appendix A] which considers a host to be over-utilized if its CPU utilization exceeds 70%. VMs are chosen for migration according to MMT i.e. the VM with the lowest predicted migration time will be selected for migration to another host. Migration time is based on the amount of RAM being used by the VM. The VM using the least RAM will be
  • 53.
    Chapter3 CloudSim 45 chosen asthe primary candidate for migration, simulating minimization of the Dirty Page Rate (DPR) during VM transfer [39] as previously discussed in Section 2.4.2.2. 3.6 Allocation Policy – Minimum Migration Time (MMT) The destination host for the migration is chosen on the basis of power consumption following migration i.e. the host with the lowest power consumption (post migration) is chosen as the primary destination candidate. In some cases, more than one VM may require migration to reduce the host's utilization below the threshold. Dynamic RAM adjustment is not facilitated in CloudSim as the simulation proceeds. Rather, RAM values are read (during execution of the MMT algorithm) on the basis of the initial allocation to each VM at the start of the simulation. 3.7 Default LRMMT The LRMMT algorithm begins in the main() method of the LrMmt.java class [Appendix B]. A PlanetLabRunner() object is instantiated. The PlanetLabRunner() class inherits from the RunnerAbstract() class and sets up the various parameters required to run the LRMMT simulation. The parameters are passed to the initLogOutput() method in the default constructor of the super class (RunnerAbstract) which creates the folders required for saving the results of the simulation. Two methods are subsequently called: 3.7.1 init() Defined in the sub-class (PlanetLabRunner), this method takes the location of the PlanetLab workload (string inputFolder) as a parameter and initiates the simulation. A new DatacenterBroker() object is instantiated. Among other responsibilities, the broker will create the VMs for the simulation, bind the cloudlets to those VMs and assign the VMs to the data center hosts. The broker’s ‘id’ is now passed to the createCloudletListPlanetLab() method which prepares the cloudlet files in the input folder for storage in a data[288] array. It is from this array that each cloudlet value will be read so that an equivalent MI workload value can be calculated for each VM. Having created the cloudletList, the number of cloudlets (files) in the PlanetLab folder is now known and a list of VMs can be created with each cloudlet being assigned to an individual VM i.e. there is a 1-2-1 relationship between a cloudlet and a VM at the start of the simulation. The last call of the init() method creates a
  • 54.
    Chapter3 CloudSim 46 hostList whichtakes as a parameter the number of hosts configured for the DC (i.e. 800) from the PlanetLabConstants() class. On completion of the init() method the cloudlets (workload), hosts and VMs are all instantiated and ready for the data center to be created. 3.7.2 start() The start() method creates the data center, binds all the components created in the init() method to the new data center and starts the simulation. The first helper call of the start() method is to createDatacenter() which sets up a number of parameters related to the characteristics of the DC. These include:  arch (string) – whether the DC has a 32 or 64 bit architecture  os (string) – the operating system running on the hosts – e.g. Linux / Windows  vmm (string) – the virtual machine manager running on the hosts e.g. Xen  time_zone (double) – where the DC is located – e.g. 10.0  cost (double) – the cost of processing in this resource – e.g. 3.0  costPerMem (double) – the cost of using memory in this resource – e.g. 0.05  costPerStorage (double) – the cost of using storage in this resource – e.g. 0.001  costPerBm (double) – the cost of using bandwidth in this resource – e.g. 0.0 In the case of a simulation of a cloud network, where more than one data center would be required, these values can be altered for the purposes of calculating different infrastructural costs across the cloud. In this research a single data center is being simulated. The defaults are not adjusted. Once the data center has been created a boolean value (PowerDatacenter.disableMigrations - indicating whether or not migrations are disabled) is set to false i.e. migrations are enabled for this simulation. The VM and cloudlet lists are submitted (by the broker) to the datacenter object and the simulation is started i.e. double lastClock = CloudSim.StartSimulation();
  • 55.
    Chapter3 CloudSim 47 The StartSimulation()method calls the run() method which waits for completion of all entities i.e. run() waits until the entities (cloudlets running on VMs) are run as threads so that the stop condition for the StartSimulation() method is reached when the threads reach the ‘non-RUNNABLE’ state or when there are no more events in the future event queue. Once this point has been reached the clock time is returned to the calling method (i.e. RunnerAbstract.start()) and the simulation is stopped. Helper.printResults(datacenter, vmList, lastClock, experimentName, Constants.OUTPUT_CSV, outputFolder); Results (if enabled) are printed to both log and trace files and the simulation is completed.
  • 56.
    Chapter3 CloudSim 48 3.8 Over-utilization Figure8 Flow Chart Depicting the LR / MMT simulation process The full simulation process is depicted in Figure 8. The scheduling interval (i.e. how often analysis will be performed) is set as a static variable in the Constants class. For the default CloudSim simulation, this interval is 300 seconds. At each interval the CPU utilization of every host is examined. Using a sliding window of the last 10 CPU utilization values, the local regression algorithm predicts the CPU utilization value for the next interval. If this value is below 100% no action is taken. However, if the CPU is predicted to be greater than
  • 57.
    Chapter3 CloudSim 49 100% (atthe next interval) the host is considered over-utilized and the MMT portion of the algorithm is called. As mentioned previously, a ‘fallback’ algorithm is used until the first 10 CPU values are available. The ‘fallback’ over-utilization threshold is 70%. The code for testing a host for overutilization is shown below: if (utilizationHistory.length < length) { return getFallbackVmAllocationPolicy().isHostOverUtilized(host); } The length of the sliding window is 10. This is known in code as the utilizationHistory. try { estimates = getParameterEstimates(utilizationHistoryReversed); } The getParameterEstimates() call runs the local regression algorithm against the sliding window and (after including the safety parameter as a multiplier) the predicted utilization of the host is calculated. predictedUtilization *= getSafetyParameter(); if(predictedUtilization >= 1) { Constants.OverUtilizedHostsThisInterval++; } return predictedUtilization >= 1; A Boolean indicating the utilization state of the host is returned to the calling function. If the host is predicted to be over-utilized at the next interval, the value of the returned Boolean will be true.
  • 58.
    Chapter3 CloudSim 50 3.9 Migration Oneor more VMs need to be migrated from the host in order to bring the CPU utilization back below the threshold. The VM(s) to be migrated are chosen on the basis of the amount of RAM they are using. Thus, the VM with the least RAM will be the primary candidate for migration. The VM types used by CloudSim are listed below. It can be seen that (for the most part †) different RAM values are configured for each VM at the start of the simulation. CloudSim does not include dynamic RAM adjustment so the static values applied initially remain the same for the duration. Cloud providers such as Amazon use the term ‘instance’ to denote a spinning VM. The CloudSim VM types provided simulate some of the VM instances available to customers in Amazon EC2. 1. High-CPU Medium Instance: 2.5 EC2 Compute Units, 0.85 GB 2. Extra Large Instance: 2 EC2 Compute Units, 3.75 GB † 3. Small Instance: 1 EC2 Compute Unit, 1.7 GB 4. Micro Instance: 0.5 EC2 Compute Unit, 0.633 GB public final static int VM_TYPES = 4; public final static int[] VM_RAM = { 870, 1740, 1740†, 613 }; All types are deployed when the VMs are being created at the beginning of the simulation. Assuming all 4 are on a single host when the host is found to be over-utilized, the order in which the VMs will be chosen for migration is: 1. 613 [index 3] 2. 870 [index 0] 3. 1740 [index 1] 4. 1740 [index 2] † (note: the VM_RAM value in the default CloudSim code is 1740. This does not reflect the ‘real-world’ server [Extra Large Instance] being simulated, which has a RAM value of 3.75 GB) If there is more than one VM with RAM of 613 on the host they will be queued for migration before the first ‘870’ enters the queue. The chosen VM is then added to a migration map which holds a key-value pair of the:
  • 59.
    Chapter3 CloudSim 51  VMID  Destination Host ID Once all the hosts have been analysed, the VMs in the migration map are migrated to their chosen destinations using the VM placement algorithm. The destination for each VM is chosen with the objective of optimizing power consumption i.e. the host which will use the least power post-migration is deemed the most suitable. public Vm getVmToMigrate(PowerHost host) { List<PowerVm> migratableVms = getMigratableVms(host); if (migratableVms.isEmpty()) { return null; } Vm vmToMigrate = null; double minMetric = Double.MAX_VALUE; for (Vm vm : migratableVms) { if (vm.isInMigration()) { continue; } double metric = vm.getRam(); if (metric < minMetric) { minMetric = metric; vmToMigrate = vm; } } return vmToMigrate; } } From the code above it can be seen that the VM with the least RAM (vm.getRam()) is chosen for migration, the objective of which is to minimize the downtime required to transfer the final RAM pages during the migration. Increased downtime during migration would result in potential SLA violations as described in detail in Section 2.4.2.3. It is clear that, to a certain extent, CloudSim is replicating the effort to minimize SLA violations which takes place during ‘real-world’ live migrations.
  • 60.
    Chapter3 CloudSim 52 3.10 Reporting CloudSimfacilitates reporting on various metrics available during the simulation. Reports are generated as either flat text or MS Excel-type Comma Separated Values (CSV) file formats. Additionally, metrics can be sent to the Eclipse console and read as the simulation progresses. Below is a sample of the metrics summary from the trace file of the default CloudSim simulation. Notable metrics include:  Number of Hosts  Number of VMs  Energy Consumption  Over-utilized Hosts  Number of VM Migrations  Average SLA Violation Trace.printLine(String.format("Experiment name: " + experimentName)); Trace.printLine(String.format("Number of hosts: " + numberOfHosts)); Trace.printLine(String.format("Number of VMs: " + numberOfVms)); Trace.printLine(String.format("Total simulation time: %.2f sec", totalSimulationTime)); Trace.printLine(String.format("Energy consumption: %.2f kWh", energy)); Trace.printLine(String.format("Overutilized Hosts: %d", Constants.OverUtilizedHostsThisInterval)); Trace.printLine(String.format("Number of VM migrations: %d", numberOfMigrations)); Trace.printLine(String.format("SLA: %.5f%%", sla * 100)); Trace.printLine(String.format("SLA perf degradation due to migration: _%.2f%%", slaDegradationDueToMigration * 100)); Trace.printLine(String.format("SLA time per active host: %.2f%%", slaTimePerActiveHost * 100)); Trace.printLine(String.format("Overall SLA violation: %.2f%%", slaOverall _* 100)); Trace.printLine(String.format("Average SLA violation: %.2f%%", slaAverage _* 100)); 3.11 Conclusion This chapter discussed some of the capabilities and limitations of the default CloudSim framework being used for this research and identified the modules most related to testing the hypothesis presented in this research. An explanation of how CloudSim processes the
  • 61.
    Chapter3 CloudSim 53 workload, beingapplied from the cloudlets, was also provided. A range of shortfalls and possible errors were also identified. Chapter 4 details the changes made to the default framework and the new code added, which were integrated into the CloudSim code to evaluate the hypothesis.
  • 62.
    Chapter4 Implementation 54 Chapter 4Implementation Introduction As it is not provided in the default CloudSim package, additional code was required to test the effect on power consumption when the monitoring interval is adjusted. Chapter 3 provided an overview of the default power-aware CloudSim simulation and the related modules. The specific capabilities (and limitations) of the framework, as they apply to dynamic interval adjustment, were also outlined. In this chapter the changes which were required to implement dynamic adjustment of the monitoring interval are described in more detail. The primary contribution of the thesis is to evaluate the impact of moving from a static to a dynamic monitoring process whereby the predicted average utilization for the DC at each interval is used to adjust the next interval. Before writing the code, which would ultimately be integrated with the existing LR/MMT modules in CloudSim, an algorithm was designed to clarify the steps involved. 4.1 Interval Adjustment Algorithm The dynamic interval adjustment algorithm involves two principle steps: 1. Calculate the weighted mean of the CPU utilization value for all operational hosts in the data center as in Equation 1. Non-operational hosts are excluded from the calculation, they do not affect the average CPU utilization for the DC: 𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 𝑚𝑒𝑎𝑛 = ∑ (𝑤 𝑖 𝑥 𝑖 𝑛 𝑖=1 ) ∑ ( 𝑤 𝑖)n i=1 , (1) where wi is the weight applied to the range within which the predicted utilization value xi for each operational host falls and n is the number of operational hosts.
  • 63.
    Chapter4 Implementation 55 2. Chooseand set the next monitoring interval with respect to the appropriate weighted mean from Table 4.1: The premise upon which weightings are applied to calculate the average utilization of the DC is simplified for the purposes of this research. The primary objective is to adjust the monitoring interval with respect to the upper utilization threshold. As such, a straightforward set of weights (from 1 - 10) are applied to the CPU utilization for each host, such that host's which have a higher CPU utilization i.e., are closer to 100%, are given more priority in the calculation. If a simple average was taken of host utilization across the DC, this would have the effect of masking hosts that are close to the threshold, where SLA’s are in danger of being violated. If the lower threshold were taken into consideration, a different set of weights would be appropriate, with increased importance applied in the regions closer to both thresholds and reduced importance at the center (e.g. 40-60% CPU utilization). There is certainly scope for further investigation of the simplified set of weights applied in this research, depicted in Table 4.1. Table 4.1: Application of Weights to Predicted Utilization Predicted Utilization (%) per host ( xi ) Weight Applied ( wi ) 1 – 10 1 11 – 20 2 21 – 30 3 … … 91 – 100 10 The monitoring intervals applied to the resulting weighted average prediction are depicted in Figure 9. As with the weights discussed above, the intervals were chosen somewhat arbitrarily and would benefit from further analysis. The maximum interval is aligned with the existing default interval in CloudSim i.e. 300 seconds. A minimum interval of 30 seconds facilitates 10 intervals in total, each having a corresponding 10% CPU utilization range from 0 to 100.
  • 64.
    Chapter4 Implementation 56 However, ifthe minimum interval of 30 seconds was applied for the full 24 hour simulation, 2880 values would be required (i.e. (60 x 60 x 24) / 30) in each PlanetLab cloudlet file to ensure a value could be read at each interval. The 288 values in the 1052 default files provided by CloudSim were thus concatenated (using a C# program written specifically for this purpose) to ensure sufficient values were available, resulting in a total of 105 complete files with 2880 values each. if(Constants.IsDefault) { data = new double[288]; // PlanetLab workload } else { data = new double[2880]; } The code tract above demonstrates the difference between the two data[] arrays which hold the PlanetLab cloudlet values read at each interval during the simulation. The default array is 288 in length while the dynamic is 2880 – providing sufficient indices in the array to store the required number of cloudlet values throughout the simulation.
  • 65.
    Chapter4 Implementation 57 Figure 9Application of the Monitoring Interval Basedon Weighted Utilization Average It should be noted that the intervals and CPU ranges were chosen somewhat arbitrarily and could be fine-tuned following further investigation. Additionally, as a result of the reduced file count after concatenation, the number of hosts running in the simulation was reduced, from the CloudSim default of 800 to 80, to maintain the ratio of VMs to hosts (Table 4.2): Table 4.2: VMs to Hosts - Ratio Correction Cloudlets / VMs Hosts Default 1052 800 Dynamic 105 80
  • 66.
    Chapter4 Implementation 58 4.2 ComparableWorkloads This research focuses on a comparison of the default CloudSim simulation with a dynamic version and was required, therefore, to use comparable workloads. Ensuring that the workloads are comparable (in a simulation which monitors at different intervals) involves applying the same amount of processing to each VM during each interval. Accordingly, a new set of files was created for the default simulation. The values for these files were calculated based on the average of the values used in the dynamic simulation by the time each 300 second interval had elapsed. This was achieved by running the dynamic simulation for 24 hours and recording the data (e.g. interval length, cumulative interval, average utilization per interval) observed (shown in Figure 10). This data was initially written to a Microsoft Excel worksheet from within the CloudSim reporting structure and then exported to a Microsoft SQL Server database. The number (727) and length of the intervals in the dynamic simulation can be seen in Figure 11. From Figure 12, it is clear that a lower (and/or) upper offset may occur during calculation i.e. the dynamic interval ‘straddles’ the 300-second mark. To maintain as much accuracy as possible for the calculation of the default file values, two new variables (i.e. offsetBelow300, offsetAbove300) were introduced.
  • 67.
    Chapter4 Implementation 59 Figure 10A screenshot of the data generated for calculation of the default workload Figure 11 Intervals calculated during the dynamic simulation
  • 68.
    Chapter4 Implementation 60 Figure 12Calculation of the Average CPU Utilization for the Default Files The length of each interval is added to an accumulator until the total equals (or exceeds) 300 seconds. The average utilization for the accumulated intervals is then calculated. This average includes (if required) the average for the final portion of any interval below the 300-second mark. When an offset occurs above the 300-second mark (in the current accumulator), it is ‘held-over’ (i.e. added to the accumulator in the next 300-second interval). Some of the new Java code written in CloudSim to monitor the interval and workload activity (generating the data required for the calculator) is shown below – the code comments provide explanation: if(Constants.IntervalGenerator) { int intervalDifference = 0; int iOffsetBelowForPrinting = 0; int iOffsetAboveForPrinting = 0; int iAccumulatedIntervalsForPrinting = 0; if(Constants.accumulatedIntervals >= 300) { //Constants.accumulatedIntervals is exactly 300 int accumulated = 300; //calculate offsets if(Constants.accumulatedIntervals > 300) {
  • 69.
    Chapter4 Implementation 61 accumulated =(int) Constants.accumulatedIntervals - _(int) dInterval; } Constants.offsetBelow = 300 - accumulated; Constants.offsetAbove = Constants.accumulatedIntervals - 300; } } 4.3 C# Calculator Calculation of the new per-interval workloads was achieved using a separate ‘calculator’ program written in C#. The calculator implements the process depicted in Figure 12. The principle C# method used to calculate the new averages for the default files in the calculator program is CreateDefaultFiles(). The comments in the code explain each step of the process: private void CreateDefaultFiles() { //read in first 727 from each file - used in dynamic simulation FileInfo[] Files = dinfo.GetFiles(); string currentNumber = string.Empty; int iOffsetAboveFromPrevious = 0; //initialize at max to ensure not used on first iteration int iIndexForOffsetAbove = 727; foreach (FileInfo filex in Files) { using (var reader = new StreamReader(filex.FullName)) { //fill dynamicIn for (int i = 0; i < 727; i++) { dynamicIn[i] = Convert.ToInt32(reader.ReadLine()); } int iCurrentOutputIndex = 0; //Calculate for (int k = 0; k < 727; k++) { //add each average used here - including any offset float iAccumulatedTotal = 0; //reached > 300 accumulated intervals int iReadCount = _Convert.ToInt32(ds.Tables[0].Rows[k]["ReadCount"]);
  • 70.
    Chapter4 Implementation 62 if (iReadCount> 0) { //first interval if (k == 0) { int iValue = dynamicIn[k]; iAccumulatedTotal += iValue; } else { //readCount == 1: just check for offsets if (iReadCount > 1) { for (int m = 1; m < iReadCount; m++) { int iValue = dynamicIn[k - m]; int iInterval = _Convert.ToInt32(ds.Tables[0].Rows[k - _m]["Interval"]); iAccumulatedTotal += iValue * iInterval; } } } //offset - read this interval int iOffsetBelow = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetBelow300"]); if (iOffsetBelow > 0) { iAccumulatedTotal += iOffsetBelow * dynamicIn[k]; } //use previous offset above in this calculation if (k >= iIndexForOffsetAbove) { iAccumulatedTotal += iOffsetAboveFromPrevious; //reset iOffsetAboveFromPrevious = 0; iIndexForOffsetAbove = 727; } //use this offset above in next calculation int iOffsetAbove = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetAbove300"]); if (iOffsetAbove > 0) { //value for offset above to add to next _accumulator iOffsetAboveFromPrevious = iOffsetAbove * _dynamicIn[k]; //use in next calculation - at a minimum iIndexForOffsetAbove = k; }
  • 71.
    Chapter4 Implementation 63 float fAverage= iAccumulatedTotal / 300; int iAverage = Convert.ToInt32( iAccumulatedTotal / _300); //first interval if (k == 0) { iAverage = dynamicIn[k]; } //save averaged value to array for writing defaultOutput[iCurrentOutputIndex] = _iAverage.ToString(); iCurrentOutputIndex++; } } } //Print to text file for default cloudlet System.IO.File.WriteAllLines("C:UsersscoobyDesktopDefaultNewFiles " + _filex.Name, defaultOutput); } } The code above depicts the process by which the values required to calculate the default file averages was achieved. Having run the dynamic simulationThe results of the calculator program were written back out to flat text files i.e. the same format as the original CloudSim cloudlet files. To compare the difference between the default and dynamic workloads, both default and dynamic simulations were run using the new cloudlet files with a few lines of additional code to monitor the workload during each simulation added to the default constructor of the UtilizationModelPlanetlabInMemory() class in CloudSim. This additional code ensured that the workload would be observed (and accumulated) as each cloudlet was processed. As the data is being read into the data[] array from the PlanetLab cloudlet files each value is added to a new accumulator variable i.e. Constants.totalWorkload: int n = data.length; for (int i = 0; i < n - 1; i++) { data[i] = Integer.valueOf(input.readLine()); Constants.totalWorkload += data[i]; }
  • 72.
    Chapter4 Implementation 64 The Constants.totalWorkloadvalue was then divided by the relevant number of intervals (i.e. Default: 288 / Dynamic: 727) to calculate the average workload per interval. A difference of less than 1% in the per-interval workloads was observed, validating, for the most part, the results generated for the default cloudlet files by the C# program. The negligible difference may be explained by migrations taking place during collection of the CPU utilization data prior to export. For example, if a workload on a VM is calculated at 10% of its host’s capacity and then migrated to a host with a lower capacity, the same workload would require more time to complete – skewing the average CPU utilization that would have otherwise been calculated had the migration not taken place. This scenario was not factored into calculation of the per-interval average CPU utilization, resulting in the difference between the workloads of approximately 1%. This error margin was considered acceptable in the context of the overall thesis objectives. 4.4 Interval Adjustment Code The updateCloudletProcessing() method in the PowerDatacenter class is the principle cloudlet processing method, run at each interval and provided by default in CloudSim. As such it is the ideal place to position the function call to the additional code required to implement interval adjustment. To differentiate between the default and dynamic simulations at runtime, a constant boolean variable (IsDefault) was created which indicates which type of simulation is being run. Based on the value of the IsDefault variable, the code will fork either to the default CloudSim code or the dynamic code written to adjust the monitoring interval. The fork returns to the default CloudSim code once the AdjustInterval() method has been executed: if(Constants.IsDefault) { //run default simulation } else { //run dynamic simulation AdjustInterval(); }
  • 73.
    Chapter4 Implementation 65 The AdjustInterval()method (outlined below) is the entry point for the dynamic monitoring interval adjustment simulation. Figure 13 depicts how the dynamic code interacts with the CloudSim default: protected void AdjustInterval(double currentTime) { double dTotalUsageForAverage = 0; double dAverageUsage = 0; int iDenominator = 0; int iWeight = 0; double timeDiff = currentTime - getLastProcessTime(); for (PowerHost host : this.<PowerHost> getHostList()) { double utilizationOfCpu = host.getUtilizationOfCpu(); if(utilizationOfCpu > 0) { iWeight = GetWeight(utilizationOfCpu); dTotalUsageForAverage += utilizationOfCpu * iWeight; iDenominator += iWeight; } } dAverageUsage = dTotalUsageForAverage / iDenominator; //alter scheduling interval according to average utilization SetSchedulingIntervalRelativeToUtilization(dAverageUsage); }
  • 74.
    Chapter4 Implementation 66 Figure 13How the dynamic interval adjustment code interacts with CloudSim
  • 75.
    Chapter4 Implementation 67 A hostwhich is not running would have a CPU utilization of 0. As depicted by the code, only hosts with CPU utilization greater than 0 will be included in the average CPU utilization for the DC i.e. if(utilizationOfCpu > 0) A weighting is then applied (helper function: GetWeight() - below) to each result obtained. This weighting (cf. Table 4.1) is based on the CPU utilization calculated for each host by the getUtilizationOfCpu() method provided by default in CloudSim: public int GetWeight(double utilization) { double iUtilization = utilization * 100; int iWeight = 0; //check utilization value range if(iUtilization >= 0.00 && iUtilization <= 10.00) { iWeight = 1; } else if(iUtilization > 10.00 && iUtilization <= 20.00) { iWeight = 2; } else if(iUtilization > 20.00 && iUtilization <= 30.00) { iWeight = 3; } else if(iUtilization > 30.00 && iUtilization <= 40.00) { iWeight = 4; } else if(iUtilization > 40.00 && iUtilization <= 50.00) { iWeight = 5; } else if(iUtilization > 50.00 && iUtilization <= 60.00) { iWeight = 6; } else if(iUtilization > 60.00 && iUtilization <= 70.00) { iWeight = 7; } else if(iUtilization > 70.00 && iUtilization <= 80.00) { iWeight = 8; } else if(iUtilization > 80.00 && iUtilization <= 90.00)
  • 76.
    Chapter4 Implementation 68 { iWeight =9; } else if(iUtilization > 90.00 && iUtilization <= 100.00) { iWeight = 10; } return iWeight; } The average utilization for the DC is then passed to another helper function (SetSchedulingIntervalRelativeToUtilization() – shown below) which will adjust the next monitoring interval (i.e. Constants.SCHEDULING_INTERVAL) based on the range within which the utilization falls. public void SetSchedulingIntervalRelativeToUtilization(double dAverageUsage) { double iUtilization = dAverageUsage * 100; double dInterval = 300; if(iUtilization >= 0.00 && iUtilization <= 10.00) { dInterval = 300; } else if(iUtilization > 10.00 && iUtilization <= 20.00) { dInterval = 270; } else if(iUtilization > 20.00 && iUtilization <= 30.00) { dInterval = 240; } else if(iUtilization > 30.00 && iUtilization <= 40.00) { dInterval = 210; } else if(iUtilization > 40.00 && iUtilization <= 50.00) { dInterval = 180; } else if(iUtilization > 50.00 && iUtilization <= 60.00) { dInterval = 150; } else if(iUtilization > 60.00 && iUtilization <= 70.00) { dInterval = 120; }
  • 77.
    Chapter4 Implementation 69 else if(iUtilization> 70.00 && iUtilization <= 80.00) { dInterval = 90; } else if(iUtilization > 80.00 && iUtilization <= 90.00) { dInterval = 60; } else { dInterval = 30; } setSchedulingInterval(dInterval); Constants.SCHEDULING_INTERVAL = dInterval; } The process then returns to the AdjustInterval() method which returns control back to the default CloudSim code where the fork began. The default CloudSim code continues, completing the (per-interval) updateCloudletProcessing() method and continuing the simulation into the next interval, the length of which has now been adjusted with respect to the predicted average CPU utilization for the DC. 4.5 Reporting The metrics available in the default CloudSim reports (as described in Section 3.10) were found to be sufficient for the purposes of the testing phase of this research (cf. Chapter 5). However, some additional variables were needed during the design phase of the new algorithm to adjust the monitoring interval. These were added as static Constants so that they would be globally available across a range of classes and could be used without requiring instantiation of any new objects. Most are associated with calculation of the per- interval workload for the default PlanetLab cloudlet files. They include: public static int fileBeingRead = 0; public static boolean IntervalGenerator = false; public static int previousIntervalCount = 0; public static int intervalCount = 0; public static int offsetBelow = 0; public static int offsetAbove = 0; public static int accumulatedOffsetTotal = 0; public static int intervalLengthTotal = 1;
  • 78.
    Chapter4 Implementation 70 The listbelow depicts a typical trace file for the default CloudSim LRMMT simulation. It contains the output of the CloudSim reporting class i.e. Helper(). It also includes some of the additional metrics added for the purpose of this research (bold green italics):  Experiment name: default_lr_mmt_1.2  Number of hosts: 80  Number of VMs: 105  Total simulation time: 86100.00 sec  Energy consumption: 16.76 kWh  Overutilized Hosts: 2249  Number of VM migrations: 2305  Total Workload: 3833.310000  SLA: 0.00428%  SLA perf degradation due to migration: 0.07%  SLA time per active host: 5.97%  Overall SLA violation: 0.53%  Average SLA violation: 11.61%  SLA time per host: 0.05%  Number of host shutdowns: 1184  Mean time before a host shutdown: 627.78 sec  StDev time before a host shutdown: 1443.06 sec  Mean time before a VM migration: 17.12 sec  StDev time before a VM migration: 7.67 sec 4.6 Conclusion This chapter has discussed the modifications required to CloudSim to implement dynamic adjustment of the monitoring interval. A description of how the new code integrates with the default code provided by CloudSim was also provided. In Chapter 5 the tests carried out to compare the default with the dynamic simulations are described and results analysed. Finally, potential opportunities for improvement of the CloudSim framework during the course of this research are suggested.
  • 79.
    Chapter5 Tests,Results&Evaluation 71 Chapter 5Tests,Results & Evaluation Introduction Chapter 4 detailed the code changes that were required in the CloudSim framework to implement dynamic adjustment of the monitoring interval. The new code integrates seamlessly with the existing framework. No alterations were made to the underlying CloudSim architecture. This chapter deals with the specifics of the simulations carried out to test the hypothesis that opportunities for reduction of power consumption can be identified when the length of the interval changes with respect to the varying workload experienced by a typical DC. 5.1 Tests & Results Using the dynamic PlanetLab cloudlet files and interval adjustment code, the simulation was run for the full duration (i.e. 86100 seconds) and compared with the CloudSim default which used the cloudlet files generated by the C# calculator. Key results are presented in Table 5.1, whereby a significant reduction in over-utilized hosts, migrations and power consumption was observed. Table 5.1: Simulation Results Interval (seconds) Time Interval Count Over-utilized Hosts Migration Count Power Consumption Static 300 86100 287 2249 2305 16.76 Dynamic 86100 727 1697 979 8.23 Figure 14 depicts the intervals that were calculated during the dynamic simulation based on the average CPU utilization for the DC. It can be seen that the interval ranges from a minimum of 30 seconds to a maximum of 270 seconds - indicating that the average CPU utilization for the DC did not exceed 90% nor drop below 10%. From Figure 14, and as described in Chapter 4, the number of intervals in the dynamic simulation is 727, compared with 288 in the static simulation.
  • 80.
    Chapter5 Tests,Results&Evaluation 72 Figure 14Interval Calculation for the Dynamic Simulation Figure 15 shows a comparison of the VM count during both simulations - indicating that VMs are being constantly ‘de-commissioned’ as their workloads are completed during the simulation. There are 34 VMs still running at the end of the default simulation whereas there are only 6 VMs which have not completed their workloads at the end of the dynamic. This indicates that the VM placement algorithm has performed more efficiently in the dynamic simulation i.e. more of the PlanetLab workload from the cloudlet files has been completed by the time the dynamic simulation has finished.
  • 81.
    Chapter5 Tests,Results&Evaluation 73 Figure 15VM decommissioning comparison Comparing Figures 16 & 17, which depict the operational hosts at each interval in the default and dynamic simulations, it can be seen that more efficient use is made of the hosts when the interval is adjusted dynamically. A minimal number of operational servers is achieved sooner in the dynamic simulation and the power-on / power-off behaviour of the default simulation (which consumes both time and energy) is primarily absent from the dynamic. This is discussed further in Section 5.2.1 below.
  • 82.
    Chapter5 Tests,Results&Evaluation 74 Figure 16Operational Hosts - Default Simulation Figure 17 Operational Hosts - Dynamic Simulation
  • 83.
    Chapter5 Tests,Results&Evaluation 75 Figure 18depicts the per-interval average CPU utilization in the DC for the dynamic simulation. A cluster of values can be seen at approximately 99% between 17 – 19 hours in the dynamic simulation. There is a single operational host in this time period (Figure 17) with a range of 9 - 19 VMs running on it. The high average CPU utilization is as a direct result of all the remaining VMs being placed on this single host. The VM placement algorithm is most efficient at this point in the dynamic simulation from a power consumption perspective, optimizing energy efficiency by minimizing the number of operational hosts required to service the DC workload. This placement configuration would not be possible if the PlanetLab cloudlet workloads were higher. It is the relatively low CPU values being allocated to the VMs from the cloudlet files that make the placement on a single host in this time period possible. Figure 18 Average CPU Utilization - Dynamic Simulation 5.2 Evaluation of Test Results In Section 5.2 the results summarized above are investigated, based on an understanding of CloudSim as derived from code, code comments and help documentation / user forums. It explains the research findings using an ‘under-the-hood’ analysis. Section 5.3 identifies a number of limitations in the CloudSim framework which would benefit from further
  • 84.
    Chapter5 Tests,Results&Evaluation 76 investigation andis tailored more towards future researchers using the CloudSim framework than DC operators. 5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced? CloudSim performs migrations when an over-utilized host is identified. One or more VMs are chosen for migration to bring the CPU utilization of the host back below the over-utilization threshold. It is not explicitly clear from the CloudSim metrics available why the migration count is reduced in dynamic mode relative to static mode. The VM placement algorithm defined by CloudSim is a complex module within the framework. The logic upon which it works is that the most appropriate destination for the migrating VM is the one which results in the lowest power consumption post-migration. However, it is clear from the operational hosts observed in the dynamic simulation (depicted in Figure 17) that the VM placement algorithm is also performing consolidation (c.f. Sections 2.4.1, 3.3 & 5.1 above). During the period when only 1 host is operational in the dynamic simulation (Figure 17: 17 – 24 hours) it was observed that there were as many as 19 VMs running on that host (c.f. Section 5.1). As a result, some over-allocation is occurring. Over-allocation is when so many VMs are placed on a host than the host has insufficient capacity to service the workload of every VM at each time frame. In the effort to consolidate, VMs will sometimes be placed on a host which is currently running, rather than switching on a new host. The effect is that, due to the increased length of the CPU queue on the host (i.e. more VMs are ‘waiting’ for processing time slices), some VMs will not receive the CPU cycles required to complete their workload in the available interval. The expected action would be migration of the ‘starved’ VMs but it is evident (from Figure 17) that no migrations are taking place i.e. no other host is switched on. This is due to one of the limitations identified in the framework - that CloudSim only performs a VM migration when the entire host is over-utilized – not when an individual VM requires more capacity (c.f. Section 5.2.7 below). Clearly, there is a trade-off between consolidation and migration. The conclusion reached, based on the results observed, is that the reduced intervals in the dynamic simulation result in more frequent analysis, performing this consolidation / migration trade-off more efficiently than the default simulation, resulting in fewer over-utilized hosts and a reduced migration count.
  • 85.
    Chapter5 Tests,Results&Evaluation 77 5.2.2 Resultof Reduced Migration Count Beloglazov et al. [51] show that decreased power consumption can be achieved in a DC if the VM migration count can be reduced. Their work is based on the premise that additional resources are consumed during a migration due to the extra processing required to move the memory of the VM from its current host to another. Those processes may include:  Identification of a suitable destination server i.e. VM placement algorithm  Network traffic  CPU processing on both source and destination servers whilst concurrently running two VMs In the case of live migration, transfer of the VM’s memory image is performed by the Virtual Machine Manager (VMM) which copies the RAM, associated with the VM service, across to the destination while the service on the VM is still running. RAM which is re-written on the source must be transferred again. This process continues iteratively until the remaining volume of RAM needing to be transferred is such that the service can be switched off with minimal interruption. This period of time, while the service is unavailable, is known as downtime. Any attempt to improve migration algorithms must take live-copy downtime into consideration to prevent (or minimize) response SLAs. CloudSim achieves this (to some extent) by choosing, for migration, the VM with the lowest RAM. However, the CloudSim SLA metric in the modules used for this research does not take this downtime into consideration. Dynamic adjustment of the monitoring interval, however, minimizes this issue of RAM transfer by reducing the need for the migration in the first place. The power consumed as a result of the migrations is saved when additional migrations are not required. 5.2.3 Scalability As outlined in Section 2.1 there is a trade-off between DC monitoring overhead costs and net DC benefits. The issue here is that the additional volume of processing which takes place when shorter monitoring intervals are applied may become such that it would not be beneficial to apply dynamic interval adjustment at all.
  • 86.
    Chapter5 Tests,Results&Evaluation 78 Take forexample, Amazon’s EC2 EU West DC (located in Dublin, Ireland) which is estimated to contain over 52,000 operational servers [52]. Processing of the data (CPU utilization values) required to perform the interval adjustment is not an insignificant additional workload. The algorithm will calculate the average CPU utilization of some 52,000 servers and apply the new interval. As such, if this calculation were to take place every 30 seconds (in a DC with an average CPU utilization above 90%), rather than every 300 seconds, there is a ten-fold increase in the total processing volume which includes both collection and analysis of the data points. While it is unlikely that even the average CPU utilization of the most efficient DC would exceed 90% for any extended period of time, it is clear that the size of the DC (i.e. number of operational servers) does play a role in establishing whether or not the interval adjustment algorithm described in this research should be applied. Microsoft’s Chicago DC has approximately 140,000 servers installed. With increasingly larger DCs being built to meet growing consumer demand, it is reasonable to expect that DC server counts will reach 500,000 in the foreseeable future. Rather than viewing the entire DC as a single entity from a monitoring perspective, perhaps the most viable application of dynamic monitoring interval adjustment would be to sub-divide these larger DCs into more manageable sections, calculating the monitoring interval for each section separately. 5.3 Evaluation of CloudSim 5.3.1 Local Regression Sliding Window The adjusted interval in this research (discussed in Chapter 3 Section 5) results in the 'first fill' of the window occurring sooner than the CloudSim default i.e. the longest interval (5 minutes) in the dynamic version is the minimum (static) interval in the CloudSim default. The first 10 values in the sliding window take 3000 seconds (i.e. 300 x 10) in the default CloudSim simulation whereas, in the dynamic version, the window is filled after 1480 seconds. The result is a small increase in the accuracy of the utilization prediction at the beginning of the simulation because the less accurate ‘fallback’ algorithm is ‘discarded’ sooner. The size of the sliding window in the default CloudSim framework is 10 i.e. the 10 most recent CPU utilization values from the host are used each time the local regression algorithm
  • 87.
    Chapter5 Tests,Results&Evaluation 79 is performed.If there were more values in the window, the algorithm would be less sensitive to short-term changes in the workload. Clearly the size of the sliding window should be proportionate to the level of sensitivity required. The choice of this parameter would most likely benefit from detailed sensitivity analysis. 5.3.2 RAM Chapter 3 detailed the configuration settings provided by default in CloudSim in an effort to simulate ‘real-world’ VMs. However, for reasons unclear from the code, two of the VM types have the same RAM applied to them i.e. 1740. It would be preferable if either:  4 distinct VM types were configured to better reflect ‘real-world’ scenarios and improve the VM selection policy deployed by default  Provision for dynamic RAM adjustment was included in the CloudSim framework (c.f. Section 5.3.3 below). 5.3.3 Dynamic RAM Adjustment No facility is provided in CloudSim to adjust the amount of RAM available to a VM while the simulation proceeds. A migration to another host with a higher-capacity VM is required in CloudSim should a VM require more RAM. While this simulates many ‘real-world’ systems, the facility to dynamically adjust the amount of RAM allocated to a VM (without requiring migration) would improve the VM selection algorithm. 5.3.4 SLA-based Migration The basis upon which a migration takes place in the CloudSim module used for this research is an over-utilized host. If a VM requires additional RAM to service its workload it must be migrated to another host where a larger VM can be configured to run the workload. However the module does not facilitate SLA-based migration. Rather, only VMs on a host which is over-utilized are migrated. This is a significant limitation in the design of CloudSim. Even with this scenario, the VMs which need additional RAM may not be those migrated because the algorithm for choosing the VMs to migrate selects the VM with the lowest RAM pages requiring migration first. This will typically leave ‘starved’ VMs on the source host - still
  • 88.
    Chapter5 Tests,Results&Evaluation 80 requiring additionalRAM. Clearly, some improvement is required in the VM selection and allocation policies deployed by the default CloudSim framework.
  • 89.
    Chapter6 Conclusions 81 Chapter 6Conclusions In order to identify a potentially novel energy efficiency approach for virtualized DCs, a large part of the research effort in this thesis was dedicated to evaluating the current state-of-the- art. On completion of this investigative phase it was decided to focus on opportunities relating to DC management software i.e. virtualization. Following this, the concept of a dynamic monitoring interval was then proposed. Once CloudSim had been identified as the most accessible framework in which to build a test bed, a significant amount of time was spent reviewing the existing code to establish the capabilities (and limitations) of the framework. The dynamic simulation presented in this thesis is differentiated from the default LR/MMT CloudSim in that the duration of the next interval is adjusted (with respect to the weighted average of the data DC CPU utilization) rather than maintaining a static interval of 300 seconds which is the standard monitoring interval used in commercial applications (e.g. VMWare, Citrix Xen). The primary aim of this research (as outlined in the introductory chapter) was to determine the impact on power consumption of dynamically adjusting the monitoring interval. Analysis of DC metrics is performed more often suggesting that the DC is more sensitive to changes in CPU utilization. The focus of this research was the over-utilization threshold. In calculating the average CPU utilization for the DC, shorter intervals are applied as the average utilization rate increases. Results indicated that power consumption could be reduced when the monitoring interval is adjusted with respect to the incoming workload. As indicated, future work should also examine the potential for reduced power consumption as the average CPU utilization for the DC approaches some under-utilization threshold. This would improve the CloudSim VM placement algorithm, providing a more accurate simulation of the server consolidation efforts used by industry. In addition, this research had a secondary objective – to evaluate the efficacy of CloudSim as a simulator for power-aware DCs. During the course of reviewing existing code and writing new modules the specific issues (outlined above) were found to exist in the CloudSim framework. Discovery and documentation of them in this thesis will undoubtedly
  • 90.
    Chapter6 Conclusions 82 prove bothinformative and useful for researchers undertaking CloudSim-based simulations in the future. Recent reports suggest that Microsoft’s Chicago DC has approximately 140,000 servers installed [53]. With increasingly larger DCs being built to meet growing consumer demand, it is reasonable to expect that individual DC server counts will reach 250,000 - 500,000 in the foreseeable future. Rather than viewing the entire DC as a single entity from a monitoring perspective, perhaps the most viable application of dynamic monitoring interval adjustment would be to sub-divide these larger DCs into more manageable sections, calculating (and adjusting) the monitoring interval for each section separately – ensuring that the granularity of analysis most appropriately caters for all possible data center sizes and configurations. This analysis would also make a valuable contribution to the state-of-the-art.
  • 91.
    References 83 REFERENCES [1] http://www.gartner.com/newsroom/id/499090 -last accessed on 19/09/2014 [2] http://www.idc.com - last accessed on 19/09/2014 [3] Koomey J.G., “Estimating Total Power Consumption by Servers in the U.S. and the World”, 2007 [4] Energy Star Program - U.S. Environmental Protection Agency, “EPA Report to Congress on Server and Data Center Energy Efficiency”, EPA, Aug 2007. [5] N. Rasmussen, “Calculating Total Cooling Requirements for Data Centers,” American Power Conversion, White Paper #25, 2007. [6] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S., "Balance of Power: Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE, vol.9, no.1, pp. 42-49, January 2005 [7] Moore, J.; Sharma, R.; Shih, R.; Chase, J.; Patel, C.; Ranganathan, P., “Going Beyond CPUs: The Potential of Temperature-Aware Solutions for the Data Center”, Hewlett Packard Labs, 2002 [8] Data Center Efficiency Task Force, “Recommendations for Measuring and Reporting Version 2 – Measuring PUE for Data Centers”, 17th May 2011 [9] C. Belady, A., Rawson, J. Pfleuger, and T., Cader, "Green Grid Data Center Power Efficiency Metrics: PUE and DCIE," The Green Grid, 2008 [10] Koomey J.G., “Growth in Data Center Electricity Use 2005 to 2010”, report by Analytics Press, completed at the request of The New York Times, August 2011 [11] The Uptime Institute, “Inaugural Annual Uptime Institute Data Center Industry Survey”, Uptime Institute, May 2011 [12] ASHRAE, “Datacom Equipment Power Trends and Cooling Applications”, ASHRAE INC, 2005
  • 92.
    References 84 [13] ASHRAE, "EnvironmentalGuidelines for Datacom Equipment - Expanding the Recommended Environmental Envelope", ASHRAE INC, 2008 [14] ASHRAE, “Thermal Guidelines for Data Processing Environments – Expanded Data Center Classes and Usage Guidance”, ASHRAE INC, August 2011 [15] Boucher, T.D.; Auslander, D.M.; Bash, C.E.; Federspiel, C.C.; Patel, C.D., "Viability of Dynamic Cooling Control in a Data Center Environment," Thermal and Thermomechanical Phenomena in Electronic Systems, 2004. ITHERM '04. The Ninth Intersociety Conference on, pp. 593- 600 Vol. 1, 1-4 June 2004 [16] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S.; , "Balance of Power: Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE , vol.9, no.1, pp. 42- 49, Jan.-Feb. 2005 [17] Shah, A.; Patel, C.; Bash, C.; Sharma, R.; Shih, R.; , "Impact of Rack-level Compaction on the Data Center Cooling Ensemble," Thermal and Thermomechanical Phenomena in Electronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference on, pp.1175-1182, 28-31 May 2008 [18] C. Patel, et al., “Energy Flow in the Information Technology Stack: Coefficient of Performance of the Ensemble and its Impact on the Total Cost of Ownership,” Technical Report No. HPL-2006-55, Hewlett Packard Laboratories, March 2006 [19] C. Patel, et al., “Energy Flow in the Information Technology Stack: Introducing the Coefficient of Performance of the Ensemble,” Proc. ASME IMECE, November 2006 [20] Ahuja, N.; Rego, C.; Ahuja, S.; Warner, M.; Docca, A.; "Data Center Efficiency with Higher Ambient Temperatures and Optimized Cooling Control," Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE, pp.105-109, 20-24 March 2011 [21] Berktold, M.; Tian, T., “CPU Monitoring With DTS/PECI”, Intel Corporation, September 2010 [22] M. Stopar, SLA@SOI XLAB, Efficient Distribution of Virtual Machines, March 24, 2011.
  • 93.
    References 85 [23] C. Hyser,B. McKee, R. Gardner, and B. Watson. Autonomic Virtual Machine Placement in the Data Center. Technical Report HPL-2007-189, HP Laboratories, Feb. 2008. [24] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the Electric Bill for Internet-Scale Systems,” in Proc. ACM Conference on Data Communication (SIGCOMM’09), New York, NY, USA, 2009, pp. 123–134 [25] Bolin Hu; Zhou Lei; Yu Lei; Dong Xu; Jiandun Li; , "A Time-Series Based Precopy Approach for Live Migration of Virtual Machines," Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on , vol., no., pp.947-952, 7-9 Dec. 2011 [26] Carroll, R, Balasubramaniam, S, Botvich, D and Donnelly, W, Application of Genetic Algorithm to Maximise Clean Energy usage for Data Centers, to appear in proceedings of Bionetics 2010, Boston, December 2010 [27] Akshat Verma, Puneet Ahuja, Anindya Neogi, “pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems”, Middleware 2008: 243-264 [28] P. Riteau, C. Morin, T. Priol, “Shrinker: Efficient Wide-Area Live Virtual Machine Migration using Distributed Content-Based Addressing,” http://hal.inria.fr/docs/00/45/47/27/PDF/RR-7198.pdf, 2010 [29] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. de Laat, J. Mambretti, I. Monga, B. van Oudenaarde, S. Raghunath, and P. Wang. Seamless Live Migration of Virtual Machines over the MAN/WAN. iGrid, 2006 [30] Hai Jin, Li Deng, Song Wu, Xuanhua Shi, and Xiaodong Pan. Live virtual machine migration with adaptive memory compression. In Cluster, 2009 [31] Jonghyun Lee, MarianneWinslett, Xiaosong Ma, and Shengke Yu. Enhancing Data Migration Performance via Parallel Data Compression. In Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS), pages 47–54, April 2002 [32] M. R. Hines and K. Gopalan, “Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning,” in Proceedings of the ACM/Usenix international conference on Virtual execution environments (VEE’09), 2009, pp. 51–60
  • 94.
    References 86 [33] Bose, S.K.;Brock, S.; Skeoch, R.; Rao, S.; , "CloudSpider: Combining Replication with Scheduling for Optimizing Live Migration of Virtual Machines across Wide Area Networks," Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on , vol., no., pp.13-22, 23-26 May 2011 [34] Cioara, T.; Anghel, I.; Salomie, I.; Copil, G.; Moldovan, D.; Kipp, A.; , "Energy Aware Dynamic Resource Consolidation Algorithm for Virtualized Service Centers Based on Reinforcement Learning," Parallel and Distributed Computing (ISPDC), 2011 10th International Symposium on , vol., no., pp.163-169, 6-8 July 2011 [35] H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu, “Live migration of virtual machine based on full system trace and replay,” in Proceedings of the 18th International Symposium on High Performance Distributed Computing (HPDC’09), 2009, pp. 101–110. [36] http://www.drbd.org - last accessed on 19/09/2014 [37] K. Nagin, D. Hadas, Z. Dubitzky, A. Glikson, I. Loy, B. Rochwerger, and L. Schour, “Inter-Cloud Mobility of Virtual Machines,” in Proc. of 4th Int’l Conf. on Systems & Storage (SYSTOR). ACM, 2011, pp. 3:1–3:12. [38] R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schi•oberg. Live Wide-Area Migration of Virtual Machines including Local Persistent State. In VEE '07: Proceedings of the 3rd international conference on Virtual execution environments, pages 169{179, New York, NY, USA, 2007. ACM. [39] C. Clark, K. Fraser, A. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, A. Warfield. Live Migration of Virtual Machines. in Proceedings of the Symposium on Networked Systems Design and Implementation, 2005. [40] VMware vSphere® vMotion®, Architecture, Performance and Best Practices in VMware vSphere® 5. Performance Study, Technical White Paper, Oct 2011. [41] Voorsluys W., Broberg J., Venugopal S., Buyya R.: Cost of Virtual Machine Live Migration in Clouds: a Performance Evaluation. In: Proceedings of the 1st International Conference on Cloud Computing. Vol. 2009. Springer (2009) [42] Buyya, R., Ranjan, R., Calheiros, R. N.: Modeling and Simulation of Scalable Cloud Computing Environments and the CloudSim Toolkit: Challenges and Opportunities. In: High
  • 95.
    References 87 Performance Computing &Simulation, 2009. HPCS'09. International Conference on, pp. 1- 11. IEEE (2009) [43] Xu, Y., Sekiya, Y.: Scheme of Resource Optimization using VM Migration for Federated Cloud. In: Proceedings of the Asia-Pacific Advanced Network, vol. 32, pp. 36-44. (2011) [44] Takeda, S., and Toshinori T.: A Rank-Based VM Consolidation Method for Power Saving in Data Centers. IPSJ Online Transactions, vol. 3 pp. 88-96. J-STAGE (2010) [45] Xu, L., Chen, W., Wang, Z., Yang, S.: Smart-DRS: A Strategy of Dynamic Resource Scheduling in Cloud Data Center. In: Cluster Computing Workshops (CLUSTER WORKSHOPS), IEEE International Conference on, pp. 120-127. IEEE (2012) [46] Gmach, D., Rolia, J., Cherkasova, L., Kemper, A.: Workload Analysis and Demand Prediction of Enterprise Data Center Applications. In: Workload Characterization, 2007. IISWC 2007. IEEE 10th International Symposium on, pp. 171-180. IEEE (2007) [47] VMware, http://pubs.vmware.com/vsphere-4-esx- vcenter/index.jsp?topic=/com.vmware.vsphere.bsa.doc_40/vc_perfcharts_help/c_perfcharts_c ollection_intervals.html - last accessed on 19/09/2014 [48] Chandra, A., W. Gong, et al. (2003). Dynamic Resource Allocation for Shared Data Centers Using Online Measurements. Proceedings of the Eleventh International Workshop on Quality of Service (IWQoS 2003), Berkeley, Monterey, CA, Springer. pp. 381-400. [49] M. Aron, P. Druschel, and S. Iyer. A Resource Management Framework for Predictable Quality of Service in Web Servers, 2001. [50] J. Carlstrom and R. Rom. Application-Aware Admission Control and Scheduling in Web Servers. In Proceedings of the IEEE Infocom 2002, June 2002. [51] Beloglazov, A., and Rajkumar B.: Energy Efficient Resource Management in Virtualized Cloud Data Centers. Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010. [52] http://huanliu.wordpress.com/2012/03/13/amazon-data-center-size - last accessed on 19/09/2014
  • 96.
  • 97.
    Appendices 89 APPENDIX A The codewhich checks if a host is over-utilized. The ‘fallback’ algorithm is used until the sliding window (length = 10) has been filled. @Override protected boolean isHostOverUtilized(PowerHost host) { PowerHostUtilizationHistory _host = (PowerHostUtilizationHistory) host; double[] utilizationHistory = _host.getUtilizationHistory(); int length = 10; if (utilizationHistory.length < length) { return getFallbackVmAllocationPolicy().isHostOverUtilized(host); } double[] utilizationHistoryReversed = new double[length]; for (int i = 0; i < length; i++) { utilizationHistoryReversed[i] = utilizationHistory[length - i - 1]; } double[] estimates = null; try { estimates = getParameterEstimates(utilizationHistoryReversed); } catch (IllegalArgumentException e) { return getFallbackVmAllocationPolicy().isHostOverUtilized(host); } double migrationIntervals = Math.ceil(getMaximumVmMigrationTime(_host) / Constants.SCHEDULING_INTERVAL); double predictedUtilization = estimates[0] + estimates[1] * (length + migrationIntervals); predictedUtilization *= getSafetyParameter(); addHistoryEntry(host, predictedUtilization); if(predictedUtilization >= 1) { Constants.OverUtilizedHostsThisInterval++; } return predictedUtilization >= 1; }
  • 98.
    Appendices 90 APPENDIX B The main()method of the LRMMT algorithm. public static void main(String[] args) throws IOException { boolean enableOutput = true; boolean outputToFile = true; String inputFolder = "C:UsersscoobyDesktopEclipseCloudSimexamplesworkloadplanet lab"; //default workload generated from dynamic averages String workload = "default"; //dynamic workload if(!Constants.IsDefault) { workload = "dynamic"; // PlanetLab workload } if(Constants.IntervalGenerator) { Constants.SIMULATION_LIMIT = 86400; } String outputFolder = "C:UsersscoobyDesktopEclipseWorkspaceoutput"; String vmAllocationPolicy = "lr"; // Local Regression (LR) VM allocation policy String vmSelectionPolicy = "mmt"; // Minimum Migration Time (MMT) VM selection policy String parameter = "1.2"; // the safety parameter of the LR policy new PlanetLabRunner(enableOutput, outputToFile, inputFolder, outputFolder, workload, vmAllocationPolicy, _vmSelectionPolicy, parameter); }