Sustainability: Datacentres Part of the Problem or Part of the Solution?

SUSTAINABILITY:
DATACENTRES, PART OF THE
PROBLEM OR PART OF THE
SOLUTION?
ABSTRACT
Datacentres globally consume roughly 2% of all
generated electricity and contribute 1.5% of the
anthropomorphic atmospheric CO2 pollution. While
there is much written within the ICT domain
concerning the sustainability of datacentres, there is
much less literature looking at datacentre
sustainability from a wider interdisciplinary
perspective. By reviewing an extensive body of
literature, this paper explores the following faceted
question: Is it possible to construct wholly
sustainable datacentres; datacentres that are
powered and cooled sustainably, that house
sustainable computer hardware that runs
sustainable software? Along with the findings
presented this paper argues that despite whatever
alteration to the physical attributes of a datacentre
or its software is brought about, any efficacious form
of sustainability will remain elusive unless
uncontrolled economic growth (external to the
datacentre) is put in check. This paper attempts to
show that far from this argument suggesting that we
should stop building datacentres, we should start
using them to effect control to bring economic
growth into a more steady and sustainable state. To
explore and attempt to answer the research
question, we shall explore how energy is resourced
and distributed, how it is consumed by the
datacentre as a whole, and the how the computers
and associated hardware within consume that
energy. Further to the above, several alternative and
developing technologies will be explored along with
techniques for quantifying the efficiency and the
global impact of datacentres in economic terms.
Kevin Anderson

Kevin Anderson
©Kevin Anderson, 2017 Page 1 of 57
Table of Contents
Acknowledgements.....................................................................................................................3
Abstract.......................................................................................................................................4
Introduction ................................................................................................................................4
1 Sustainability.......................................................................................................................5
Figure 1 The semantics of sustainable development (Lélé 1991, figure 1, p.2) .....................6
1.2 Metrics..........................................................................................................................7
2 Power and Energy...............................................................................................................8
2.1 Renewables and the conventional grid........................................................................8
Figure 2 Electrical Transmission Grid illustrating KCL where ign represents grid current,
iHn = current drawn by housing, iIn = current drawn by industry, iRn = current supplied by
renewables, iNn = current supplied by nuclear, and iFn = current supplied by fossil
sources (Anderson 2014)....................................................................................................9
(Anderson 2014a)..............................................................................................................11
2.3 Smart grid...................................................................................................................14
3 Optimisation .....................................................................................................................16
3.1 Software .....................................................................................................................16
3.1.1 PowerApi..................................................................................................................19
4 Computer Hardware .........................................................................................................22
4.1 How computers consume energy ..............................................................................22
Table 3 Power Dissipation (Weste and Harris 2011, p.184) .............................................22
Figure 9 CMOS logic inverter states of operation (Anderson 2014c)...............................24
Figure 10 Electron Micrograph of a Silicon on Insulator device (Weste and Harris 2011,
p.361)................................................................................................................................25
Figure 11 Clock Gating (Weste and Harris 2011, p.186)...................................................26
4.2 Utilisation ...................................................................................................................26
4.3 PowerNap...................................................................................................................26
4.4 Virtualisation..............................................................................................................26
Table 4 Compares N+1 to N+1+1 cluster configurations..................................................28
4.5 System on a Chip (SOC)..............................................................................................28
5 Datacentre Facility Hardware ...........................................................................................30
5.1 Datacentre Facility Electricity Supply.........................................................................30
5.1.1 General Trend ..........................................................................................................30
5.1.2 Conversion Losses....................................................................................................32

Kevin Anderson
5.1.3 Conversion Losses and their impact on redundant systems ...................................33
5.1.4 Facility Level Direct Current.....................................................................................36
5.1.5 Possible road map to greater PSU efficiency...........................................................37
5.1.6 RAILS.........................................................................................................................37
5.2 Cooling........................................................................................................................38
5.2.1 Current trends in cooling technology ......................................................................38
5.2.3 Free cooling..............................................................................................................42
5.3 Corporate Policy in the Datacentre.................................................................................43
6 Economics .........................................................................................................................43
Figure 22 Gordon Moore's plot which started it all (Moore and others 1965, p.83)...........44
7 Conclusions .......................................................................................................................45
Appendices................................................................................................................................48
1 Metrics ...........................................................................................................................48
2 Terms and Abbreviations...............................................................................................50
References.................................................................................................................................52

Kevin Anderson
Acknowledgements
While this dissertation has allowed me to develop both intellectually and professionally, it
has also been somewhat exhausting and painful. Yet the tireless support and patience
provided throughout by my module leader, Chris Johnson, has given me energy and helped
me overcome the pain. In addition I would like to thank all of my other module leaders,
including;
Chris Sturley, whom inspired and taught me much about taking a broader look at the
world, while suffering my endless interruptions;
Liz Stuart, whom taught me much about prioritisation and personal energy
management;
My personal tutor John Forde, for his advice and endless support;
I would also like to thank Claro, especially my two aids, Richard Weeks and Elaine
Slingsby for their cheerful and very helpful input and advice.
Of course, no man is an island, and the support of my family and loved ones has been as
essential to me as oxygen. Therefore, in addition, I would like to thank the following;
 My Mother;
 My three lovely daughters, Esther, Miriam, and Rebekka;
 My Brother, Martin, and my good friends Becky and Gavin;
 A very warm ‘thank you’ to Claudia, for her great support and her ability to mop
up my tears of frustration.
Overall this has been a wonderful and fruitful journey.

Kevin Anderson
Sustainability: Datacentres, part of the problem or part of the
solution?
Abstract
Datacentres globally consume roughly 2% of all generated electricity and contribute 1.5% of
the anthropomorphic atmospheric CO2 pollution. While there is much written within the ICT
domain concerning the sustainability of datacentres, there is much less literature looking at
datacentre sustainability from a wider interdisciplinary perspective. By reviewing an
extensive body of literature, this paper explores the following faceted question: Is it possible
to construct wholly sustainable datacentres; datacentres that are powered and cooled
sustainably, that house sustainable computer hardware that runs sustainable software?
Along with the findings presented this paper argues that despite whatever alteration to the
physical attributes of a datacentre or its software is brought about, any efficacious form of
sustainability will remain elusive unless uncontrolled economic growth (external to the
datacentre) is put in check. This paper attempts to show that far from this argument
suggesting that we should stop building datacentres, we should start using them to effect
control to bring economic growth into a more steady and sustainable state.
To explore and attempt to answer the research question, we shall explore how energy is
resourced and distributed, how it is consumed by the datacentre as a whole, and the how
the computers and associated hardware within consume that energy. Further to the above,
several alternative and developing technologies will be explored along with techniques for
quantifying the efficiency and the global impact of datacentres in economic terms.
Introduction
Hardly a day passes without some mention of sustainability or efficiency, perhaps becoming
more common (in computing at least) is the term “cloud computing” which seems to
encourage one to think that somehow the computing gets done by some weightless entity
that requires no feeding and no space in which to exist. Seemingly entire organisations can
float effortlessly among the clouds, treading lightly leaving nothing but a sweet and gentle
breeze. The more pragmatic among us whom are interested in the mechanisms which create
and float clouds will realise that it is the datacentre industry which controls and hosts the
real earthly domain of clouds.
While it is true that Clouds save by enabling one to quickly configure and deploy massive
resources that may be used only for a short time before rapidly evaporating, paying only for
the time they are in use, however, in truth cloud technology does not tread lightly, it
requires a great deal of feeding, and leaves behind CO2 polluted air (among other wastes).
Compounding this there is evidence that with the coming of age of “Big-Data” and the
Internet of Things (IoT), datacentre construction is set to explode.
Yet datacentres and the cloud do present opportunities to reduce duplication of efforts and
improve efficiency in the wider economy. Therefore to better answer the research question

Kevin Anderson
we must look at not just the datacentre and its technologies, but its resources and the wider
effects datacentres have on the global economy. However, first we must explore what
sustainability is all about.
1 Sustainability
Two terms, green and sustainable are often dropped into conversations, promotional
material and political manifestos; however, despite the passion that surrounds those;
without definition, neither term can have any useful substance. In fact, the deeper one
explores the subject of sustainability the more one becomes uncertain as to its identity.
Discovering the true meaning is difficult as even the role of the scholar will affect the
definition “1) Sustained yield of resources that derive from the exploitation of populations
and ecosystems (applied biologist's definition); 2) Sustained abundance and genotypic
diversity of individual species in ecosystems subject to human exploitation or, more generally,
intervention (ecologist's definition); and 3) Sustained economic development, without com-
promising the existing resources for future generations (economist's definition)” (Gatto
1995). Hence what appears as a simple to understand term can easily be factorised into
subtle and sometimes contradictory semantics.
The WCO’s version of Sustainable development is development that meets the needs of the
present without compromising the ability of future generations to meet their own needs. It
contains within it two key concepts:
• the concept of 'needs', in particular, the essential needs of the world's poor, to which
overriding priority should be given; and
• the idea of limitations imposed by the state of technology and social organisation on
the environment's ability to meet present and future needs.
Lélé cautions that we must not make the mistake of using the term sustainability too literally
for this leads us into chaos and contradiction. More meaningfully our definition should
contain elements from development, its processes of change, growth, and objectives, and of
meeting our basic needs and objectives.

Kevin Anderson
Figure 1 The semantics of sustainable development (Lélé 1991, figure 1, p.2)
Mackay dedicates his book concerning sustainable energy “to those who will not have the
benefit of two billion years’ accumulated energy reserves” (MacKay 2009), this should remind
us that the effects of whatever we do, the effects of our actions (or inactions) will be felt
long after current generations have left the stage.
Our actions (big or small) have intergenerational effects, in that the actions we perform now
will have either positive and or negative reactions upon our future selves and descendants.
Furthermore, the planet cannot be saved for it is not in any jeopardy, though the friendly
environment that is favourable to our kind and the other life forms upon which we depend
could well be in danger.
1.1 What sort of state is our world in, why should we care about what ICT is
doing to it?
CO2 contributed by ICT is now above 2% of the total anthropomorphic greenhouse gas
emissions (GHG) in the atmosphere (Petty 2007). The total may, of course, be much higher
as GHGs from the embodied energy contained in datacentre structures, and hardware 's
hard to quantify even mining raw materials exposes GHGs, furthermore recycling datacentre
products to recover those materials poses another challenge which if done carelessly
exposes further pollution. Hilty categorises environmental effects in three orders;
• First order (AKA primary effects) caused by the physical existence of ICT hardware,
stemming from its production, use, and disposal phases.
• Second order (AKA secondary effects ) relate to any decrease or increase in
environmental impact caused indirectly by ICT’s forces of change, e.g. changes to
transport usage, downloads replacing physical music and video media, etc.
• Third order (AKA tertiary effects) medium or long term behavioural (e.g.
consumption habits) adaption caused by the availability of ICTs and their services.

Kevin Anderson
(Hilty 2011, pp.16–19)
Lélé points out that apart from residing in the domains of the ecological and physical
sciences, sustainability is as much rooted in social issues.
One must not conclude that sustainability lives purely in the supply side of things; demand
(in both magnitude and structure) has to be matched to supply (Lélé 1991, p.610). Therefore
it is unlikely that a sustainable society is going to be a spontaneous one. Indeed we have
most of the technological artefacts, and it is likely that we still have the resources; however,
what is still needed is a catalyst around which reaction of change occur. Göhring notes that
the Information Society is motivated by developments, chiefly, technical progress, economic
growth, and change in life styles. He goes on to cite economic competition as the driving
force to all three developments. Yet although there are daily opportunities to positively
influence development and tempt it toward sustainable development, Göhring finds
evidence that these opportunities are missed, pointing out that positive effects would be
more likely given a proper framework, perhaps it is in such a framework somewhere
between the political and scientific world where such a catalyst is to be found (Göhring 2004,
p.281).
In short précis one cannot classify ICTs as either totally good or totally bad, the truth lies
between those extremes, and it may even be possible to use modelling to find the ICTs with
good or bad qualities, however as Hilty states “Information and communication technologies
should be used in such a way that the material and energy flows avoided by the ICT
application are clearly greater that the material and energy flows caused by the
application..”(Hilty 2011, p.59). These flows are not obvious and it is necessary to involve
some sort of life cycle assessment (LCA) to understand and quantify them. Despite ICTs
dematerialisation ability (where growth can be driven by information rather than natural
resources), “The level of economic activity in western countries is still correlated to their per
capita consumption of natural resources. If we do not succeed in changing this correlation,
we will be losing the life-sustaining services of Nature”(Hilty 2011 quoting WRF - Factor Ten
Institute and Empa 2008).
Ultimately we need to bring the resource flows from and too the environment into check,
ICTs (and therefore datacentres) stand in the middle and if we choose to, we can use them
to affect dematerialisation by creating a more information rich economy and lower our
resource demand which in 2008 stood at 100 billion tonnes per year. Equally though, we
could use our ICTs to drive those material flows ever higher by ignoring the delicate interplay
of efficiency and sufficiency.
1.2 Metrics
Aiming for efficiency in the datacentre is one thing but achieving it is not straight forward, an
early hurdle to this is simply measuring it.
If any attempt to economise and or improve efficiency is to succeed then an effective set of
metrics is vital. Many metrics have already been defined, but a disciplined approach is also
required when applying them, taking into account functional breakdown of energy usage;

Kevin Anderson
accounting for the consumption cost of each software, each OS, each server, and peripheral
– in fact we need to know the impact of each and every datacentre device (physical or
virtual) before we can gain any sort of meaningful insight. As yet it is still very hard to
compare two dissimilar datacentres (although boasting about PUE values seems to have
sparked competitive efficiency drives).
See appendices for further explanation of common metrics.
2 Power and Energy
2.1 Renewables and the conventional grid
Electrical energy is generated by a range of techniques, some more polluting than others.
Unfortunately, unless one is well placed and can connect directly to a private non-polluting
generator, one has to accept that the energy supplied has a mixed pedigree. Simply placing a
given amount renewable source on the supply grid does not equate to an equivalent
reduction in pollution. This is partly caused by the design of the supply grid, and in part due
to the intermittent nature of those renewable sources.
Assuming that the grid does not possess any significant storage and is not a smart grid then
renewable input will be shared equally between all consumers (domestic and industrial)
according to Kirchhoff’s 1st law which states that current entering any given node in a circuit
is equal to the current leaving that node. Stated another way for a junction with three
connections the arrangement will be thus or more formerly using sigma
notation:
Using this law we can build a simple model of a conventional electricity transmission grid.
Figure 2 represents three generating plants (Renewable, Nuclear, and Fossil), and two load
communities (Housing, and Industry). If we now add some arbitrary figures and current flow
arrows to the model we can begin to see the relevance of Kirchhoff’s current law (aka KCL).

Kevin Anderson
Figure 2 Electrical Transmission Grid illustrating KCL where ign represents grid current, iHn = current drawn by housing, iIn =
current drawn by industry, iRn = current supplied by renewables, iNn = current supplied by nuclear, and iFn = current
supplied by fossil sources (Anderson 2014)
We will assign both the Nuclear and Fossil plants 250A of current each, and the Renewable
plant will get 100A. As this is current which is drawn by the two load communities we
must be using all 600A, so we will arbitrarily assign Housing with 100A, and industry with
500A. We can now show that the balance of all the currents on the grid is as follows. Table
1
Formula Sum of Line
Currents
Sources & Loads Current
Ig1 = IF1 = 250A Fossil Feed Line
Ig2 = Ig1 + IN1 250 + 250 = 500A Nuclear Feed Line
Ig3= Ig2 + IR1 500 + 100 = 600A Renewable Feed Line
Ig4= Ig3 +(- II1) 600 + (-500) = 100A Industry Feed Line
IH1= Ig4 100 = 100A Housing Feed Line
IH1= IH2 100A Housing Community Load
IH2= Ig5 100 = 100A Housing Return Line
II2= II1 100 = 100A Industry Community Load
Ig6= Ig5 + II2 100 + 500 = 600A Industry Return Line
IR2= IR1 100A Renewable Generator Current
Ig7= Ig6 +(-IR2) 600 - 100 = 500A Renewable Return Line
IN2 = IN1 250A Nuclear Generator Current
Ig8= Ig7 + (-IN2) 500 +(-250) = 250A Nuclear Return Line
IF2 = IF1 250A Fossil Generator Current
Ig9 = IF2 = 250A Fossil Return Line
There are some important values that have to be maintained in a conventional grid: the end
user’s voltage at point of delivery and the AC frequency (50 Hz for domestic power in the UK
and many parts of the world). To achieve this conventional rotating AC generators have to
maintain a constant speed; to do that the energy input driving each generator shaft has to
match the energy that is being drawn from the transmission grid.
Power demand, however, is far from constant and only partially predictable. With
conventional thermally driven generators such as coal or gas fired plants the solution is fairly
straight forward, simply increase the heat input to meet increased demand, or reduce it to
match a drop in demand. If this was not done and demand increased notably, the generators
would slow, and both voltage and frequency would fall.
Normally only some of the generators are adjusted as certain kinds of generators are not
suited to constant load changes, e.g. nuclear plants can’t easily be turned down (this is
possible when fuel loads are relatively new, however, this sort of treatment contaminates
the fuel rods (Smith 2012, p.12) and there is little financial incentive and no decrease in
greenhouse gas emissions). To accommodate this variable state, a portfolio of different
generator types supply the transmission grid. Some of that portfolio is throttled, contributing
only partial effort, while others are throttled so far back that none of their input energy gets

Kevin Anderson
to the grid, these are generators held in spinning reserve during periods of excess capacity.
Anything throttled back is deemed to be held in despatch.
Intermittency is a normal state of affairs even in conventional generating systems,
everything has reliability problems, and everything requires maintenance at some point. To a
great extent, this can be planned for, and despatch is the mechanism that mitigates any loss
of supply that would otherwise occur, overriding any problem of variable supply which is an
unavoidable fact of life.
While renewable energy sources are very much in vogue an unfortunate state of affairs can
sometimes arise with renewables when there is an overabundant supply, not matched by
demand. In an ideal world, the sensible thing would be to simply turn off the cheapest
producer (regarding environmental impact), but this is possibly counter-intuitive.
Renewables in Europe have guaranteed access to grids, and as such are not despatched in
times of negative demand (Schaps and Ecckert 2014), instead other (fossil fuel based)
generators are despatched in their place. One, of course, is tempted to see this as a triumph
of sustainability, but as Smith argues, despatched generators cannot be turned off
completely, just throttled back lest their boilers loose too much heat. A cold boiler would
take hours and large amounts of energy to come back into production. Thus, while it appears
that during peaks in renewable energy production that we all live clean and free, instead, we
are in fact burning fuel and generating CO2 in our despatched plants.
Currently negative demand is not a massive problem, but in future, this may be much more
of a problem. Between 2006 and 2008 in Western Denmark wind power exceeded demand
for 2% of that period, yet according to Hedegaard and Meibom’s forecasts for 2025, when
half of the generating capacity will be wind-powered, show that there will be negative net
loads for up to 22% (Hedegaard and Meibom 2012, p.319).

Kevin Anderson University of Plymouth Student Number:10346863
©Kevin Anderson, 2014/15 COMP306 Dissertation Page 11 of 57
Futurists look forward to days when all of our power comes from renewables, and many see
this as a simple choice, as a recent Greenpeace email puts it “do we choose glorious, green
renewables; or climate-wrecking fossil fuels?”(Casson 2014), some large commercial
enterprises seem to agree with Greenpeace (at least in publicity bulletins). Google
announced in January 2014 that they had succeeded in purchasing the entire 59-megawatt
energy crop of four (so far unbuilt) Swedish wind farms to power their Finnish datacentre
(Google Inc. 2014).
Figure 3 Sankey diagram shows Wishfully Purchasing an Energy Crop with electrons magically crossing a shared grid.
(Anderson 2014a)
Purchasing the entire energy crop of a renewable source such as wind or solar seems at first
to be a simple way of writing off one's C02 emissions. Unfortunately (or fortunately if we
expect any electrical device to operate at all) physics does not support claims such as these.

To gain a better idea of what is actually going on we need to refer back to the diagram in
figure 2 (and the tongue in cheek Sankey diagram Figure 3). We again ignore the fact that
the grid runs on high voltage AC and uses transforming substations to provide medium and
low voltages (as used in homes), and again propose a DC analogy where the generators and
consumers all share a single voltage and ignore system losses.
Looking again at Kirchhoff’s circuits laws, the current law (KCL) shows us that electrical
current is, in fact, a flow that disperses across all routes; it is also a non-storable good. The
voltage law (KVL) states that the directed sum of all electromotive force around any closed
network is zero (which is the fundamental principle of the conservation of energy) and has
the following Sigma notation:
For this to be true there must be absolute equality between supply and demand, yet there is
no rule that tells any of those `green’ electrons to take specific routes. Therefore all the
supply inputs are shared by the entire collective load (Glachant and Pignon 2005, p.156).
A simple arithmetic model can show that any generated renewable power must be divided
into a number of shares, each representing the amount of power provided to any single
customer, i.e. if a consumer is provided with 2% of the total grid power and renewables had
contributed 5% to that grid total then the total renewable share must be part of the
consumer’s share. Of course, no datacentre consumer is hopefully ever going to demand 2%
of our generating capacity, and the ability for renewables to contribute is extremely
variable. However, the simple fact of the matter is that regardless of whether one buys
none or all a grid-connected renewables output, it will is always shared with all the other
consumers.
An enterprise such as a datacentre could, of course, choose to bypass the grid and build its
own renewable power generators, however, to be totally self-sufficient the owners would
need to look into the relative power densities (the amount of power concentrated per unit
area) of the various technologies and compare against their load requirements) which
MacKay argues are insufficient to replace conventional generators to any significant degree.
Table 2 showing power densities (source MacKay 2009, p.112)
Power per unit land or water area
Wind (onshore) 2 W/𝑚2
Wind (offshore) 3 W/𝑚2
Tidal pools 3 W/𝑚2
Tidal stream 6 W/𝑚2

Solar PV
Plants
Highland rain-water 0.24 W/𝑚2
Hydroelectric 11 W/𝑚2
Geothermal 0.017 W/𝑚2
The above figures should become interesting when one considers a rack of IT equipment
which on average - if it contains servers - consumes power at a rate of 8-10kw or blade
servers 12-16kw (and can in some cases go up to 28kw) of load, and all that load spread
across little more than ½ a square meter of datacentre floor space. To generate enough
power for the lightest of these loads using photovoltaic alone is best case 400m2, or worst
case 1,600m2 of solar cells (of course this assumes that the sun will never set)(MacKay 2009,
p.112; Wiersma 2013). Few data centres have only a single rack to power, and all need some
way to cool the equipment, therefore, the above calculation would have to be adjusted
significantly to provide power to all the datacentre cooling and ventilation plant.
Taking the power density in a slightly different direction, perhaps it would be better to
invest capital in the construction of new nuclear power plants, which although having rather
large construction costs,
• enjoy very low fuelling costs (suitable for constant base loads that datacentres
represent),
• have minuscule footprints compared to renewables (high power density)
• are extremely low GHG emitters,
• are extremely safe – when compared to other conventional energy sources.
(Smith 2012; Kharecha and Hansen 2013, p.4892)
Although atomic generators contribute comparatively little to the GHG problem, they are
none the less unpopular, and many would rather like to see them shut down permanently.
Unpopular they maybe, but the truth is there are no ‘base-load’ generators as clean and
safe as atomic power stations. Without atomics, we shall be forced to balance intermittent
renewable power sources by keeping carbon-based thermal stations in spinning reserve (hot
standby) just in case the wind drops or the sun goes in behind a cloud. Thus getting green
power from the source to destination without significant waste is a major obstacle to GHG
reduction!
2.2 Carbon credits.
Carbon credits are currently popular with those keen to reduce their overall GHG impact.
Controversially some consumers seek to offset their carbon footprint by purchasing carbon
credits. There is, unfortunately, scant if any scientific evidence for the efficacy of such
indulgences and it is unlikely that there is any further benefit for offset consumers other

than (possibly) good PR. Google have introduced many radical approaches to DC
sustainability, yet even they have invested in carbon offsets, particularly those designed to
destroy landfill gases (Google Inc 2011).
A landfill site receiving 7.5 million tonnes of household waste per year can generate 50,000
m3 per hour of methane. Google gains carbon credit by paying certain landfill sites to burn
off this methane (a GHG many times more effective than CO2) emitting only h2O and a
portion of much less harmful CO2. This CH4, however, is also an effective fuel, which
according to MacKay is capable of generating useful electricity. Although Google is
seemingly offsetting their GHG to some extent this way, Smith sees offsetting as simply a
modern indulgence (Smith 2007).
If a datacentre were conveniently located close to a landfill site and used some sort of micro
power station useful electric power could be extracted. Google already employ methane
fuel cells. However, most landfill sites are nowhere near datacentres and simply dumping
that power on the grid will only gain value to the datacentre via a feed in tariff. Even so,
making something useful rather than burning off a good does make sense.
If, however, a datacentre owner were to build or subscribe to a remote generator at a
landfill site then there would still be the problem of sharing those ‘green’ electrons with all
the other grid users (see Kirchhoff argument), this illustrates the need for some more
effective means to arbitrate the power grid. By giving the grid some intelligence, it becomes
possible a smart grid to orchestrate energy inputs and loads.
2.3 Smart grid
There is a host of Smart Grid initiatives being pursued around the world (IEA 2011), the
majority of which employ signalling systems to help balance demand throughout the grid.
Perhaps instead of searching for ways to meet an ever increasing power demand, we should
be looking to economise, even ration the power that is supplied. One way to achieve this is
by the use of a Smart Grid (SG) (Tang et al. 2014). SGs are often proposed to solve issues
caused by intermittency of renewable generators connected to the grid, and to reduce
significantly the total input energy required.
In brief, by controlling when devices around the grid can draw power, energy supplies can
be conserved. A heating element (in an industrial boiler perhaps) can be powered off for a
few moments while an electric motor driving a locomotive accelerates, and powered on
again when that vehicle is up to speed; Thus in effect regulating the maximum current draw
from the grid by means of orchestration (Kats and Seal 2012). Of course orchestration and a
sense of priority is key in such a system, it would not be acceptable to momentarily power
down a hospital life support system.
By giving the grid a degree of intelligence it is possible to control maximum demand,
bringing it in line with available generation capacity without resorting to expensive spinning

reserve generators. At times when demand is likely to outstrip the capability of any virtual
generating capacity then the Smart grid can switch on stored power resources, perhaps
calling on parked electric vehicles to sell back some of their reserves, or by using more
heavy weight centralised stores such as pumped hydro or stores such as flow cell batteries
(MacKay 2009, pp.191 – 200).
Smart power grids seem to make perfectly logical sense, yet there is little to show that any
useful consensus has formed as to how they should operate. Before they can become a
reality we need to solve some significant questions such as by whom and how they will be
controlled, how security can be guaranteed.
It is evident, however, that the sheer volumes of information necessary to drive them will
test current data processing facilities, thus requiring many new data centres to be built.
New datacentres need not be as power hungry as their predecessors, in concert with a
smart grid there is the possibility to store energy in enlarged UPS batteries, and when the
grid demands more energy the datacentre (sensitive to electricity price signals) help balance
the grid load. To this end, Nissan, working with the GreenDataNet (‘Eaton Launches
GreenDataNet’ 2014) are working on a project to evaluate the reuse of electric vehicle
batteries once their road life has ended. It is expected that these units will still be able to
store at least 70% of their original capacity and could be used as energy storage in
datacentres. Other technologies such as flywheel stores can offer a rapid response at very
low cost. More conventionally (but still polluting) the datacentres standby diesel generators
can also be used to top up the grid. Loss of efficiency is still a risk with smart grids; many
small generators increase the percentage of energy loss due to inefficiencies caused by
friction and incomplete combustion in heat engines.
Uncertainty, however, regarding the development of smart energy systems possesses
barriers to successful adoption; chiefly no information infrastructure exists. “A crucial
prerequisite of a smarter energy grid is an energy information infrastructure, which does not
yet exist. In comparison to the existing internet infrastructure, the energy information
infrastructure is critical and an outage might, depending on its size, be much more damaging
than would be the case for the breakdown of the internet. Hence, the energy information
infrastructure requires high safety standards and constant availability of its critical parts.
This yet missing information infrastructure is considered as a structural deficit that must be
overcome so that the implementation of SG technologies can be successful”. (Muench et
al. 2014) On this note, such an information infrastructure could still be thought of as part
of the greater internet, as it is unreasonable to imagine that any net can remain
unreachable from the internet. Even a network behind a firewall is accessible from the
outside, particularly by those authorised, but also by those technically capable and
unauthorised. Therefore even with the eyes of national security monitoring the grid there
would still be the potential for abuse and or attack, thus affecting the social dimension of
sustainability.

3 Optimisation
This section covers optimisation that can be performed inside the datacentre, although
certain aspects of this happen outside, e.g. regulating use of IT to restrict miss-use, needless
use, non-optimal use of non-despatchable green power.
Wherever there is complexity there is often room for optimisation, ICT is one field where
although much of the complexity is hidden (think world wide web hiding the internet, and
how cloud computing hides the datacentre along with its costs and complexities), there is
likely to be a significant portion that is over resourced, overly complex, and far from
optimal. Some of that potential for optimisation lies outside the datacentre, in boardrooms,
in OEMs, and in homes by way of ICT’s social consumption. This dissertation, however, is
primarily concerned with those optimisation quanta that lie inside the datacentre, that can
or should be optimised, and the actors and conditions both inside and out that affect the
datacentres sustainability.
3.1 Software
“Hardware dissipates energy because software tells it to”(Ferreira et al. 2013, p.30).
Consequently, the programmer is in part responsible for the energy dissipated by that
hardware.
Unlike an aeroplane pilot that knows her/his aircraft intimately and can adjust the surfaces
and engine settings to perfectly match the demands and the conditions, the programmer
has imperfect knowledge regarding the conditions in which s/he will deploy the software,
the demands that will be placed upon it, and possibly even the design and configuration of
the physical machine. Regardless, however, the modern application programmer must not
fall to the temptation of simply drawing on seemingly endless computing resources, doing
so will likely be wasteful and may even cancel out any gains made at the lower levels of the
compute stack or among hardware (Pinto et al. 2014, p.22). In support, developers should
be in a position draw upon a wealth of proven best practices and to effectively benchmark
their software for its energy use. Noureddine et al identify three basic categories of
software energy measurement;
1. Hardware –High precision but tends to be coarse grained,
2. Power Models – may be too generic, or coarse grained, or platform based,
3. Software Measurement – Energy Application Profiling (may be most promising).
Beyond benchmarking and improving software, we may additionally want to build systems
that can dynamically adjust (and potentially quiesce) their own hardware to minimise
energy consumption, yet such a system would need to guarantee that each task has just
enough resource to meet its own SLA and or throughput. Such a system could draw on a
database of observed task performance calculating any given tasks resource requirement,
however, utilising such a technique would likely lead to conservative savings at best. Brown
and Reams suggest that a more effective approach would be to let the applications decide

what their own throughput should be based on their service level requirements or
deadlines, before passing such data to the OS via an interface. The OS would then be in a
position to make more aggressive reductions in energy use (Brown and Reams 2010, p.55) .
Improving energy efficiency is an optimisation problem, to this end hardware resources
need to be adjusted dynamically so that only the required resources are employed in task
completion. Further to this, any optimisation needs to take account of whether the task
needs to be completed on time, or whether it is the task’s throughput that is important to
the SLA. Either way “performance and efficiency are not mutually exclusive” (Brown and
Reams 2010, p.53), even when a task requires the maximum performance from a system,
that system will still probably have resources that can be quiesced without impacting the
task. Brown and Reams identify two performance levels that must be maintained;
1. Tasks with deadlines must be completed on time. Task deadlines that are <= to the
systems best performance (using whatever resources it may have) are effectively
asking for ASAP completion, with such tasks the system therefore can only optimise
by deactivating resources that are not concerned with the processing of those tasks.
a. If, however, a deadline later than the systems best achievable deadline is
stipulated then the system has the freedom to complete the task at any time
up the deadline. In such circumstances the system is able to reduce its energy
consumption much more aggressively and find an energy minimum for that
workload.
b. Deadlines may also be considered “hard” or “soft”; soft deadlines can be met
by best efforts, while hard deadlines may be significantly more difficult for an
energy optimiser to meet.
2. Services that must operate at required throughput. Brown et al, point out that
concerning online services, that “the notion of throughput” would better
characterise the performance level than a “completion deadline”. “Since services, in
their implementation, can ultimately be decomposed into individual tasks that do
complete, we expect there to be a technical analogue (although the most suitable
means of specifying its performance constraint might be different)”.
An effective system would need to be responsive to demand i.e. it will have to possess
dynamic response capabilities so that it can manage maximum latencies within bounds
imposed by the available hardware and be relative to the task/workload performance
requirements. However there are several ways one can interpret throughput, transactions
per second (databases), or transfer rates on an I/O channel, are a couple of examples,
Brown et al suggest “Instantaneous power must never exceed power limit (P)” as a possible
constraint for a system or even an entire datacentre
Having said all that about tasks deadlines and throughput, we are still left with the biggest
question, how can we best create a solution? In pursuit of an answer to this, Brown et al
identify three required aspects;

1. The system must be able to construct a power model that identifies what and how
power is consumed, including how to control it.
2. There needs to a method of determining an applications task/workload performance
requirements (either by explicit communication with the application or by
observation). Brown et al call this the constraints-determination and assessment
component.
3. The system needs to implement an energy optimiser, with which the system will
configure and reconfigure the hardware while operating.
Figure 4 describes graphically the above three aspects. Even boiling the problem down to
three principle aspects, however, it is not easy to see how this can be modelled. Brown et al
suggest a division of responsibility in that the application should be responsible for assessing
its requirements, while the operating system provides an interface to which the application
can communicate those requirements. Upon receiving the requirements, the OS will be in a
position to “aggressively optimise” the hardware.

Figure 5 shows the principle modelled according to a message-bus principle. Where OS = Operating System.
Such optimisation facilities would likely need to be built into the OS kernel (possibly as a
module), along with some improvement to the computer’s power management facilities at
chip level so that devices can be ‘varied’ between ‘power states’ rapidly and at minimal cost.
The kernel and the application could be loosely coupled via a message bus to maintain a
simple and standardisable approach.
While it is important to have an effective strategy to control a tasks energy usage during
operational lifetime, it is equally important to have an effective of probing task energy.
Measurement is important both from an end user’s perspective and from a developer’s
perspective. Whereas the end user will be rather restricted to choosing the application, the
developer has the opportunity if armed with suitably effective measuring facilities to
determine the application’s eventual energy use. Choosing whether to approach a problem
in an iterative or a recursive style may have an impact on the overall energy effectiveness of
the code. Assuming the developer chooses to develop code that is to be compiled, s/he has
the opportunity to use optimisation switches at compile time (GCC used without switches
will try to reduce the cost of compilation and not improve the codes performance
(‘Optimize Options - Using the GNU Compiler Collection (GCC)’ 2014) ) (Noureddine et al.
2012, p.5).
3.1.1 PowerApi
PowerAPI is a modular general architecture that monitors power by mixing sensor data and
mathematical energy models, giving energy information per software, figure 6 gives an
overview of the structure.

Figure 6 PowerAPI Reference Architecture (Noureddine et al. 2012, p.22)
Figure 7 PowerAPI architecture showing Pub/Sub Event Bus concept (Adapted from Bourdon et al. 2013)
The central glue of PowerAPI is a common event bus, modules use this to publish or
subscribe to sending events. Obtaining a task’s energy consumption is a simple process
requiring a process id (123), a time interval (500 milliseconds) and the name of the target
listener (CpuListener).
PowerAPI.startMonitoring(
Process(123),
500 milliseconds,
classOf[CpuListener]
)
The user maintains complete control over which modules are involved in any measurement.
The API does, however, need to be configured to the target hardware as the ‘Formulae’
modules require local values such as the CPU’s thermal design power and an array
containing operating frequencies for each voltage.

Noureddine et al and Brown et al have both demonstrated that reducing software energy
consumption need not be a blind guessing exercise, given that researchers have been
exploring low energy software concepts for several years now ( business owners have
always been interested in cost reduction), it may seem surprising that relatively few
programmers outside the field of portable computing have developed any appropriate low
energy programming skills (though it may be that some techniques for improving execution
speed may be relevant). Yet there seems to be little available material that gives simple
advice comparing software languages by their relative energy consumption. Pinto et al
mined StackOverflow in order to understand how the Q&A site users are interested in
software energy consumption. Pinto et al were able to show that developers are aware of
energy consumption problems, finding that developers were asking diverse and challenging
questions, yet answers were often vague and or flawed. They went on to identify three
recurring problems in the answers given; “(i) misconceptions about software energy
consumption and how it can be reduced, (ii) solutions that are applicable in certain contexts
being presented as universal, and (iii) lack of tools, in particular, measurement tools” (Pinto
et al. 2014, p.28).
Perhaps part of the problem lies in a failure to educate programmers regarding the
connection between software and energy use. It may seem forgivable that a programmer
may have only a vague understanding of the underlying concepts, after all we would not
demand that a car driver has to understand the car’s engineering along with the mechanics
of motion and thermodynamics. Yet we only expect a driver to be operating a single car at
any given time, so any poor husbandry will have limited effect. In contrast, computer
software could at any given time be run millions of times and although the energy
consumed by any single process may seem insignificant when compared to the consumption
of a car, once it has multiplied by a few million it may greatly outweigh that of the car.
Notwithstanding, many developers even if they were aware of the general processes in a
computer, may never know what sort of machine is going to execute their code, web
developers working in scripted environments could be forgiven to some extent.
Even the choice of language should be taken into consideration, from a simple Tower of
Hanoi program tested in different languages using recursive methods (Noureddine et al.
2012, p.27) were able to demonstrate that Perl and Python were around two orders of
magnitude more expensive in energy consumption than Java or C++. Whereas this is a single
example comparing the energy consumption, it does suggest that perhaps large savings
could be made if developers were able to compile more of their code, and if such savings
were universal then the implications to web serving datacentres may be huge.
By introducing some sort of dynamic energy profiling to the IDE that can highlight energy
‘hotspots’, developers may be further encouraged to refactor their code towards lower
energy consumption.
Programmers and users may be responsible for ordering the hardware to consume energy
but that does not let the system architects off the hook. Computer hardware is far from

optimal in terms of energy efficiency, yet until recently it seems that most of the drive has
been toward improving processing performance and reducing hardware cost. In recent
times, however, the cost of the energy consumed by computing hardware has begun to
outstrip the acquisition cost.
4 Computer Hardware
4.1 How computers consume energy
“Power consumption is a function of load capacitance, frequency of operation, and supply
voltage. A reduction of any one of these is beneficial”(Sarwar 1997, p.12).
The dynamic power consumption of a CMOS computing device can be found with the
following simple formula; 𝑷 = 𝑪𝒇𝑽 𝟐 + 𝑷 𝑺𝒕𝒂𝒕𝒊𝒄 (the voltage V being that of VDD).
Where P is power in watts, f is the clock frequency, V2 is the square of the supply voltage,
and PStatic is the static power consumption.
𝑃 𝑑𝑦𝑛𝑎𝑚𝑖𝑐 = 𝐶𝑉2 𝑓
𝑃 𝑠𝑡𝑎𝑡𝑖𝑐 = 𝐼 𝑠𝑡𝑎𝑡𝑖𝑐 𝑉 𝐷𝐷
Table 3 Power Dissipation (Weste and Harris 2011, p.184)
Sources of Power Dissipation
Cause
Dynamic Dissipation
Load capacitances charging and discharging during gate switching
“Short circuit” caused by the partial ON of pMOS and nMOS pairs.
Static Dissipation
Sub-threshold leakage of OFF transistors.
Gate dielectric leakage.
Source/drain diffusion causing junction leakage.
Contention current of ratioed circuits.

Figure 8 Logic Inverter using switches to show states of operation (Anderson 2014b).
We can gain an understanding of how computers use energy by studying the operation of a
simple logic inverter. Figure 8 shows three basic states of a device that uses simple switches
in place of transistors. In state 1 the input has risen to logic 1, the top switch has opened
disconnecting positive power line (VDD), while the bottom switch has closed thus pulling the
voltage at Vout to near ground potential. In state 3 the opposite has occurred with a logic 0
input, and current is charging up the output circuit which has been isolated from ground.
State 2 shows what happens for a brief period while switching between states, which
involve both switches being momentarily closed. Therefore this short-circuit current in state
2 is the cause of some energy loss. Although it happens only for a short time when both
transistors are only weakly on, short-circuit current can consume up to 10% of dynamic
power (Weste and Harris 2011, p.192).

Static power is another important factor; “static power arises from sub-threshold, gate, and
junction leakage currents and contention current” (Weste and Harris 2011, p.194).
Figure 9 CMOS logic inverter states of operation (Anderson 2014c)

Since the introduction of sub 90nm processes, static power loss has become a growing
phenomenon (Weste and Harris 2011, pp.80–85,194). Figure 9, depicts a CMOS inverter
using conventional symbols. State 4 has been added to show current that passes through
transistors that are in the open (off). This loss known as Leakage is an important factor and a
growing problem with each reduction in the devices physical size. Above 90 nanometres
leakage was insignificant and was only considered an issue during sleep mode, but as
dielectrics forming the insulators in devices below that size are only a few atoms thick
electrons often tunnel through (quantum tunnelling), making leakage a significant loss
factor.
Figure 10 Electron Micrograph of a Silicon on Insulator device (Weste and Harris 2011, p.361)
The CLoad capacitor symbolised in figures 8 and 9 is a virtual device which represents the
capacitance of the entire physical device, generally this capacitance is caused by transistor
gate capacitance and by the chips internal wiring which in scale far outweighs that of any
active device in the chip (see figure 10). Wiring capacitance is anything but insignificant. If
we managed to lower the voltage on those wires down to a few millivolts CLoad would be less
of a problem to us, however, this is thus far out of reach as the transistors only operate
reliably with voltages an order of magnitude greater. This impedance mismatch between
semiconductors and their wiring is currently an area of intense research that if fruitful could
pave the way forward to significantly lower power consumption (Agarwal and Yablonovitch
2014).
Adjusting any value in the 𝑷 = 𝑪𝒇𝑽 𝟐 + 𝑷 𝑺𝒕𝒂𝒕𝒊𝒄 formula will affect the consumption, but as
the voltage is squared, reducing the voltage may have the greatest impact. This is the
approach of dynamic frequency and voltage scaling. DVFS has been a good strategy for
reducing energy consumption, but since the introduction of devices that require much lower
supply voltages the efficiency gain offered by DVFS has begun to erode(Le Sueur and Heiser
2010).
As in nature hibernation is another common approach we can use to reducing power. Two
major approaches to this are clock gating, and power gating. Clock gating removes the clock
timing signal from a section of the chip, while power gating powers down a section with the

use of switching transistors that isolate the supply voltage. Both of these approaches are
often employed, but neither is without cost. Clock gating although reducing the dynamic
power of the chip, does nothing to reduce static power. While power gating can (if used
correctly) reduce both static and dynamic power, does cost energy when the logic elements
have to be powered up again, therefore if logic elements are powered off for too short a
period then the advantage is lost (Weste and Harris 2011, p.186,197).
Figure 11 Clock Gating (Weste and Harris 2011, p.186)
Figure 12 Power Gating (Weste and Harris 2011, p.197)
4.2 Utilisation
Meisner reports that “reducing power at low-utilization is critical to increasing server
efficiency” and categorises power saving techniques for servers thusly; active low power,
and idle low power modes. From modelling Meisner finds that “full system average power is
approximately linear with respect to CPU utilization”
𝑃 𝑇𝑜𝑡𝑎𝑙 = 𝑃 𝐷𝑦𝑛 𝑋 𝑈 𝐴𝑣𝑔 + 𝑃𝐼𝑑𝑙𝑒
4.3 PowerNap
PowerNap is an architecture proposed by Meisner that seeks to aggressively power down
hardware for brief periods when CPUs become idle (several microseconds at a time). The
approach requires hardware to be constructed with PowerNap in mind and does not suit
multiprocessing architectures well, but within a single processor system the approach is
promising(Meisner et al. 2011)
4.4 Virtualisation
Virtualisation is an often cited , cure for poor utilisation, however, it does not suit every kind
of workload (particularly those workloads like transaction processing facilities that need
access to physical hardware such as RTCs), but when it does suit virtualisation can
consolidate a number of workloads while reducing the total number of physical servers. As
virtual servers are effectively time-sliced, the average utilisation tends to improve
somewhat.

Even so virtualisation is not a simple cure however, and even when payloads suit, the end
product may be more consuming than first imagined.
Consider a High availability cluster configuration (HA) with 2 physical hosts running 2 virtual
machines (N+1). In this arrangement it would be foolish to allow any VM to use more than
50% of the physical CPU or RAM, otherwise if one physical host should fail we would then
be running both VMs on a single host, thus in normal running we lose 50 % of each server’s
capacity (if one server exceeded this then failover could not occur). If one of the VMs needs
to grow in size, then both hosts will require upgrades we then strand more capacity and
thus utilisation suffers. The larger N+1 HA clusters get, the greater the stranded capacity
across the entire cluster, which includes not only physical hardware, but also software
licences and a great deal of energy. If however, we deliberately increase the cluster size and
adopt a strategy such as 𝑁 + 𝑋 + 1, where X represents the number of failures we are
prepared to withstand, then we can claw back most of the stranded capacity (see the green
shaded cells in table 4 which shows an 𝑁 + 1 + 1 strategy. On small clusters 𝑁 + 𝑋 + 1
seems extravagant, but as clusters membership grows, the relative cost falls and the savings
increase.
Physical Hosts in a Virtual Machine Cluster
Capacity Requirement: 3 VMs, 1 requiring 72GB, 2 requiring 128GB.
Physical
Server
Node
Installed
Physical
RAM
VM RAM VM RAM VM RAM VM
RAM If
using
N+1+1
Physic
al CPU
Cores
0 512 460.8 409.6 307.2 460.8 12
1 512 460.8 409.6 307.2 460.8 12
2 512 460.8 409.6 307.2 460.8 12
3 512 460.8 409.6 307.2 460.8 12
4 512 460.8 409.6 307.2 460.8 12
5 512 460.8 409.6 307.2 460.8 12
6 512 460.8 409.6 307.2 460.8 12
7 512 460.8 409.6 307.2 460.8 12
8 512 460.8 409.6 307.2 460.8 12
9 512 460.8 409.6 307.2 460.8 12
Totals with
N+1
- 5120GB 4608GB
to keep below
90%
utilization.
4096GB
allows for a
single host
failure.
3072GB
allows for the
ability to
restart one
128GB VM.
120

N+1+1
host for
redundancy
10 512 460.8
Gives
4608GB
12 –
brings
CPU core
total to
132
Stranded
Capacity
0% 10% 20% 40% 10%
Table 4 Compares N+1 to N+1+1 cluster configurations
As with most solutions, virtualisation cannot solve every issue and does not fit all workloads,
the following being of significance;
• Where access to the physical hardware is required,
• Transaction systems –often require real-time clock access
• Licence restrictions may preclude VM usage
• Where security requires physical separation, etc.
Virtualisation is not the only strategy available for improving utilisation, however.
• Putting the computer to sleep, waking it quickly when there is work to be done,
• Improve storage throughput by reducing I/O latency, this can be achieved by
attaching DASD flash devices directly to the memory bus (Sandisk 2013).
• Moving main memory onto the CPU die – this is still somewhat a future technology,
but at some point in the future this is likely to be a necessary move(Ruch et al.
2013).
4.5 System on a Chip (SOC)
Although seemingly impressive when compared to early computers, current hardware
technologies are on the whole extremely wasteful. Garimella et al, citing, (Michel.B 2012),
state “that 99% of the overall energy consumption in an IT system goes toward
communication, with only 1% being used for computation.” They go on to state that “In a
typical air-cooled system, about 1 part per million of the volume is taken up by transistors
and 96% of the volume is used for thermal transport. As communication costs become more
important, so does this 96% waste in system volume. A liquid cooled system has smaller
system volume demands, making it an enabling technology for 3D chip architectures and
higher system transistor density.”
A limit to the connectivity that we can give an individual chip can be found according to
Rent’s Rule which is expresses as;
𝐶 = 𝑘𝑁 𝑝
If one were to partition a chunk out of any network of nodes which will contain N nodes, the
number of connections that link that chunk to the rest of the network is represented by C.
the terms k and p are characteristic of the entire network, k being the average number of

connections per node, and p is known as the Rent exponent and is ranged between 0 and 1
(in computers this is typically in the range .45 to .75).
Rent’s rule first identified at IBM in the 1960s while researching the connectivity of
integrated circuits has been identified in biological brains (Freiberger 2010).
Short of any way of magically making communication virtually cost free, computers will have
to become much more memory-centric “Caches, pipelining, and multithreading improve
performance but reduce efficiency. Multitasking violates the temporal Rent rule because it
puts processes in spatial or temporal proximity that do not communicate. If the underlying
hardware follows Rent’s rule, a system optimisation segregates the tasks. However, “the
breakdown of Rent’s rule at the chip boundary leads to a serialization of the data stream
and, accordingly, to penalties in performance and efficiency.”(Ruch et al. 2011, p.9).
As each chip boundary limits the Rent coefficient by essentially partitioning the network of
connections, the only way to improve this situation is to effectively bring the network of
nodes within the bounds of the chip. A further use of the Rent coefficient, however, is to
determine the wiring cost of a given network, merely expanding the chip to encompass the
entire compute network using current IC technologies would result in a high wiring cost.
“Today, all microprocessors suffer from a break-down of Rent’s Rule for high logic block
counts because the number of interconnects does not scale beyond the chip edge.”(Ruch et
al. 2013, p.162)
Concerned by these limitations researchers are looking into ways of fabricating three-
dimensional blocks of silicon wafers (known as 3DIC packages). Apart from obvious benefits
such as being able to marry the CPU with large amounts of main storage, thus dispensing
with any need for cache RAM, the most significant improvement may come in “terms of
energy, 3-D integration can reduce energy consumption by a factor of 30 and, with improved
devices, by a factor of 100.” (Ruch et al. 2013, p.163)
3-D devices bring with them many opportunities as well as some problems; foremost is likely
to be the problem of cooling them. Devices are being studied that incorporate engraved
microchannel waterways inside 3-D chips that are constructed of stacks of silicon wafers.

Figure 13 Showing stacked integrated circuit dies and water cooling pathways (Sridhar et al. 2014, p.2579)
5 Datacentre Facility Hardware
5.1 Datacentre Facility Electricity Supply
5.1.1 General Trend
The majority of server hall computer are powered by some form of AC supply. In the EU
supply voltages nominally 230V ± 10% single phase and 400V ± 10% three phase. Up until
the early 2000s three phase power predominated in the datacentre, but more recently with
the growing dominance of rack mounted PC style servers few computers or peripherals now
directly employ three phase power. Three phase power is still widely used in datacentres
but tends either to be powering the cooling and ventilation plant, or it is terminated at the
power distribution units (PDU) which then distribute the lower 230V power to buss bars
from which individual servers and peripherals gain power.
The distribution of electrical power throughout the datacentre is not without some loss, and
there has been a fair amount of research in this area and a certain amount of controversy
regarding some of the findings, particularly when it comes to the use of bulk high voltage DC
or HVDC as it is becoming known.
Despite major changes in the technologies at the compute level of the hardware stack, the
design of datacentre electrical power systems have tended to be somewhat conservative,
though considering the value tied up in operation this is perhaps not too surprising. In more
recent times, however, with hitherto unprecedented levels of power demand, efficiency is
fast becoming the main driver of change, demanding more than conventional (largely
passive) designs can produce. While on the whole conventional datacentre supply systems
are quite effective. Datacentre UPS efficiency figures reaching up to around 95% have been
published (Rasmussen and Spitaels 2007, p.9), though it must be said that these figures are
found on relatively new installations, older equipment may achieve less than half of that,

and in addition Rasmussen et al point out that efficiency should be expressed as a curve
detailing efficiencies at varying loads. For 2N redundant systems it is customary to use the
figures found at the 50% load factor of such a curve, and 33% for 2(N+1) systems. These
load factor percentages give one a clue as to the worst case efficiency and impose
constraints on the design engineers. Ideally the efficiency curve should be as flat as possible
between the nominal operation, and the fully loaded point.
When a Datacentre is in the planning stage, architects have to decide how to best supply the
DC. If they install enough supply capacity for a fully provisioned DC, then in the period
before the DC reaches that capacity the datacentre supply will be under-loaded, and will
operate far below the optimal band. An ideal situation is portrayed in figure 14 where the
curve marked B has a flat section from around 33% load up to 100%. In contrast the curve
marked A begins a steep decline below 50% load factor, engineers would ideally like to keep
a UPS that had the A efficiency curve loaded above 50%, as such it could be used in a 2N
system, but would perform badly in a 2(N+1) arrangement which would load it at the 33%
point.
Owners with datacentre facilities that have just come online with a have a large proportion
of unused capacity will find themselves paying for equipment that is producing little more
than waste heat. To avoid this scenario, power plant architects often build out from minimal
installations, increasing plant capacity as the need arises. The downside to this approach
may be that some of the early use equipment is not fit for the new purpose in the
demanding environment. An alternate approach is to modularise the datacentre, matching
smaller supply plants to limited subsections of the facility (such subsets are often effectively
mini datacentres, comprising compute, cooling and power, and may even be containerised).
Figure 14 Efficiency plot to load of a typical UPS (Anderson 2014d)
Regardless of the installation scale, supply plant in the conventional form will often have
several successive stages where losses occur. Consider figure 12 which shows a highly
simplified non-redundant (N redundancy) datacentre electricity supply.

1. Medium AC voltage is supplied to the building where it is reduced to 400V AC by a
voltage transformer.
2. The 400V AC is then rectified into DC where it feeds the power backup battery.
3. DC is then fed into an inverter which restores the current to 400V AC.
4. The 400V AC then arrives at another transformer which lowers the voltage to 220V
AC.
5. The 220V AC is fed into a Power Distribution Unit which,
6. Feeds various Cabinet power Distribution Units into which computer equipment
PSUs are connected.
7. The final stage inside the computer equipment PSUs rectifies the AC into DC and
lowers the voltage to those useful to the servers and peripherals (generally
employing switch mode principles).
5.1.2 Conversion Losses
The system described in Figure 15 has several stages where conversion loses occur, the
first of these is with the voltage transformer receiving the input from the power utility
company which drops the voltage to 400V AC (in Europe), there is yet another
transformer stage later on in the diagram that drops the voltage still further to 220V AC.
Both of these transformers suffer (as do all transformers) from the effects of;
1. Resistance of the windings - 'copper loss'.
2. Magnetic friction in the core - 'hysteresis'.
3. Electric currents induced in the core - 'eddy currents'.
4. Physical vibration and noise of the core and windings.
5. Electromagnetic radiation.
6. Dielectric loss in materials used to insulate the core and windings.
To minimise the above points transformers have to be designed carefully with the
application in mind, if the load factor were to change the transformer becomes less
efficient (Clarke 2014)
Losses also occur in the DC to AC conversion step which depends on large inverters.
The final step in the conversion chain has a very significant impact on overall efficiency
as most “1U volume servers in the market have an average of 70% PSU power efficiency.
Therefore, 30% energy loss occurs when changing from AC to DC” (Kwon 2010, p.2). 30%
loss sounds bad enough yet it is easy to miss the significance of this when one looks at
the supply as a linear chain from utility – datacentre facility supply – server. The
important fact to understand of course is that each individual server wastes 30%,
contributing to the heat load that the building’s cooling system has to contend with.
Having to work harder the cooling system also incurs losses which further compound

the problem by imposing greater load on the facility supplies which themselves produce
heat.
Key 1 Terms and symbols used in Figures 15, 16, and 17
Figure 15 Traditional AC supply with UPS (Anderson 2014e), see Key 1.
5.1.3 Conversion Losses and their impact on redundant systems
In many situations server owners require their service to be available constantly; they see
any downtime as either lost revenue or a threat to operation. Traditionally uptime is
measured in nines where nines equate to the number of nines in the availability factor
(often state in service level agreements or SLAs), e.g. three nines = an availability of 99.9%.
Table 5 shows a comparison of uptimes. Unfortunately significantly improving reliability of
(focusing on hardware for the moment) the physical components of DC computer systems
(which are already very reliable) is a nontrivial matter. As such moving from four nines to
five nines requires much more than replacing one server with a more costly one, in fact, one
usually has to modify architecture to make such a reliability leap.
Often to gain reliability 2(N+1) facility power supplies are installed, yet although giving a
good level of redundancy, they are also relatively inefficient. IF we use a 2 PSU machine as

an example, each supply has to be capable of running an entire datacentre, as detailed
above they are inefficient when not fully loaded.
Increasing reliability in not a simple matter of buying better stuff, but involves duplication of
assets, increased direct power consumption and a host of embodied energies.
Service failure, on the other hand, depending on the application can result in increased
energy usage as organisations can be forced to rebuild assets (loss of data, loss of business,
and loss of dematerialised resources).
Table 5 Uptime expressed as percentage and in human readable format
Number of Nines % Uptime Downtime Seconds Human Form
2 99 315,400.00 3 days 15h 36m 40s
3 99.9 31,540.00 8h 45m 40s
4 99.99 3,154.00 52m 34s
5 99.999 315.40 5m 15.4s
6 99.9999 31.54 31.54s
7 99.99999 3.15 3.15s
There being 3.154 x 107 seconds in each non-leap year
With careful planning, it is possible to achieve a good level of supply redundancy and
minimise (but not eliminate) losses. One such configuration is a Tri-isolated, Triple
redundant supply (see figure 16). To effect such a configuration the three input UPS systems
each have to capable of sustaining half of the load, yet all three are nominally online
supplying each a third of the load. The output of these UPS is fed into ‘static transfer’
switches that can switch the supplied power in less than a quarter of an AC power cycle. By
studying the diagram one can see that using this arrangement each rack has multiple supply
paths options. The equipment inside the racks are all ‘dual corded’, each possessing two
power supply units (Robert J. Yester 2006). Such configurations trade-off between efficiency
and reliability, maximising one tends to compromise the other.
Tri-Isolated Triple-Redundant Electricity Supply

Figure 16 Tri-Isolated Triple-Redundant Electricity Supply (Anderson 2014f), see Key 1.

5.1.4 Facility Level Direct Current
In recent years there has been growing interest in ways to tackle the efficiency/reliability
trade-off. With this in mind engineers are beginning to build datacentres that dispense with
AC almost entirely, using elevated DC voltage to feed the entire datacentre.
Figure 17 High Voltage DC Datacentre Supply (Anderson 2014g), see Key 1.
Since there are no phase differences to worry about power factor correction requirements
are entirely eliminated.
Reliability is the foremost advantage of 380 Vdc in the data centre. By eliminating the extra
conversions, the PDUs and transformers, the front end of the PSU, and, most especially, the
inverter that was on the output of the UPS in the ac design, the reliability of the power
delivery chain is improved by 1,000%”. In addition the Facility Level HVDC concept has
several potential benefits;
• Only a single transformer (assuming a single non-redundant medium voltage power
input to the datacentre), thus fewer transformer losses – copper loss, hysteresis, and
eddy currents (also vibration, EM radiation and dielectric losses).
• No need to use DC-AC conversion when running on battery power, inverting DC to
AC is also known to cause losses.
• Simpler and cheaper wiring, because “of skin effect and the reactive current
overhead, ac conductors need to be sized bigger than dc for the same voltage and
power capacity”(AlLee and Tschudi 2012, p.56).
• Reduced component count and accordingly simpler maintenance.
• Ideally DC will feed directly into computers, dispensing with the computers own PSU.
IBM began offering mainframes (z196 and subsequent models) ready fitted to accept
HVDC in the range of 350 to 550V DC (Andres et al. 2012, p.5).
• A further option is to place batteries in racks near computing hardware to improve
local fault tolerance.
Opponents cite often cite problems with such as ‘high voltage switching is difficult as
there is no zero crossing point to reduce arcing’; however this is only a problem for the
component suppliers to overcome. Shortage of skilled high voltage DC technicians is also
given as a reason to avoid the technology (though there must be plenty of expertise in
the electric railway and tramway fields where HVDC has been in use for over a hundred
years).

5.1.5 Possible road map to greater PSU efficiency
Increasing reliability in not a simple matter of buying better stuff, but involves duplication of
assets, increased direct power consumption and a host of embodied energies.
Service failure on the other hand, depending on the application can result in increased
energy usage as organisations can be forced to rebuild assets (loss of data, loss of business,
and loss of dematerialised resources).
5.1.6 RAILS
One approach for improving efficiency without resorting to converting the datacentre to
direct current is to ensure that all switch mode power supplies are operating at the load
factor they are designed for.
Figure 18 Power supply efficiency (Meisner 2012, p.88, fig.8.1)
Strategies such as ‘dual cording‘, giving each computer or peripheral 2 (sometimes more)
switch-mode PSUs give an effective level of redundancy, though this approach is extremely
inefficient. Using a 2 PSU machine as an example, each supply has to be capable of running
an entire machine, yet switch mode power supplies are very inefficient when not fully
loaded, and as both PSUs would be effectively sharing the load their efficiency is likely to be
at or below 20%. Power wasted as a consequence of improved reliability may result in
stranded capacity, whereby locations in the datacentre have to remain unpopulated due to
shortages in supply power, and or cooling capacity.
To improve power supply utilisation the RAILS (Redundant Array for Inexpensive Load
Sharing) (figure 19) concept has been proposed (Meisner 2012, p.103). Comprising a set of
very basic switch-mode power supplies that are capable only of operating in the ‘green
zone’ while the load is at minimum (figure 18) a RAILS device tracks the load (in the study
the load was a blade centre array), as load increases PSU devices are switched in to meet
demand. When load falls successive PSUs are isolated from both input and output. In this

way PSUs operate only in the ‘green zone’ (see figure 18) while the overall system improves
reliability by the fact that there are a number of redundant PSUs.
5.2 Cooling
5.2.1 Current trends in cooling technology
Most datacentre servers have evolved from the humble desktop PC and as a consequence
Modern server hardware still carries many artefacts perhaps more suited to the desktop
than the computer room
While early PC servers began to expand into DC space, mainframe systems were employing
water cooling technologies to more effectively control the high temperatures of ECL TTL
(Emitter Coupled Logic – Transistor Transistor Logic) technologies (Pittler et al. 1982). The
first PC servers gained market share in the LAN file server segment, and possibly as a result
of their low cost they could be rapidly deployed using department budgets, thus avoiding IT
department ‘red tape’ and cross charges. Later as PC client/servers increased in complexity
such systems were often inherited by the IT departments to be redeployed inside the
datacentre. In the 1990s the computer halls of datacentres were populated mainly with
mainframe and/or mini (midrange) computers along with their associated peripherals
(mostly direct attached storage – DASD and tape units), but from the late 90s into the early
2000s racks of PC servers began to make up a significant portion of the DC floor space.
The changing internal landscape of the datacentre highlights an important fact, that the
datacentre building architecture is designed for the medium to long term, yet the hardware
that it is designed to support can radically change within the short term. Mismatching the
building and computer hardware can cause severe energy and cooling problems, yet over

specifying the support infrastructure is both financially wasteful and environmentally
unsustainable.
Perhaps though we are being unfair to the datacentre architects expecting them to house
technology that has yet to be designed, even so the modular datacentre concept may offer
some flexibility to overcome technological uncertainty.
Aspects of PC server design that can become problematic are power supply and cooling.
Generally PC servers are designed to be air cooled; therefore they have a contingent of air
moving devices and are built with widely separated components to maximise airflow and to
reduce drag. For air cooling to be effective, hot surfaces have to be as large as practicable
owing to the poor heat capacity of air which is the primary coolant (Zimmermann, Tiwari, et
al. 2012) this property forces designers to set hot components apart. In addition the intake
air has to be chilled sufficiently to allow enough heat to be removed by the air stream.
Chilling and moving large volumes of air into and around a datacentre often requires large
energy inputs, and where so-called ‘free the air cooling’ techniques are unsuitable this
cooling may consume more energy than does the data processing (see PUE and DCiE).
Figure 20 (Garimella et al. 2013, p.68) gives a proportional view of typical air cooled datacentre energy use.
It is possible, however, to water cool PC servers and there are a number of technologies in
the market providing such solutions (some even call for the total immersion of servers). Yet
simply adopting liquid cooling may only improve the cooling efficiency, and while fluid
coolants may be cheaper to circulate and chill than air, there may be greater benefits to be
gained by capturing the waste heat for reuse.

To facilitate energy reuse the entropy must be suitably low, each conversion of an energy
tends to increase the entropy (hot things become colder, ordered things become
disordered) and “An important first step for data center heat reuse is to reduce the number
of thermal interfaces between components and their thermal resistance in the thermal
path”(Brunschwiler et al. 2009, p.11:4). In an atmospheric cooling system the exhaust air,
although warm, contains little useful heat energy (it has high entropy) and has few
secondary uses.
Before CMOS transistors began to displace bipolar TTL (Transistor-Transistor Logic) logic in
computing around the turn of the 1980s, engineers were searching for ways to cool VLSI
chips. Etching micro channel grooves directly into IC substrates Tuckerman and Pease were
able to remove 800W/cm2 while keeping devices below 85⁰C (Tuckerman and Pease 1981).
Interest in such technologies however, had waned by the end of the 1990s when CMOS
technologies having matured, brought significantly lower power consumption and
consequently lower thermal dissipation than their TTL rivals. Without any further significant
ROI, technologies tend to fall out of use. Unfortunately power densities have increased
markedly in the last few years with heat fluxes exceeding 100W/cm2 (Sridhar et al. 2014,
p.2577). With such high energy densities the possibility to reuse a significant portion of the
energy suggests that ROI may now be an effective driver toward DC sustainability.
In support of this view, there are several papers investigating the use of hot water as a
computer coolant and consequently HPC clusters have been built as proof of concept,
notably ‘Aquasar’ which features the micro-channel grooves developed by Tuckerman and
Pease. An important aspect of Aquasar is its simplicity; the entire cooling system operating
without the need for any chillers. Zimmermann et al conclude that hot water entering a
system at 60⁰C is an effective coolant “The benefits of the liquid cooled solution were the
higher exergetic output and the possibility of a direct use of up to 80% of the recovered heat
for space heating.”
Support for this concept comes from an interesting comparison in that “a brain is hot-water
cooled, and waste heat is used to raise the body core temperature to achieve better
efficiency. The pumping power for the circulatory system, provided by the heart, is about 5%
of the total energy budget; a similar level of overhead could be envisaged for such a
bioinspired IT system”(Garimella et al. 2013, p.72).
Unfortunately as with a lot of products, the designs of PC servers suffer a high degree of
inertia. While much within PC servers has changed e.g. chip, and bus architecture, the basic
construction pattern is conservative and has altered little. Glass reinforced plastic printed
circuit boards carrying discrete components soldered in place are so common place few
ever think there could be any other solution; yet in order for energy consumption to be
reduced to an acceptable level more adventurous design philosophies will have to explored,
especially if technologies such as hot water cooling are to be successful. To this end
motherboard construction will need to change radically. Incidentally GRP printed circuits

tend to be shipped to poorer countries outside of legal boundaries, where the more
precious elements are extracted by burning which causes further pollution and harm to
those involved in the recycling process (Hilty 2011, pp.128 – 129).
In fact long before PC servers were popular, mainframe computers based on water cooled
multi-chip modules (MCM) were already successfully deployed. Apart from a large reduction
in volume, MCM technology reduces the number of interconnections between logic
elements and a huge improvement in reliability (even when failures do occur, depending on
the MCM technology chosen, rework and repair is feasible (Blodgett and Barbour 1982,
p.34).
One may speculate that there may be a lack of trust when it comes to mixing expensive
electronics with water. Despite this during the late 80s and 1990s there was no technology
on the market that could offer the power densities of computers based on water cooled
MCM technology, and at the time few would have doubted the reliability of those systems.
A worry that is often heard regarding water cooling is that water and electronics don’t mix,
which may be true to some extent (in circumstances where the electronics are
unprotected), but given that systems and procedures are engineered to work reliably within
known conditions, such worries are not rational (Stephen J. Bigelow 2014).
In most situations it is better to engineer to avoid the possibility of incident than to
eliminate a risk. Considering how often sophisticated electronic systems are used in harsh
environments (marine, automotive, avionic, and industrial applications), it may seem odd to
an industry outsider that datacentre systems are so delicate. Given that electronic systems
can be and are routinely hardened against harsh environments, this situation may be a
hangover from the times when computer halls were required to be ‘clean room’
environments owing to the fact that many computer parts such as removable disc drive
packs were not hermetically sealed and even tiny amounts of fine dust could cause fatal disc
crashes and expensive data loss.
Power density is already a driving factor, yet with disruptive technologies such as ‘Big data’,
the ‘Cloud’, and the ‘Internet of Things’ (IoT) set to deliver hitherto unprecedented volumes
of data into the datacentre, floor space is becoming premium therefore to tackle these
massively increased workloads the datacentre industry is moving relentlessly toward higher
computing power densities, and we will soon move beyond the cooling capacity of air,
therefor modern day datacentre managers need to rid themselves of such worries if they
are to achieve the power densities the market is coming to demand (Sridhar et al. 2014).
5.2.2.2 Heat Recovery
The value of any heat recovered depends on there being a market for such a good, an
organisation with a local datacentre may wish to heat offices for example or sell the heat
into a community heating scheme.

The economic value is also dependent on the temperature of the waste heat and the
secondary reuse application. Space heating and desalination appear to offer the greatest
value, though use in refrigeration, and for generating electricity are also possible but less
efficient (Zimmermann, Meijer, et al. 2012, p.244; Zimmermann, Tiwari, et al. 2012, p.6399),
although the market price of heat has to be assessed correctly before the value of the waste
heat can be known. To this end Zimmerman et al. propose a new metric to gauge the
economic value of waste heat (VH), expressed as;
While the cost of the heat recovered is related to the value of the electricity the DC
consumed and the efficiency of the heat recovery’s efficiency, which is expressed as follows;
The final value of the waste heat is, therefore, a function of the datacentres operating
temperature (Zimmermann, Tiwari, et al. 2012, p.6398).
5.2.3 Free cooling
There are some seemingly simpler cooling technologies available. Both Google and
Facebook have built datacentres that are cooled without any refrigeration plant, employing
instead filtered ambient air to cool racks arranged in ‘hot/cold’ aisle containment. Facebook
have gone as far as to make their solutions open source in the OpenCompute Project.
Opencompute is rather radical in that the entire datacentre including the compute
hardware (servers and racks) has been custom designed to reduce the amount of energy
required for air cooling. Facebook claim that this approach is very economical. However, the
overall approach is more suited to green-field rather than inner city sites and takes some
advantage from Prineville’s location and its relatively dry ambient air. Figure 21 depicts a
vertical cross section of an Opencompute datacentre, which although dispensing with
conventional items such as false floors and air ducting, has a relatively low density of
computer hardware with computers confined in hot isles in the bottom third of the building.

Figure 21 Vertical cross section of Facebook's Opencompute datacentre (Frachtenberg 2012, p.2)
Conventionally air cooled datacentres utilise some form of air conditioning, either based on
compressor or evaporative techniques, yet although chilling air warmed by computers is an
indirect and potentially wasteful approach to cooling, the cooling engineers need only
understand the dynamics of air cooling. Whereas with directly liquid cooled systems the
cooling engineers need also to be aware of computer engineering problems. Overcoming
this compartmentalised situation is going to become crucial if real energy savings are to be
made.
According to Garrimella et al, datacentre development progress is slower than it could be
owing to the fact that the challenges are multidisciplinary and that some “problems are
poorly understood across boundaries which is where synergy is needed. The IT community is
diverse and scattered, consisting of end users, data center operators, chip manufacturers,
equipment integrators and others, and there is a limited understanding of the perspectives
of each. Similarly, conventionally separate electrical design for power consumption reduction
(via supply voltage reduction) and improved thermal management technologies have
provided only incremental performance extensions to current approaches”(Garimella et al.
2013, p.72).
For many years massive amounts of energy were expended keeping computer rooms within
temperature ranges relating to the days of the water-cooled mainframe when the standards
were trying to make it comfortable for people to work in those environments. “When
aircooled servers became prevalent, data center operators continued in the previous mode,
using ever more powerful air conditioning units to maintain a comfortable working
environment. The recent ASHRAE standards allow for a higher room temperature, which
enables more energy efficient air-based cooling. This evolution has come about because of a
closer dialog between data center operators and server designers” (Garimella)
5.3 Corporate Policy in the Datacentre
In some situations the political framework behind a datacentre may need to change in order
for real improvements to take place. Organisations with a near sited view on investment
may fail to achieve any real level of sustainability by affording only the cheapest solution
and not looking at the real total cost of ownership. Incorrectly assigning budgets can restrict
organisations to less than optimal architectures that run with poor efficiency and/or poor
reliability, after all a broken computer system still consumes energy. The Opencompute
project is a good example of how policy can positively affect both the organisation and the
community as a whole. Facebook’s initial bold investment is feeding back into the
organisation and has resulted in wider industry collaboration as well as the creation of open
standards.
6 Economics
Datacentres do not stand alone in our environment independent of constraints, even a five
bar gate at the edge of a field has been shaped by a multitude of factors. The gate has just
enough design, structural strength, material to be effective at the lowest possible cost. If

Sustainability: Datacentres Part of the Problem or Part of the Solution?

Sustainability: Datacentres Part of the Problem or Part of the Solution?

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

Similar to Sustainability: Datacentres Part of the Problem or Part of the Solution?

Similar to Sustainability: Datacentres Part of the Problem or Part of the Solution? (20)

Recently uploaded

Recently uploaded (20)

Sustainability: Datacentres Part of the Problem or Part of the Solution?