Flyways To De-Congest Data Center NetworksDocument Transcript
Flyways To De-Congest Data Center Networks
Srikanth Kandula Jitendra Padhye Paramvir Bahl
Abstract– A study of application demands from a produc- Tree FatTree VL
Oversubscription Ratio : : :
tion datacenter of servers shows that except for a few out-
liers, application demands can be generally met by a network G
that is slightly oversubscribed. Eliminating over-subscription Switches Agg
is hence a needless overkill. In a signi cant departure from Commodity
recent proposals that do so, we advocate a hybrid architec- Top-of-rack
ture. e base network is provisioned for the average case, is Network Cost (approx.) x - x - x
oversubscribed, and can be built with any of the existing net- Table : Comparison of three data center networking archi-
work designs. To tackle the hotspots that remain, we add extra tectures. Servers, x G agg switches, G Server NIC,
G, G links, port commodity switches for FatTree. Notice the
links on an on-demand basis. ese links called yways pro-
number of links required for FatTree topology.
vide additional capacity where and when needed. Our results
show that even a few additional yways substantially improve Eliminating oversubscription is a noble goal. For some work-
performance (by over ), as long as they are added at the loads, such as the so-called “all-pairs-shu e” , it is even nec-
right place in the network. We consider two design alterna- essary. Yet, as the cost and complexity of non-oversubscribed
tives for adding yways at negligible additional cost: one that networks is quite high, it is important to ask: how much band-
uses wireless links ( GHz or . n) and another that uses width do typical applications really demand? e answer to
commodity switches to add capacity in a randomized manner. this question may point towards an intermediate alternative
that bridges the gap between today’s production network, and
1. INTRODUCTION the ideal, non-oversubscribed proposals.
To answer the question, we gathered application demands
As cloud-based services gain popularity, many businesses
by measuring all network events in a server production
continue to invest in large data centers. Large datacenters pro-
data center that supports map-reduce style data mining jobs .
vide economies of scale, large resource pools, simpli ed IT
Figure shows a sample matrix of demands between every
management and the ability to run large data mining jobs (e.g.,
pair of the top-of-rack switches. A few trends are readily ap-
indexing the web) [ ]. One of the key challenges in building
parent. First, at any time only a few top-of-rack switches are
large data centers is that the cost of providing the same com-
hot, i.e., send or receive a large volume of tra c (dark hori-
munication bandwidth between an arbitrary pair of servers
zontal rows and vertical columns). Second, the matrix is quite
grows in proportion to the size of the cluster [ , ].
sparse, i.e., even the hot ToRs end up exchanging much of
Production networks use a tree like topology (see Fig. a)
their data with only a few other ToRs. e implications are
with - servers per rack, increasingly powerful links and
interesting. Figure shows the completion time of a typical
switches as one goes up the tree, and over-subscription factors
demand matrix in a conventional tree topology that has :
of : (or more) at higher levels in the tree . High oversub-
over-subscription at the top-of-racks. e sparse nature of the
scription ratios put a premium on communication with non-
demand matrix translates into skewed bottlenecks, just a few
local servers (i.e., those outside the rack) and application de-
of the ToRs lag behind the rest and hold back the entire net-
velopers are forced to be cognizant of this limitation [ ].
work from completion. Providing extra capacity to just these
In contrast, recent research proposals [ , , ] combine many
few ToRs can signi cantly improve overall performance.
more links and switches with variants of multipath routing
Demand matrices exhibit these patterns because of the char-
such that the core of the network is not oversubscribed. At any
acteristics of underlying applications. Speci cally, the map-
point in the network, su cient bandwidth is always available
reduce workload that runs in the examined cluster causes, at
to forward all incoming tra c. In such a network any server in
worst, a few tens of ToRs to be simultaneously bottlenecked.
the cluster can talk to any other server at full NIC bandwidth,
We expect this observation to hold for many data center work-
regardless of the location of the servers in the cluster, or any
loads including those that host web services, except perhaps
other ongoing tra c. Needless to say, this bene t comes with
for rare scienti c computing applications.
large material cost (see Table ) and implementation complex-
Based on these observations, we advocate a hybrid network.
ity (see Fig. b, c). Some [ ] require so many wires that laying
Since the demand matrix is quite sparse, the base network need
out cables becomes challenging while others [ , ] require up-
only be provisioned for the average case and can be oversub-
dates to server and switch so ware and rmware in order to
scribed. Any hotspots that occur can be tackled by adding ex-
achieve multipath routing.
servers with Gbps NICs per rack, port Cisco s at the top Every server sends a large amount of data to every other server.
of the rack (ToR) with Gbps uplinks and port Cisco s at the We note that internal network is rarely the bottleneck for clusters
root results in : over-subscription at the ToR’s uplink that support external web tra c.
e r v e r s
A g g r e g a t i o n
i t c h e s T o p - o f - R a c k
C o m m o d i t y ( f e w p o r t s , l o w c o s t )
B a n d w i d t h = S e r v e r N I C
L i n k s
H i g h B a n d w i d t h
… … … …
… … … …
F l y w a y
… … … …
(a) Conventional Tree (b) FatTree [ ] (c) VL [ ]
Figure : Tree, VL and FatTree topologies
tra links between pairs of ToRs that can bene t from it. We 1.0
To Top of Rack Switch
call these links yways. Flyways can be realized in a variety of 0.8
ways, including wireless links that are set up on demand and 0.6
commodity switches that interconnect random subsets of the
ToR switches. We primarily investigate ghz wireless tech-
nology for creating yways. is technology can support short 0.2
range ( - meters), high-bandwidth ( Gbps) wireless links. 0.0
We now make several observations about yways, which we From Top of Rack Switch
will justify in the rest of the paper. First, only a few yways, Figure : Matrix of Application Demands (normalized) between
with relatively low bandwidth, can signi cantly improve per- Top of Rack Switches. Only a few ToRs are hot and most of their
tra c goes to a few other ToRs.
formance of an oversubscribed data center network. O en,
the performance of a yway-enhanced oversubscribed network 1
With 1:2 Oversubscription
is equal to that of a non-oversubscribed network. Second, the 0.8 SpeedUp
key to achieving the most bene t is to place yways at appro- 0.6
priate locations. Finally, the tra c demands are predictable at Hot ToRs
short time scales allowing yways to keep up with changing
demand. We will describe a preliminary design for a central 0 20 40 60 80 100 120 140
ToRs sorted by traffic
controller that gathers demands, adapts yways in a dynamic
manner, and uses MPLS label switched paths to route tra c. Figure : Providing some surplus capacity for just the top few
ToRs can signi cantly speed up completion of demands.
Wireless yways, by being able to form links on demand,
can distribute the available capacity to whichever ToR pairs
Fraction of Total Traffic
max Gap from mean
need it. Further, the high capacity and limited interference 0.03
range of GHz makes it an apt choice. ough less intu- 0.025
itive, wired yways provide equivalent bene t. When inex- 0.015
pensive switches are connected to subsets of the ToRs, the lim- 0.005 0.6
ited backplane bandwidth at these switches can be divided among 0 2 4 6 8 101214161820 0 2 4 6 8 10 12 14 16 18 20
whichever of the many ToR pairs that are connected need it. Demand Matrix @ Time (hours) Demand Matrix @ Time (hours)
Wired yways are more restrictive, only if a ToR pair happen Figure : Demand matrices are neither dominated by a few ToR
to be connected via one of the yway switches, will they ben- pairs nor uniformly spread out. None of the ToR pairs contributes
more than of the total (le ) and the typical ToR pair is o the
e t from a yway. Yet, they are more likely to keep up with
mean by . (right).
wired speeds (for e.g., as NICs go up to Gbps and links go
to Gbps). Either of these methods performs better than the We collected all socket level events at each of the instru-
alternative of spreading the same bandwidth across all racks mented servers using the Event Tracing for Windows [ ] frame-
as much of that will go unused on links that do not need it. work. Over a few month period our instrumentation collected
We stress that this yway architecture is not a replacement several petabytes of data. e topology of the cluster is akin to
for architectures such as VL and FatTree that eliminate over- the typical tree topology (see Fig. a). To compute how much
subscription. Rather, our thesis is that for practical tra c pat- tra c the applications have to exchange (i.e., the demands)
terns one can get equivalent performance from a slightly over- independent of the topology that the tra c is currently being
subscribed network (of any design) that is augmented with y- carried on, we accumulate tra c at the time scale of the appli-
ways. Further, yways can be deployed today on top of the cations (e.g., the duration of a job). For the map-reduce appli-
existing tree-like topologies of production data centers and in cation in our data center, we accumulate over a 5 minute pe-
many cases, yways are also likely to be cost-e ective. riod since most maps and reduces nish within that time [ ].
Tra c can be binned into two categories, the tra c between
servers in the same rack, and the tra c between servers that
2. THE CASE FOR FLYWAYS are in di erent racks. As the backplane of the ToR switch has
We examine the tra c demands from a production cluster ample capacity to handle the intra-rack tra c, we focus only
by instrumenting servers. Together, these servers com- on the inter-rack tra c which is subject to oversubscription
prise a complete data mining cluster that supports replicated and experiences congestion higher up the tree.
distributed storage (e.g., GFS [ ]) as well as parallel execution What do the demand matrices look like? If the matrices
of data mining jobs (e.g., MapReduce [ ]). are uniform, i.e., every ToR pair needs to exchange the same
Fraction of ToRs Traffic To Other 1 top ten other ToRs 0.4 1
Remaining Traffic at ToR
Fraction of Total Traffic
top five 1
largest 0.9 1 flyway each for top 50 ToRs
0.3 0.8 5 flyways each for top 10 ToRs
0.6 10 flyways each for top 5 ToRs
0.2 0.6 Reduction of 50%
0.1 Probability 0.5 0.4
0 0 0.3 Better Choice
0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 0
Fraction of ToRs Number of Hot ToRs 0 10 20 30 40 50
ToRs sorted by traffic
Figure : e hot ToRs, i.e., those that either send or receive a lot
of tra c, exchange most of it with just a few other ToRs (le ) and Figure : Where to place yways for the most speedup?
there aren’t too many hot ToRs (right) 0.2 1
amount of tra c, then the solution is to provide uniformly 0.12
high bandwidth between every pair of ToRs. On the other 0.08 Probability’ 0.4
hand, if only a few ToR pairs consistently contribute most of 0.04 0.2
the total tra c, then the network can be engineered to pro- 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
vide large bandwidth only between these few pairs. We nd Capacity of flyway as fraction of the total ToR bandwidth
that neither extreme happens o en. Fig. (le ) plots the max- Figure : How much capacity should each yway have?
imum entry in demand matrices of an entire day. e largest
spondents does eliminate congestion at the top ve only for the
entry contributes . of the total demand on average and
sixth ToR to end up as the bottleneck. Achieving a proper bal-
never more than . Fig. (right) plots the average gap be-
ance between helping more ToRs and reducing enough conges-
tween a demand entry and the mean demand, which is typi-
tion at every one of the hot ToRs obtains the most speedup. (See
cally of the mean.
§ . for our algorithm).
Let us now consider the ToR switches that either send or
How much capacity does each yway need to have? Sup-
receive large amounts of tra c and examine the fraction of
pose we add yways between the top ten ToRs and each of the
each ToR’s tra c that is exchanged with its top few correspon-
ve other ToRs that they exchange the most data with (i.e., the
dents (other ToRs). Figure shows that among ToR switches
best option above to place yways), Fig. plots how much
that contribute more than of total tra c, i.e., the ToRs that
tra c each yway needs to support. Most yways need less
are shown in Fig. le ,, the median ToR exchanges more than
than of the ToR’s uplink bandwidth to be useful. e rea-
of its tra c with just 10 other ToRs. is result has sev-
son is that while the ToR’s uplink carries tra c to all of the
eral implications. Providing additional capacity between the
other ToRs, a yway has to only carry tra c to one other ToR.
hot ToR and the other ToR that it exchanges a lot of data with
e usefulness of yways stems directly from application
would improve the completion time for that pair. By remov-
characteristics that cause sparse demand matrices. In data cen-
ing the tra c of this pair from competing with the other tra c
ters that support web services, the request tra c is load bal-
at the hot ToR, completion times for the other correspondents
anced across servers, each of which in turn assembles a re-
improves as well. Even better, since we picked a hot ToR to be-
sponse page by perhaps asking a few other servers to generate
gin with, speeding up completion of this ToR’s demands (i.e.,
parts of the response (e.g., advertisements). e reduce part of
local improvements) will lower the completion time of the en-
map-reduce in data mining jobs perhaps comes closest to be-
tire demand matrix (global impact). It turns out that the num-
ing the worst case, with each reducer pulling data from all the
ber of hot ToRs that need the surplus capacity is small–in a
mappers. e job is bottlenecked until all the reducers com-
typical demand matrix, the 10 top ToR’s account for of
plete. Even then, it is rare to have so many mappers and re-
the total tra c (see Fig. right).
duces that all ToRs are congested simultaneously. ough y-
Suppose we do want to add yways to provide extra capacity
ways will provide little bene t for demands like the all-pairs
between hot ToRs and some other ToRs that they exchanging
shu er, we believe that a large set of practical applications
tra c with. We need to answer two questions. First, which
stand to gain from yways.
pairs should one select to get the most speedup? And second,
how much capacity does each yway need to have?
Placing the rst yway between a ToR that is the most con- 3. REALIZING FLYWAYS
gested and another ToR that it exchanges the most data with is We present the design of a yway-based network. We con-
clearly the right choice. But subsequent choices are less clear, sider both wireless and wired yway architectures. In case of
for example should one place the next yway at the same ToR wireless, we explore both GHz and . n technologies,
or elsewhere? Fig. examines di erent ways of placing the but GHz technology appears better suited for our purposes.
same number of yways. Neither spreading yways too thinly Since the GHz technology is relatively new, we begin with
nor concentrating them at the top few ToRs works well. For some background.
example, placing one yway each between the top ToRs
and their largest correspondent does not reduce the comple- 3.1 60GHz Background
tion time of the hot ToR enough. Conversely, placing yways Millimeter wavelength wireless communications is an ac-
between the top ve ToRs and each of their ten largest corre- tive research topic with rapidly improving technology. Here,
we brie y review properties of GHz communications and D i g i t a l
explain why we believe it is suitable technology for construct-
R F c h i p
B B / M A C
ing yways in a data center.
e GHz band is a GHz wide band of spectrum ( - Figure : GHz wireless NIC. Courtesy SiBeam.
GHz) that was set aside as unlicensed by the FCC in . In
contrast to the MHz wide ISM band at . GHz which sup-
ports the IEEE . b/g/n networks, this band of frequency
is x wide. e higher band width facilitates higher capacity
links. For example, a simple encoding that achieves bps/Hz
makes possible links with a nominal bandwidth of Gbps. e
. b/g/n links use far more complex encodings that achieve Figure : View from top of a (partial) data center. Each box rep-
up to bps per Hz. Most regulators allow to watts of resents a x inch rack, which are arranged in rows of ten. e
circle represents the m range of a GHz device mounted atop
radiated power for transmissions in this band and per Shan- a rack in the center, which contains about other racks.
non’s law, higher transmission power facilitates higher capac-
ity links. Since this band includes the absorption frequency of SiBeam and Sayana to develop GHz devices using silicon
the oxygen atom, the signal strength falls o rapidly with dis- which lowered prices and reduced power draw.
tance ( - meters). However, in the constrained environs of a 3.2 Flyway links
datacenter, this short range is helpful; it allows for signi cant
spatial reuse while being long enough to span tens of racks. Wireless: One can construct wireless yways by placing one
e short wavelength of GHz ( mm) facilitates compact or more devices atop each rack in the datacenter. To form a y-
antennas. From the Frii’s law, the e ective area of an antenna way between a pair of ToRs, the devices atop the correspond-
decreases as frequency squared. us, a one-square inch an- ing racks create a wireless link. e choice of technology af-
tenna can provide a gain of dBi at GHz [ ]. Taken to- fects the available bandwidth, the number of channels avail-
gether, these characteristics allow placing one or more GHz able for spatial re-use, interference patterns and the range of
devices atop each of the racks in a datacenter to provide sur- the yway. e antenna technology dictates the time needed
plus link capacity, spatial reuse and viable range. to setup and tear down a yway. We evaluate a few of these
Numerous startups (SiBeam [ ], Sayana [ ]) have demon- constraints in § and defer others to future work.
strated prototype GHz devices that sustain data rates of - Wired: We suggest that wired yways be constructed by using
Gbps over a distance of to meters with a power draw additional switches of the same make as today’s ToR switches
between mw to watts. Fig. shows a prototype SiBeam that inter-connect random subsets of the ToR switches. For
device. e typical usage scenario for GHz networks, so far, e.g., one could use Cisco switches to inter-connect
has been to replace the wires connecting home entertainment ToRs with Gbps links each. To keep links short, we have the
devices and a few industry standards (WiGig [ ], Wireless yway switches preferentially connect racks that are close by
HD [ ]) support this usage. in the datacenter (see Fig. ).
Existing GHz devices are usable in datacenters today. Given Regardless of which of the above technologies one uses for
standard equipment racks that are inches wide, their range yways, the additional cost due to yways is a small fraction
of a few meters allows communication across several racks (see of today’s network cost. From Table , we note that adding a
Figure ). e small power draw (< W) and the form factor few tens of yway switches, a few hundreds of G links or a
of the devices ( - cubic inches) allows easy mounting on top few wireless devices per ToR increases cost marginally.
of racks. Some devices include electronically steerable phased- e two classes of yways are qualitatively di erent. When
array antennas that form beams of about degrees and can be deploying wired yways, one does not have to worry about
steered within milliseconds. Further customization of MAC spectrum allocation or interference. At the same time, their
and PHY layers for data center environment (e.g. more sophis- random construction constrains wired yways; ToR pairs that
ticated encodings that provide more bits/Hz, higher power etc.) exchange a lot of tra c and can bene t from surplus capacity
would result in greater cumulative capacity. might end up without a wired yway.
Needless to say, some challenges remain. First, due to the We do note however that either method of yway construc-
absorption characteristics and also because GHz waves are tion is strictly better than dividing the same amount of band-
weakly di racted [ ], non-line of sight communication re- width uniformly across all pairs of ToRs. Rather than spread
mains di cult to achieve. is is less of a problem in a data bandwidth uniformly and have much of it wasted, as would
center environment where antennas can be mounted atop the happen when the demand matrix is sparse, yways provide a
racks and out of the way of human operators. Second, the way to use the spare bandwidth to target the parts of the de-
technology to build power ampli ers at these high frequen- mand matrix that can bene t the most from surplus capacity.
cies is still in ux. Until recently, ampli ers could only be built 3.3 A Network with Flyways
with Gallium-Arsenide substrates (instead of silicon) causing
GHz radio front ends to be more expensive [ ]. Recent We propose to use a central controller to gather estimates of
advances in CMOS technologies have allowed companies like demands between the pairs of ToRs. e information can be
gathered from lightweight instrumentation at the end servers
1 0 10 20 30 40 50 1
0 10 20 30 40 50 100M 600M 1G 2G
Total extra bandwidth added (Gbps) Flyway Bandwidth
Number of flyways
Figure : Distributing sur- Figure : Impact of Flyway
Figure : Impact of adding Flyways
plus capacity among all over- bandwidth ( yways)
themselves or by polling SNMP counters at switches. Using subscribed links
these estimates, the controller periodically runs the placement 2 2
algorithm (see § . ) to place the available yways. 1.8 1.8
e topology of a yway-based network is dynamic, and re- 1.6 1.6
quires multipath routing. Towards this end, we leverage ideas 1.4 1.4
from prior work that tackles similar problems [ , , ]. e 1 1
controller determines how much of the tra c between a pair 0 10 20 30 40 50 0 10 20 30 40 50
of ToRs should go along the base network or take a yway from Number of flyways Number of 24-port flyway switches
the sending ToR to the receiving ToR, if one exists. e ToR Figure : With Technology Constraints: (le ) wireless yways
switch splits tra c as per this ratio by assigning di erent ows that are no longer than m (right) wired yways that can only
provide capacity among the randomly chosen subset
onto di erent MPLS label switched paths. We note that only a
few yways, if any, are available at each ToR. Hence, the num- tion network have Gbps interfaces and are divided among
ber of LSPs required at each switch is small and the problem racks with servers per rack. Hence, in the simulations
of splitting tra c across the base and yways that are one hop here, we evaluate di erent ways of inter-connecting the ToR
long is signi cantly simpler than standard multipath routing. switches. We route the observed demands with the constraint
that tra c in or out of a ToR cannot exceed Gbps. Our pri-
3.4 Placing Flyways Appropriately mary metric is the completion time of the demands (CT D),
e problem of creating optimal yways can be cast as an which is de ned as the maximum completion time of all the
optimization problem. Given Dij demand between ToRs i, j ows in that demand matrix. For ease of comparison, we re-
and Cl the capacity of link l, the optimal routing is the one port the normalized completion times where,
that minimizes the maximum completion time:
N ormalizedCT D = ,
Dij CT Dideal
min max ( )
such that rij and CT Dideal is the completion time with the ideal, non-
< Dij at ToR j oversubscribed network. As we present results from di er-
ent ways of adding yways to a : oversubscribed tree net-
rij − rij = −Dij at ToR i
: 0 at all other ToRs work, note that the baseline has CT D = 2 and obtaining a
X l l CT D = 1 implies that with yways, the network has routed
rij ≤ C ∀ links l
demands as well as the ideal, non-oversubscribed network.
For simulations in this section, we will assume that wireless
where rij is the rate achieved for ToR pair i, j and rij is the links are narrow beam, half-duplex and point-to-point. We
portion of that pair’s tra c on link l. will ignore antenna steering overhead. We will also assume
Computing the optimal yway placement involves suitably that given the narrow beamwidth, the limited range and the
changing the topology and re-solving the above optimization wide spectrum band available at GHz, the impact of inter-
problem. For example, we could add all possible yways and ference is negligible.
the constraint that no more than a certain number can be si-
multaneously active or that none of the yways can have a ca- 4.1 Beneﬁt of using ﬂyways
pacity larger than a certain amount. Not all the variants of the Figure shows the median normalized CTD (error bars
above optimization problem are tractable. Instead, our results are ’th and ’th percentiles) for di erent numbers of y-
are based on a procedure that adds one yway at a time, solves ways added to a : oversubscribed tree topology. Each yway
the above optimization problem and then greedily adds the has a bandwidth of Gbps. e simulations were run over a
yway that reduces completion times the most. is proce- day’s worth of demand matrices.
dure is not optimal and improving it is future work. Without any yways, the median completion time of the
tree topology is twice that of the ideal topology. As more y-
4. PRELIMINARY RESULTS ways are added the di erence between the two topologies nar-
We now present simulation results that demonstrate the value rows. e take-away from this gure is that with just y-
of yways under di erent settings. e simulations are driven ways, the median CTD with yways is within of that from
from the demand matrices obtained from a production dat- an ideal topology. Observe that the potential cost for establish-
acenter as described in § . e servers in the produc- ing yways is negligible compared to that of the ideal topolo-
gies. For many of the demand matrices just yways bring spectrum dedicated to the yway. Being able to vary the capac-
CTD on par with that of the ideal topology. Further, Figure ity of a yway can more e ciently use the available spectrum
shows that distributing equivalent additional capacity uniformly with fewer interfaces per ToR. ird, we assume that there is
among all the oversubscribed links, achieves little speed up. no interference between yways. While we believe that this is
is simulation validates the key thesis behind yways: adding a reasonable assumption for GHz links, we plan to relax it
low-bandwidth links between ToRs that are congested improves in the future by making the yway placement algorithm aware
the performance of oversubscribed network topologies. of interference patterns.
4.2 How much bandwidth? 6. CONCLUSION
How much bandwidth do we need for each yway? To an- Prior research has addressed how to scale data center net-
swer this question, we repeat the above simulations with y- works, but to the best of our knowledge none have studied ap-
way capacities set to Mbps ( . g, with channel bond- plication demands. Our data shows that a map-reduce style
ing), Mbps (the best nominal bandwidth o ered by . n) data mining workload results in sparse demand matrices. At
and Gbps. Figure shows the median, ’th and ’th per- any time, only a few ToR switches are bottlenecked and these
centiles of completion times from adding yways. e graph ToRs exchange most of their data with only a few other ToRs.
indicates that while it may be possible to use Mbps links is leads us to the concept of yways. By providing addi-
to create yways, performance of Mbps yways would be tional capacity when and where congestion happens, yways
quite poor. Further, Gbps yways provide little marginal ben- improve performance at negligible additional cost. We show
e t over Gbps yways. that wireless links, especially those in the GHz band, are an
apt choice for implementing yways. We expect that pending
4.3 Constraints due to Technology
a revolution in the types of applications that run within data-
So far, we have ignored constraints due to the technology. centers, the sparse nature of inter-rack demand matrices will
Wireless yways are constrained by range and wired yways persist. Hence, the yways concept should remain useful. We
which are constructed by inter-connecting random subsets of have listed some practical and theoretical problems that need
the ToR switches can only provide surplus capacity between to be solved to make yway based networks a reality.
these random subsets. Fig. repeats the above simulations
with Gbps yways and also these practical constraints. We Acknowledgments
assume that GHz yways span a distance of meters and We would like to thank Albert Greenberg, Dave Maltz and
use the (to-scale) datacenter layout (Fig. ) from a produc- Parveen Patel for feedback on early versions of this paper.
tion data-center. For wired yways, we use port, Gbps
switches. We see that both constraints lower the bene t of y- References
ways but the gains are still signi cant. [ ] M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable
Note that many more wired yways need to be added to Commodity Data Center Network Architecture. In
obtain the same bene t accrued from wireless yways. For SIGCOMM, .
example, with y -port switches, we add 50 ∗ 24 = 1200 [ ] L. A. Barroso and U. Holzle. e Datacenter as a Computer -
an introduction to the design of warehouse-scale machines.
duplex links to the network. ough switches of a higher port Morgan & Claypool, .
density ( , etc.) can achieve equivalent performance with [ ] J. Dean and S. Ghemawat. Mapreduce: Simpli ed data
fewer links, wireless yways do so with half-duplex links. processing on large clusters. In OSDI, .
is is because the targeted addition of wireless yways can [ ] Event Tracing For Windows.
speed up exactly those pairs of ToRs that need additional ca- http://msdn.microso .com/en-us/library/ms .aspx.
[ ] S. Ghemawat, H. Gobio , and S.-T. Leung. e Google File
pacity. Wired yways are added at random and will bene t
System. In SOSP, .
only those ToR pairs that are connected via a yway switch. [ ] A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim,
P. lahiri, D. Maltz, P. Patel, and S. Sengupta. Vl : A scalable
5. DISCUSSION and exible data center network. In SIGCOMM, .
[ ] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian,
Our results are meant primarily to demonstrate the viabil- Y. Zhang, and S. Lu. Bcube: High performance, server-centric
ity of the yway concept. While we considered a few practical network architecture for data centers. In SIGCOMM, .
limits on building yways, many others remain. We list a few [ ] S. Kandula, D. katabi, B. Davie, and A. Charny. Walking the
of those issues here. First, the number of yways that each ToR tightrope: responsive yet stable tra c engineering. In
can participate in is limited by the number of wireless NICs [ ] Sayana Networks.
available at the ToR. In our simulations, we nd that we need [ ] SiBeam. http://sibeam.com/whitepapers/.
between and wireless links at the busiest ToRs, some of [ ] P. Smulders. Exploiting the GHz Band for Local Wireless
which are between the same ToR pair. Second, we assume that Multimedia Access: Prospects and Future Directions. IEEE
all yways have the same capacity. In practice, the capacity of a Communications Magazine, January .
[ ] Wireless Gigabit Alliance. http://wirelessgigabitalliance.org/.
yway is determined by several environmental factors such as [ ] WirelessHD. http://wirelesshd.org/.
interference from other yways, the antenna gain, and the dis-
tance between the two wireless NICs but also by the amount of