Your SlideShare is downloading. ×
  • Like
Containing Chaos
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

From Nemertes Research: Data center architects need to consider designs that limit complexity and reduce the …

From Nemertes Research: Data center architects need to consider designs that limit complexity and reduce the
possibility of chaotic behavior. Learn more at

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Containing Chaos:How Networks Must Reduce Complexity to Adapt to the Demands ofNext-Generation Data CentersBy Johna Till JohnsonPresident & Sr. Founding Partner, Nemertes ResearchExecutive Summary Data centers’ network requirements are changing dramatically, driven bynew applications, ongoing data-center consolidation, and the increasingworkload dynamism and volatility introduced by virtualization. At the sametime, traditional requirements for high speed, low latency, and high reliabilitycontinue to ratchet inexorably upwards. In response to these demands, data-center architects need to consider designs that limit complexity and reduce thepossibility of chaotic behavior.The Issue Data-center consolidation, server virtualization, and an increase in real-time, high-bandwidth applications (such as video) and performance-sensitiveapplications (such as Voice over IP and desktop virtualization) are driving aparadigm shift in network architecture. A few years back, servers and users sharedthe same campus network, had similar requirements, and therefore relied on thesame networking technologies. These days, servers are increasingly virtualized andconsolidated in data centers. Users, in contrast, are distributed out across branchesand administrative offices. In other words, yesterday’s one-size-fits-all campus LAN has bifurcated intotwo LANs: An access network that primarily interconnects users, and a data-centernetwork that primarily interconnects virtualized servers and connected storage. Servers have very different network usage characteristics than users.Typically they require orders-of-magnitude greater bandwidths coupled with verylow latency. That means data-center networks are under intense pressure to scaleup performance and reduce latency, and scale out to handle increasedinterconnections and bandwidth. ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 1
  • 2. And that’s not all. Virtualization introduces two new challenges: dynamismand complexity. With virtualization, it’s no longer possible to predict whichworkloads need to communicate with which users—or more importantly, whichother workloads, or where either end of the conversation will be located. It used tobe possible to engineer a network based on the expected traffic flows, giving well-traveled paths (either between users and an application, or between applications)higher bandwidth and lower latency. Now the physical location of the virtualizedworkload is unknown, and in fact usually varies with time. IT staffs can launchapplications anywhere in the data center, and those applications even can jump toother data centers. This any-to-any behavior doesn’t work well across a traditionalhierarchical data-center network. The second challenge virtualization introduces is increased complexity—both operational and architectural. If you think of each virtualized workload as anend-node, the number of end-nodes connected through each network device goesup by at least an order of magnitude in a virtualized environment versus a physicalone. And complexity increases geometrically with scale, which means a networkdesigned to handle virtualized workloads isn’t 10 times more complex than onehandling physical ones, but closer to 50-100 times. And an increase in complexitytranslates directly into an increase in management overhead (including costs)—anda decrease in reliability. Complexity also limits agility; particularly in a virtualizedenvironment in which the goal is to build dynamic pools of compute resources. Acomplex network is harder to modify rapidly—meaning that the network gets in theway of rapidly provisioning resources. The challenge facing data-center architects, therefore, lies in designing anetwork that scales performance while simultaneously reducing complexity.Key Technology Trends and Business Drivers To better understand these challenging data-center requirements, it helps totake a closer look at some of the critical technology trends and business driversthat produced them.Data-Center Consolidation First and foremost is data-center consolidation: Over the past few years, ITorganizations have increasingly consolidated from dozens of data centers down to ahandful (typically three), with the goal of optimizing costs by reducing real-estatefootprint. This means that the remaining data centers are housing an order ofmagnitude more computing and storage resources—and that networks need toscale accordingly. That’s even before the added impact of server virtualization (seebelow).Server Virtualization As noted, server virtualization is also a critical trend. Nearly everyorganization (97%) has adopted some degree of server virtualization. (Please seeFigure 1.) For organizations that have fully deployed virtualization, 78% of ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 2
  • 3. workloads are virtualized. But most companies are still in the process ofvirtualizing: Just 68% of workloads, on average, are fully virtualized, meaning thisis a trend that’s ongoing. And as noted, virtualization injects specific challengesinto an architecture, increasing performance requirements and complexity by anorder of magnitude or more.Bandwidth Increases As all this is going on, bandwidth requirements are increasing dramatically,driven by increases in application density and type. 10 Gbit Ethernet has becomethe de-facto standard in data-center networks. (Please see Figure 2). And majorrouter manufacturers have announced 100-Gbit interfaces. The bottom line? Getready for yet another step-function increase in data-center bandwidth.Figure 1: Server Virtualization AdoptionEmerging Real-time Applications Along with the structural changes in the data center, IT organizations arecoping with a dramatic influx in real-time applications. These include growing useof video, both conferencing and streaming video. (Approximately 74% ofcompanies are deploying, planning to deploy, or evaluating streaming video). ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 3
  • 4. Another major application is desktop virtualization, deployed by 51% of companiesin 2010 and projected to rise to 74% of companies by 2012. These applications drive the need for both high bandwidth and extremelylow latency in the data center core, since the servers for these applications areincreasingly instantiated as virtual machines located in the data center.Figure 2: Growth in 10-G EthernetThe Impact of Technology Trends on Network Design Overall, the impact of these technology trends is to shift the fundamental jobof the data center. With virtualization, the major challenge of data-centernetworking is to provide an interconnection capability across which administratorscan create virtual machines and manage them dynamically. These virtual machines have two main problematic characteristics. First,they’re dynamic: They appear and disappear unpredictably as servers andapplications are provisioned (increasingly, by the users themselves), and theymove. Second, they increasingly generate traffic flows that are device-to-device,rather than client-to-server. “We’re seeing a 20% increase in any-to-any traffic inthe data center,” says the CIO of a midsize university who notes that a drivingfactor is the increased use of video streaming applications. Users (including but not ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 4
  • 5. limited to students) often store videos on one server, then move them server-to-server to process them. Yet another trend is the emergence of SOA, Web 2.0, and collaborationapplications, which also require real-time performance, and also drive server-to-server traffic flows. In other words, data-center traffic flows are changing from staticallydefined, top-down (client-to-server) towards dynamic server-to-server—and thedata- center architecture must change along with them. At the same time, performance and reliability requirements continue toscale up. As the sheer volume of data-center traffic increases, due to data-centerconsolidation, and the emergence of high-bandwidth applications, capacity is also adesign factor. And applications are increasingly intolerant of delay, making latencyanother design factor. Finally, since data centers increasingly are consolidated,failure is no longer an option, meaning that data-center networks need to getbigger, faster, and more able to handle unpredictable workloads while becomingeven more reliable than previously.The Complexity Challenge The challenge in re-architecting the data-center core is fundamentally this:To support the design requirements of dynamic any-to-any traffic flows and highperformance, while also reducing complexity. Why worry about complexity? In anylarge-scale system (such as a network), increasing complexity tends to do twothings: Increase the cost of managing the system and decrease reliability.The catch is that to reduce complexity, one first has to understand and define it.Although there’s an entire science devoted to complexity theory, there’s no fixeddefinition of complexity, or of complex systems. A good definition of a complexsystem is the following: Complex systems are built out of a myriad of simplecomponents which interact, and exhibit behavior that is not a simple consequenceof pairwise interactions, but rather, emerges from the combination of interactionsat some scale. For networked systems, one can begin to think about complexity in terms ofthe number of devices or agents in the system and the potential interconnectionsbetween them. If there are N agents in a system, it takes N*(N-1)/2interconnections to interlink these agents directly to each other, meaning that thenumber of interconnections scales geometrically with N. In a data-center network, “agents” are switching and routing elements, and“connections” are the logical paths between them. Controlling complexity thereforeinvolves minimizing the number of the interactions between agents, which, as we’llsee, is easier said than done. ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 5
  • 6. Complexity, Chaos, and Dynamism One consequence of complexity is that it can generate chaotic behavior.Mathematically speaking, chaotic behavior is behavior that’s neither predictablenor random—infinitesimally small changes in a starting state can producearbitrarily large changes in a later state. Obviously this is undesirable in a system(or network) that’s designed to consistently deliver a specific function predictablyand manageably. For example, in a networked environment, a minor difference inconfiguration could trigger a downstream failure that’s unpredictable and thus,unpreventable. These types of problems arise in virtually any complex environment(including nuclear power plants and airplanes in flight). Interestingly, chaoticbehavior often arises from very simple relationships. In other words, a complexsystem that is constructed of simple, deterministic building blocks can nonethelessdisplay chaotic behavior. (Surprisingly, this mathematical conception of chaos wasaccurately captured back in 1945 by the poet Edna St. Vincent Millay, whodescribed it as, “Something simple yet not understood.”) The challenge of reducing complexity therefore becomes, in essence, thechallenge of containing chaos. Some approaches to doing so can be drawn fromother fields in which chaotic behavior arises; others are specific to networks. As noted, the complexity—and propensityfor chaotic behavior—increases dramatically with a “Next to thesystem’s “dynamism,” the need to change quickly mysteries of dynamicsfrom state to state. As Duncan Watts, applied on a network…themathematician and principal research scientist at problems of networksYahoo! Research, puts it more eloquently, dynamic we have encounteredbehavior dramatically changes the game when it up to now are justcomes to chaos and complexity. pebbles on the How does all this apply to data-center seashore.”—Duncannetworks? In a nutshell, they need to be architected Watts, appliedin a way that limits complexity and reduces, or mathematician andeliminates, the possibility of chaotic behavior— principal researchparticularly in light of the non-deterministic nature scientist, Yahoo!of virtualized servers and applications. ResearchArchitecting to Control Complexity As noted, controlling complexity in a network boils down to minimizing N,the number of networking elements. With traditional architectures, that’s notexactly easy. Traditional architectures are built around a core-distribution-edgedesign (Please see Figure 3.) With such an architecture, connecting from a virtualworkload (“V”) executing in server farm A to a virtual workload in server farm Crequires traversal up and down the hierarchy (six hops). Similarly, a virtualworkload in server farm B is three hops away from server farm C, and eight hopsaway from server farm A. ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 6
  • 7. This poses three challenges. First, to scale to support an increasing numberof virtual workloads, this architecture must increase the network elements, exactlythe opposite of the goal. Second, injecting multiple switching elements betweenvirtual machines increases the path length, and therefore the latency, betweenvirtual machines, which can adversely affect performance. Finally, a hierarchicalnetwork design is ill equipped to handle highly dynamic endpoints, such as virtualworkloads. Figure 3: Traditional Network Architecture The solution is to “collapse the core,” or flatten the traditional hierarchicalstructure as much as possible, ideally to provide a single hop between every site.That reduces complexity and also reduces latency. This means, for example, that ifa user is processing a video-streaming application, he or she can dynamicallyprovision a video-processing server across the data-center network from the videoserver with the confidence that the data-center network will inject no more than asingle hop’s delay. There’s an additional step that can reduce complexity even moredramatically. In other complex systems—aerospace engineering, for example—thesolution to untrammeled complexity is to essentially create “black boxes” thatprovide a simple and predictable set of inputs and outputs to the rest of the system,thereby bounding the complexity (and curtailing potential chaos) That is, possiblenumber of interactions between elements within the black box and the rest of thesystem is reduced, since the black box appears to the outside system as a singleelement. ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 7
  • 8. In the case of network complexity, the corresponding approach is to reducethe entire network to a single, consistently managed switching element. In otherwords, replacing N independently managed switches with a single switch reducesinherent complexity as low as possible (N= 1). In some ways, this is analogous togrid computing, in which interconnected server farms can be managed andcontrolled as if they were a single server. Such approaches have succeeded inreducing cost and complexity at the same time they dramatically increase reliability(for example, in data centers in Google and other major Internet sites).   Figure 4: Flattening the CoreThe Impact on Network Evolution Collapsing the core in data-center network design means data-centerrouters and switches are no longer simply larger versions of the same networkdevices that exist in the rest of the campus network. Data-center networks arebecoming larger and more centralized, scaling to the transport of terabit/s traffic,with nanosecond latencies. Additionally, these networks are seeing theconvergence of Ethernet and Fibre Channel, interconnecting server and storagesystems rather than users. Finally, they’re supporting an increased density ofvirtual workloads, which migrate dynamically across servers. Access LANs (the networks in campus and branch offices) in contrast,increasingly are starting to feature centralized management, policy, andprovisioning for a high density of (relatively) low-bandwidth users; supporting theconvergence of wired and wireless infrastructure; interconnecting users (ratherthan servers and storage); and supporting user (rather than server) density anddynamism. (Please see Figure 5.) ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 8
  • 9. The bottom line, again, is that today’s one-size-fits-all network architectureis segmenting into multiple, specialized networks. Figure 5: Network EvolutionConclusions and Recommendations Given the tectonic shifts in requirements, IT managers should be thinkingdifferently about how to architect data-center networks. Thanks to consolidation,data centers are handling more traffic, and applications like video and desktopvirtualization are driving bandwidth and latency performance requirements.Finally, with the widespread advent of server virtualization, data centers today areincreasingly interconnecting a mobile, dynamic population of virtual workloads.These changes collectively are driving the need for an architecture that deliversperformance and scalability without increasing cost or complexity, which means, inturn, that IT managers should think in terms of bounding complexity and reducingthe potential for chaotic behavior. That means reassessing the traditional core, distribution, and edgearchitecture with an eye to minimizing switch count and maximizing the ability tomanage multiple network components as a whole. Data-center network architects should take the following steps: ± Recognize that application volatility and dynamism is just beginning. As more and more data centers begin supporting autoprovisioning, virtual machines will start popping up all over the data center (and disappearing just as quickly). ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 9
  • 10. ± Plan for any-to-any. I shall put Chaos into fourteen lines Even if the And keep him there; and let him thence escape predominant If he be lucky; let him twist, and ape application flows Flood, fire, and demon—his adroit designs today are client-to- Will strain to nothing in the strict confines server, anticipate a Of this sweet order, where, in pious rape rapid rise in peer-to- I hold his essence and amorphous shape, peer traffic in the Till he with order mingles and combines. data center. Past are the hours, the years of our duress, ± As much as possible, His arrogance, our awful servitude: collapse the core I have him. He is nothing more nor less (flatten the number Than something simple not yet understood; of networking tiers). I shall not even force him to confess; Even a partial Or answer. I will only make him good. “collapse” is better --Edna St. Vincent Millay, than none. If the Mine the Harvest, a collection of new poems current environment has the canonical three tiers—strive to reduce them to two. ± Seek an integrated approach to managing switching elements. Managing and provisioning routers and switches as a single entity reduces complexity (and the possibility for chaotic behavior). About Nemertes Research: Nemertes Research is a research-advisory firm that specializes in analyzing and quantifying the business value of emerging technologies. You can learn more about Nemertes Research at our Website,, or contact us directly at ©Nemertes Research 2011 ± ± 888-241-2685 ±DN1420 102000388-001 Feb 2011