Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Center Interconnects: An Overview


Published on

It's common business policy for organizations of a certain size to have two data centers as part of a disaster recovery or business continuity plan. However, most enterprise - applications are not designed for or intended to use systems in two different locations.

Enter the notion of a data center interconnect, which extends an Ethernet network between two physically separate data centers. While the idea is simple, Ethernet wasn't designed to run across a wide area network. Thus, a DCI implementation requires a variety of technological fixes to work around Ethernet's limitations.

This report outlines the issues that complicate DCIs, such as loops that can bring down networks and traffic trombones that eat up bandwidth. It also examines the variety of options companies have to connect two or more data centers, including dark fiber, MPLS services and MLAG, as well as vendor specific options such as Cisco OTV and HP EVI. The report looks at the pros and cons of each option.

Published in: Technology, Business
  • Be the first to comment

Data Center Interconnects: An Overview

  1. 1. Report ID:S6970513 Next reports DataCenterInterconnects: AnOverviewA DCI lets companies link two or more data centers together for disaster recovery or business continuity,but it’s not easy.This report provides an overview of the major DCI technologies and describes their pros and cons. By Greg Ferro R e p o r t s . I n f o r m a t i o n W e e k . c o m M a y 2 0 1 3 $ 9 9
  2. 2. Previous Next reports May 2013 2 CONTENTS TABLE OF 3 Author’s Bio 4 Executive Summary 5 The DCI Problem 5 Figure 1:Ingress Routing Problems 6 Loop Prevention 7 Three Options 8 Software-Defined Networking 8 Vendor and Standards-Based Technologies 8 Figure 2:Leaf and Spine 9 Figure 3:Partial Mesh 10 Custom Options 10 Figure 4:MLAG 11 DCI:Weighing the Choices 12 Related Reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w ABOUT US InformationWeek Reports’ analysts arm business technology decision-makers with real-world perspective based on qualitative and quantitative research,business and technology assessment and planning tools,and adoption best practices gleaned from experience. OUR STAFF Lorna Garey,content director; Heather Vallis,managing editor,research; Elizabeth Chodak,copy chief; Tara DeFilippo,associate art director; Find all of our reports at
  3. 3. May 2013 3 Previous Next © 2013 InformationWeek,Reproduction Prohibited reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Greg Ferro has spent more than 20 years in IT sales,technical and management roles,but his main work is in network engineering and design.Today he works as a freelance consultant for Fortune 100 companies in the U.K.,focusing on data centers,security and operational automation.Greg believes that cloud computing is just a new focus on technical operations,that software quality is vital and that networking is the unrealized future of everything because it’s all about bandwidth. At night he finds it cathartic to write about networking at Network Computing, his blog at Ethereal Mind and at Packet Pushers.He is known for practical opin- ions,technical viewpoints and being graceful when getting it wrong.Mostly. Greg Ferro InformationWeekReports Table of Contents FollowFollowFollowFollow Want More? Never Miss a Report!
  4. 4. May 2013 4 Previous Next It’s common business policy for organizations of a certain size to have two data centers as part of a disaster recovery or business continuity plan.However,most enterprise applications are not designed for or intended to use systems in two different locations. For example,a MySQL database is designed to exist on a single server with a single stor- age location.Building a resilient MySQL server requires an advanced infrastructure or complex software. Enter the notion of a data center interconnect,which extends an Ethernet network be- tween two physically separate data centers.While the idea is simple,Ethernet wasn’t de- signed to run across a wide area network.Thus,a DCI implementation requires a variety of technological fixes to work around Ethernet’s limitations. This report outlines the issues that complicate DCIs,such as loops that can bring down networks and traffic trombones that eat up bandwidth.It also examines the variety of op- tions companies have to connect two or more data centers,including dark fiber,MPLS services and MLAG,as well as vendor-specific options such as Cisco OTV and HP EVI.The report looks at the pros and cons of each option. reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w EXECUTIVE SUMMARY Table of Contents
  5. 5. May 2013 5 The most reliable method to connect two data centers together for high availability and disaster recovery is to route IP traffic be- tween the data centers.However,it’s become more common to extend the Ethernet net- work over the WAN, called a data center in- terconnect, or DCI. This allows for the use of features such as virtual machine migration. For instance,by connecting two data centers via Ethernet, administrators can move a SQL Server instance via VM migration without changing the IP address of the operating sys- tem.This is attractive to the server teams be- cause the IP address is a key part of the di- rectory service or configuration database. Maintaining the same IP address means that application settings remain the same and re- duces the chance of errors when migration occurs. Service continuity is simpler if the IP address is unchanged. A VMware ESXi server can perform vMotion for up to eight virtual machines at once (pro- vided that you have a 10 Gbps network adapter or four at 1 Gbps).Given that it’s com- mon to have 20 to 40 VMs per physical server, you can see that evacuating a server will take some time. VMotion performance requires very low latency,typically less than 50 millisec- onds to achieve control transfer (although op- Previous Next Ingress Routing Problems Source: Greg Ferro S6970513/1 S Without careful planning after a server migration,traffic may unnecessarily traverse one data center and the data center interconnect to connect to servers in another data center. Data Centre Interconnect Server Migration The DCI Problem reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents Figure 1
  6. 6. May 2013 6 tions exist).A larger DCI bandwidth will result in faster vMotion and reduce the risk of traffic trombone. Note that you can create a cascading failure as more servers move,so you need increasing amounts of bandwidth for interserver appli- cation traffic, which makes the remaining vMotion tasks progressively slower. Band- width will eventually reach a peak and can prevent vMotion from occurring at the peak point of transition. One problem with server migration is that storage must be synchronized be- tween the data centers us- ing storage replication technology. Replication is usually performed by the storage array,but it’s an expensive option and will consume additional bandwidth between the sites. Provided that the storage is replicated between the sites, extending the Ethernet network between data centers results in the simplest possible server migration between each data center,though it incurs significant technical debt.It is networking best practice to use Layer 3 routing between geographi- cally diverse locations and to limit Layer 2 connectivity wherever possible,thus improv- ing network stability and limiting risk do- mains to a single data center. We’ll look at some of the technological challenges of DCI and discuss the pros and cons of various DCI options. Loop Prevention Ethernet introduces several technical hur- dles in building a DCI. Ethernet was created some 30 years ago as a local area network protocol,with no practical concept of scaling past a few machines. By design, Ethernet is a multiaccess technology that allows all Ether- net broadcast frames to be received by all endpoints on the network.Thus,an Ethernet broadcast frame must be forwarded across all Ethernet networks, including the DCI. If a broadcast frame is looped back into an Eth- ernet network, it will be forwarded by all switches even though it was already broad- cast.This creates a race condition that rapidly consumes all network bandwidth and usu- ally results in catastrophic network failure as the volumes of broadcasts expand to con- sume all resources. The Spanning Tree Protocol was designed to address the loop problem with Ethernet and has generally served its purpose on the LAN. However it’s not suitable for control of flooding of packets between data centers, because Spanning Tree is not easily scalable and risk domains grow as network diameter grows. STP has no domain isolation, so a problem in a single data center can propa- gate between data centers. In addition, first- hop resolution and inbound routing selec- tion can cause verbose inter-data center traffic over the DCI. When a server is migrated between data cen- ters, traffic to and from the server must be in- tentionally designed.Outbound flows from the server will default to a router that may or may not be in the same data center.In this instance, traffic from a server in DC B may traverse the DCI link to reach the router in DC A and then re- Previous Next reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents When a server is migrated between data centers,traffic to and from the server must be intentionally designed.
  7. 7. May 2013 7 Previous Next turn back over the DCI link to another resource in the DC B.This is not optimal because the net- work link between data centers is bearing all traffic from external users and traffic from the relocated server to other servers (see Figure 1). The resulting traffic pattern is sometimes called a traffic trombone.Consider a Web server with Java Runtime and an MySQL database in DC A. After migration of the Web server to DC B, the traffic flows over the DCI are: >> External flows from the WAN and/or Internet >> Flows from theWeb server to the database >> Administrative traffic flows such as back- ups,monitoring and patching Consider performing a backup of the mi- grated server in DC B to a backup system in DC A. How much bandwidth do you need so that the backup will complete within the backup window? And will the backup impact the critical traffic like the database queries or customer Web traffic ? Three Options Today there are three common methods for modifying traffic flow between data centers: first-hop bypass,LISP and load balancing.We’ll look at each in turn. First-hop bypass relates to the many options for establishing a local default gateway or router hop for the server. The server will re- quire the same default gateway address in each data center,but sending the traffic from DC B to DC A leads to failure.Therefore,meth- ods based around MAC address filtering for HSRP IP gateways are common.There are sev- eral ways to handle this specific to each router vendor’s software implementation. Location Independent Separation Protocol is an IETF standard proposed by Cisco that modifies the concept of routing location. In- stead of routing to a subnet in the network, traffic is forwarded to a specific router using a tunnel.The router will then forward the traf- fic to an identifier, which is the IP address of the server.You can find more about LISP here or at the IETF LISP Working Group.LISP works for inbound and outbound traffic. Load balancing involves using the source NAT features on a load balancing VIP so that traffic will be sourced from a device within the data center. However, this only works for in- bound flows and must be combined with other traffic controls such as first-hop bypass for a complete solution. A fourth option is route injection, which in- volves triggering a dynamic route injection into the network routing based on certain trigger options. This method has proved less reliable in wider use because routing proto- cols have limited trigger capabilities. This works for inbound flows and partially for out- bound flows. These technologies address the traffic trombone problem using legacy or existing network tools,but you may also wish to con- sider vendor-specific technologies such as Cisco’s OTV or Hewlett-Packard EVC, which we’ll discuss later. Software-Defined Networking In the future, SDN and controller-based networking will likely provide new capabili- ties that do not rely on the configuration of individual devices or require manual over- Research: The Next-Gen WAN Respondents to our Next Gen- eration WAN Survey are a highly connected bunch:44% have 16 or more branch or remote offices linked to their primary data cen- ters.And Ethernet-based services like MPLS outstripped ISDN among current users,73% to 56%.What’s next? Demand for dark fiber and private clouds, among other things. DownloadDownload reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents
  8. 8. May 2013 8 rides of routing configuration.If you’re plan- ning a DCI deployment,you should consider evaluating the new SDN technologies, in- cluding Juniper Contrail, VMware NSX and Alcatel-Lucent’s Nuage Networks. Vendor and Standards-BasedTechnologies There has been significant demand for DCI products, and this has led to a number of technological developments by vendors and by standards bodies. I’ll look at five options: dark fiber, MPLS pseudowires, MLAG, TRILL/ ECMP and custom vendor protocols. Dark fiber: Dark fiber is a broad term used to describe dedicated fiber-optic cables or services that closely emulate dedicated ca- bles. For some geographies, it’s possible to lay your own fiber between data centers and own the right of way or to purchase a dedi- cated fiber from a provider. Physical cables are usually capped at around 50 to 75 kilo- meters (the distance of a long-haul, single- mode laser transmitter). More commonly, your local carrier provides a dense wavelength division multiplexer ser- vice that presents fiber at each site and ap- pears as a dedicated fiber cable to the sites. The customer can use any signal on that fiber because the laser is physically multiplexed and the DWDM drop has limited awareness of your protocols. The DWDM service provides additional circuit redundancy through the use of ring technologies, and the carrier can pro- vision multiple services over a single fiber pair. MPLS pseudowires: When it comes to MPLS, most organizations will choose to pur- chase Layer 2 virtual private network service from a service provider.Service providers use the MPLS protocol internal to their networks to provide a wide range of IP routed services such as WAN and Internet circuits. Previous Next Leaf and Spine Source: Greg Ferro S6970513/3 S A leaf and spine architecture can be used in a DCI 40 Gbps 40 Gbps reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents Figure 2
  9. 9. May 2013 9 Typically, your service provider will deliver an Ethernet port to your premises, and this will connect to the provider’s MPLS back- bone. MPLS standards have organically grown a messy bunch of protocols that can provide a Layer 2 Ethernet emulation over an MPLS network. Technologies such as VPLS, EoMPLS, GREoMPLS and L2TPv3 all provide ways for emulating Ethernet networks. Your provider’s MPLS network should be config- ured to support one or more of these tech- nologies. These technologies are incorrectly but widely referred to as “pseudowires” be- cause their original purpose was to emulate ATM and frame relay circuits in the early 2000s before being modified for Ethernet. Large enterprises may build their own MPLS backbones to have greater control over the services and security of the WAN,but for most companies this won’t be a viable option. MPLS is a relatively complex group of proto- cols that requires a significant amount of time to learn and comprehend. Building mission- critical business with MPLS is hard and should generally be avoided. MLAG: Multichassis link aggregation de- scribes the logical bonding of two physical switches into a single unit, as shown in Figure 2. The logical switch control plane is a single software entity.This prevents loop conditions from occurring and reduces operational risk. It’s simple to use,configure and maintain com- pared with other approaches, and is less ex- pensive.Your service provider can supply Layer 2 services (probably using dark fiber or MPLS pseudowires,as discussed previously). Note that MLAG is not a standard.Each ven- Previous Next Partial Mesh Source: Greg Ferro S6970513/4 S TRILL can be used to create a layer-2 partial mesh DCI topology. Partial MeshTopology reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents Figure 3 Like This Report? Rate It!Something we could do better? Let us know. RateRate
  10. 10. May 2013 10 dor has its own name for the technology,such as Cisco’s vPC,HP’s IRF,Juniper’s MC-LAG and Brocade’s Multi-Chassis Trunk. To use MLAG for DCI, connect each port on the MLAG switches to the Layer 2 service to prevent loops. It’s recommended not to use MLAG features on core switches in each data center; instead use fixed switches in a modu- lar design for control and better support. MLAG could handle up to eight point-to- point circuits.A service provider failure would reduce the bandwidth and will require careful design if you’re using quality of service to pro- tect key applications. ECMP/TRILL:Equal Cost Multipath,or ECMP, is a more recent addition to the options for DCI.The IETF TRILL standard provides a multi- path protocol that “routes” Ethernet frames across up to 16 paths that have the same bandwidth or cost. Although intended for data center back- bones to implement a Clos Tree switch fabric (sometimes known as leaf/spine),TRILL can be used in a DCI technology. It provides high availability because dual switches are used at all sites,and also provides native STP isolation. A unique feature of TRILL as a DCI technol- ogy is that it supports partial meshed topol- ogy for multiple data centers because the Layer 2 traffic is routed over the TRILL core. Although core features are complete, the TRILL protocol continues to be developed. Many mainstream vendors have not released a fully standards- compliant implementation, so while you can build aTRILL fabric from a single vendor’s gear, you may run into interoperability problems in a heterogeneous environment. Some vendors are also extending the standard to add proprietary features. Bro- cade VCS and Cisco FabricPath are two of the available options today. Custom Options As you can see, there are com- plex technical challenges to ex- tending Ethernet networks between data centers.The effort often brings more risk than customers are willing to accept.However,ven- dors are developing proprietary protocols to address these risks. Case in point are Cisco’s Previous Next reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents MLAG Source: Greg Ferro S6970513/2 S MLAG logically bonds two physical switches to make them appear as a single unit. Physical MLAG Logical MLAG Figure 4
  11. 11. May 2013 11 Overlay Transport Virtualization (OTV) and HP’s Ethernet Virtual Interconnect (EVI). The protocols encapsulate Ethernet in IP for transport over WAN services. Software agents for these protocols in the edge net- work devices provide features such as Span- ning Tree isolation in each data center, re- duced configuration effort and multisite setups. Compared to MPLS, OTV and EVI are very simple to configure and maintain, though you will incur a substantial licensing fee on specific hardware platforms.The sim- plicity of these approaches makes them at- tractive options for most enterprises. You can find a more detailed comparison of OTV and EVI here. DCI:Weighing the Choices Before embarking on a DCI project,consider your disaster recovery plan carefully.Can you meet your disaster recovery requirements by cold start from a storage array replication or even by restoring from a backup? If so, you may not need to make the investment in a DCI. On the other hand, if you are looking for disaster avoidance, where server instances can be evacuated between data centers when a specific event, such as a major storm or po- litical intervention, occurs, then a DCI may be the way to go. Perhaps the best advice is to consider care- fully your actual business requirements. Mi- grating virtual workloads between data cen- ters creates unique technical problems due to the complexity of traffic flows. The following technical concerns are just a few of the less- obvious problems created by DCI: >>Tracing application problems can be dif- ficult when servers might be in two locations >> Applications incur latency over the DCI for just one or two servers,resulting in unpre- dictable performance >> Loop topology failure leads to outages in both data centers >> Bandwidth exhaustion results in service loss and cannot be easily controlled Layer 2 DCI is a last-resort technology that allows legacy applications to behave as if they were in the same Ethernet domain. The cor- rect solution is to deploy applications that are designed to run active/active in two or more data centers and avoid deploying DCI. If you choose to implement DCI, you should strictly limit its use to critical applications. Previous Next reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w Table of Contents LikeLike TweetTweetTweet ShareShare Like This Report? Share it!
  12. 12. SubscribeSubscribe Newsletter Want to stay current on all new InformationWeek Reports? Subscribe to our weekly newsletter and never miss a beat. May 2013 12 Previous reports D a t a C e n t e r I n t e r c o n n e c t s : A n O v e r v i e w MORELIKE THIS Want More Like This? InformationWeek creates more than 150 reports like this each year,and they’re all free to registered users.We’ll help you sort through vendor claims,justify IT projects and implement new systems by pro- viding analysis and advice from IT professionals.Right now on our site you’ll find: Strategy:OpenFlow vs.Traditional Networks: OpenFlow and SDN have the potential to simplify network operations and management while driving down hardware costs.But they would also require IT to rethink traditional network architectures.At the same time,other protocols are available or emerg- ing that can provide many of the same benefits without requiring major changes.We’ll look at the pros and cons of OpenFlow and SDN and how they stack up with existing options to simplify networking. SDN Buyer’s Guide: SDN products are finally hitting the enterprise market.Do you have a strategy? This report,the companion to our online comparison,explains key factors to consider in four areas:soft- ware-defined networking controllers,applications,physical or virtual switches,and other compatible hardware. Research:IT Pro Ranking:Data Center Networking: Cisco has an iron grip on the data center network.One reason is its reputation for quality:The company scores a 4.3 out of 5 for reliability,a rating no other vendor matched.That said,technology and market changes are loosening Cisco’s hold.Will the shift to virtual infrastructure,next-gen Ethernet and commodity switching components change the vendor pecking order? More than 500 IT pros weighed in to evaluate seven vendors. PLUS: Find signature reports,such as the InformationWeek Salary Survey,InformationWeek 500 and the annual State of Security report; full issues; and much more. Table of Contents