Global Server Load Balancing Dima Krioukov  [dima@nortelnetworks.com] Alex Kit  [akit@winstar.com] October 24, 2000
Purpose Existing methods New technique Analysis Applicability considerations
Plan Introduction What are ASPs? Requirements to IDCs LSLB Load Sharing NAT (LSNAT) Direct Server Return (DSR) Tunneling GSLB DNS Based Host Route Injection (HRI) Triangle Data Flow (TDF) Latest Trends New Technique – Virtual Block Injection (VBI) Description Testing Analysis Applicability Considerations Conclusions and References
Abbreviations LB = Load Balancing/Balancer SLB = Server LB LSLB = Local SLB GSLB = Global SLB HA = High Availability RS = Real Server/Service VS = Virtual Server/Service VIP = VS IP address LSNAT = Load Sharing NAT DSR = Direct Server Return PRP = Proximity Report Protocol LRP = Load Report Protocol LPRP = PRP + LRP  HRI = Host Route Injection VBI = Virtual Block Injection TDF = Triangle Data Flow IDC = Internet Data Center CDN = Content Delivery Network ASP = Application Service Provider CASP = Content/Collocation and Application Service Provider AIP = Application Infrastructure Provider xyP = ?
1. Introduction Logic: GSLB    IDC    ASP    Hosting
Hosting Infrastructure Web User Content Owner IDC Owner ISP OSS
ASP Infrastructure End Customer ASP Applications Operations ISP/Backbone Access IDC
IDC IDC Core (Routing) Distribution (L3 Switching) Tier Tier Tier LB Tier Load Balancing (L4 Switching) Port Density (L2 Switching) Servers SAN
Requirements to IDCs Load Balancing (LB) Local Global Local Global Proximity (“including” congestion) Load High Availability (HA) IDC1 IDC2 Client HA    LB
2. Generic SLB and LSLB SLB = VS    RS Health Checking Layer 2 Layer 3 Layer 4 Layer 7 SLB Algorithm Round Robin Least Connections Server Response Time Server Load Hashing SLB Forwarding Session Tables Timers
LSLB Forwarding LSNAT DSR Tunneling
LSNAT Router LB S1 S2 S3 X Y src/ dst Layer Ingress Client_Port S1_Port dst Client_IP S1_IP dst LB_MAC S1_MAC dst Client_Port Virtual_Port dst Client_IP Virtual_IP dst dst Router_MAC Virtual_MAC Client_Port Client_IP LB_MAC Client_Port Client_IP Router_MAC S1_IP src L3 src src src src src Virtual_IP L3 S1_Port L4 Virtual_Port L4 S1_MAC L2 Y Virtual_MAC L2 X Egress Segment
LSNAT + Source NAT Router LB S1 S2 S3 X Y src/ dst Layer Ingress LB_V_Port S1_Port dst LB_V_IP S1_IP dst LB_V_MAC S1_MAC dst Client_Port Virtual_Port dst Client_IP Virtual_IP dst dst Router_MAC Virtual_MAC LB_V_Port LB_V_IP LB_V_MAC Client_Port Client_IP Router_MAC S1_IP src L3 src src src src src Virtual_IP L3 S1_Port L4 Virtual_Port L4 S1_MAC L2 Y Virtual_MAC L2 X Egress Segment
DSR Router LB S1 S2 S3 1 2 3 Virtual_Port Client_Port Virtual_IP Client_IP S1_MAC Virtual_MAC 2 Client_Port Virtual_Port Client_IP Virtual_IP Router_MAC S1_MAC 3 src/ dst Layer 1 Virtual_Port dst Virtual_IP dst dst Virtual_MAC Client_Port Client_IP Router_MAC src src src L3 L4 L2
Tunneling Router LB S1 S2 S3 1 2 3 Int: V_IP Int: C_IP V_Port C_Port Ext: S1_IP Ext: LB_IP S1_MAC LB_MAC 2 C_Port V_Port C_IP V_IP R_MAC S1_MAC 3 src/ dst Layer 1 V_Port dst V_IP dst dst V_MAC C_Port C_IP R_MAC src src src L3 L4 L2
3. GSLB DNS Based HRI TDF Latest Trends
3.1 DNS Based GSLB = Name    VS (DNS+) Smart DNS Load and availability awareness    Load Report Protocol (LRP) Proximity and congestion awareness    Proximity Report Protocol (PRP) LB DNS Functionality DNS Server DNS Proxy Caching DNS Traffic Intercept
LPRP Transport UDP TCP HTTP Operation Periodic Updates Periodic Requests Triggered Updates IDC1 LB IDC2 LB IDC3 LB
PRP RTT Effective bandwidth Number of hops Number of AS hops IGP metric Proximity to the client LDNS, not to the client
LRP VS Health Up Down Backup only VS Load Number of sessions Response Time LB Load Number of sessions Capacity threshold CPU RS/Content Load Network Load bps pps QoS Security
How it works IDC1 IDC2 LB IDC3 LB Customer LDNS ADNS Client RDNS 1 2 3 4 5 5 6 6 6
How it works IDC1 IDC2 LB IDC3 LB Customer LDNS ADNS Client RDNS 7 7 8 10 11 9
Analysis Pros Accurate load info Accurate proximity info Perfect solution… in some cases and if certain conditions are met Cons DNS – wrong target Proximity between client and its LDNS Caching LB LDNS Application Complexity Hard to find optimal values for various timers (TTL, cache timeouts, etc.) and prefix lengths
3.2 HRI GSLB = Routing+ To what? BGP IGP By what? RS Router LB
To what IGP? BGP Route filtering (both ways) No ECMP Client Router IDC1 IDC2
By what RS IDC1 Router RS BGP IDC2 Router RS BGP
By what Router IDC1 Router RS IDC2 Router RS RS LB
By what LB IDC2 Router RS RS LB IDC1 Router RS RS LB BGP BGP
Analysis Pros Simplicity No new protocols are needed Proximity is handled by routing Load handling? Cons Single backbone* Its own Single ISP Too many routes Less accurate load and proximity info Only local load Optimal routing? Route flapping*
3.3 TDF GSLB = X + TDF NAT Based Tunneling Client IDC1,  “ wrong” IDC2, “ right”
Why “wrong” IDC? Failure of, disabled or non-implemented LPRP Cached DNS records Other retardation effects (LPRP, BGP)
NAT Based Client IDC1,  “ wrong” V1.1;  V1.2 IDC2, “ right” V2.1;  V2.2 3 2 1 1 V1.1 C C V2.2 dst V1.1 C src L3 3 2
“Remote Servers” Client IDC1,  “ wrong” V1.1 IDC2, “ right” V2.1 2 1 C V1.1 4 1 V1.1 C V1.1 V2.1 dst V2.1 V1.1 src L3 3 2 3 4
Tunneling Next section
Analysis Pros Fixes errors optimally Cons ip verify reverse-path Client Router Router IDC1,  “ wrong” IDC2, “ right”
Analysis Pros Fixes errors optimally Cons ip verify reverse-path Client Router Router IDC1,  “ wrong” IDC2, “ right”
3.4 Latest Trends, Radicalism Internet infiltration Going to the client edge Going to the client Modifying the client LB presence in strategic locations (HydraGPS, Speedera) LDNS modifications (Speedera) Application modifications (SRV RRs)
Internet Infiltrations IDC2 LB IDC1 LB Customer LB LB LB Client LB LB LB
Internet Infiltrations IDC2 LB IDC1 LB Customer LB LB LB Client LB LB
LDNS modifications in CDNs IDC2 LB IDC1 LB Customer LDNS Client ASP Backbone
4. Virtual Block Injection (VBI) Inject not VS host routes, but blocks of GSLB’ed VSs    IDC (LB) failures are handled by the routing protocol Use tunneling TDF in case of individual VS failure
How it works AS1 AS2 V/20, AS3 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
How it works AS1 AS2 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
How it works AS1 AS2 V/20, AS3 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
Testing Needed LB BGP Tunnels Linux Linux Virtual Server (LVS, Wensong Zhang, Julian Anastasov) Zebra Tunnels
Test Network
Analysis Pros All of HRI, plus No host route injection Working TDF Perfect VS health handling VS load    LRP Obvious simplifications in more “ideal” cases Cons LB load    stop advertisement? BGP – proximity tool? Discontinuous AS? Route flapping!
Route Flapping AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20 UDP TCP
Solution for UDP Session table entry exchange for long sessions AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
Solution for UDP Session table entry exchange for long sessions AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
Solution for TCP If LB receives packet Destined to a VS No SYN No session table entry Not via the tunnels Forward via all  the tunnels AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
5. Applicability Considerations GSLB of  Small number of VSs (or RSs)  by an ISP* by its customer Big number of VSs (between IDCs) CASP    ISP CASP     ISP CASP has its own backbone CASP does not have control over customer access CASP has control over customer access** CASP does not have its own backbone CASP is multihomed to the same ISP CASP is multihomed to different ISPs*
6. Conclusions No ideal GSLB method For some “ideal” network scenarios, there are some “ideal” solutions For realistic network scenarios, there are rapidly improving realistic solutions Good competition Lack of comparative testing in the production-like environment
References On ASPs:  Nortel ,  ASP Industry Consortium ,  Network Magazine ,  IRG Vendors:  Alteon ,  ArrowPoint ,  Foundry ,  F5 ,  Cisco ,  Nortel ,  Radware ,  HydraWEB ,  Speedera ,  Resonate RFCs:  LSNAT ,  SRV ,  DNS for LB , SLB draft (work in progress) Open Source:  LVS, http://www.linuxvirtualserver.org/ VBI Testing:  http://www.krioukov.net/~dima/VBI/

Title Subtitle

  • 1.
    Global Server LoadBalancing Dima Krioukov [dima@nortelnetworks.com] Alex Kit [akit@winstar.com] October 24, 2000
  • 2.
    Purpose Existing methodsNew technique Analysis Applicability considerations
  • 3.
    Plan Introduction Whatare ASPs? Requirements to IDCs LSLB Load Sharing NAT (LSNAT) Direct Server Return (DSR) Tunneling GSLB DNS Based Host Route Injection (HRI) Triangle Data Flow (TDF) Latest Trends New Technique – Virtual Block Injection (VBI) Description Testing Analysis Applicability Considerations Conclusions and References
  • 4.
    Abbreviations LB =Load Balancing/Balancer SLB = Server LB LSLB = Local SLB GSLB = Global SLB HA = High Availability RS = Real Server/Service VS = Virtual Server/Service VIP = VS IP address LSNAT = Load Sharing NAT DSR = Direct Server Return PRP = Proximity Report Protocol LRP = Load Report Protocol LPRP = PRP + LRP HRI = Host Route Injection VBI = Virtual Block Injection TDF = Triangle Data Flow IDC = Internet Data Center CDN = Content Delivery Network ASP = Application Service Provider CASP = Content/Collocation and Application Service Provider AIP = Application Infrastructure Provider xyP = ?
  • 5.
    1. Introduction Logic:GSLB  IDC  ASP  Hosting
  • 6.
    Hosting Infrastructure WebUser Content Owner IDC Owner ISP OSS
  • 7.
    ASP Infrastructure EndCustomer ASP Applications Operations ISP/Backbone Access IDC
  • 8.
    IDC IDC Core(Routing) Distribution (L3 Switching) Tier Tier Tier LB Tier Load Balancing (L4 Switching) Port Density (L2 Switching) Servers SAN
  • 9.
    Requirements to IDCsLoad Balancing (LB) Local Global Local Global Proximity (“including” congestion) Load High Availability (HA) IDC1 IDC2 Client HA  LB
  • 10.
    2. Generic SLBand LSLB SLB = VS  RS Health Checking Layer 2 Layer 3 Layer 4 Layer 7 SLB Algorithm Round Robin Least Connections Server Response Time Server Load Hashing SLB Forwarding Session Tables Timers
  • 11.
    LSLB Forwarding LSNATDSR Tunneling
  • 12.
    LSNAT Router LBS1 S2 S3 X Y src/ dst Layer Ingress Client_Port S1_Port dst Client_IP S1_IP dst LB_MAC S1_MAC dst Client_Port Virtual_Port dst Client_IP Virtual_IP dst dst Router_MAC Virtual_MAC Client_Port Client_IP LB_MAC Client_Port Client_IP Router_MAC S1_IP src L3 src src src src src Virtual_IP L3 S1_Port L4 Virtual_Port L4 S1_MAC L2 Y Virtual_MAC L2 X Egress Segment
  • 13.
    LSNAT + SourceNAT Router LB S1 S2 S3 X Y src/ dst Layer Ingress LB_V_Port S1_Port dst LB_V_IP S1_IP dst LB_V_MAC S1_MAC dst Client_Port Virtual_Port dst Client_IP Virtual_IP dst dst Router_MAC Virtual_MAC LB_V_Port LB_V_IP LB_V_MAC Client_Port Client_IP Router_MAC S1_IP src L3 src src src src src Virtual_IP L3 S1_Port L4 Virtual_Port L4 S1_MAC L2 Y Virtual_MAC L2 X Egress Segment
  • 14.
    DSR Router LBS1 S2 S3 1 2 3 Virtual_Port Client_Port Virtual_IP Client_IP S1_MAC Virtual_MAC 2 Client_Port Virtual_Port Client_IP Virtual_IP Router_MAC S1_MAC 3 src/ dst Layer 1 Virtual_Port dst Virtual_IP dst dst Virtual_MAC Client_Port Client_IP Router_MAC src src src L3 L4 L2
  • 15.
    Tunneling Router LBS1 S2 S3 1 2 3 Int: V_IP Int: C_IP V_Port C_Port Ext: S1_IP Ext: LB_IP S1_MAC LB_MAC 2 C_Port V_Port C_IP V_IP R_MAC S1_MAC 3 src/ dst Layer 1 V_Port dst V_IP dst dst V_MAC C_Port C_IP R_MAC src src src L3 L4 L2
  • 16.
    3. GSLB DNSBased HRI TDF Latest Trends
  • 17.
    3.1 DNS BasedGSLB = Name  VS (DNS+) Smart DNS Load and availability awareness  Load Report Protocol (LRP) Proximity and congestion awareness  Proximity Report Protocol (PRP) LB DNS Functionality DNS Server DNS Proxy Caching DNS Traffic Intercept
  • 18.
    LPRP Transport UDPTCP HTTP Operation Periodic Updates Periodic Requests Triggered Updates IDC1 LB IDC2 LB IDC3 LB
  • 19.
    PRP RTT Effectivebandwidth Number of hops Number of AS hops IGP metric Proximity to the client LDNS, not to the client
  • 20.
    LRP VS HealthUp Down Backup only VS Load Number of sessions Response Time LB Load Number of sessions Capacity threshold CPU RS/Content Load Network Load bps pps QoS Security
  • 21.
    How it worksIDC1 IDC2 LB IDC3 LB Customer LDNS ADNS Client RDNS 1 2 3 4 5 5 6 6 6
  • 22.
    How it worksIDC1 IDC2 LB IDC3 LB Customer LDNS ADNS Client RDNS 7 7 8 10 11 9
  • 23.
    Analysis Pros Accurateload info Accurate proximity info Perfect solution… in some cases and if certain conditions are met Cons DNS – wrong target Proximity between client and its LDNS Caching LB LDNS Application Complexity Hard to find optimal values for various timers (TTL, cache timeouts, etc.) and prefix lengths
  • 24.
    3.2 HRI GSLB= Routing+ To what? BGP IGP By what? RS Router LB
  • 25.
    To what IGP?BGP Route filtering (both ways) No ECMP Client Router IDC1 IDC2
  • 26.
    By what RSIDC1 Router RS BGP IDC2 Router RS BGP
  • 27.
    By what RouterIDC1 Router RS IDC2 Router RS RS LB
  • 28.
    By what LBIDC2 Router RS RS LB IDC1 Router RS RS LB BGP BGP
  • 29.
    Analysis Pros SimplicityNo new protocols are needed Proximity is handled by routing Load handling? Cons Single backbone* Its own Single ISP Too many routes Less accurate load and proximity info Only local load Optimal routing? Route flapping*
  • 30.
    3.3 TDF GSLB= X + TDF NAT Based Tunneling Client IDC1, “ wrong” IDC2, “ right”
  • 31.
    Why “wrong” IDC?Failure of, disabled or non-implemented LPRP Cached DNS records Other retardation effects (LPRP, BGP)
  • 32.
    NAT Based ClientIDC1, “ wrong” V1.1; V1.2 IDC2, “ right” V2.1; V2.2 3 2 1 1 V1.1 C C V2.2 dst V1.1 C src L3 3 2
  • 33.
    “Remote Servers” ClientIDC1, “ wrong” V1.1 IDC2, “ right” V2.1 2 1 C V1.1 4 1 V1.1 C V1.1 V2.1 dst V2.1 V1.1 src L3 3 2 3 4
  • 34.
  • 35.
    Analysis Pros Fixeserrors optimally Cons ip verify reverse-path Client Router Router IDC1, “ wrong” IDC2, “ right”
  • 36.
    Analysis Pros Fixeserrors optimally Cons ip verify reverse-path Client Router Router IDC1, “ wrong” IDC2, “ right”
  • 37.
    3.4 Latest Trends,Radicalism Internet infiltration Going to the client edge Going to the client Modifying the client LB presence in strategic locations (HydraGPS, Speedera) LDNS modifications (Speedera) Application modifications (SRV RRs)
  • 38.
    Internet Infiltrations IDC2LB IDC1 LB Customer LB LB LB Client LB LB LB
  • 39.
    Internet Infiltrations IDC2LB IDC1 LB Customer LB LB LB Client LB LB
  • 40.
    LDNS modifications inCDNs IDC2 LB IDC1 LB Customer LDNS Client ASP Backbone
  • 41.
    4. Virtual BlockInjection (VBI) Inject not VS host routes, but blocks of GSLB’ed VSs  IDC (LB) failures are handled by the routing protocol Use tunneling TDF in case of individual VS failure
  • 42.
    How it worksAS1 AS2 V/20, AS3 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 43.
    How it worksAS1 AS2 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 44.
    How it worksAS1 AS2 V/20, AS3 V/20, AS3 Client ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 45.
    Testing Needed LBBGP Tunnels Linux Linux Virtual Server (LVS, Wensong Zhang, Julian Anastasov) Zebra Tunnels
  • 46.
  • 47.
    Analysis Pros Allof HRI, plus No host route injection Working TDF Perfect VS health handling VS load  LRP Obvious simplifications in more “ideal” cases Cons LB load  stop advertisement? BGP – proximity tool? Discontinuous AS? Route flapping!
  • 48.
    Route Flapping AS1AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20 UDP TCP
  • 49.
    Solution for UDPSession table entry exchange for long sessions AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 50.
    Solution for UDPSession table entry exchange for long sessions AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 51.
    Solution for TCPIf LB receives packet Destined to a VS No SYN No session table entry Not via the tunnels Forward via all the tunnels AS1 AS2 V/20, AS3 V/20, AS3 Client Router ISP1 ISP2 IDC1, R1/20 IDC2, R2/20
  • 52.
    5. Applicability ConsiderationsGSLB of Small number of VSs (or RSs) by an ISP* by its customer Big number of VSs (between IDCs) CASP  ISP CASP  ISP CASP has its own backbone CASP does not have control over customer access CASP has control over customer access** CASP does not have its own backbone CASP is multihomed to the same ISP CASP is multihomed to different ISPs*
  • 53.
    6. Conclusions Noideal GSLB method For some “ideal” network scenarios, there are some “ideal” solutions For realistic network scenarios, there are rapidly improving realistic solutions Good competition Lack of comparative testing in the production-like environment
  • 54.
    References On ASPs: Nortel , ASP Industry Consortium , Network Magazine , IRG Vendors: Alteon , ArrowPoint , Foundry , F5 , Cisco , Nortel , Radware , HydraWEB , Speedera , Resonate RFCs: LSNAT , SRV , DNS for LB , SLB draft (work in progress) Open Source: LVS, http://www.linuxvirtualserver.org/ VBI Testing: http://www.krioukov.net/~dima/VBI/