Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scalable Internet Servers and Load Balancing

1,110 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Scalable Internet Servers and Load Balancing

  1. 1. Computer Networks 11/11/2009 Internet Online Applications Scalable Internet Servers and Load Balancingg  Internet online applications  Applications accessible to online users through Internet Internet.  Examples  Online keyword search engine: Google.  Web email: Gmail.  News: CNN, NBC news. Web directory: Yahoo!, MSN. Kai Shen   Scalability requirements  Many simultaneous user accesses; large amount of hosted data, …  Internet servers  Computer systems that host these online applications 11/11/2009 CSC 257/457 - Fall 2009 1 11/11/2009 CSC 257/457 - Fall 2009 2 Internet Servers are at the Search Engine as An Example: Application Layer Step 1 – Crawling  Normally on the end hosts, involving no routers  Function on transport-layer protocols TCP/UDP  Crawling – get all these Web pages out there: g g p g  First retrieve some root pages;  Parse their content and follow hyperlinks to retrieve more pages;  Depth-first search or breadth-first search? Remove Internet duplicates. Google Yahoo! CNN 11/11/2009 CSC 257/457 - Fall 2009 3 11/11/2009 CSC 257/457 - Fall 2009 4 CSC 257/457 - Fall 2009 1
  2. 2. Computer Networks 11/11/2009 Performance Analysis for Search Engine as An Example: Crawling Step 2 – Indexing  What are the resources involved?  CPU processing for TCP/HTTP protocol handling and the p g p g  Indexing parsing of page content f  crawled raw web pages are not easy to search.  writing to disk storage  we index them to formats that are easy to search. network bandwidth to remote web sites As part of indexing, we need to give each page an ID    Assume average page size 10KB  using a hash function. raw processing power of a single CPU ……   1000 requests/sec Computer: Page #123 Page #357  I/O to a single disk  100 seeks/sec  up to 100 requests/sec  network bandwidth from/to the Internet  T1 link (1.5Mbit/s)  12 requests/sec Networks: Page #124 Page #468 ……  T3 link (45Mbit/s)  360 requests/sec 11/11/2009 CSC 257/457 - Fall 2009 5 11/11/2009 CSC 257/457 - Fall 2009 6 Search Engine as An Example: Step 3 – Online Search Partitioning and Replication Index servers (partition 1) Index server Firewall/ Firewall Router Local- Local-area network Web server/ Query handler Local- Local-area Internet Internet network Index servers (partition 2) Page server Web server/ Query handlers Page servers Scalability, reliability 11/11/2009 CSC 257/457 - Fall 2009 7 11/11/2009 CSC 257/457 - Fall 2009 8 CSC 257/457 - Fall 2009 2
  3. 3. Computer Networks 11/11/2009 Load Balancing on Internet Servers Load Balancing over Internet Technique 1 - DNS Rotation Servers 128.111.1.2  Popular sites like Google or CNN receive tens or hundreds of millions of hits per day day. IP address of CNN.com?  A large number of replicated servers are used at Firewall/ 128.111.1.3 these sites. Router IP address of  Key question: how to balance client requests over CNN.com? Internet these servers? 128.111.1.4 128.111.1.2 Web servers for CNN.com 128.111.1.3 DNS server for CNN.com 11/11/2009 CSC 257/457 - Fall 2009 9 11/11/2009 CSC 257/457 - Fall 2009 10 Load Balancing on Internet Servers Discussions on DNS Rotation Technique 2 – Cooperative Offloading 128.111.1.2  Advantages  Require almost no change on the existing Internet architecture Firewall/ 128.111.1.3 Router  Problems  DNS Caching Internet  Rigid load balancing policy  can’t balance based on runtime load changes 128.111.1.4  slow or no adjustment in response to failures Web servers for CNN.com 11/11/2009 CSC 257/457 - Fall 2009 11 11/11/2009 CSC 257/457 - Fall 2009 12 CSC 257/457 - Fall 2009 3
  4. 4. Computer Networks 11/11/2009 Discussions on Cooperative Cooperative Offloading with Offloading TCP Handoff [Pai et al. ASPLOS1998] 128.111.1.2 What does 1.3 do?  Can be combined with the DNS rotation. What does 1.4 do?  Advantages:  More flexible policy is possible clt IP Firewall/ 128.111.1.3 Router  Be more responsive to runtime workload and server 1.3 failures (to a certain degree) clt IP Internet 1.4  Problems: 128.111.1.4  Need software changes on servers 1.3 13  Longer delay clt IP Web servers for CNN.com All packets in a TCP connection must offload to one server? 11/11/2009 CSC 257/457 - Fall 2009 13 11/11/2009 CSC 257/457 - Fall 2009 14 Cooperative Offloading vs. Load Balancing on Internet Servers TCP Handoff Technique 3 – Load Balancing Router 128.111.1.2 clt IP  Software changes on the servers g 1.2 1.2 clt IP clt IP Delays 128.111.1.3  Firewall 1.1 LB Router Internet 128.111.1.1 1.1 clt IP 128.111.1.4 Web servers for CNN.com 11/11/2009 CSC 257/457 - Fall 2009 15 11/11/2009 CSC 257/457 - Fall 2009 16 CSC 257/457 - Fall 2009 4
  5. 5. Computer Networks 11/11/2009 More About Load Balancing Router Summary How deep do we look into the network protocol stack?  Scalable Internet servers  Network layer (IP)?  partitioning  replication  Transport layer (TCP/UDP)?  Application layer?  Load balancing for Internet servers  DNS rotation Load balancing policies in LB routers (Goal: transparency,  cooperative offloading (w. TCP handoff) plug-and-play)  Load balancing router  Simple rotation  Changes required on the components: components DNS server?? Least number of active requests    Web server??  Shortest response time  client??  router?? 11/11/2009 CSC 257/457 - Fall 2009 17 11/11/2009 CSC 257/457 - Fall 2009 18 CSC 257/457 - Fall 2009 5

×