HTTP at your local BigCo
Upcoming SlideShare
Loading in...5
×
 

HTTP at your local BigCo

on

  • 1,196 views

 

Statistics

Views

Total Views
1,196
Views on SlideShare
1,194
Embed Views
2

Actions

Likes
0
Downloads
1
Comments
0

1 Embed 2

http://a0.twimg.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    HTTP at your local BigCo HTTP at your local BigCo Presentation Transcript

    • HTTP at your local BigCo:How the internet sausage gets made
      Peter Griess
      @pgriess
    • Goals and non-goals
      Basics of TCP/IP, DNS and HTTP and how they work together; pitfalls and optimizations
      A 1,000 foot view of scaling out HTTP infrastructure
      All manner of load balancing / traffic shaping
      Living on the edge
      Not: how to make a fast application (database access, rendering performance, etc)
    • Background: DNS
      Map hostnames to IP(s)
      www.facebook.com 69.171.229.12, 69.171.228.40
      Resolution process
      Recursion (and what does the DNS server see?)
      Caching
      Latencies: on-host, cached in LAN, cached at ISP, miss
    • Background: TCP
      Stateful protocol
      Negotiated by a synchronous 3-way handshake:
      2xRTT before first byte is sent!
      e.g. USA => South America ~250ms RTT
      Seamless failover is hard (but not impossible)
      Load balancing must be aware of flows
    • Background: HTTP
      Layered on top of TCP/TLS
      Has some useful bits
      Compression
      Connection re-use
      Pipelining
      Caching
      Kind of sucks
      Headers on all requests/responses
      Compression on bodies only
      Pipelining has to be disabled most of the time
      Pipelining suffers from head-of-line blocking
    • mycutekittens.tv
      68.193.17.4
      Big bad internet
      HTTP
    • Problem?
    • Problem
      Availability
      Server goes down (kernel panic?)
      Network goes down (cable cut?)
      Datacenter goes down (EC2?)
      Overload
      Shed load (good, can be transparent)
      Get infinitely slow (not good)
    • mycutekittens.tv: multi-server
      Big bad internet
      ???
    • We have options
      DNS load balancing
      IP load balancing
      HTTP load balancing
    • DNS load balancing
      mycutekittens.tv resolves to IPs: A, B, C, D
      Add new IPs to scale out
      Remove IPs when hosts go down
      Benefits
      Don’t need extra hardware to do load balancing
      Can span datacenters
      DNS servers are cheap / fast
      Drawbacks
      Hotspots due to caching
      Hotspots due to ordering in result list
      Hotspots due to resolver size
      TTL / flexibility trade-off
    • mycutekittens.tv: DNS
      Big bad internet
      DNS Server
      DNS
      68.193.17.4
      68.193.17.5
      68.193.17.6
    • IP load balancing (1)
      mycutekittens.tv resolves to 1 public IP owned by an IP load balancer
      Add new backend hosts w/ private IPs to scale out
      Load balancer health-checks hosts actively or passively to avoid dead hosts
      Scheduling policies vs. failover
      DSR
    • IP load balancing (2)
      Benefits
      Only 1 public IP (high DNS TTL)
      Backend network capacity/membership transparent to the internet
      Cheap-ish
      Failover is possible, not insanely difficult
      Drawbacks
      Can’t do what you can with HTTP
    • mycutekittens.tv: IP
      10.0.0.1
      Big bad internet
      10.0.0.2
      GW
      68.193.17.4
      10.0.0.3
      LB
    • HTTP load balancing (1)
      mycutekittens.tv resolves to 1 public IP owned by an HTTP load balancer
      Largely same as IP load balancing
      Terminates TCP connections (sees all bytes)
      Can make routing decisions based on HTTP
      Can autonomously serve requests (caching, access control, etc)
      Examples:
      Send requests for /foo/* to pool A
      401 requests without cookie Q
    • HTTP load balancing (2)
      Benefits
      Largely the same as IP
      More flexible rules
      Can terminate TLS (security+, cost+)
      Drawbacks
      No DSR
      Failover difficult
      Not as performant as IP
    • mycutekittens.tv: HTTP
      10.0.0.1
      Big bad internet
      10.0.0.2
      68.193.17.4
      LB
      HTTP(S)
      10.0.0.3
    • mycutekittens.tv: MOAR
      Eventually a single LB is going to be a problem
      Not enough capacity
      Availability
      Turtles all the day way down
      LB of LBs!
      DNS load balancing between datacenters

    • HTTPS: myths and reality
      Too computationally expensive
      Only a few percent (imperialviolet.org); is your webserver actually CPU bound? doubt it
      SSL acceleration cards, GPUs, etc
      Too much latency
      Handshaking is 5-7xRTT
      Session resume
      False start
      Snap start
      Caching breaks
    • My latency is huge in Japan
      RTT to USA is (or any single DC) can be huge
      Re-use connections (connection: keep-alive)
      Send work in parallel (pipelining)
      Use compression (content-encoding)
      Lots of tricks for static resources (bundling, CDNs, caching, etc)
      Pre-fetch data
    • Let’s get crazy: SPDY
      Don’t limit yourself to HTTP; use a different protocol
      SPDY developed by Google, supported by Chrome, google.com (and soon facebook.com)
      Connection re-use w/o head-of-line blocking
      Headers always compressed
      Always SSL (but breaks caching)
    • Let’s get crazy: TCP termination
      Synchronous RTTs: the silent killer
      Opening new TCP connections is very costly
      Run proxies close to users and proxy traffic back to core using optimized protocol
      Low RTT to proxy
      Do SPDY-like tricks between edge + core
      Potentially faster network to core than public internet
      Advertise these proxies via DNS
      Geo-targetting
      AS-adjacency
      Akamai CDN does this, sort of
    • Let’s get crazy: DNS anycast
      Remember how DNS resolutions were slow?
      DNS servers could be far away from a user
      Advertise multiple network routes for the same DNS IP, let the IP stack pick the closest one