Katz, Stoica F04 EE122: DNS and the Web


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Katz, Stoica F04 EE122: DNS and the Web

  1. 1. EE122: DNS and the Web (after a little more multicast) October 27, 2003
  2. 2. EECS 122: Introduction to Computer Networks DNS and WWW Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776
  3. 3. Barriers to Multicast <ul><li>Hard to change IP </li></ul><ul><ul><li>multicast means change to IP </li></ul></ul><ul><ul><li>details of multicast were very hard to get right </li></ul></ul><ul><li>Not always consistent with ISP economic model </li></ul><ul><ul><li>charging done at edge, but single packet from edge can explode into millions of packets within network </li></ul></ul><ul><li>Troublesome security model </li></ul><ul><ul><li>Anyone can send to a group </li></ul></ul><ul><ul><li>Denial-of-service attacks on known groups </li></ul></ul>
  4. 4. Application Layer Multicast (ALM) <ul><li>Let the hosts do all the “special” work </li></ul><ul><ul><li>only require unicast from infrastructure </li></ul></ul><ul><li>Basic idea: </li></ul><ul><ul><li>hosts do the copying of packets </li></ul></ul><ul><ul><li>set up tree between hosts </li></ul></ul><ul><li>Example: Narada [Yang-hua et al, 2000] </li></ul><ul><ul><li>Small group sizes <= hundreds of nodes </li></ul></ul><ul><ul><li>Typical application: chat </li></ul></ul>
  5. 5. Narada: End System Multicast Stanford CMU Stan1 Stan2 Berk2 “Overlay” Tree Gatech Berk1 Berkeley Gatech Stan1 Stan2 CMU Berk1 Berk2
  6. 6. Algorithmic Challenge <ul><li>Choosing replication/forwarding points among hosts </li></ul><ul><ul><li>how do the hosts know about each other </li></ul></ul><ul><ul><li>and know which hosts should forward to other hosts </li></ul></ul>
  7. 7. Advantages of ALM <ul><li>No need for changes to IP or routers </li></ul><ul><li>No need for ISP cooperation </li></ul><ul><li>End hosts can prevent other hosts from sending </li></ul><ul><li>Easy to implement reliability </li></ul><ul><ul><li>use hop-by-hop retransmissions </li></ul></ul>
  8. 8. Performance Concerns <ul><li>Stretch </li></ul><ul><ul><li>ratio of latency in the overlay to latency in the underlying network </li></ul></ul><ul><li>Stress </li></ul><ul><ul><li>number of duplicate packets sent over the same physical link </li></ul></ul>
  9. 9. Performance Concerns Duplicate Packets: Bandwidth Wastage CMU Stan1 Stan2 Berk2 Gatech Berk1 Delay from CMU to Berk1 increases Stanford Berkeley Gatech Stan1 Stan2 CMU Berk1 Berk2
  10. 10. Single Sender Multicast <ul><li>Many problems with IP multicast disappear if each group is associated with a single source </li></ul><ul><li>Hosts joining multicast group can send join messages to source </li></ul><ul><ul><li>this sets up delivery tree </li></ul></ul><ul><ul><li>no worry about “root” being in wrong place </li></ul></ul><ul><li>This solves several problems: </li></ul><ul><ul><li>better security and charging model </li></ul></ul><ul><ul><li>simple algorithm </li></ul></ul>
  11. 11. Example <ul><li>Group members: M1, M2, M3 </li></ul>source M1 M2 M3 control (join) messages data
  12. 12. What’s Wrong with SSM? <ul><li>Multiple sources? </li></ul><ul><ul><li>can set up group per source, or... </li></ul></ul><ul><ul><li>source can serve as relay for other senders </li></ul></ul><ul><li>Algorithm? </li></ul><ul><ul><li>trivial </li></ul></ul><ul><li>So, why isn’t SSM the answer? </li></ul><ul><ul><li>multicast no longer serves as “rendezvous” </li></ul></ul><ul><ul><li>ok for “broadcast” apps, not good for “meeting” apps </li></ul></ul>
  13. 13. What Do You Need to Know? <ul><li>DVRMP </li></ul><ul><li>CBT </li></ul><ul><li>SSM </li></ul><ul><li>How they compare </li></ul>
  14. 14. Today’s Lecture: 17 Network (IP) Application Transport Link Physical 2 7, 8, 9 10,11 17 , 18, 19 14, 15, 16 21, 22, 23 25 6
  15. 15. Internet Names & Addresses <ul><li>Names: e.g. ariachne.berkeley.edu </li></ul><ul><ul><li>human-usable labels for machines </li></ul></ul><ul><ul><li>conforms to “organizational” structure </li></ul></ul><ul><li>Addresses: e.g. </li></ul><ul><ul><li>router-usable labels for machines </li></ul></ul><ul><ul><li>conforms to “network” structure </li></ul></ul><ul><li>How do you map from one to another? </li></ul><ul><ul><li>Domain Name System (DNS) </li></ul></ul>
  16. 16. DNS: History <ul><li>Initially all host-addess mappings were in a file called hosts.txt (in /etc/hosts) </li></ul><ul><ul><li>Changes were submitted to SRI by email </li></ul></ul><ul><ul><li>New versions of hosts.txt ftp’d periodically from SRI </li></ul></ul><ul><ul><li>An administrator could pick names at their discretion </li></ul></ul><ul><li>As the internet grew this system broke down because: </li></ul><ul><ul><li>SRI couldn’t handled the load </li></ul></ul><ul><ul><li>Names were not unique </li></ul></ul><ul><ul><li>Many hosts had inaccurate copies of hosts.txt </li></ul></ul><ul><li>Internet growth was threatened! </li></ul><ul><ul><li>Domain Name System (DNS) was born </li></ul></ul>
  17. 17. Basic DNS Features <ul><li>Hierarchical namespace </li></ul><ul><ul><li>as opposed to original flat namespace </li></ul></ul><ul><li>Distributed storage architecture </li></ul><ul><ul><li>as opposed to centralized storage (plus replication) </li></ul></ul><ul><li>Client--server interaction on UDP Port 53 </li></ul><ul><ul><li>but can use TCP if desired </li></ul></ul>
  18. 18. Naming Hierarchy <ul><li>“ Top Level Domains” are at the top </li></ul><ul><li>Depth of tree is arbitrary (limit 128) </li></ul><ul><li>Domains are subtrees </li></ul><ul><ul><li>E.g: .edu, berkeley.edu, eecs.berkeley.edu </li></ul></ul><ul><li>Name collisions avoided </li></ul><ul><ul><li>E.g. berkeley.edu and berkeley.com can coexist, but uniqueness is job of domain </li></ul></ul>root edu com gov mil org net uk fr berkeley mit eecs sims argus etc.
  19. 19. Host names are administered hierarchically <ul><li>A zone corresponds to an administrative authority that is responsible for that portion of the hierarchy </li></ul><ul><li>eecs controls names: x.eecs.berkeley.edu </li></ul><ul><li>berkeley controls names: x.berkeley.edu and y.sims.berkeley.edu </li></ul>root edu com gov mil org net uk fr berkeley mit eecs sims argus root edu com gov mil org net uk fr berkeley eecs sims
  20. 20. Server Hierarchy <ul><li>Each server has authority over a portion of the hierarchy </li></ul><ul><ul><li>A server maintains only a subset of all names </li></ul></ul><ul><li>Each server contains all the records for the hosts in its zone </li></ul><ul><ul><li>might be replicated for robustness </li></ul></ul><ul><li>Each server needs to know other servers that are responsible for the other portions of the hierarchy </li></ul><ul><ul><li>Every server knows the root </li></ul></ul><ul><ul><li>Root server knows about all top-level domains </li></ul></ul>
  21. 21. DNS Name Servers <ul><li>Local name servers: </li></ul><ul><ul><li>Each ISP (company) has local default name server </li></ul></ul><ul><ul><li>Host DNS query first goes to local name server </li></ul></ul><ul><li>Authoritative name servers: </li></ul><ul><ul><li>For a host: stores that host’s (name, IP address) </li></ul></ul><ul><ul><li>Can perform name/address translation for that host’s name </li></ul></ul><ul><li>Can also do IP to name translation, but won’t discuss </li></ul>
  22. 22. DNS: Root Name Servers <ul><li>Contacted by local name server that can not resolve name </li></ul><ul><li>Root name server: </li></ul><ul><ul><li>Contacts authoritative name server if name mapping not known </li></ul></ul><ul><ul><li>Gets mapping </li></ul></ul><ul><ul><li>Returns mapping to local name server </li></ul></ul><ul><li>~ Dozen root name servers worldwide </li></ul>
  23. 23. Simple DNS Example <ul><li>Host whistler.cs.cmu.edu wants IP address of www.berkeley.edu </li></ul><ul><li>1. Contacts its local DNS server, mango.srv.cs.cmu.edu </li></ul><ul><li>2. mango.srv.cs.cmu.edu contacts root name server, if necessary </li></ul><ul><li>3. Root name server contacts authoritative name server, ns1.berkeley.edu, if necessary </li></ul>requesting host whistler.cs.cmu.edu www.berkeley.edu root name server authorititive name server ns1.berkeley.edu 1 2 3 4 5 6 local name server mango.srv.cs.cmu.edu
  24. 24. Example of Recursive DNS Query <ul><li>Root name server: </li></ul><ul><li>May not know authoritative name server </li></ul><ul><li>May know intermediate name server : who to contact to find authoritative name server? </li></ul><ul><li>Recursive query: </li></ul><ul><li>Puts burden of name resolution on contacted name server </li></ul><ul><li>Heavy load? </li></ul>requesting host whistler.cs.cmu.edu www.berkeley.edu root name server 1 2 3 4 5 6 authoritative name server ns1.berkeley.edu 7 8 local name server mango.srv.cs.cmu.edu intermediate name server (edu server)
  25. 25. Example of Iterated DNS Query <ul><li>Iterated query: </li></ul><ul><li>Contacted server replies with name of server to contact </li></ul><ul><li>“ I don’t know this name, but ask this server” </li></ul>requesting host whistler.cs.cmu.edu www.berkeley.edu root name server 1 2 3 4 6 7 authoritative name server ns1.berkeley.edu 5 8 iterated query local name server mango.srv.cs.cmu.edu intermediate name server (edu server)
  26. 26. DNS Records <ul><li>Four fields: (name, value, type, TTL) </li></ul><ul><li>Type = A: </li></ul><ul><ul><li>name = hostname </li></ul></ul><ul><ul><li>value = IP address </li></ul></ul><ul><li>Type = NS: </li></ul><ul><ul><li>name = domain </li></ul></ul><ul><ul><li>value = name of dns server for domain </li></ul></ul>
  27. 27. DNS Records (cont’d) <ul><li>Type = CNAME: </li></ul><ul><ul><li>name = hostname </li></ul></ul><ul><ul><li>value = canonical name </li></ul></ul><ul><li>Type = MX: </li></ul><ul><ul><li>name = domain in email address </li></ul></ul><ul><ul><li>value = canonical name of mail server </li></ul></ul>
  28. 28. DNS as Indirection Service <ul><li>Can refer to machines by name, not address </li></ul><ul><ul><li>not only easier for humans </li></ul></ul><ul><ul><li>also allows machines to change IP addresses without having to change way you refer to machine </li></ul></ul><ul><li>Can refer to machines by alias </li></ul><ul><ul><li>www.berkeley.edu can be generic web server </li></ul></ul><ul><ul><li>but DNS can point this to particular machine that can change over time </li></ul></ul><ul><li>But, this flexibility applies only within domain! </li></ul>
  29. 29. Special Topics <ul><li>DNS caching </li></ul><ul><ul><li>improve performance by saving results of previous lookups </li></ul></ul><ul><li>DNS “hacks” </li></ul><ul><ul><li>return records based on requesting IP address </li></ul></ul><ul><li>dynamic DNS </li></ul><ul><ul><li>allows remote updating of IP address for mobile hosts </li></ul></ul><ul><li>DNS politics (ICANN) and branding battles </li></ul>
  30. 30. Important Properties of DNS <ul><li>Administrative delegation and distributed server architecture results in: </li></ul><ul><li>Easy unique naming </li></ul><ul><li>Fate sharing for network failures </li></ul><ul><li>Reasonable trust model </li></ul>
  31. 31. The Web <ul><li>A distributed database of “pages” </li></ul><ul><li>Core components: </li></ul><ul><ul><li>Servers: store files and execute remote commands </li></ul></ul><ul><ul><li>Browsers: retrieve and display “pages” </li></ul></ul><ul><ul><li>URLs: way to refer to pages </li></ul></ul><ul><li>Need a protocol to transfer information between clients and servers </li></ul><ul><ul><li>HTTP </li></ul></ul>
  32. 32. Uniform Record Locator <ul><li>protocol://host-name:port/directory-path/resource </li></ul><ul><li>Extend the idea of hierarchical namespaces to include anything in a file system </li></ul><ul><ul><li>ftp://www.eecs.berkeley.edu/122/Lecture6/presentation.ppt </li></ul></ul><ul><li>Extend to program executions as well… </li></ul><ul><ul><li>http://us.f413.mail.yahoo.com/ym/ShowLetter?box=%40B%40Bulk&MsgId=2604_1744106_29699_1123_1261_0_28917_3552_1289957100&Search=&Nhead=f&YY=31454&order=down&sort=date&pos=0&view=a&head=b </li></ul></ul><ul><ul><li>Server side processing can be incorporated in the name </li></ul></ul>
  33. 33. Web and DNS <ul><li>URLs use hostnames </li></ul><ul><li>Thus, content names are tied to specific hosts </li></ul><ul><li>This is bad! </li></ul><ul><li>URNs are one proposal to achieve persistence </li></ul>
  34. 34. Hyper Text Transfer Protocol <ul><li>Client-server architecture </li></ul><ul><li>Synchronous request/reply protocol </li></ul><ul><ul><li>Runs over TCP, Port 80 </li></ul></ul><ul><li>Stateless </li></ul><ul><li>Uses unicast </li></ul>
  35. 35. Big Picture Client Server TCP Syn TCP syn + ack TCP ack + HTTP GET Establish connection Request response Client request Close connection . . .
  36. 36. Hyper Text Transfer Protocol Commands <ul><li>GET – transfer resource from given URL </li></ul><ul><li>HEAD – GET resource metadata (headers) only </li></ul><ul><li>PUT – store/modify resource under given URL </li></ul><ul><li>DELETE – remove resource </li></ul><ul><li>POST – provide input for a process identified by the given URL (usually used to post CGI parameters) </li></ul>
  37. 37. Response Codes <ul><li>1x informational </li></ul><ul><li>2x success </li></ul><ul><li>3x redirection </li></ul><ul><li>4x client error in request </li></ul><ul><li>5x server error; can’t satisfy the request </li></ul>
  38. 38. Client Request <ul><li>Steps to get the resource: http://www.eecs.berkeley.edu/index.html </li></ul><ul><ul><li>Use DNS to obtain the IP address of www. eecs . berkeley . edu </li></ul></ul><ul><ul><li>Send to an HTTP request: </li></ul></ul>GET /index.html HTTP/1.0
  39. 39. Server Response HTTP/1.0 200 OK Content-Type: text/html Content-Length: 1234 Last-Modified: Mon, 19 Nov 2001 15:31:20 GMT <HTML> <HEAD> <TITLE>EECS Home Page</TITLE> </HEAD> … </BODY> </HTML>
  40. 40. Example (from Kurose and Ross) <ul><li>http://www.mylife.org/mypictures.htm </li></ul><ul><li>After finding out the IP address of the host… </li></ul><ul><li>http client initiates a TCP connection on :80 </li></ul><ul><li>Client sends the get request via socket established in 1 </li></ul><ul><li>Server sends the html file, which is encapsulated in its response </li></ul><ul><li>http server tells tcp to terminate connection </li></ul><ul><li>http client receives the file and the browser parses it…contains ten jpeg images </li></ul><ul><li>Client repeats steps 1-4 </li></ul>
  41. 41. HTTP/1.0 Example Client Server Request image 1 Transfer image 1 Request image 2 Transfer image 2 Request text Transfer text Finish display page
  42. 42. HHTP/1.0 Performance <ul><li>Create a new TCP connection for each resource </li></ul><ul><ul><li>Large number of embedded objects in a web page </li></ul></ul><ul><ul><li>Many short lived connections </li></ul></ul><ul><li>TCP transfer </li></ul><ul><ul><li>Too slow for small object </li></ul></ul><ul><ul><li>May never exit slow-start phase </li></ul></ul><ul><li>Connections may be set up in parallel (5 is default in most browsers) </li></ul>
  43. 43. HTTP/1.0 Caching <ul><li>Exploit locality of reference </li></ul><ul><li>A modifier to the GET request: </li></ul><ul><ul><li>If-modified-since – return a “not modified” response if resource was not modified since specified time </li></ul></ul><ul><li>A response header: </li></ul><ul><ul><li>Expires – specify to the client for how long it is safe to cache the resource </li></ul></ul><ul><li>A request directive: </li></ul><ul><ul><li>No-cache – ignore all caches and get resource directly from server </li></ul></ul><ul><li>These features can be best taken advantage of with HTTP proxies </li></ul><ul><ul><li>Locality of reference increases if many clients share a proxy </li></ul></ul>
  44. 44. Web Proxies <ul><li>Intermediaries between client and server </li></ul>Client 1 Proxy Proxy Server Client 2 Client N . . .
  45. 45. HTTP/1.1 (1996) <ul><li>Performance: </li></ul><ul><ul><li>Persistent connections </li></ul></ul><ul><ul><li>Pipelined requests/responses </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><li>Support for virtual hosting </li></ul><ul><li>Efficient caching support </li></ul><ul><ul><li>Network Cache assumed more explicitly in the design </li></ul></ul><ul><ul><li>Gives more control to the server on how it wants data cached </li></ul></ul>
  46. 46. Persistent Connections <ul><li>Allow multiple transfers over one connection </li></ul><ul><li>Avoid multiple TCP connection setups </li></ul><ul><li>Avoid multiple TCP slow starts </li></ul>
  47. 47. Pipelined Requests/Responses <ul><li>Buffer requests and responses to reduce the number of packets </li></ul><ul><li>Multiple requests can be contained in one TCP segment </li></ul><ul><li>Note: order of responses has to be maintained </li></ul>Client Server Request 1 Request 2 Request 3 Transfer 1 Transfer 2 Transfer 3
  48. 48. What You Need to Know <ul><li>DNS: record types, and how they are used </li></ul><ul><li>HTTP basics (and essential differences between 1.0 and 1.1) </li></ul>
  49. 49. What’s the Moral of this Story? <ul><li>QoS and IP Multicast: </li></ul><ul><ul><li>interesting algorithmic and architectural issues </li></ul></ul><ul><ul><li>thousands of academic papers </li></ul></ul><ul><ul><li>ubiquitous in routers, but not deployed by ISPs </li></ul></ul><ul><ul><li>little or no impact on end users </li></ul></ul><ul><li>DNS and the Web: </li></ul><ul><ul><li>no research papers on topic before deployment </li></ul></ul><ul><ul><li>really boring designs </li></ul></ul><ul><ul><li>they changed the world.... </li></ul></ul>