INFO 330 Computer Networking Technology I  Chapter 2 The Application Layer  Glenn Booker INFO 330 Chapter 2
Application Layer <ul><li>The Application Layer is the reason the rest of the network exists – to serve applications </li>...
Application Layer <ul><li>New applications designed for network implementation need to decide whether  the application is ...
Client-server Architecture <ul><li>In client-server architecture, the server  </li></ul><ul><ul><li>Handles requests from ...
Client-server Architecture <ul><li>Complex  infrastructure intensive  apps might require several types of servers – databa...
P2P Architecture <ul><li>P2P architecture assumes the clients are on or off at will, and all are treated equally as potent...
P2P Architecture <ul><li>P2P architecture is inherently  self-scalable </li></ul><ul><ul><li>Millions of computers may par...
P2P Architecture <ul><li>Key challenges in a good P2P app include </li></ul><ul><ul><li>ISP friendly, since most residenti...
Hybrid Architecture <ul><li>Client-server and P2P combinations exist </li></ul><ul><ul><li>Napster  is the best known for ...
Process Communication <ul><li>Any network application (no matter which architecture) needs to communicate between hosts us...
Process Communication <ul><li>Processes exchange  messages </li></ul><ul><ul><li>The sending or  client process  creates a...
Sockets <ul><li>A  socket  is the doorway through which the process sends a message to the network </li></ul><ul><li>The m...
Sockets INFO 330 Chapter 2 Could be UDP on both ends
Sockets <ul><li>A socket is the  Application Programming Interface  (API) between application and  the network </li></ul><...
Addressing Processes <ul><li>For the server process to get the message, it has to be addressed correctly </li></ul><ul><li...
Addressing Processes INFO 330 Chapter 2 Sockets send packets Ports listen for them
Port Number <ul><li>Port numbers follow default values, set by  the  IANA , unless specified otherwise </li></ul><ul><ul><...
More Protocols <ul><li>Application-layer protocols define how a particular application’s processes are structured </li></u...
Application vs its protocols <ul><li>A single application often needs to use several application-layer protocols </li></ul...
RFC Summary <ul><li>For an RFC which lists the current RFC standards, look in the  RFC Index  for  “Internet Official Prot...
Application Services <ul><li>The transport layer connects the application layer to everything else </li></ul><ul><li>Have ...
Application Services <ul><li>How much bandwidth or throughput does your app need? </li></ul><ul><ul><li>Does sending rate ...
TCP Services <ul><li>TCP provides a connection-oriented service, where the sockets of the client and server recognize a co...
TCP Services <ul><li>TCP also provides congestion control, for benefit of the Internet </li></ul><ul><ul><li>This throttle...
UDP Services <ul><li>UDP is a lightweight protocol – meaning it doesn’t do much! </li></ul><ul><ul><li>UDP is connectionle...
Services NOT Provided <ul><li>TCP and UDP do not provide guarantees of throughput or timing </li></ul><ul><li>TCP does not...
Application Protocols <ul><li>We’ll examine protocols for Internet-based applications </li></ul><ul><ul><li>HTTP </li></ul...
The Web and HTTP <ul><li>Through the 1980’s, the Internet was used mostly for remote login, file transfer, newsgroups, and...
HTTP <ul><li>The HyperText Transfer Protocol ( HTTP )  is the heart of the Web </li></ul><ul><ul><li>Defined by RFCs 1945 ...
HTTP <ul><li>A  Web server  houses the objects </li></ul><ul><ul><li>Apache  and Microsoft Internet Information Services (...
HTTP <ul><li>HTTP can use persistent or non-persistent connections – persistent is the default, but non-persistent can be ...
HTTP <ul><ul><li>Server closes the TCP connection </li></ul></ul><ul><ul><li>Client closes the TCP connection </li></ul></...
More Delays! <ul><li>How long does this process take?  </li></ul><ul><ul><li>The  round-trip time  (RTT) is for a packet t...
RTT Delay <ul><li>So the time for getting one file is two times the RTT, plus the transmission time for uploading the file...
Persistent Connection <ul><li>If there’s a persistent connection, the TCP connection stays, so the handshake is done once ...
Persistent Connection <ul><li>Without pipelining , the client requests a new object only after the previous request has be...
HTTP vs HTML <ul><li>Don’t confuse HTTP with HTML </li></ul><ul><ul><li>HTTP is the protocol used to define how files  are...
HTTP Messages <ul><li>HTTP messages are two types,  request  messages (from client) and  response  messages (from server) ...
HTTP Messages <ul><li>There are many headers which could appear in requests or responses </li></ul><ul><ul><li>Cache-Contr...
HTTP Requests <ul><li>Request messages have variable number  of lines, depending on the method called  </li></ul><ul><li>G...
HTTP Requests <ul><ul><li>HTTP-Version is what it sounds like, e.g. HTTP/1.1 </li></ul></ul><ul><li>There are many possibl...
HTTP Responses <ul><li>HTTP responses go from server to client </li></ul><ul><li>General syntax starts with  </li></ul><ul...
HTTP Responses <ul><li>Response headers can include </li></ul><ul><ul><li>Accept-Ranges, Age, ETag, Location,  Proxy-Authe...
HTTP Entities <ul><li>An entity is the object sent or returned with an HTTP message </li></ul><ul><li>Entities can be with...
HTTP <ul><li>So HTTP describes request and response message formats </li></ul><ul><ul><li>Both types typically have a firs...
Cookies! <ul><li>HTTP is stateless </li></ul><ul><li>But some would like to remember a little information about web site v...
Cookies <ul><li>When a user visits a cookied web site the  first time, they are assigned a unique ID number, which is stor...
Cookies <ul><ul><li>Cookie: 1678 </li></ul></ul><ul><li>This provides a way for web sites to automate login for repeat cus...
Other HTTP Content <ul><li>So far we assumed the file content for HTTP was HTML files, JPGs, GIFs, etc. </li></ul><ul><li>...
Web Caching <ul><li>A Web cache, or proxy server, acts as an intermediate between clients and servers </li></ul><ul><ul><l...
Web Caching <ul><li>Tends to work well when the client-cache connection is faster than the cache-server connection </li></...
FTP <ul><li>The File Transfer Protocol is one of the oldest Internet applications (now RFC 959, but started as RFC 114 in ...
FTP <ul><li>FTP uses TCP, and usually connects to the server on ports 20 and 21 </li></ul><ul><li>The client sends user ID...
FTP <ul><li>Commands and replies are very basic </li></ul><ul><ul><li>Most commands are three or four-letter abbreviations...
Electronic Mail <ul><li>E-mail is another ancient Internet application, with origins in RFC 772 in 1980 </li></ul><ul><li>...
Electronic Mail <ul><li>Email is composed in a client, which sends it to a mail queue in the sender’s mail server  </li></...
Electronic Mail <ul><li>Each user has a mailbox on the mail server </li></ul><ul><ul><li>Access to the mailbox is controll...
SMTP <ul><li>After the TCP connection is established, SMTP does a handshake with port 25 of  the recipient’s mail server <...
SMTP <ul><li>Other commands include ( with comments in italics ) </li></ul><ul><ul><li>RSET  (abort current transaction) <...
SMTP vs HTTP <ul><li>SMTP and HTTP can both move files using persistent TCP connections </li></ul><ul><ul><li>SMTP  pushes...
Mail Message Formats <ul><li>Email contains header information defined  by RFC 822 (Standard for ARPA Internet Text Messag...
Mail Message Formats <ul><ul><li>Other allowable header fields are:  SUBJECT, COMMENTS, ENCRYPTED, and possibly some exten...
MIME <ul><li>Multipurpose Internet Mail Extensions (MIME) are used for handling non-ASCII contents in email, e.g. non-Lati...
MIME <ul><li>The key three parts of MIME are defining the version of MIME, the encoding scheme, and the type of content </...
MIME <ul><ul><ul><li>Subtype is an ietf-token (An extension token defined by a standards-track RFC and registered with IAN...
MIME <ul><li>The received message also includes a  Received:  header added to the top of  the message </li></ul><ul><li>Th...
Uuencode and uudecode <ul><li>Historic note: </li></ul><ul><ul><li>Before MIME,  uuencode  was used to convert non-ASCII f...
Mail Access Protocols <ul><li>If you log directly into your email server, SMTP is all you need to handle email </li></ul><...
POP3 <ul><li>POP3 is defined in RFC 1939 </li></ul><ul><ul><li>It’s a pretty simple protocol compared to many </li></ul></...
POP3 <ul><li>POP3 uses TCP, and connects to port 110 on the mail server </li></ul><ul><li>POP3 does three things – authori...
POP3 <ul><li>POP3 communicates with the mail server by commands, which get a +OK response if it worked, and an –ERR respon...
POP3 <ul><li>POP3 allows two modes, depending on whether you delete the messages after retrieving them </li></ul><ul><ul><...
POP3 <ul><li>POP3 maintains a little state information during a session, such as which files have been marked for deletion...
IMAP <ul><li>IMAP , defined in RFC 3501, allows folders to be defined on the mail server to organize email there </li></ul...
IMAP <ul><li>Users can also get just the headers of messages, and avoid downloading the  MIME portion </li></ul><ul><ul><l...
Web Email <ul><li>Hotmail  (now owned by Microsoft) introduced web-based email shortly after the Web became popular </li><...
DNS <ul><li>A key need, once the Internet grew beyond a few thousand hosts, was to automate converting human* readable add...
Host vs Domain Names <ul><li>A  hostname  is the name of a particular host computer, such as banner.drexel.edu </li></ul><...
IP Addresses <ul><li>IP addresses have four groups of bytes, each group from 0 to 255, separated by periods  </li></ul><ul...
DNS <ul><li>DNS runs over UDP, port 53  (something uses UDP!) </li></ul><ul><li>DNS is managed by DNS servers, typically r...
Reverse DNS Lookup <ul><li>So if you try to look up a random IP address like 123.45.67.89,  dnsstuff.com  gives </li></ul>...
DNS <ul><li>DNS also provides other key services </li></ul><ul><ul><li>Host aliasing  allows the true or  canonical hostna...
DNS Structure <ul><li>DNS is highly decentralized </li></ul><ul><ul><li>Improves throughput, speed, redundancy, reliabilit...
DNS Structure <ul><ul><li>Top-Level Domain (TLD) DNS Servers – sets of servers are maintained for each of the top level do...
DNS Lookup <ul><li>DNS lookup typically follows the pattern at right </li></ul><ul><ul><li>A request to the local DNS serv...
Recursive vs Iterative Queries <ul><li>DNS queries which ask another server to get information are  recursive </li></ul><u...
DNS Lookup <ul><li>This would be terribly tedious without caching </li></ul><ul><ul><li>Common queries are stored on each ...
DNS Records <ul><li>Data about a hostname, its aliases, domain, and mail servers are captured in resource records (RR) </l...
DNS Records <ul><li>DNS RR types are one of several options </li></ul><ul><ul><li>Type=A gives the IP address Value for a ...
DNS Records <ul><ul><li>Type=MX gives the canonical mail server Value for an alias hostname Name </li></ul></ul><ul><ul><u...
New resource record types <ul><li>There are type AAAA resource records for IPv6 addresses  </li></ul><ul><ul><li>Their syn...
DNS Messages <ul><li>The same format DNS messages are used  to both query a DNS server, and receive  the reply </li></ul><...
nslookup <ul><li>The command nslookup provides basic IP data for a hostname or domain </li></ul><ul><li>Nslookup snip.net ...
DNS Changes <ul><li>A registrar makes changes to the DNS database </li></ul><ul><ul><li>The list of registrars is at  http...
DNS and security <ul><li>DNS is somewhat vulnerable to distributed denial of service (DDoS) attacks </li></ul><ul><ul><li>...
Peer-to-Peer File Sharing <ul><li>Peer-to-Peer (P2P) file sharing occupies much of the volume of Internet traffic </li></u...
P2P File Distribution <ul><li>P2P can be used to distribute a file from one source (e.g. a new Linux kernel) to hundreds o...
BitTorrent <ul><li>Bittorrent.org manages the protocol used by most file sharing (30% of all Internet backbone traffic!) <...
BitTorrent <ul><li>When you join a torrent, you identify up to 50 neighboring peers already in the torrent </li></ul><ul><...
Peer-to-Peer File Sharing <ul><li>TCP connections between the computers and FTP make it possible </li></ul><ul><ul><li>The...
Peer-to-Peer File Sharing <ul><ul><li>More refined  limited scope query flooding  is now done to minimize Internet traffic...
Peer-to-Peer File Sharing <ul><ul><li>More powerful peers are group leaders  ( super peers ) for those around them, acting...
Skype <ul><li>Skype  is a popular P2P Internet telephony app, which goes beyond file distribution and sharing in the P2P w...
Peer-to-Peer File Sharing <ul><li>A massive issue for P2P file sharing is the intellectual property rights of the files be...
Upcoming SlideShare
Loading in …5
×

Chapter 2

904 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
904
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Chapter 2

  1. 1. INFO 330 Computer Networking Technology I Chapter 2 The Application Layer Glenn Booker INFO 330 Chapter 2
  2. 2. Application Layer <ul><li>The Application Layer is the reason the rest of the network exists – to serve applications </li></ul><ul><li>Most of the software familiar to end users are applications </li></ul><ul><ul><li>Email, FTP, newsgroups, chat, the Web, streaming video, video conferencing, IPTV, etc. </li></ul></ul><ul><li>We focus first on key concepts related to the Application Layer, then discuss some specific applications in detail </li></ul>INFO 330 Chapter 2
  3. 3. Application Layer <ul><li>New applications designed for network implementation need to decide whether the application is based on </li></ul><ul><ul><li>Client-server architecture </li></ul></ul><ul><ul><li>Peer to peer (P2P) </li></ul></ul><ul><ul><li>Or some hybrid combination of the two </li></ul></ul>INFO 330 Chapter 2
  4. 4. Client-server Architecture <ul><li>In client-server architecture, the server </li></ul><ul><ul><li>Handles requests from many clients, and </li></ul></ul><ul><ul><li>Is generally always available </li></ul></ul><ul><ul><li>Often has a fixed IP address </li></ul></ul><ul><li>Clients generally don’t communicate with each other, and may be on or off independently of each other and the server </li></ul><ul><ul><li>Client-server applications include email, FTP, the Web, remote login </li></ul></ul>INFO 330 Chapter 2
  5. 5. Client-server Architecture <ul><li>Complex infrastructure intensive apps might require several types of servers – database, web, etc. </li></ul><ul><li>Multiple servers may be needed to keep up with the volume of client requests, hence the need for a server farm or data center </li></ul>INFO 330 Chapter 2
  6. 6. P2P Architecture <ul><li>P2P architecture assumes the clients are on or off at will, and all are treated equally as potential servers and/or clients </li></ul><ul><ul><li>Apps include Gnutella , Morpheus , BitTorrent , Kazaa and more </li></ul></ul>INFO 330 Chapter 2
  7. 7. P2P Architecture <ul><li>P2P architecture is inherently self-scalable </li></ul><ul><ul><li>Millions of computers may participate, because each computer adds capacity at the same time it adds possible workload </li></ul></ul><ul><li>Managing contents of a P2P application can be difficult </li></ul><ul><ul><li>Only one computer may have a particular file, and there’s no control over when that computer is available </li></ul></ul>INFO 330 Chapter 2
  8. 8. P2P Architecture <ul><li>Key challenges in a good P2P app include </li></ul><ul><ul><li>ISP friendly, since most residential connections are designed for far more bandwidth down than up, and P2P doesn’t follow this </li></ul></ul><ul><ul><li>Security, danger of over-sharing </li></ul></ul><ul><ul><li>Incentives for people to participate </li></ul></ul>INFO 330 Chapter 2
  9. 9. Hybrid Architecture <ul><li>Client-server and P2P combinations exist </li></ul><ul><ul><li>Napster is the best known for file sharing </li></ul></ul><ul><ul><ul><li>Obtains file location and description information from a P2P network, but maintains that information on a central server farm </li></ul></ul></ul><ul><ul><li>Instant messaging (IM) is also hybrid </li></ul></ul><ul><ul><ul><li>Chats are all P2P, but logging into the system is centralized </li></ul></ul></ul><ul><ul><ul><li>Includes ICQ , AOL IM , MSN Messenger , etc. </li></ul></ul></ul>INFO 330 Chapter 2
  10. 10. Process Communication <ul><li>Any network application (no matter which architecture) needs to communicate between hosts using processes </li></ul><ul><ul><li>In this sense, a process is a program running on a client, server, or peer host </li></ul></ul><ul><ul><li>Processes may communicate with other processes on the same host; this is controlled by the host’s operating system (OS) </li></ul></ul><ul><ul><li>We are interested in processes that communicate between hosts </li></ul></ul>INFO 330 Chapter 2
  11. 11. Process Communication <ul><li>Processes exchange messages </li></ul><ul><ul><li>The sending or client process creates a message and sends it into the network </li></ul></ul><ul><ul><li>The receiving or server process gets the message from the network and might reply </li></ul></ul><ul><li>Notice that client and server process only relate to their relative roles in sending a message, not the client-server or other architectures mentioned earlier </li></ul>INFO 330 Chapter 2
  12. 12. Sockets <ul><li>A socket is the doorway through which the process sends a message to the network </li></ul><ul><li>The message goes through a socket on the client process, passes through the network, then enters the server process through another socket </li></ul><ul><li>A socket bridges the application and transport layers within each host </li></ul>INFO 330 Chapter 2
  13. 13. Sockets INFO 330 Chapter 2 Could be UDP on both ends
  14. 14. Sockets <ul><li>A socket is the Application Programming Interface (API) between application and the network </li></ul><ul><ul><li>The API is all the developer sees of the network connection </li></ul></ul><ul><ul><ul><li>The developer can choose to use TCP or UDP, and maybe tweak a few transport layer parameters </li></ul></ul></ul><ul><ul><li>Winsock is the Microsoft socket API </li></ul></ul>INFO 330 Chapter 2
  15. 15. Addressing Processes <ul><li>For the server process to get the message, it has to be addressed correctly </li></ul><ul><li>The host address and receiving process are the key parts of the address </li></ul><ul><ul><li>The host address is its IP address (the 32- or 128-bit address of the host’s network interface) </li></ul></ul><ul><ul><li>The receiving process is identified by its port number , since many processes can be running at once </li></ul></ul>INFO 330 Chapter 2
  16. 16. Addressing Processes INFO 330 Chapter 2 Sockets send packets Ports listen for them
  17. 17. Port Number <ul><li>Port numbers follow default values, set by the IANA , unless specified otherwise </li></ul><ul><ul><li>21 = FTP </li></ul></ul><ul><ul><li>23 = Telnet </li></ul></ul><ul><ul><li>25 = SMTP </li></ul></ul><ul><ul><li>53 = DNS </li></ul></ul><ul><ul><li>80 = HTTP, http://mine.com implies http://mine.com:80 </li></ul></ul><ul><ul><li>110 = POP3 </li></ul></ul><ul><ul><li>194 = IRC, and hundreds more </li></ul></ul>INFO 330 Chapter 2
  18. 18. More Protocols <ul><li>Application-layer protocols define how a particular application’s processes are structured </li></ul><ul><ul><li>What types of messages are allowed </li></ul></ul><ul><ul><li>The syntax of those messages </li></ul></ul><ul><ul><li>The meaning of the fields in the syntax </li></ul></ul><ul><ul><li>Rules for processing messages – when and how to send messages, how to reply, etc. </li></ul></ul>INFO 330 Chapter 2
  19. 19. Application vs its protocols <ul><li>A single application often needs to use several application-layer protocols </li></ul><ul><ul><li>A web browser might use HTTP, but also FTP, telnet, gopher, etc. </li></ul></ul><ul><ul><li>An email application might use POP3, SMTP, IMAP, etc. </li></ul></ul><ul><li>Many app protocols are defined in RFCs </li></ul><ul><ul><li>But some application-layer protocols are proprietary </li></ul></ul>INFO 330 Chapter 2
  20. 20. RFC Summary <ul><li>For an RFC which lists the current RFC standards, look in the RFC Index for “Internet Official Protocol Standards” </li></ul><ul><ul><li>The current one is RFC 5000, dated May 2008 </li></ul></ul>INFO 330 Chapter 2
  21. 21. Application Services <ul><li>The transport layer connects the application layer to everything else </li></ul><ul><li>Have a choice of two protocols, TCP and UDP, unless you want to write your own! </li></ul><ul><li>Key services include </li></ul><ul><ul><li>Reliable data transfer – how important is it? Or is your app loss-tolerant? </li></ul></ul>INFO 330 Chapter 2
  22. 22. Application Services <ul><li>How much bandwidth or throughput does your app need? </li></ul><ul><ul><li>Does sending rate have to equal receiving rate? </li></ul></ul><ul><ul><li>Some apps are elastic – can tolerate wide ranges of available bandwidth </li></ul></ul><ul><li>How sensitive is your app to timing? </li></ul><ul><ul><li>Games and telephony tend to be sensitive to slow or erratic transmission delays </li></ul></ul><ul><li>How important is security? </li></ul>INFO 330 Chapter 2
  23. 23. TCP Services <ul><li>TCP provides a connection-oriented service, where the sockets of the client and server recognize a connection for the duration of the session </li></ul><ul><ul><li>Connection is duplex – messages can go both ways at once </li></ul></ul><ul><ul><li>TCP is highly reliable – the bits leaving one side all get to the other side, and get put back in the original order </li></ul></ul>INFO 330 Chapter 2
  24. 24. TCP Services <ul><li>TCP also provides congestion control, for benefit of the Internet </li></ul><ul><ul><li>This throttles the sending processes when the connection is congested, and can limit bandwidth </li></ul></ul><ul><li>TCP does not guarantee any level of transmission rate, or provide delay guarantees </li></ul><ul><li>So you’ll get your data across, but we don’t know when </li></ul>INFO 330 Chapter 2
  25. 25. UDP Services <ul><li>UDP is a lightweight protocol – meaning it doesn’t do much! </li></ul><ul><ul><li>UDP is connectionless </li></ul></ul><ul><ul><li>UDP is unreliable – data may never get there </li></ul></ul><ul><ul><li>UDP packets may arrive out of order and not realize it </li></ul></ul><ul><ul><li>There are no transmission rate guarantees </li></ul></ul>INFO 330 Chapter 2
  26. 26. Services NOT Provided <ul><li>TCP and UDP do not provide guarantees of throughput or timing </li></ul><ul><li>TCP does nothing for security per se, but SSL can be added on at the transport layer </li></ul><ul><ul><li>See Chapter 7 for INFO 331 </li></ul></ul>INFO 330 Chapter 2
  27. 27. Application Protocols <ul><li>We’ll examine protocols for Internet-based applications </li></ul><ul><ul><li>HTTP </li></ul></ul><ul><ul><li>FTP </li></ul></ul><ul><ul><li>SMTP </li></ul></ul><ul><ul><li>POP3 </li></ul></ul><ul><ul><li>IMAP </li></ul></ul><ul><ul><li>DNS </li></ul></ul>INFO 330 Chapter 2
  28. 28. The Web and HTTP <ul><li>Through the 1980’s, the Internet was used mostly for remote login, file transfer, newsgroups, and email </li></ul><ul><li>The World Wide Web changed all that, and made the Internet visible to the public </li></ul><ul><ul><li>Comparable in significance to inventing movable type, the telephone, radio, or TV </li></ul></ul><ul><ul><li>Web provides demand-based information, vs. broadcast info on radio and TV </li></ul></ul>INFO 330 Chapter 2
  29. 29. HTTP <ul><li>The HyperText Transfer Protocol ( HTTP ) is the heart of the Web </li></ul><ul><ul><li>Defined by RFCs 1945 (v1.0) and 2616 (v1.1) </li></ul></ul><ul><ul><li>Has client and server programs which communicate via HTTP messages </li></ul></ul><ul><li>Web pages contain objects – files of various sorts, such as a base HTML file, which cites JPG and/or GIF images, etc. </li></ul><ul><li>App to use HTTP is a browser </li></ul>INFO 330 Chapter 2
  30. 30. HTTP <ul><li>A Web server houses the objects </li></ul><ul><ul><li>Apache and Microsoft Internet Information Services ( IIS ) are common Web server apps </li></ul></ul><ul><li>HTTP defines the messages that pass between client and server </li></ul><ul><ul><li>Uses TCP for transport protocol </li></ul></ul><ul><ul><li>HTTP has no memory of previous actions (a stateless protocol ) – so if you ask for a file 126 times, it will send the file 126 times </li></ul></ul>INFO 330 Chapter 2
  31. 31. HTTP <ul><li>HTTP can use persistent or non-persistent connections – persistent is the default, but non-persistent can be specified </li></ul><ul><li>A non-persistent connection to get a web page might work like this: </li></ul><ul><ul><li>Client requests a TCP connection to web server on port 80 </li></ul></ul><ul><ul><li>Client requests the HTML page </li></ul></ul><ul><ul><li>Server retrieves the HTML page, and sends it </li></ul></ul>INFO 330 Chapter 2
  32. 32. HTTP <ul><ul><li>Server closes the TCP connection </li></ul></ul><ul><ul><li>Client closes the TCP connection </li></ul></ul><ul><ul><li>Client reads the HTML file, and finds 10 JPGs referenced </li></ul></ul><ul><ul><li>Client repeats steps 1-4 ten times (!) to download each of the JPG images </li></ul></ul><ul><li>Not very efficient! </li></ul><ul><li>Browser can determine how many parallel TCP connections are used (typically 5-10) </li></ul>INFO 330 Chapter 2
  33. 33. More Delays! <ul><li>How long does this process take? </li></ul><ul><ul><li>The round-trip time (RTT) is for a packet to go from client to server and back </li></ul></ul><ul><ul><li>Includes propagation delays, queuing delays, processing delays </li></ul></ul><ul><li>TCP handshake involves two messages between client (C) and server (S); C-S, S-C </li></ul><ul><li>Then request the file (C-S), and get the file from the server (S-C) </li></ul>INFO 330 Chapter 2
  34. 34. RTT Delay <ul><li>So the time for getting one file is two times the RTT, plus the transmission time for uploading the file from the server (Fig. 2.7, p. 104, 5 th ed.) </li></ul><ul><li>In the non-persistent connection example, this is done 11 times for one HTML file and 10 JPGs </li></ul>INFO 330 Chapter 2
  35. 35. Persistent Connection <ul><li>If there’s a persistent connection, the TCP connection stays, so the handshake is done once not only for the web page in the example, but for many HTTP requests </li></ul><ul><ul><li>Connection is closed after some period of inactivity </li></ul></ul><ul><li>Persistent connections can be with or without pipelining </li></ul>INFO 330 Chapter 2
  36. 36. Persistent Connection <ul><li>Without pipelining , the client requests a new object only after the previous request has been filled </li></ul><ul><li>With pipelining , the clients requests new objects as needed, and may be waiting for several responses at once </li></ul><ul><ul><li>This is the default setting for web browsers </li></ul></ul><ul><ul><li>Could reduce total RTT to one RTT unit for all parts of a web page, vs. 22 units for a non-persistent connection! </li></ul></ul>INFO 330 Chapter 2
  37. 37. HTTP vs HTML <ul><li>Don’t confuse HTTP with HTML </li></ul><ul><ul><li>HTTP is the protocol used to define how files are requested and transferred between server and clients </li></ul></ul><ul><ul><li>HTML is the format of web pages </li></ul></ul><ul><li>So an HTML file might be the structure of an entity body transferred using HTTP </li></ul>INFO 330 Chapter 2
  38. 38. HTTP Messages <ul><li>HTTP messages are two types, request messages (from client) and response messages (from server) </li></ul><ul><ul><li>All HTTP messages are plain ASCII text </li></ul></ul><ul><ul><ul><li>‘ Both types of message consist of a start-line, zero or more header fields (also known as &quot;headers&quot;), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body.’ [RFC 2616, para 4.1] </li></ul></ul></ul><ul><ul><ul><li>CRLF is a “carriage return and line feed” </li></ul></ul></ul>INFO 330 Chapter 2
  39. 39. HTTP Messages <ul><li>There are many headers which could appear in requests or responses </li></ul><ul><ul><li>Cache-Control, Connection , Date , Pragma, Trailer, Transfer-Encoding, Upgrade, Via, and/or Warning [RFC 2616, para 4.5] </li></ul></ul><ul><ul><li>Disclaimer : RFC 2616 is 176 pages long – so we’re just providing a summary of where to look for info if you’re curious about the details of these messages </li></ul></ul>INFO 330 Chapter 2
  40. 40. HTTP Requests <ul><li>Request messages have variable number of lines, depending on the method called </li></ul><ul><li>General request syntax is </li></ul><ul><ul><li>Method Request-URI HTTP-Version </li></ul></ul><ul><ul><li>Methods are OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, or CONNECT [RFC 2616, para 5.1.1] </li></ul></ul><ul><ul><ul><li>Most commonly used is GET </li></ul></ul></ul><ul><ul><li>Request-URI is the desired Uniform Resource Identifier (URI, commonly called a URL) </li></ul></ul>INFO 330 Chapter 2
  41. 41. HTTP Requests <ul><ul><li>HTTP-Version is what it sounds like, e.g. HTTP/1.1 </li></ul></ul><ul><li>There are many possible request headers </li></ul><ul><ul><li>Accept, Accept-Charset, Accept-Encoding, Accept-Language, Authorization, Expect, From, Host , If-Match, If-Modified-Since, If-None-Match, If-Range, If-Unmodified-Since, Max-Forwards, Proxy-Authorization, Range, Referer, TE (extension transfer-codings), and/or User-Agent [RFC 2616, para 5.3] </li></ul></ul>INFO 330 Chapter 2
  42. 42. HTTP Responses <ul><li>HTTP responses go from server to client </li></ul><ul><li>General syntax starts with </li></ul><ul><ul><li>HTTP-Version Status-Code Reason-Phrase [RFC 2616, para 6.1] </li></ul></ul><ul><ul><li>The Status-Code could be dozens of values </li></ul></ul><ul><ul><ul><li>&quot;200&quot; OK </li></ul></ul></ul><ul><ul><ul><li>&quot;403&quot; Forbidden </li></ul></ul></ul><ul><ul><ul><li>&quot;404&quot; Not Found </li></ul></ul></ul><ul><ul><li>The Reason-Phrase is any text phrase assigned </li></ul></ul>INFO 330 Chapter 2
  43. 43. HTTP Responses <ul><li>Response headers can include </li></ul><ul><ul><li>Accept-Ranges, Age, ETag, Location, Proxy-Authenticate, Retry-After, Server , Vary, and/or WWW-Authenticate [RFC 2616, para 6.2] </li></ul></ul><ul><li>Responses usually include entities, unless the HEAD method was used </li></ul>INFO 330 Chapter 2
  44. 44. HTTP Entities <ul><li>An entity is the object sent or returned with an HTTP message </li></ul><ul><li>Entities can be with requests or responses </li></ul><ul><ul><li>Entity headers include Allow, Content-Encoding, Content-Language, Content-Length (bytes), Content-Location, Content-MD5, Content-Range, Content-Type , Expires, Last-Modified , and/or extension-header [RFC 2616, para 7.1] </li></ul></ul><ul><ul><ul><li>Where extension-header is any allowable message-header for that kind of message </li></ul></ul></ul>INFO 330 Chapter 2
  45. 45. HTTP <ul><li>So HTTP describes request and response message formats </li></ul><ul><ul><li>Both types typically have a first line which tells its purpose (the request or status line) </li></ul></ul><ul><ul><li>There can be many header lines </li></ul></ul><ul><ul><li>There might be an entity attached </li></ul></ul>INFO 330 Chapter 2
  46. 46. Cookies! <ul><li>HTTP is stateless </li></ul><ul><li>But some would like to remember a little information about web site visitors, hence cookies were defined with RFC 2965 </li></ul><ul><li>Cookies require four parts </li></ul><ul><ul><li>A cookie header in HTTP responses </li></ul></ul><ul><ul><li>A cookie header in HTTP requests </li></ul></ul><ul><ul><li>Cookie files on the user’s computer </li></ul></ul><ul><ul><li>A database on the web server </li></ul></ul>INFO 330 Chapter 2
  47. 47. Cookies <ul><li>When a user visits a cookied web site the first time, they are assigned a unique ID number, which is stored in the database </li></ul><ul><li>A Set-cookie method is used in their response to flag that ID number </li></ul><ul><ul><li>Set-cookie: 1678 </li></ul></ul><ul><li>All subsequent HTTP interaction with that site, even years later, will flag that cookie number and identify the user </li></ul>INFO 330 Chapter 2
  48. 48. Cookies <ul><ul><li>Cookie: 1678 </li></ul></ul><ul><li>This provides a way for web sites to automate login for repeat customers, and track browsing and spending patterns </li></ul><ul><ul><li>One-click shopping is only possible with cookies </li></ul></ul><ul><ul><li>The price for convenience is the lack of privacy </li></ul></ul><ul><li>Ads on web sites can be targeted to match the user’s preferences </li></ul>INFO 330 Chapter 2
  49. 49. Other HTTP Content <ul><li>So far we assumed the file content for HTTP was HTML files, JPGs, GIFs, etc. </li></ul><ul><li>Entities can be many other file formats </li></ul><ul><ul><li>XML files, which are structured text </li></ul></ul><ul><ul><li>VoiceXML , WML (web pages for mobile phones), streaming audio and video, and P2P file sharing </li></ul></ul>INFO 330 Chapter 2
  50. 50. Web Caching <ul><li>A Web cache, or proxy server, acts as an intermediate between clients and servers </li></ul><ul><ul><li>The cache stores recently used files, so they don’t have to be requested again </li></ul></ul><ul><ul><li>The cache acts as client and server </li></ul></ul><ul><li>ISPs typically use web caching to cut down on outgoing web traffic (to the servers) and lower request response time </li></ul>INFO 330 Chapter 2
  51. 51. Web Caching <ul><li>Tends to work well when the client-cache connection is faster than the cache-server connection </li></ul><ul><li>Often helps avoid upgrading the cache-server connection speed, which saves money </li></ul><ul><li>Implement by using a conditional GET method in HTTP </li></ul><ul><ul><li>With the If-Modified-Since request header </li></ul></ul><ul><ul><li>If the cache is still current, don’t download the file </li></ul></ul>INFO 330 Chapter 2
  52. 52. FTP <ul><li>The File Transfer Protocol is one of the oldest Internet applications (now RFC 959, but started as RFC 114 in 1971) </li></ul><ul><li>While HTTP and FTP both send files </li></ul><ul><ul><li>FTP uses two connections – one for control, one for data (control information is out-of-band ) </li></ul></ul><ul><ul><ul><li>User login and commands are on the control connection, files move on the data connection </li></ul></ul></ul><ul><ul><li>HTTP uses one connection for both purposes (control information is in-band ) </li></ul></ul>INFO 330 Chapter 2
  53. 53. FTP <ul><li>FTP uses TCP, and usually connects to the server on ports 20 and 21 </li></ul><ul><li>The client sends user ID and password </li></ul><ul><ul><li>FTP may be done to some sites with generic ID, known as anonymous FTP </li></ul></ul><ul><li>Once logged in, the user may navigate and view directories, and upload (STOR or PUT) or download (RETR or GET) files </li></ul>INFO 330 Chapter 2
  54. 54. FTP <ul><li>Commands and replies are very basic </li></ul><ul><ul><li>Most commands are three or four-letter abbreviations </li></ul></ul><ul><ul><li>Replies are three-digit codes, followed by text </li></ul></ul><ul><li>Command connection is based on Telnet, incidentally [RFC 959, para 2.3] </li></ul><ul><li>Due to its age, FTP has provisions for a huge range of data types (ASCII or EBCDIC) and file, record, and page structures </li></ul>INFO 330 Chapter 2
  55. 55. Electronic Mail <ul><li>E-mail is another ancient Internet application, with origins in RFC 772 in 1980 </li></ul><ul><li>It provides asynchronous text communication and allows files to be attached to messages </li></ul><ul><ul><li>Even voice and video messages </li></ul></ul><ul><li>Main elements are users (sender and recipient), mail servers, and the Simple Mail Transfer Protocol (SMTP, RFC 2821) </li></ul><ul><ul><li>Careful, there’s also an S N TP for network time </li></ul></ul>INFO 330 Chapter 2
  56. 56. Electronic Mail <ul><li>Email is composed in a client, which sends it to a mail queue in the sender’s mail server </li></ul><ul><li>The sending mail server uses SMTP to send the message to the recipient’s mail server </li></ul><ul><ul><li>If mail can’t be sent successfully, the sender’s mail server will put the message in a queue, and keep trying (typically for 3 days) </li></ul></ul><ul><li>The recipient is notified that the message is present, which they read with their client </li></ul>INFO 330 Chapter 2
  57. 57. Electronic Mail <ul><li>Each user has a mailbox on the mail server </li></ul><ul><ul><li>Access to the mailbox is controlled with user name and password </li></ul></ul><ul><li>SMTP is the main protocol to get email from one mail server to another </li></ul><ul><ul><li>It uses TCP, not surprisingly </li></ul></ul><ul><ul><li>Defined in proposed standard RFC 2821 </li></ul></ul><ul><ul><li>Only uses 7-bit ASCII for message AND body </li></ul></ul><ul><ul><ul><li>Forces binary files to be converted to ASCII & back </li></ul></ul></ul>INFO 330 Chapter 2
  58. 58. SMTP <ul><li>After the TCP connection is established, SMTP does a handshake with port 25 of the recipient’s mail server </li></ul><ul><li>The client then sends the message </li></ul><ul><li>Multiple messages can be sent if needed, then the connection is closed </li></ul><ul><li>Client commands include HELO, MAIL FROM:, RCPT TO:, DATA (then the message body), and QUIT </li></ul>INFO 330 Chapter 2
  59. 59. SMTP <ul><li>Other commands include ( with comments in italics ) </li></ul><ul><ul><li>RSET (abort current transaction) </li></ul></ul><ul><ul><li>SEND FROM:<reverse-path> </li></ul></ul><ul><ul><li>SOML FROM:<reverse-path> (send or mail) </li></ul></ul><ul><ul><li>SAML FROM:<reverse-path> (send and mail) </li></ul></ul><ul><ul><li>VRFY <string> (verify a user name) </li></ul></ul><ul><ul><li>EXPN <string> (expand mailing list) </li></ul></ul><ul><ul><li>HELP [ <string>] </li></ul></ul><ul><ul><li>NOOP (just send an OK reply) </li></ul></ul><ul><ul><li>TURN (your turn to be client or server) </li></ul></ul>INFO 330 Chapter 2
  60. 60. SMTP vs HTTP <ul><li>SMTP and HTTP can both move files using persistent TCP connections </li></ul><ul><ul><li>SMTP pushes messages to the recipient’s mail server </li></ul></ul><ul><ul><ul><li>HTTP pulls contents when desired from a web server </li></ul></ul></ul><ul><ul><li>SMTP incorporates attachments into the body of the message as one big object </li></ul></ul><ul><ul><ul><li>HTTP downloads attachments in separate responses </li></ul></ul></ul><ul><ul><li>SMTP requires messages in 7-bit ASCII text </li></ul></ul><ul><ul><ul><li>HTTP doesn’t </li></ul></ul></ul>INFO 330 Chapter 2
  61. 61. Mail Message Formats <ul><li>Email contains header information defined by RFC 822 (Standard for ARPA Internet Text Messages), now RFC 5322 </li></ul><ul><ul><li>The sender headers can include: FROM, SENDER, REPLY-TO, RESENT-FROM, RESENT-SENDER, and RESENT-REPLY-TO </li></ul></ul><ul><ul><li>Receiver headers can be: TO, CC, and BCC </li></ul></ul><ul><ul><li>Reference headers can be: MESSAGE-ID, IN-REPLY-TO, REFERENCES and KEYWORDS </li></ul></ul>INFO 330 Chapter 2
  62. 62. Mail Message Formats <ul><ul><li>Other allowable header fields are: SUBJECT, COMMENTS, ENCRYPTED, and possibly some extension fields or user-defined fields </li></ul></ul><ul><li>While many of these headers also sound like SMTP commands, they are part of the email message </li></ul><ul><li>This works fine for ASCII data </li></ul><ul><ul><li>For anything outside of that, call a MIME </li></ul></ul>INFO 330 Chapter 2
  63. 63. MIME <ul><li>Multipurpose Internet Mail Extensions (MIME) are used for handling non-ASCII contents in email, e.g. non-Latin character sets, binary files, images, audio, video, etc. </li></ul><ul><li>MIME (RFC 2045) adds the ability to handle </li></ul><ul><ul><li>(1) textual message bodies in character sets other than US-ASCII, (2) an extensible set of different formats for non-textual message bodies, (3) multi-part message bodies, and (4) textual header information in character sets other than US-ASCII. </li></ul></ul>INFO 330 Chapter 2
  64. 64. MIME <ul><li>The key three parts of MIME are defining the version of MIME, the encoding scheme, and the type of content </li></ul><ul><ul><li>MIME-Version: 1.0 </li></ul></ul><ul><ul><li>Content-Transfer-Encoding: can be &quot;7bit&quot; / &quot;8bit&quot; / &quot;binary&quot; / &quot;quoted-printable&quot; / &quot;base64“ </li></ul></ul><ul><ul><li>Content-Type: describes the type and subtype </li></ul></ul><ul><ul><ul><li>Type is discrete (&quot;text&quot; / &quot;image&quot; / &quot;audio&quot; / &quot;video&quot; / &quot;application&quot;) or composite (&quot;message&quot; / &quot;multipart&quot;) </li></ul></ul></ul>INFO 330 Chapter 2
  65. 65. MIME <ul><ul><ul><li>Subtype is an ietf-token (An extension token defined by a standards-track RFC and registered with IANA) or an X-token (The two characters &quot;X-&quot; or &quot;x-&quot; followed, with no intervening white space, by an ASCII text string) </li></ul></ul></ul><ul><li>There are many other variations of type and subtype (see RFC 2046), including for </li></ul><ul><ul><li>Other character sets (Content-type: text/plain; charset=iso-8859-1), or proprietary formats (image/JPEG, application/postscript, etc.) </li></ul></ul>INFO 330 Chapter 2
  66. 66. MIME <ul><li>The received message also includes a Received: header added to the top of the message </li></ul><ul><li>This is familiar in email if you look at the full headers </li></ul>INFO 330 Chapter 2
  67. 67. Uuencode and uudecode <ul><li>Historic note: </li></ul><ul><ul><li>Before MIME, uuencode was used to convert non-ASCII files to text </li></ul></ul><ul><ul><ul><li>Doing so expanded the file in size 35%, because of the conversion from 7 bit to 8 bit, plus control information </li></ul></ul></ul><ul><ul><li>Uudecode reversed the operation after the file was received </li></ul></ul><ul><ul><li>These commands still exist under UNIX </li></ul></ul>INFO 330 Chapter 2
  68. 68. Mail Access Protocols <ul><li>If you log directly into your email server, SMTP is all you need to handle email </li></ul><ul><li>But if you wish to access email from a local host, you need to use a mail access protocol </li></ul><ul><li>The biggies at present are </li></ul><ul><ul><li>Post Office Protocol version 3 (POP3) and </li></ul></ul><ul><ul><li>Internet Mail Access Protocol (IMAP) </li></ul></ul>INFO 330 Chapter 2
  69. 69. POP3 <ul><li>POP3 is defined in RFC 1939 </li></ul><ul><ul><li>It’s a pretty simple protocol compared to many </li></ul></ul><ul><li>SMTP sends mail between mail servers, and from the user agent (email app) to their mail server </li></ul><ul><li>POP3 transfers mail from your mail server to your user agent </li></ul><ul><li>From a user’s view, SMTP handles outgoing email, and POP3 handles incoming email </li></ul>INFO 330 Chapter 2
  70. 70. POP3 <ul><li>POP3 uses TCP, and connects to port 110 on the mail server </li></ul><ul><li>POP3 does three things – authorization, transaction, and update </li></ul><ul><ul><li>Authorization verifies the user identity </li></ul></ul><ul><ul><li>Transaction retrieves email, marks messages for deletion, and gets mail statistics </li></ul></ul><ul><ul><li>Update ends the session, and deletes flagged messages </li></ul></ul>INFO 330 Chapter 2
  71. 71. POP3 <ul><li>POP3 communicates with the mail server by commands, which get a +OK response if it worked, and an –ERR response if it didn’t work </li></ul><ul><ul><li>Authorization uses commands ‘user’ and ‘pass’ </li></ul></ul><ul><ul><li>Transaction uses commands </li></ul></ul><ul><ul><ul><li>‘ list’ to see list of messages </li></ul></ul></ul><ul><ul><ul><li>‘ dele x’ to delete message number x </li></ul></ul></ul><ul><ul><ul><li>‘ retr x’ to retrieve message number x </li></ul></ul></ul><ul><ul><ul><li>‘ quit’ ends the session </li></ul></ul></ul>INFO 330 Chapter 2
  72. 72. POP3 <ul><li>POP3 allows two modes, depending on whether you delete the messages after retrieving them </li></ul><ul><ul><li>If you download-and-delete messages from the server, you only download them to one local host </li></ul></ul><ul><ul><li>If you download-and-keep the messages on the server, then you can download them to more than one local host (e.g. home and work) </li></ul></ul><ul><ul><ul><li>Disadvantage is that the volume of mail on the server can be too big </li></ul></ul></ul>INFO 330 Chapter 2
  73. 73. POP3 <ul><li>POP3 maintains a little state information during a session, such as which files have been marked for deletion </li></ul><ul><li>However after a session is over, all state information is gone </li></ul><ul><ul><li>This makes a POP3 server a fairly simple beast </li></ul></ul><ul><li>Users use folders locally (on their email app) to store and organize messages </li></ul>INFO 330 Chapter 2
  74. 74. IMAP <ul><li>IMAP , defined in RFC 3501, allows folders to be defined on the mail server to organize email there </li></ul><ul><ul><li>Messages are associated with a folder – first the generic INBOX, then moved by the user </li></ul></ul><ul><ul><li>Hence state information about the folder for each message must be saved across sessions </li></ul></ul><ul><li>IMAP also provides search capability within the mailbox </li></ul>INFO 330 Chapter 2
  75. 75. IMAP <ul><li>Users can also get just the headers of messages, and avoid downloading the MIME portion </li></ul><ul><ul><li>Handy when on a low speed connection </li></ul></ul>INFO 330 Chapter 2
  76. 76. Web Email <ul><li>Hotmail (now owned by Microsoft) introduced web-based email shortly after the Web became popular </li></ul><ul><ul><li>Mail is accessed by HTTP not POP3 or IMAP </li></ul></ul><ul><ul><li>But the server-to-server connection is still SMTP </li></ul></ul><ul><li>Very convenient for accessing mail with limited bandwidth or from many locations </li></ul><ul><li>Widely imitated ( Gmail , Yahoo , AOL , etc.) </li></ul>INFO 330 Chapter 2
  77. 77. DNS <ul><li>A key need, once the Internet grew beyond a few thousand hosts, was to automate converting human* readable addresses or hostnames (www.microsoft.com) to IP addresses (207.46.198.60) got IP here </li></ul><ul><li>That is the purpose of the Domain Name System (DNS) </li></ul><ul><ul><li>Before DNS, really big lookup tables were used! </li></ul></ul>* Humans who read English, at least! INFO 330 Chapter 2
  78. 78. Host vs Domain Names <ul><li>A hostname is the name of a particular host computer, such as banner.drexel.edu </li></ul><ul><ul><li>May really represent multiple computers, but logically they are all the same host </li></ul></ul><ul><li>A domain name is the top level domain and the specific domain name, like drexel.edu </li></ul><ul><li>Top level domains are com, edu, gov, mil, org, net, and the country codes uk, de, fr, etc. </li></ul>INFO 330 Chapter 2
  79. 79. IP Addresses <ul><li>IP addresses have four groups of bytes, each group from 0 to 255, separated by periods </li></ul><ul><ul><li>Why called bytes? Each value from 0 to 255 corresponds to a value of from 0 to (2 8 -1), and a byte is eight bits </li></ul></ul><ul><li>IP addresses are typically static (fixed) for servers and other semi-permanent Internet connections, and dynamic for temporary connections (e.g. dial-up, wireless) </li></ul>INFO 330 Chapter 2
  80. 80. DNS <ul><li>DNS runs over UDP, port 53 (something uses UDP!) </li></ul><ul><li>DNS is managed by DNS servers, typically running Berkeley Internet Name Domain ( BIND ) software </li></ul><ul><li>DNS is used by other applications (HTTP, SMTP, FTP) to translate host names to IP addresses </li></ul><ul><ul><li>You can also do a reverse DNS lookup (convert 205.188.97.2 to www-vd03.evip.aol.com) </li></ul></ul>INFO 330 Chapter 2
  81. 81. Reverse DNS Lookup <ul><li>So if you try to look up a random IP address like 123.45.67.89, dnsstuff.com gives </li></ul><ul><ul><li>The reverse DNS entry for an IP is found by reversing the IP, adding it to &quot;in-addr.arpa&quot;, and looking up the PTR record. So, the reverse DNS entry for 123.45.67.89 is found by looking up the PTR record for 89.67.45.123.in-addr.arpa. </li></ul></ul><ul><ul><ul><li>“ tinnie.arin.net (an authoritative nameserver for 123.in-addr.arpa., which is in charge of the reverse DNS for 123.45.67.89) says that there are no PTR records for 123.45.67.89.” </li></ul></ul></ul>INFO 330 Chapter 2
  82. 82. DNS <ul><li>DNS also provides other key services </li></ul><ul><ul><li>Host aliasing allows the true or canonical hostname to have aliases </li></ul></ul><ul><ul><ul><li>When blah.com works to get to www.blah.com, it’s because blah.com is a host alias of www.blah.com </li></ul></ul></ul><ul><ul><li>Mail server aliasing – same concept, but for mail server names </li></ul></ul><ul><ul><li>Load distribution across many servers for the same hostname – so everyone in the world doesn’t use one IP address for microsoft.com </li></ul></ul>INFO 330 Chapter 2
  83. 83. DNS Structure <ul><li>DNS is highly decentralized </li></ul><ul><ul><li>Improves throughput, speed, redundancy, reliability, security </li></ul></ul><ul><li>There are three levels of structure – the job of looking up a given address is partitioned among them </li></ul><ul><ul><li>Root DNS Servers – are 13 sets of servers around the world that provide top level delegation of DNS information </li></ul></ul>INFO 330 Chapter 2
  84. 84. DNS Structure <ul><ul><li>Top-Level Domain (TLD) DNS Servers – sets of servers are maintained for each of the top level domains, including country codes </li></ul></ul><ul><ul><ul><li>Network Solutions Inc maintains the .COM domain </li></ul></ul></ul><ul><ul><li>Authoritative DNS Servers – everyone who has publicly visible web or mail servers has to maintain DNS records </li></ul></ul><ul><ul><ul><li>Drexel, large ISPs, etc. all can maintain DNS servers </li></ul></ul></ul><ul><ul><li>Local DNS servers – are used to forward to the nearest authoritative DNS server </li></ul></ul>INFO 330 Chapter 2
  85. 85. DNS Lookup <ul><li>DNS lookup typically follows the pattern at right </li></ul><ul><ul><li>A request to the local DNS server finds the TLD server from root </li></ul></ul><ul><ul><li>Then get the auth. server from the TLD server, who gives the desired IP address </li></ul></ul>INFO 330 Chapter 2
  86. 86. Recursive vs Iterative Queries <ul><li>DNS queries which ask another server to get information are recursive </li></ul><ul><ul><li>Query 1 on previous slide is recursive </li></ul></ul><ul><li>DNS queries which which get the information directly are iterative </li></ul><ul><ul><li>Queries 2, 4, and 6 are iterative </li></ul></ul><ul><li>All DNS queries can, in general, be recursive or iterative – the example shown is typical </li></ul>INFO 330 Chapter 2
  87. 87. DNS Lookup <ul><li>This would be terribly tedious without caching </li></ul><ul><ul><li>Common queries are stored on each level of DNS server, so they don’t have to be looked up constantly </li></ul></ul><ul><ul><li>Cached values are cleared typically every two days or less, in case the data changes </li></ul></ul>INFO 330 Chapter 2
  88. 88. DNS Records <ul><li>Data about a hostname, its aliases, domain, and mail servers are captured in resource records (RR) </li></ul><ul><li>Each RR is a line with four fields </li></ul><ul><ul><li>(Name, Value, Type, and TTL) </li></ul></ul><ul><ul><ul><li>Name is a hostname, domain name, or canonical host or mail server name (depending on the Type) </li></ul></ul></ul><ul><ul><ul><li>Value is the IP address, mail server, or of the Name </li></ul></ul></ul><ul><ul><ul><li>Type is the record type </li></ul></ul></ul><ul><ul><ul><li>TTL is the time the resource should be removed from cache (in seconds) </li></ul></ul></ul>INFO 330 Chapter 2
  89. 89. DNS Records <ul><li>DNS RR types are one of several options </li></ul><ul><ul><li>Type=A gives the IP address Value for a hostname Name </li></ul></ul><ul><ul><ul><li>(relay1.bar.foo.com, 145.37.93.126, A) (TTL not shown) </li></ul></ul></ul><ul><ul><li>Type=NS (name server) gives the authoritative DNS server Value for a domain Name </li></ul></ul><ul><ul><ul><li>(foo.com, dns.foo.com, NS) </li></ul></ul></ul><ul><ul><li>Type=CNAME defines the alias Name for the canonical hostname Value </li></ul></ul><ul><ul><ul><li>(foo.com, relay1.bar.foo.com, CNAME) </li></ul></ul></ul>INFO 330 Chapter 2
  90. 90. DNS Records <ul><ul><li>Type=MX gives the canonical mail server Value for an alias hostname Name </li></ul></ul><ul><ul><ul><li>(foo.com, mail.bar.foo.com, MX) </li></ul></ul></ul><ul><ul><li>Most hostnames have many RRs </li></ul></ul>The Start of Authority ( SOA) resource record indicates that this DNS name server is the best source of information for the data within this DNS domain INFO 330 Chapter 2
  91. 91. New resource record types <ul><li>There are type AAAA resource records for IPv6 addresses </li></ul><ul><ul><li>Their syntax is like an A type record turtle.mytrek.com IN AAAA FC00::8:800:200C:417A </li></ul></ul><ul><li>An experimental A6 resource record is used for chains of related IPv6 addresses </li></ul>From Ubuntu Server Admin and Reference , R Peterson, 2009 INFO 330 Chapter 2
  92. 92. DNS Messages <ul><li>The same format DNS messages are used to both query a DNS server, and receive the reply </li></ul><ul><li>The messages have a header section, the question, the answer, a section for other authoritative servers, and possibly additional information (such as A records for mail servers) </li></ul>INFO 330 Chapter 2
  93. 93. nslookup <ul><li>The command nslookup provides basic IP data for a hostname or domain </li></ul><ul><li>Nslookup snip.net </li></ul><ul><ul><li>Server: ns2.snip.net </li></ul></ul><ul><ul><li>Address: 209.204.64.3 </li></ul></ul><ul><ul><li>Name: snip.net </li></ul></ul><ul><ul><li>Address: 216.83.103.123 </li></ul></ul>INFO 330 Chapter 2
  94. 94. DNS Changes <ul><li>A registrar makes changes to the DNS database </li></ul><ul><ul><li>The list of registrars is at http://www.internic.net/ (the text is full of typos!) </li></ul></ul><ul><li>Changes to DNS records typically take hours to a couple days to become available – less if lots of people are requesting a new domain </li></ul><ul><ul><li>Likewise, email won’t find you right away </li></ul></ul>INFO 330 Chapter 2
  95. 95. DNS and security <ul><li>DNS is somewhat vulnerable to distributed denial of service (DDoS) attacks </li></ul><ul><ul><li>The Root servers were attacked in 2002, but they block incoming ping messages </li></ul></ul><ul><ul><li>TLD servers are more vulnerable, but local caching would reduce its impact </li></ul></ul><ul><li>Another approach is to send many DNS requests to authoritative servers, and spoof the source as a local DNS server </li></ul>INFO 330 Chapter 2
  96. 96. Peer-to-Peer File Sharing <ul><li>Peer-to-Peer (P2P) file sharing occupies much of the volume of Internet traffic </li></ul><ul><li>It allows a user to find a file on another user’s computer, and download it directly </li></ul><ul><ul><li>Everyone can be client and server, even at the same time </li></ul></ul><ul><ul><li>Napster used a centralized index , but true P2P just indexes the files you will share </li></ul></ul><ul><ul><ul><li>Please don’t share your entire hard drive! </li></ul></ul></ul>INFO 330 Chapter 2
  97. 97. P2P File Distribution <ul><li>P2P can be used to distribute a file from one source (e.g. a new Linux kernel) to hundreds of peer servers </li></ul><ul><li>P2P is inherently scalable </li></ul><ul><ul><li>Client-server file distribution time increases linearly with the number of nodes on the network </li></ul></ul><ul><ul><li>P2P distribution time levels off asymptotically </li></ul></ul>INFO 330 Chapter 2
  98. 98. BitTorrent <ul><li>Bittorrent.org manages the protocol used by most file sharing (30% of all Internet backbone traffic!) </li></ul><ul><ul><li> m Torrent is a commercial version; see also Azureus/Vuze , BitComet , etc. </li></ul></ul><ul><li>A torrent is the set of peers participating in distribution of a file </li></ul><ul><ul><li>A tracker node keeps track of which nodes are in the torrent </li></ul></ul>INFO 330 Chapter 2
  99. 99. BitTorrent <ul><li>When you join a torrent, you identify up to 50 neighboring peers already in the torrent </li></ul><ul><ul><li>Then find what chunks of the file each has, and get the rarest first </li></ul></ul><ul><li>When responding to requests for file chunks, focus on neighbors with the highest data rate </li></ul><ul><ul><li>Peers also send chunks to random neighbors </li></ul></ul><ul><ul><li>In order to get good download rates, must share nicely with others! (no free-riding !) </li></ul></ul>INFO 330 Chapter 2
  100. 100. Peer-to-Peer File Sharing <ul><li>TCP connections between the computers and FTP make it possible </li></ul><ul><ul><li>The server computer is a transient Web server </li></ul></ul><ul><li>Gnutella has a proprietary protocol (not everything is an RFC!) </li></ul><ul><ul><li>A request for a file produces query flooding to find that file is neighboring peers, and collects query hits; from those hits, an HTTP GET command downloads the file </li></ul></ul>INFO 330 Chapter 2
  101. 101. Peer-to-Peer File Sharing <ul><ul><li>More refined limited scope query flooding is now done to minimize Internet traffic required per user </li></ul></ul><ul><ul><ul><li>Only looks at nearby peers in decreasing numbers </li></ul></ul></ul><ul><ul><li>Gnutella also manages how people find peers on the network ( bootstrapping ), and maintain whether they are still online by pinging them </li></ul></ul><ul><li>KaZaA and Morpheus borrowed from both Napster and Gnutella </li></ul><ul><ul><li>It searches nearby peers, but not all are equal </li></ul></ul><ul><ul><li>Some have higher bandwidth and more to share </li></ul></ul>INFO 330 Chapter 2
  102. 102. Peer-to-Peer File Sharing <ul><ul><li>More powerful peers are group leaders ( super peers ) for those around them, acting like mini hubs of the network </li></ul></ul><ul><ul><ul><li>Group leaders connect via TCP, and map out what’s available from their local peers </li></ul></ul></ul><ul><ul><li>Other tricks include </li></ul></ul><ul><ul><ul><li>Limiting the number of simultaneous downloads </li></ul></ul></ul><ul><ul><ul><li>Giving priority to those who upload more than download </li></ul></ul></ul><ul><ul><ul><li>Download parts of the same file in parallel from multiple sources at once </li></ul></ul></ul>INFO 330 Chapter 2
  103. 103. Skype <ul><li>Skype is a popular P2P Internet telephony app, which goes beyond file distribution and sharing in the P2P world </li></ul><ul><li>Nodes in Skype are in a hierarchical overlay (like the super peer concept), which makes it faster to locate a user </li></ul><ul><li>Skype uses relays to establish calls across NAT-hidden local networks </li></ul>INFO 330 Chapter 2
  104. 104. Peer-to-Peer File Sharing <ul><li>A massive issue for P2P file sharing is the intellectual property rights of the files being shared </li></ul><ul><ul><li>Music and video industry lawyers have claimed enormous losses from file sharing, and have vigorously fought file sharing applications </li></ul></ul><ul><ul><li>Napster, BearShare, Grokster, Morpheus, iMesh, DVDxCopy, KaZaA, and others are involved in such ongoing disputes </li></ul></ul>INFO 330 Chapter 2

×