• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Chapter 2
 

Chapter 2

on

  • 1,120 views

 

Statistics

Views

Total Views
1,120
Views on SlideShare
1,120
Embed Views
0

Actions

Likes
0
Downloads
8
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Chapter 2 Chapter 2 Presentation Transcript

    • INFO 330 Computer Networking Technology I Chapter 2 The Application Layer Glenn Booker INFO 330 Chapter 2
    • Application Layer
      • The Application Layer is the reason the rest of the network exists – to serve applications
      • Most of the software familiar to end users are applications
        • Email, FTP, newsgroups, chat, the Web, streaming video, video conferencing, IPTV, etc.
      • We focus first on key concepts related to the Application Layer, then discuss some specific applications in detail
      INFO 330 Chapter 2
    • Application Layer
      • New applications designed for network implementation need to decide whether the application is based on
        • Client-server architecture
        • Peer to peer (P2P)
        • Or some hybrid combination of the two
      INFO 330 Chapter 2
    • Client-server Architecture
      • In client-server architecture, the server
        • Handles requests from many clients, and
        • Is generally always available
        • Often has a fixed IP address
      • Clients generally don’t communicate with each other, and may be on or off independently of each other and the server
        • Client-server applications include email, FTP, the Web, remote login
      INFO 330 Chapter 2
    • Client-server Architecture
      • Complex infrastructure intensive apps might require several types of servers – database, web, etc.
      • Multiple servers may be needed to keep up with the volume of client requests, hence the need for a server farm or data center
      INFO 330 Chapter 2
    • P2P Architecture
      • P2P architecture assumes the clients are on or off at will, and all are treated equally as potential servers and/or clients
        • Apps include Gnutella , Morpheus , BitTorrent , Kazaa and more
      INFO 330 Chapter 2
    • P2P Architecture
      • P2P architecture is inherently self-scalable
        • Millions of computers may participate, because each computer adds capacity at the same time it adds possible workload
      • Managing contents of a P2P application can be difficult
        • Only one computer may have a particular file, and there’s no control over when that computer is available
      INFO 330 Chapter 2
    • P2P Architecture
      • Key challenges in a good P2P app include
        • ISP friendly, since most residential connections are designed for far more bandwidth down than up, and P2P doesn’t follow this
        • Security, danger of over-sharing
        • Incentives for people to participate
      INFO 330 Chapter 2
    • Hybrid Architecture
      • Client-server and P2P combinations exist
        • Napster is the best known for file sharing
          • Obtains file location and description information from a P2P network, but maintains that information on a central server farm
        • Instant messaging (IM) is also hybrid
          • Chats are all P2P, but logging into the system is centralized
          • Includes ICQ , AOL IM , MSN Messenger , etc.
      INFO 330 Chapter 2
    • Process Communication
      • Any network application (no matter which architecture) needs to communicate between hosts using processes
        • In this sense, a process is a program running on a client, server, or peer host
        • Processes may communicate with other processes on the same host; this is controlled by the host’s operating system (OS)
        • We are interested in processes that communicate between hosts
      INFO 330 Chapter 2
    • Process Communication
      • Processes exchange messages
        • The sending or client process creates a message and sends it into the network
        • The receiving or server process gets the message from the network and might reply
      • Notice that client and server process only relate to their relative roles in sending a message, not the client-server or other architectures mentioned earlier
      INFO 330 Chapter 2
    • Sockets
      • A socket is the doorway through which the process sends a message to the network
      • The message goes through a socket on the client process, passes through the network, then enters the server process through another socket
      • A socket bridges the application and transport layers within each host
      INFO 330 Chapter 2
    • Sockets INFO 330 Chapter 2 Could be UDP on both ends
    • Sockets
      • A socket is the Application Programming Interface (API) between application and the network
        • The API is all the developer sees of the network connection
          • The developer can choose to use TCP or UDP, and maybe tweak a few transport layer parameters
        • Winsock is the Microsoft socket API
      INFO 330 Chapter 2
    • Addressing Processes
      • For the server process to get the message, it has to be addressed correctly
      • The host address and receiving process are the key parts of the address
        • The host address is its IP address (the 32- or 128-bit address of the host’s network interface)
        • The receiving process is identified by its port number , since many processes can be running at once
      INFO 330 Chapter 2
    • Addressing Processes INFO 330 Chapter 2 Sockets send packets Ports listen for them
    • Port Number
      • Port numbers follow default values, set by the IANA , unless specified otherwise
        • 21 = FTP
        • 23 = Telnet
        • 25 = SMTP
        • 53 = DNS
        • 80 = HTTP, http://mine.com implies http://mine.com:80
        • 110 = POP3
        • 194 = IRC, and hundreds more
      INFO 330 Chapter 2
    • More Protocols
      • Application-layer protocols define how a particular application’s processes are structured
        • What types of messages are allowed
        • The syntax of those messages
        • The meaning of the fields in the syntax
        • Rules for processing messages – when and how to send messages, how to reply, etc.
      INFO 330 Chapter 2
    • Application vs its protocols
      • A single application often needs to use several application-layer protocols
        • A web browser might use HTTP, but also FTP, telnet, gopher, etc.
        • An email application might use POP3, SMTP, IMAP, etc.
      • Many app protocols are defined in RFCs
        • But some application-layer protocols are proprietary
      INFO 330 Chapter 2
    • RFC Summary
      • For an RFC which lists the current RFC standards, look in the RFC Index for “Internet Official Protocol Standards”
        • The current one is RFC 5000, dated May 2008
      INFO 330 Chapter 2
    • Application Services
      • The transport layer connects the application layer to everything else
      • Have a choice of two protocols, TCP and UDP, unless you want to write your own!
      • Key services include
        • Reliable data transfer – how important is it? Or is your app loss-tolerant?
      INFO 330 Chapter 2
    • Application Services
      • How much bandwidth or throughput does your app need?
        • Does sending rate have to equal receiving rate?
        • Some apps are elastic – can tolerate wide ranges of available bandwidth
      • How sensitive is your app to timing?
        • Games and telephony tend to be sensitive to slow or erratic transmission delays
      • How important is security?
      INFO 330 Chapter 2
    • TCP Services
      • TCP provides a connection-oriented service, where the sockets of the client and server recognize a connection for the duration of the session
        • Connection is duplex – messages can go both ways at once
        • TCP is highly reliable – the bits leaving one side all get to the other side, and get put back in the original order
      INFO 330 Chapter 2
    • TCP Services
      • TCP also provides congestion control, for benefit of the Internet
        • This throttles the sending processes when the connection is congested, and can limit bandwidth
      • TCP does not guarantee any level of transmission rate, or provide delay guarantees
      • So you’ll get your data across, but we don’t know when
      INFO 330 Chapter 2
    • UDP Services
      • UDP is a lightweight protocol – meaning it doesn’t do much!
        • UDP is connectionless
        • UDP is unreliable – data may never get there
        • UDP packets may arrive out of order and not realize it
        • There are no transmission rate guarantees
      INFO 330 Chapter 2
    • Services NOT Provided
      • TCP and UDP do not provide guarantees of throughput or timing
      • TCP does nothing for security per se, but SSL can be added on at the transport layer
        • See Chapter 7 for INFO 331
      INFO 330 Chapter 2
    • Application Protocols
      • We’ll examine protocols for Internet-based applications
        • HTTP
        • FTP
        • SMTP
        • POP3
        • IMAP
        • DNS
      INFO 330 Chapter 2
    • The Web and HTTP
      • Through the 1980’s, the Internet was used mostly for remote login, file transfer, newsgroups, and email
      • The World Wide Web changed all that, and made the Internet visible to the public
        • Comparable in significance to inventing movable type, the telephone, radio, or TV
        • Web provides demand-based information, vs. broadcast info on radio and TV
      INFO 330 Chapter 2
    • HTTP
      • The HyperText Transfer Protocol ( HTTP ) is the heart of the Web
        • Defined by RFCs 1945 (v1.0) and 2616 (v1.1)
        • Has client and server programs which communicate via HTTP messages
      • Web pages contain objects – files of various sorts, such as a base HTML file, which cites JPG and/or GIF images, etc.
      • App to use HTTP is a browser
      INFO 330 Chapter 2
    • HTTP
      • A Web server houses the objects
        • Apache and Microsoft Internet Information Services ( IIS ) are common Web server apps
      • HTTP defines the messages that pass between client and server
        • Uses TCP for transport protocol
        • HTTP has no memory of previous actions (a stateless protocol ) – so if you ask for a file 126 times, it will send the file 126 times
      INFO 330 Chapter 2
    • HTTP
      • HTTP can use persistent or non-persistent connections – persistent is the default, but non-persistent can be specified
      • A non-persistent connection to get a web page might work like this:
        • Client requests a TCP connection to web server on port 80
        • Client requests the HTML page
        • Server retrieves the HTML page, and sends it
      INFO 330 Chapter 2
    • HTTP
        • Server closes the TCP connection
        • Client closes the TCP connection
        • Client reads the HTML file, and finds 10 JPGs referenced
        • Client repeats steps 1-4 ten times (!) to download each of the JPG images
      • Not very efficient!
      • Browser can determine how many parallel TCP connections are used (typically 5-10)
      INFO 330 Chapter 2
    • More Delays!
      • How long does this process take?
        • The round-trip time (RTT) is for a packet to go from client to server and back
        • Includes propagation delays, queuing delays, processing delays
      • TCP handshake involves two messages between client (C) and server (S); C-S, S-C
      • Then request the file (C-S), and get the file from the server (S-C)
      INFO 330 Chapter 2
    • RTT Delay
      • So the time for getting one file is two times the RTT, plus the transmission time for uploading the file from the server (Fig. 2.7, p. 104, 5 th ed.)
      • In the non-persistent connection example, this is done 11 times for one HTML file and 10 JPGs
      INFO 330 Chapter 2
    • Persistent Connection
      • If there’s a persistent connection, the TCP connection stays, so the handshake is done once not only for the web page in the example, but for many HTTP requests
        • Connection is closed after some period of inactivity
      • Persistent connections can be with or without pipelining
      INFO 330 Chapter 2
    • Persistent Connection
      • Without pipelining , the client requests a new object only after the previous request has been filled
      • With pipelining , the clients requests new objects as needed, and may be waiting for several responses at once
        • This is the default setting for web browsers
        • Could reduce total RTT to one RTT unit for all parts of a web page, vs. 22 units for a non-persistent connection!
      INFO 330 Chapter 2
    • HTTP vs HTML
      • Don’t confuse HTTP with HTML
        • HTTP is the protocol used to define how files are requested and transferred between server and clients
        • HTML is the format of web pages
      • So an HTML file might be the structure of an entity body transferred using HTTP
      INFO 330 Chapter 2
    • HTTP Messages
      • HTTP messages are two types, request messages (from client) and response messages (from server)
        • All HTTP messages are plain ASCII text
          • ‘ Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body.’ [RFC 2616, para 4.1]
          • CRLF is a “carriage return and line feed”
      INFO 330 Chapter 2
    • HTTP Messages
      • There are many headers which could appear in requests or responses
        • Cache-Control, Connection , Date , Pragma, Trailer, Transfer-Encoding, Upgrade, Via, and/or Warning [RFC 2616, para 4.5]
        • Disclaimer : RFC 2616 is 176 pages long – so we’re just providing a summary of where to look for info if you’re curious about the details of these messages
      INFO 330 Chapter 2
    • HTTP Requests
      • Request messages have variable number of lines, depending on the method called
      • General request syntax is
        • Method Request-URI HTTP-Version
        • Methods are OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, or CONNECT [RFC 2616, para 5.1.1]
          • Most commonly used is GET
        • Request-URI is the desired Uniform Resource Identifier (URI, commonly called a URL)
      INFO 330 Chapter 2
    • HTTP Requests
        • HTTP-Version is what it sounds like, e.g. HTTP/1.1
      • There are many possible request headers
        • Accept, Accept-Charset, Accept-Encoding, Accept-Language, Authorization, Expect, From, Host , If-Match, If-Modified-Since, If-None-Match, If-Range, If-Unmodified-Since, Max-Forwards, Proxy-Authorization, Range, Referer, TE (extension transfer-codings), and/or User-Agent [RFC 2616, para 5.3]
      INFO 330 Chapter 2
    • HTTP Responses
      • HTTP responses go from server to client
      • General syntax starts with
        • HTTP-Version Status-Code Reason-Phrase [RFC 2616, para 6.1]
        • The Status-Code could be dozens of values
          • "200" OK
          • "403" Forbidden
          • "404" Not Found
        • The Reason-Phrase is any text phrase assigned
      INFO 330 Chapter 2
    • HTTP Responses
      • Response headers can include
        • Accept-Ranges, Age, ETag, Location, Proxy-Authenticate, Retry-After, Server , Vary, and/or WWW-Authenticate [RFC 2616, para 6.2]
      • Responses usually include entities, unless the HEAD method was used
      INFO 330 Chapter 2
    • HTTP Entities
      • An entity is the object sent or returned with an HTTP message
      • Entities can be with requests or responses
        • Entity headers include Allow, Content-Encoding, Content-Language, Content-Length (bytes), Content-Location, Content-MD5, Content-Range, Content-Type , Expires, Last-Modified , and/or extension-header [RFC 2616, para 7.1]
          • Where extension-header is any allowable message-header for that kind of message
      INFO 330 Chapter 2
    • HTTP
      • So HTTP describes request and response message formats
        • Both types typically have a first line which tells its purpose (the request or status line)
        • There can be many header lines
        • There might be an entity attached
      INFO 330 Chapter 2
    • Cookies!
      • HTTP is stateless
      • But some would like to remember a little information about web site visitors, hence cookies were defined with RFC 2965
      • Cookies require four parts
        • A cookie header in HTTP responses
        • A cookie header in HTTP requests
        • Cookie files on the user’s computer
        • A database on the web server
      INFO 330 Chapter 2
    • Cookies
      • When a user visits a cookied web site the first time, they are assigned a unique ID number, which is stored in the database
      • A Set-cookie method is used in their response to flag that ID number
        • Set-cookie: 1678
      • All subsequent HTTP interaction with that site, even years later, will flag that cookie number and identify the user
      INFO 330 Chapter 2
    • Cookies
        • Cookie: 1678
      • This provides a way for web sites to automate login for repeat customers, and track browsing and spending patterns
        • One-click shopping is only possible with cookies
        • The price for convenience is the lack of privacy
      • Ads on web sites can be targeted to match the user’s preferences
      INFO 330 Chapter 2
    • Other HTTP Content
      • So far we assumed the file content for HTTP was HTML files, JPGs, GIFs, etc.
      • Entities can be many other file formats
        • XML files, which are structured text
        • VoiceXML , WML (web pages for mobile phones), streaming audio and video, and P2P file sharing
      INFO 330 Chapter 2
    • Web Caching
      • A Web cache, or proxy server, acts as an intermediate between clients and servers
        • The cache stores recently used files, so they don’t have to be requested again
        • The cache acts as client and server
      • ISPs typically use web caching to cut down on outgoing web traffic (to the servers) and lower request response time
      INFO 330 Chapter 2
    • Web Caching
      • Tends to work well when the client-cache connection is faster than the cache-server connection
      • Often helps avoid upgrading the cache-server connection speed, which saves money
      • Implement by using a conditional GET method in HTTP
        • With the If-Modified-Since request header
        • If the cache is still current, don’t download the file
      INFO 330 Chapter 2
    • FTP
      • The File Transfer Protocol is one of the oldest Internet applications (now RFC 959, but started as RFC 114 in 1971)
      • While HTTP and FTP both send files
        • FTP uses two connections – one for control, one for data (control information is out-of-band )
          • User login and commands are on the control connection, files move on the data connection
        • HTTP uses one connection for both purposes (control information is in-band )
      INFO 330 Chapter 2
    • FTP
      • FTP uses TCP, and usually connects to the server on ports 20 and 21
      • The client sends user ID and password
        • FTP may be done to some sites with generic ID, known as anonymous FTP
      • Once logged in, the user may navigate and view directories, and upload (STOR or PUT) or download (RETR or GET) files
      INFO 330 Chapter 2
    • FTP
      • Commands and replies are very basic
        • Most commands are three or four-letter abbreviations
        • Replies are three-digit codes, followed by text
      • Command connection is based on Telnet, incidentally [RFC 959, para 2.3]
      • Due to its age, FTP has provisions for a huge range of data types (ASCII or EBCDIC) and file, record, and page structures
      INFO 330 Chapter 2
    • Electronic Mail
      • E-mail is another ancient Internet application, with origins in RFC 772 in 1980
      • It provides asynchronous text communication and allows files to be attached to messages
        • Even voice and video messages
      • Main elements are users (sender and recipient), mail servers, and the Simple Mail Transfer Protocol (SMTP, RFC 2821)
        • Careful, there’s also an S N TP for network time
      INFO 330 Chapter 2
    • Electronic Mail
      • Email is composed in a client, which sends it to a mail queue in the sender’s mail server
      • The sending mail server uses SMTP to send the message to the recipient’s mail server
        • If mail can’t be sent successfully, the sender’s mail server will put the message in a queue, and keep trying (typically for 3 days)
      • The recipient is notified that the message is present, which they read with their client
      INFO 330 Chapter 2
    • Electronic Mail
      • Each user has a mailbox on the mail server
        • Access to the mailbox is controlled with user name and password
      • SMTP is the main protocol to get email from one mail server to another
        • It uses TCP, not surprisingly
        • Defined in proposed standard RFC 2821
        • Only uses 7-bit ASCII for message AND body
          • Forces binary files to be converted to ASCII & back
      INFO 330 Chapter 2
    • SMTP
      • After the TCP connection is established, SMTP does a handshake with port 25 of the recipient’s mail server
      • The client then sends the message
      • Multiple messages can be sent if needed, then the connection is closed
      • Client commands include HELO, MAIL FROM:, RCPT TO:, DATA (then the message body), and QUIT
      INFO 330 Chapter 2
    • SMTP
      • Other commands include ( with comments in italics )
        • RSET (abort current transaction)
        • SEND FROM:<reverse-path>
        • SOML FROM:<reverse-path> (send or mail)
        • SAML FROM:<reverse-path> (send and mail)
        • VRFY <string> (verify a user name)
        • EXPN <string> (expand mailing list)
        • HELP [ <string>]
        • NOOP (just send an OK reply)
        • TURN (your turn to be client or server)
      INFO 330 Chapter 2
    • SMTP vs HTTP
      • SMTP and HTTP can both move files using persistent TCP connections
        • SMTP pushes messages to the recipient’s mail server
          • HTTP pulls contents when desired from a web server
        • SMTP incorporates attachments into the body of the message as one big object
          • HTTP downloads attachments in separate responses
        • SMTP requires messages in 7-bit ASCII text
          • HTTP doesn’t
      INFO 330 Chapter 2
    • Mail Message Formats
      • Email contains header information defined by RFC 822 (Standard for ARPA Internet Text Messages), now RFC 5322
        • The sender headers can include: FROM, SENDER, REPLY-TO, RESENT-FROM, RESENT-SENDER, and RESENT-REPLY-TO
        • Receiver headers can be: TO, CC, and BCC
        • Reference headers can be: MESSAGE-ID, IN-REPLY-TO, REFERENCES and KEYWORDS
      INFO 330 Chapter 2
    • Mail Message Formats
        • Other allowable header fields are: SUBJECT, COMMENTS, ENCRYPTED, and possibly some extension fields or user-defined fields
      • While many of these headers also sound like SMTP commands, they are part of the email message
      • This works fine for ASCII data
        • For anything outside of that, call a MIME
      INFO 330 Chapter 2
    • MIME
      • Multipurpose Internet Mail Extensions (MIME) are used for handling non-ASCII contents in email, e.g. non-Latin character sets, binary files, images, audio, video, etc.
      • MIME (RFC 2045) adds the ability to handle
        • (1) textual message bodies in character sets other than US-ASCII, (2) an extensible set of different formats for non-textual message bodies, (3) multi-part message bodies, and (4) textual header information in character sets other than US-ASCII.
      INFO 330 Chapter 2
    • MIME
      • The key three parts of MIME are defining the version of MIME, the encoding scheme, and the type of content
        • MIME-Version: 1.0
        • Content-Transfer-Encoding: can be &quot;7bit&quot; / &quot;8bit&quot; / &quot;binary&quot; / &quot;quoted-printable&quot; / &quot;base64“
        • Content-Type: describes the type and subtype
          • Type is discrete (&quot;text&quot; / &quot;image&quot; / &quot;audio&quot; / &quot;video&quot; / &quot;application&quot;) or composite (&quot;message&quot; / &quot;multipart&quot;)
      INFO 330 Chapter 2
    • MIME
          • Subtype is an ietf-token (An extension token defined by a standards-track RFC and registered with IANA) or an X-token (The two characters &quot;X-&quot; or &quot;x-&quot; followed, with no intervening white space, by an ASCII text string)
      • There are many other variations of type and subtype (see RFC 2046), including for
        • Other character sets (Content-type: text/plain; charset=iso-8859-1), or proprietary formats (image/JPEG, application/postscript, etc.)
      INFO 330 Chapter 2
    • MIME
      • The received message also includes a Received: header added to the top of the message
      • This is familiar in email if you look at the full headers
      INFO 330 Chapter 2
    • Uuencode and uudecode
      • Historic note:
        • Before MIME, uuencode was used to convert non-ASCII files to text
          • Doing so expanded the file in size 35%, because of the conversion from 7 bit to 8 bit, plus control information
        • Uudecode reversed the operation after the file was received
        • These commands still exist under UNIX
      INFO 330 Chapter 2
    • Mail Access Protocols
      • If you log directly into your email server, SMTP is all you need to handle email
      • But if you wish to access email from a local host, you need to use a mail access protocol
      • The biggies at present are
        • Post Office Protocol version 3 (POP3) and
        • Internet Mail Access Protocol (IMAP)
      INFO 330 Chapter 2
    • POP3
      • POP3 is defined in RFC 1939
        • It’s a pretty simple protocol compared to many
      • SMTP sends mail between mail servers, and from the user agent (email app) to their mail server
      • POP3 transfers mail from your mail server to your user agent
      • From a user’s view, SMTP handles outgoing email, and POP3 handles incoming email
      INFO 330 Chapter 2
    • POP3
      • POP3 uses TCP, and connects to port 110 on the mail server
      • POP3 does three things – authorization, transaction, and update
        • Authorization verifies the user identity
        • Transaction retrieves email, marks messages for deletion, and gets mail statistics
        • Update ends the session, and deletes flagged messages
      INFO 330 Chapter 2
    • POP3
      • POP3 communicates with the mail server by commands, which get a +OK response if it worked, and an –ERR response if it didn’t work
        • Authorization uses commands ‘user’ and ‘pass’
        • Transaction uses commands
          • ‘ list’ to see list of messages
          • ‘ dele x’ to delete message number x
          • ‘ retr x’ to retrieve message number x
          • ‘ quit’ ends the session
      INFO 330 Chapter 2
    • POP3
      • POP3 allows two modes, depending on whether you delete the messages after retrieving them
        • If you download-and-delete messages from the server, you only download them to one local host
        • If you download-and-keep the messages on the server, then you can download them to more than one local host (e.g. home and work)
          • Disadvantage is that the volume of mail on the server can be too big
      INFO 330 Chapter 2
    • POP3
      • POP3 maintains a little state information during a session, such as which files have been marked for deletion
      • However after a session is over, all state information is gone
        • This makes a POP3 server a fairly simple beast
      • Users use folders locally (on their email app) to store and organize messages
      INFO 330 Chapter 2
    • IMAP
      • IMAP , defined in RFC 3501, allows folders to be defined on the mail server to organize email there
        • Messages are associated with a folder – first the generic INBOX, then moved by the user
        • Hence state information about the folder for each message must be saved across sessions
      • IMAP also provides search capability within the mailbox
      INFO 330 Chapter 2
    • IMAP
      • Users can also get just the headers of messages, and avoid downloading the MIME portion
        • Handy when on a low speed connection
      INFO 330 Chapter 2
    • Web Email
      • Hotmail (now owned by Microsoft) introduced web-based email shortly after the Web became popular
        • Mail is accessed by HTTP not POP3 or IMAP
        • But the server-to-server connection is still SMTP
      • Very convenient for accessing mail with limited bandwidth or from many locations
      • Widely imitated ( Gmail , Yahoo , AOL , etc.)
      INFO 330 Chapter 2
    • DNS
      • A key need, once the Internet grew beyond a few thousand hosts, was to automate converting human* readable addresses or hostnames (www.microsoft.com) to IP addresses (207.46.198.60) got IP here
      • That is the purpose of the Domain Name System (DNS)
        • Before DNS, really big lookup tables were used!
      * Humans who read English, at least! INFO 330 Chapter 2
    • Host vs Domain Names
      • A hostname is the name of a particular host computer, such as banner.drexel.edu
        • May really represent multiple computers, but logically they are all the same host
      • A domain name is the top level domain and the specific domain name, like drexel.edu
      • Top level domains are com, edu, gov, mil, org, net, and the country codes uk, de, fr, etc.
      INFO 330 Chapter 2
    • IP Addresses
      • IP addresses have four groups of bytes, each group from 0 to 255, separated by periods
        • Why called bytes? Each value from 0 to 255 corresponds to a value of from 0 to (2 8 -1), and a byte is eight bits
      • IP addresses are typically static (fixed) for servers and other semi-permanent Internet connections, and dynamic for temporary connections (e.g. dial-up, wireless)
      INFO 330 Chapter 2
    • DNS
      • DNS runs over UDP, port 53 (something uses UDP!)
      • DNS is managed by DNS servers, typically running Berkeley Internet Name Domain ( BIND ) software
      • DNS is used by other applications (HTTP, SMTP, FTP) to translate host names to IP addresses
        • You can also do a reverse DNS lookup (convert 205.188.97.2 to www-vd03.evip.aol.com)
      INFO 330 Chapter 2
    • Reverse DNS Lookup
      • So if you try to look up a random IP address like 123.45.67.89, dnsstuff.com gives
        • The reverse DNS entry for an IP is found by reversing the IP, adding it to &quot;in-addr.arpa&quot;, and looking up the PTR record. So, the reverse DNS entry for 123.45.67.89 is found by looking up the PTR record for 89.67.45.123.in-addr.arpa.
          • “ tinnie.arin.net (an authoritative nameserver for 123.in-addr.arpa., which is in charge of the reverse DNS for 123.45.67.89) says that there are no PTR records for 123.45.67.89.”
      INFO 330 Chapter 2
    • DNS
      • DNS also provides other key services
        • Host aliasing allows the true or canonical hostname to have aliases
          • When blah.com works to get to www.blah.com, it’s because blah.com is a host alias of www.blah.com
        • Mail server aliasing – same concept, but for mail server names
        • Load distribution across many servers for the same hostname – so everyone in the world doesn’t use one IP address for microsoft.com
      INFO 330 Chapter 2
    • DNS Structure
      • DNS is highly decentralized
        • Improves throughput, speed, redundancy, reliability, security
      • There are three levels of structure – the job of looking up a given address is partitioned among them
        • Root DNS Servers – are 13 sets of servers around the world that provide top level delegation of DNS information
      INFO 330 Chapter 2
    • DNS Structure
        • Top-Level Domain (TLD) DNS Servers – sets of servers are maintained for each of the top level domains, including country codes
          • Network Solutions Inc maintains the .COM domain
        • Authoritative DNS Servers – everyone who has publicly visible web or mail servers has to maintain DNS records
          • Drexel, large ISPs, etc. all can maintain DNS servers
        • Local DNS servers – are used to forward to the nearest authoritative DNS server
      INFO 330 Chapter 2
    • DNS Lookup
      • DNS lookup typically follows the pattern at right
        • A request to the local DNS server finds the TLD server from root
        • Then get the auth. server from the TLD server, who gives the desired IP address
      INFO 330 Chapter 2
    • Recursive vs Iterative Queries
      • DNS queries which ask another server to get information are recursive
        • Query 1 on previous slide is recursive
      • DNS queries which which get the information directly are iterative
        • Queries 2, 4, and 6 are iterative
      • All DNS queries can, in general, be recursive or iterative – the example shown is typical
      INFO 330 Chapter 2
    • DNS Lookup
      • This would be terribly tedious without caching
        • Common queries are stored on each level of DNS server, so they don’t have to be looked up constantly
        • Cached values are cleared typically every two days or less, in case the data changes
      INFO 330 Chapter 2
    • DNS Records
      • Data about a hostname, its aliases, domain, and mail servers are captured in resource records (RR)
      • Each RR is a line with four fields
        • (Name, Value, Type, and TTL)
          • Name is a hostname, domain name, or canonical host or mail server name (depending on the Type)
          • Value is the IP address, mail server, or of the Name
          • Type is the record type
          • TTL is the time the resource should be removed from cache (in seconds)
      INFO 330 Chapter 2
    • DNS Records
      • DNS RR types are one of several options
        • Type=A gives the IP address Value for a hostname Name
          • (relay1.bar.foo.com, 145.37.93.126, A) (TTL not shown)
        • Type=NS (name server) gives the authoritative DNS server Value for a domain Name
          • (foo.com, dns.foo.com, NS)
        • Type=CNAME defines the alias Name for the canonical hostname Value
          • (foo.com, relay1.bar.foo.com, CNAME)
      INFO 330 Chapter 2
    • DNS Records
        • Type=MX gives the canonical mail server Value for an alias hostname Name
          • (foo.com, mail.bar.foo.com, MX)
        • Most hostnames have many RRs
      The Start of Authority ( SOA) resource record indicates that this DNS name server is the best source of information for the data within this DNS domain INFO 330 Chapter 2
    • New resource record types
      • There are type AAAA resource records for IPv6 addresses
        • Their syntax is like an A type record turtle.mytrek.com IN AAAA FC00::8:800:200C:417A
      • An experimental A6 resource record is used for chains of related IPv6 addresses
      From Ubuntu Server Admin and Reference , R Peterson, 2009 INFO 330 Chapter 2
    • DNS Messages
      • The same format DNS messages are used to both query a DNS server, and receive the reply
      • The messages have a header section, the question, the answer, a section for other authoritative servers, and possibly additional information (such as A records for mail servers)
      INFO 330 Chapter 2
    • nslookup
      • The command nslookup provides basic IP data for a hostname or domain
      • Nslookup snip.net
        • Server: ns2.snip.net
        • Address: 209.204.64.3
        • Name: snip.net
        • Address: 216.83.103.123
      INFO 330 Chapter 2
    • DNS Changes
      • A registrar makes changes to the DNS database
        • The list of registrars is at http://www.internic.net/ (the text is full of typos!)
      • Changes to DNS records typically take hours to a couple days to become available – less if lots of people are requesting a new domain
        • Likewise, email won’t find you right away
      INFO 330 Chapter 2
    • DNS and security
      • DNS is somewhat vulnerable to distributed denial of service (DDoS) attacks
        • The Root servers were attacked in 2002, but they block incoming ping messages
        • TLD servers are more vulnerable, but local caching would reduce its impact
      • Another approach is to send many DNS requests to authoritative servers, and spoof the source as a local DNS server
      INFO 330 Chapter 2
    • Peer-to-Peer File Sharing
      • Peer-to-Peer (P2P) file sharing occupies much of the volume of Internet traffic
      • It allows a user to find a file on another user’s computer, and download it directly
        • Everyone can be client and server, even at the same time
        • Napster used a centralized index , but true P2P just indexes the files you will share
          • Please don’t share your entire hard drive!
      INFO 330 Chapter 2
    • P2P File Distribution
      • P2P can be used to distribute a file from one source (e.g. a new Linux kernel) to hundreds of peer servers
      • P2P is inherently scalable
        • Client-server file distribution time increases linearly with the number of nodes on the network
        • P2P distribution time levels off asymptotically
      INFO 330 Chapter 2
    • BitTorrent
      • Bittorrent.org manages the protocol used by most file sharing (30% of all Internet backbone traffic!)
        • m Torrent is a commercial version; see also Azureus/Vuze , BitComet , etc.
      • A torrent is the set of peers participating in distribution of a file
        • A tracker node keeps track of which nodes are in the torrent
      INFO 330 Chapter 2
    • BitTorrent
      • When you join a torrent, you identify up to 50 neighboring peers already in the torrent
        • Then find what chunks of the file each has, and get the rarest first
      • When responding to requests for file chunks, focus on neighbors with the highest data rate
        • Peers also send chunks to random neighbors
        • In order to get good download rates, must share nicely with others! (no free-riding !)
      INFO 330 Chapter 2
    • Peer-to-Peer File Sharing
      • TCP connections between the computers and FTP make it possible
        • The server computer is a transient Web server
      • Gnutella has a proprietary protocol (not everything is an RFC!)
        • A request for a file produces query flooding to find that file is neighboring peers, and collects query hits; from those hits, an HTTP GET command downloads the file
      INFO 330 Chapter 2
    • Peer-to-Peer File Sharing
        • More refined limited scope query flooding is now done to minimize Internet traffic required per user
          • Only looks at nearby peers in decreasing numbers
        • Gnutella also manages how people find peers on the network ( bootstrapping ), and maintain whether they are still online by pinging them
      • KaZaA and Morpheus borrowed from both Napster and Gnutella
        • It searches nearby peers, but not all are equal
        • Some have higher bandwidth and more to share
      INFO 330 Chapter 2
    • Peer-to-Peer File Sharing
        • More powerful peers are group leaders ( super peers ) for those around them, acting like mini hubs of the network
          • Group leaders connect via TCP, and map out what’s available from their local peers
        • Other tricks include
          • Limiting the number of simultaneous downloads
          • Giving priority to those who upload more than download
          • Download parts of the same file in parallel from multiple sources at once
      INFO 330 Chapter 2
    • Skype
      • Skype is a popular P2P Internet telephony app, which goes beyond file distribution and sharing in the P2P world
      • Nodes in Skype are in a hierarchical overlay (like the super peer concept), which makes it faster to locate a user
      • Skype uses relays to establish calls across NAT-hidden local networks
      INFO 330 Chapter 2
    • Peer-to-Peer File Sharing
      • A massive issue for P2P file sharing is the intellectual property rights of the files being shared
        • Music and video industry lawyers have claimed enormous losses from file sharing, and have vigorously fought file sharing applications
        • Napster, BearShare, Grokster, Morpheus, iMesh, DVDxCopy, KaZaA, and others are involved in such ongoing disputes
      INFO 330 Chapter 2