The document provides information about various protocols and concepts in the Application Layer of the OSI model. It discusses DNS and how it maps domain names to IP addresses using a hierarchical naming scheme and distributed database. It also summarizes FTP for file transfer, SMTP for email transfer, HTTP for web communication, and multimedia applications.
3. DNS â The Domain Name System
Way back in the ARPANET, there was simply a file, hosts.txt, that listed all the
hosts and their IP addresses. Every night, all the hosts would fetch it from the site
at which it was maintained. For a network of a few hundred large timesharing
machines, this approach worked reasonably well.
However, when thousands of minicomputers and PCs were connected to the net,
this approach could not continue to work forever. For one thing, the size of the
file would become too large. However, even more important, host name conflicts
would occur constantly unless names were centrally managed, something
unthinkable in a huge international network due to the load and latency. To solve
these problems, DNS (the Domain Name System) was invented.
The essence of DNS is the invention of a hierarchical, domain-based naming
scheme and a distributed database system for implementing this naming scheme.
It is primarily used for mapping host names and e-mail destinations to IP
addresses
4. DNS â The Domain Name System
⢠The DNS Name Space
⢠Resource Records
⢠Name Servers
To map a name onto an IP address, an application program calls a library
procedure called the resolver, passing it the name as a parameter. The resolver
sends a UDP packet to a local DNS server, which then looks up the name and
returns the IP address to the resolver, which then returns it to the caller.
5. The DNS Name Space
A portion of the Internet domain name space.
The Internet is divided into over 250 top-level domains, where each domain covers many
hosts. The top-level domains come in two flavors: generic and countries. The original
generic domains were com (commercial), edu (educational institutions), gov (the U.S.
Federal Government), int (certain international organizations), mil (the U.S. armed
forces), net (network providers), and org (nonprofit organizations). The country domains
include one entry for every country, as defined in ISO 3166.
⢠Other new generic domains added later include biz (businesses), info (information),
name (people's names) pro (professions), aero (aerospace industry), coop (co-operatives),
and museum (museums).
6. ⢠Each domain is named by the path upward from it to the (unnamed)
root.
⢠Domain names can be either absolute or relative. An absolute domain
name always ends with a period (e.g., eng.sun.com.), whereas a
relative one does not.
⢠Domain names are case insensitive, so edu, Edu, and EDU mean the
same thing. Component names can be up to 63 characters long, and
full path names must not exceed 255 characters.
The DNS Name Space
7. Resource Records
⢠Every domain, whether it is a single host or a top-level domain, can
have a set of resource records associated with it.
⢠When a resolver gives a domain name to DNS, what it gets back are
the resource records associated with that name. Thus, the primary
function of DNS is to map domain names onto resource records.
⢠A resource record is a five-tuple. Although they are encoded in binary
for efficiency, in most expositions, resource records are presented as
ASCII text, one line per resource record. The format we will use is as
follows:
Domain_name Time_to_live Class Type Value
⢠The Domain_name tells the domain to which this record applies.
⢠The Time_to_live field gives an indication of how stable the record is.
Information that is highly stable is assigned a large value, such as 86400
(the number of seconds in 1 day). Information that is highly volatile is
assigned a small value, such as 60.
8. Resource Records
⢠The third field is Class. For Internet information, it is always IN.
⢠The Type field tells what kind of record this is. The most important types are :
⢠An SOA record provides the name of the primary source of information
about the name server's zone, the e-mail address of its administrator, a
unique serial number, and various flags and timeouts.
⢠The most important record type is the A (Address) record. Every Internet
host must have at least one IP address so that other machines can
communicate with it. Some hosts have two or more network connections
in which case multiple A records exist
9. ⢠The next most important record type is the MX record. It specifies Name
of the host prepared to accept e-mail for the specified domain, as not
every machine is prepared to accept e-mail.
⢠The NS records specify name servers. For example, every DNS database
normally has an NS record for each of the top-level domains.
⢠CNAME records allow aliases to be created.
cs.mit.edu 86400 IN CNAME lcs.mit.edu
⢠Like CNAME, PTR points to another name. PTR is a regular DNS datatype
whose interpretation depends on the context in which it is found. In
practice, it is nearly always used to associate a name with an IP address to
allow lookups of the IP address and return the name of the corresponding
machine. These are called reverse lookups.
⢠HINFO records allow people to find out what kind of machine and
operating system a domain corresponds to. TXT records allow domains to
identify themselves in arbitrary ways. Both of these record types are for
user convenience, neither is required.
⢠Finally, we have the Value field. This field can be a number, a domain
name, or an ASCII string. The semantics depend on the record type.
Resource Records
11. Name Servers
⢠To avoid the problems associated with having only a single source of
information, the DNS name space is divided into non-overlapping
zones.
⢠Each zone contains some part of the tree and also contains name
servers holding the information about that zone.
⢠Normally, a zone will have one primary name server, which gets its
information from a file on its disk, and one or more secondary name
servers, which get their information from the primary name server.
⢠When a resolver has a query about a domain name, it passes the query
to one of the local name servers. If the domain being sought falls
under the jurisdiction of the name server, such as ai.cs.yale.edu falling
under cs.yale.edu, it returns the authoritative resource records. An
authoritative record is one that comes from the authority that
manages the record and is thus always correct.
14. File Transfer Protocol
⢠The greatest volume of data exchange in the Internet today is due to
file transfer. File Transfer Protocol (FTP) is the standard mechanism
provided by TCP/IP for copying a file from one host to another.
⢠FTP differs from other client/server applications in that it establishes
two connections between the hosts. One connection is used for data
transfer, the other for control information (commands and responses).
Separation of commands and data transfer makes FTP more efficient.
⢠FTP uses the services of TCP. It needs two TCP connections. Port 21 is
used for the control connection, and port 20 is used for the data
connection.
⢠The control connection remains connected during the entire
interactive FTP session. The data connection is opened and then closed
for each file transferred.
15. FTP : Communication over Control Connection
⢠FTP uses the same approach as SMTP to communicate across the
control connection. It uses the 7-bit ASCII character set.
Communication is achieved through commands and responses. This
simple method is adequate for the control connection because we
send one command (or response) at a time.
⢠Each command or response is only one short line, so we need not
worry about file format or file structure.
⢠Each line is terminated with a two-character (carriage return and
line feed) end-of-line token.
16. FTP : Communication over Data Connection
File transfer in FTP means one of three things:
⢠A file is to be copied from the server to the client. This is called
retrieving a file. It is done under the supervision of the RETR
command,
⢠A file is to be copied from the client to the server. This is called
storing a file. It is done under the supervision of the STOR
command.
⢠A list of directory or file names is to be sent from the server to the
client. This is done under the supervision of the LIST command.
Note that FTP treats a list of directory or file names as a file. It is
sent over the data connection.
17. FTP : Communication over Data Connection
⢠The client must define the type of file to be transferred, the structure of the
data, and the transmission mode.
⢠File Type FTP can transfer an ASCII file, EBCDIC file, or image file. The ASCII
file is the default format for transferring text files. Each character is encoded
using 7-bit ASCII.
⢠Data Structure FTP can transfer a file across the data connection by using
one of the following interpretations about the structure of the data: file
structure, record structure, and page structure. In the file structure format,
the file is a continuous stream of bytes. In the record structure, the file is
divided into records. This can be used only with text files. In the page
structure, the file is divided into pages, with each page having a page
number and a page header. The pages can be stored and accessed randomly
or sequentially.
⢠Transmission Mode FTP can transfer a file across the data connection by
using one of the following three transmission modes: stream mode, block
mode, and compressed mode. The stream mode is the default mode. Data
are delivered from FTP to TCP as a continuous stream of bytes. In block
mode, data can be delivered from FTP to TCP in blocks. In this case, each
block is preceded by a 3-byte header. The first byte is called the block
descriptor; the next 2 bytes define the size of the block in bytes. In the
compressed mode, if the file is big, the data can be compressed. The
compression method normally used is run-length encoding.
18. SMTP
⢠Simple Mail Transfer Protocol is the standard e-mail protocol on the Internet
and part of the TCP/IP protocol suite. SMTP defines the message format and
the message transfer agent (MTA), which stores and forwards the mail. SMTP
was originally designed for only plain text (ASCII text), but MIME and other
encoding methods enable executable programs and multimedia files to be
attached / transported with the e-mail message.
⢠SMTP is a relatively simple, text-based protocol, where one or more
recipients of a message are specified and then the message text is
transferred. SMTP uses TCP port 25.
⢠The primary purpose of SMTP is to transfer email between mail servers. In
order to send email, the client sends the message to an outgoing mail server,
which in turn contacts the destination mail server for delivery. For this
reason, it is necessary to specify an SMTP server when configuring an email
client.
⢠One important point to make about the SMTP protocol is that it does not
require authentication. This allows anyone on the Internet to send email to
anyone else or even to large groups of people. It is this characteristic of
SMTP that makes junk email or spam possible.
19. SMTP
SMTP clients and servers have
two main components
User Agents â Prepares the
message, encloses it in an envelope.
(ex. Thunderbird, Eudora)
Mail Transfer Agent â Transfers the
mail across the internet (ex.
Sendmail, Exim)
Analogous to the postal system in
many ways
20. How messages are sent to SMTP server?
ď E-mail communication using Relaying
â Used during initial days of SMTP.
â SMTP routing information is included along with E-mail address.
â Problem with this method.
ď Using DNS
â This method is used at present.
â The senders SMTP server makes the use of DNS to find MX
record of the domain to which the E-mail is to be sent.
21. SMTP Basic commands
SMTP defines a small required command set, with several optional
commands included for convenience purposes. The minimal set
required for an SMTP sending client are:
⢠HELO - Initial State Identification
⢠MAIL- Mail Sender Reverse Path
⢠RCPT - One Recipientâs Forward Path
⢠DATA - Mail Message Text State
⢠RSET - Abort Transaction and Reset all buffers
⢠NOOP - No Operation
⢠QUIT- Commit Message and Close Channel
22. SMTP Limitations
⢠Security matters for SMTP are worse.
⢠Its usefulness is limited by its simplicity.
⢠Transmission of executable files and binary files using SMTP is not
possible without converting into text files. Use MIME to send mail
in other format.
⢠It cannot transmit text data that contains national language
characters. These national language characters use 8-bit codes with
values of 128 decimal or more.
⢠It is limited to 7-bit ASCII characters only.
⢠SMTP servers may reject mail messages beyond some specific
length.
23. HTTP
⢠HTTP stands for Hypertext Transfer Protocol.
⢠It is an TCP/IP based communication protocol which is used to
deliver virtually all files and other data, collectively called
resources, on the World Wide Web. These resources could be
HTML files, image files, query results, or anything else.
⢠It is a Client-Server based protocol.
⢠Here the browser works as an HTTP client because it sends
requests to an HTTP server which is called Web server. The Web
Server then sends responses back to the client.
⢠The standard and default port for HTTP servers to listen on is 80.
24. WHY HTTP?
⢠HTTP is like SMTP because the data transferred between the client
and server are similar in appearance to SMTP messages. Also, the
format of the messages is controlled by MIME-like headers. BUT
unlike SMTP HTTP doesnât store the intermediate messages it just
transmits them dynamically.
⢠HTTP is like FTP because they both transfer files and use the
services of TCP . BUT unlike FTP HTTP is non-persistent type
whereas the former one is persistent type.
⢠Thus HTTP incorporates the features of both FTP & SMTP and can
be considered to be a advanced and augmented version of both.
25. ⢠An client sends a request message to an server. The server, returns a
response message.
⢠The HTTP client first initiates a TCP connection with the server. Once the
connection is established, the browser and the server processes access TCP
through their socket interfaces
is a stateless protocol. In other words, the current request does not know
what has been done in the previous requests.
26. Features of HTTP
⢠HTTP is connectionless: After a request is made, the client disconnects
from the server and waits for a response. The server must re-establish
the connection after it process the request.
⢠HTTP is media independent: Any type of data can be sent by HTTP as
long as both the client and server know how to handle the data
content.
⢠HTTP is stateless: This is a direct result of HTTP's being connectionless.
The server and client are aware of each other only during a request.
Afterwards, each forgets the other. For this reason neither the client
nor the browser can retain information between different request
across the web pages.
27. HTTP - URLs
URL -- Uniform Resource Locator
A URL is used to uniquely identify a resource over the web.
Syntax :
protocol://hostname:port/path-and-file-name
Example : http://xxx.myplace.com:80/cgi-bin/t.html
ď protocol (http, ftp, smtp,dns,news..etc)
ď host name (name.domain name)
ď port (usually 80 but many on 8080)
ď directory path to the resource
ď resource name
28. HTTP Messages
⢠HTTP messages act as the language in which web clients and web
servers talk to each other.
⢠Each message, whether a request or a response, has three parts:
1. The request or the response line
2. A header section
3. The body of the message
29. 29
What the client does?
⢠The client sends a message to the server at a particular port
(80 is the default)
⢠The first part of the message is the Request line containing:
â A method (HTTP command) such as GET or POST
â A document address, and
â An HTTP version number
⢠Example:
â GET /index.html HTTP/1.0
30. HTTP Request Methods
⢠GET
â whatever information is identified by the Request-URI
â Can Get static content and data produced by a program
â majority of HTTP request messages use the GET method.
⢠POST
â Submit information to Web Server
â Eg: posting to blog, submission of user formâŚ
â Information is included in message body
â The actual function depends on request URI
Example
POST/phonebook.cgi.HTTP/1.0
Date:
User-Agent:
Accept Language: en-us
Content Length: 14
98490 55266
Looks up phone book for the number
Could have been also achieved by Get
But in that case number would have been in
the Resource URL
Which would have been stored in the log
31. Request Methods
⢠HEAD
â Servers response does not include message body
â Useful for getting resource metadata without transferring the resource
â Also useful for debugging, checking for validity, accessibility and
modification
⢠PUT
â Requests a server store the enclosed data under the supplied Request
URL.
â Creates the resource if it does not create
â Not useful for web publishing (FTP is preferred for security purposes)
⢠DELETE
â Removes the Web object
â Needs to be carefully used for security reasons
32. Request Methods
⢠TRACE
â Invokes a remote application layer feedback of the request
message
â Useful for testing what is being received at the server
â Also possible to forward to intermediaries for debugging
purposes
⢠OPTIONS
â Requests information about communication options available
to server
34. Response Message
⢠The server response is also in three parts
⢠The first part is the Status line, which tells:
â The HTTP version
â A status code
â A short description of what the status code means
⢠Example: HTTP/1.1 404 Not Found
⢠Status codes are in groups:
100-199 Informational
200-299 The request was successful
300-399 The request was redirected
400-499 The request failed
500-599 A server error occurred
35. Common status codes
35
⢠200 OK
â Everything worked, hereâs the data
⢠301 Moved Permanently
â URI was moved, but hereâs the new address for your records
⢠302 Moved temporarily
â URL temporarily out of service, keep the old one but use this one for
now
⢠400 Bad Request
â There is a syntax error in your request
⢠403 Forbidden
â You canât do this, and we wonât tell you why
⢠404 Not Found
â No such document
⢠408 Request Time-out, 504 Gateway Time-out
â Request took too long to fulfill for some reason
38. Advantages & Disadvantages of HTTP
⢠Advantages
â Platform independent- Allows Straight cross platform porting.
â No Runtime support required to run properly.
â Usable over Firewalls! Global applications possible.
â Not Connection Oriented- No network overhead to create and
maintain session state and information.
⢠Disadvantages
â Privacy issues : Anyone can see content, No security by default.
â Integrity issues : Someone might alter content. HTTP is insecure
since no encryption methods are used. Hence is subject to man in
the middle and eavesdropping of sensitive information.
â Authentication : Not clear who you are talking with. Authentication
is sent in the clear â Anyone who intercepts the request can
determine the username and password being used.
39. ⢠The term WWW refers to the World Wide Web or simply the Web.
⢠The World Wide Web consists of all the public Web sites connected
to the Internet worldwide, including the client devices that access
Web content.
⢠The WWW is just one of many applications of the Internet and
computer.
⢠Founded by - Tim Berners-Lee at the European Laboratory for
Particle Physics(CERN).
⢠He developed a software application for a hypertext server
program which stores documents and makes it available on the
internet, which he named as WORLD WIDE WEB.
⢠Developed To- try to improve the CERNâs research document
handling and sharing mechanism
World Wide Web (WWW)
40. ⢠In 1993, Marc Andressen wrote a program named âmosaic â which
could read the documents in hypertext format, which was later named
as âWEB BROWSERâ
⢠Because of web browser people could easily access internet.
⢠It was a free software.
⢠âNETSCAPE NAVIGATORâ was the first web browser, which was based
on mosiac.
WWW
41. Internet vs WWW
Internet World Wide Web
Estimated year of Origin 1969, though opening of
the network to commercial
interests began only in
1988
1991
Name of the first version ARPANET NSFnet
Comprises Network of Computers,
copper wires, fibre-optic
cables & wireless networks
Files, folders & documents
stored in various
computers
Governed by
Internet Protocol
Hyper Text Transfer
Protocol
Dependency This is the base,
independent of the World
Wide Web
It depends on Internet to
work
Nature Hardware Software
42. ⢠A web server is a program running on a server computer. It consist of
the website containing a no. of web pages.
⢠Web pages are designed on a language called HTML.
⢠Each web page can contain text , graphics, sound, video and
animation.
⢠A web portal is most often one specially designed Web page which
brings information together from diverse sources in a uniform way.
⢠Usually, each information source gets its dedicated area on the page
for displaying information.
Web server / portal
43. A program that searches for and
identifies items in a database that
correspond to keywords or
characters specified by the user,
used especially for finding
particular sites on the World Wide
Web.
Search engine
44. ⢠The web server receives request for a web page from a browser
program running on the client.
⢠Then it locates that the corresponding page and sends it to the
client computer.
⢠In this TCP connection the client sends one request and the server
ine response.
⢠This request âresponse model is governed by a protocol called
HYPERTEXT TRASFER PROTOCOL(HTTP).
⢠The most popular web servers are Apache , IIS.
Working of web server
45. MIME â Multipurpose Internet Mail Extensions
Problems with international languages:
⢠Languages with accents
(French, German).
⢠Languages in non-Latin alphabets
(Hebrew, Russian).
⢠Languages without alphabets
(Chinese, Japanese).
⢠Messages not containing text at all
(audio or images).
A solution was proposed in RFC 1341 and updated in RFCs
2045â2049. This solution, called MIME (Multipurpose
Internet Mail Extensions) is now widely used.
46. ⢠The basic idea of MIME is to continue to use the RFC 822 but
to add structure to the message body and define encoding
rules for the transfer of non-ASCII messages.
The correct way to encode binary messages is to use base64 encoding,
sometimes called ASCII armor.
For messages that are almost entirely ASCII but with a few non-ASCII characters,
an encoding known as quoted-printable encoding is used.
48. ⢠Within the Internet, e-mail is delivered by having the source
machine establish a TCP connection to port 25 of the
destination machine. Listening to this port is an e-mail
daemon that speaks SMTP (Simple Mail Transfer Protocol).
This daemon accepts incoming connections and copies
messages from them into the appropriate mailboxes. If a
message cannot be delivered, an error report containing the
first part of the undeliverable message is returned to the
sender.
⢠SMTP is a simple ASCII protocol. Using ASCII text makes
protocols easy to develop, test, and debug. After establishing
the TCP connection to port 25, the sending machine,
operating as the client, waits for the receiving machine,
operating as the server, to talk first.
Message Transfer - SMTP
50. ESMTP
⢠SMTP has the problems with authentication,
encryption, inefficient usage of bandwidth
incase of non-ASCII transmissions, message
sizes etc.
⢠SMTP is allowed to have an extended
mechanism, which is mandatory in RFC 5321
standard called Extended SMTP.
51. ⢠Both the communicating parties may not be online all the
time. One solution is to have a message transfer agent on an
ISP machine accept e-mail for its customers and store it in
their mailboxes on an ISP machine. Since this agent can be on-
line all the time, e-mail can be sent to it 24 hours a day.
⢠POP3 (Post Office Protocol Version 3) is a protocol that allows
user transfer agents (on client PCs) to contact the message
transfer agent (on the ISP's machine) and allow e-mail to be
copied from the ISP to the user.
⢠POP3 begins when the user starts the mail reader. The mail
reader calls up the ISP (unless there is already a connection)
and establishes a TCP connection with the message transfer
agent at port 110.
Final Delivery
52. Final Delivery
(a) Sending and reading mail when the receiver has a permanent
Internet connection and the user agent runs on the same machine as
the message transfer agent. (b) Reading e-mail when the receiver has
a dial-up connection to an ISP.
53. ⢠Once the connection has been established, the
POP3 protocol goes through three states in
sequence:
â Authorization.
â Transactions.
â Update.
⢠The authorization state deals with having the user
log in. The transaction state deals with the user
collecting the e-mails and marking them for
deletion from the mailbox. The update state
actually causes the e-mails to be deleted.
54. IMAP (Internet Message Access Protocol)
⢠IMAP is an improvement over an earlier final delivery
protocol, POP3 (Post Office Protocol, version 3),
which is specified in RFC 1939. POP3 is a simpler
protocol but supports fewer features and is less
secure in typical usage.
⢠Mail is usually downloaded to the user agent
computer, instead of remaining on the mail server.
This makes life easier on the server, but harder on
the user. It is not easy to read mail on multiple
computers, plus if the user agent computer breaks,
all email may be lost permanently.
55.
56.
57. Webmail
⢠Webmail is an increasingly popular alternative
to IMAP and SMTP for providing email service
is to use the Web as an interface for sending
and receiving mail.
⢠Widely used Webmail systems include Google
Gmail, Microsoft Hotmail and Yahoo! mail.
Webmail is one example of software (in this
case, a mail user agent) that is provided as a
service using the Web.