It unit 1

HISTORY OF INTERNET
The Internet had its roots during the 1960's as a project of the
United States government's Department of Defense, to create a
non-centralized network.
This project was called ARPANET (Advanced Research Projects
Agency Network), created by the Pentagon's Advanced
Research Projects Agency established in 1969 to provide a
secure and survivable communications network for
organizations engaged in defense-related research.

Packet switching
In order to make the network more global a new
sophisticated and standard protocol was needed. They
developed IP (Internet Protocol) technology which defined
how electronic messages were packaged, addressed, and
sent over the network. The standard protocol was invented
in 1977 and was called TCP/IP (Transmission Control
Protocol/Internet Protocol).
TCP/IP allowed users to link various branches of other
complex networks directly to the ARPANET, which soon
came to be called the Internet.
In 1969, ARPANET delivered its first message: a “node-to-
node” communication from one computer to another but it
crashed the full network. The internet was yet to be born

Contd.

Without packet switching, the government’s computer
network—now known as the ARPANET—would have been just
as vulnerable to enemy attacks as the phone system.

Internet in the 1970’s
 By the end of 1969, just four
computers were connected to
the ARPANET, but the network
grew steadily during the 1970s.
 As packet-switched computer
networks multiplied, however, it
became more difficult for them
to integrate into a single
worldwide “Internet.”

Contd.
 B y the end of the 1970s, a computer scientist named
Vinton Cerf had begun to solve this problem by
developing a way for all of the computers on all of the
world’s mini-networks to communicate with one
another.
 H e called his invention “Transmission Control Protocol,”
or TCP. (Later, he added an additional protocol, known
as “Internet Protocol.” The acronym we use to refer to
these today is TCP/IP.)

Contd.
TCP/IP was described to
be the “handshake”
between computers all
over the world. It
enabled each computer
to have its own identity.

Cerf’s protocol transformed the Internet into a worldwide
network. Throughout the 1980s, researchers and
scientists used it to send files and data from one
computer to another.
However, this network was still between scientistsand
researchers from different universities and labs.

However, in 1991 the Internet
changed again.
 Ti m Berners-Lee introduced the
World Wide Web: an Internetthat
was not simply a way to send files
from one place to another but was
itself a “web” of information that
anyone on the Internet could
retrieve.
Berners-Lee created the first
browser and the Internet that we
know today.

Contd.
In 1992, a group of students and researchers at the University
of Illinois developed a sophisticated browser that they called
Mosaic. (It later became Netscape.)
Mosaic offered a user-friendly way to search the Web: It
allowed users to see words and pictures on the same page for
the first time and to navigate using scrollbars and clickable
links. HTML is language used to creating web pages and HTML
5 is upgraded to browse and search files in web

Contd.
 That same year, it was decided
that the Web could be used for
commercial purposes. As a result,
companies of all kinds hurried to
set up websites of their own, and
e-commerce entrepreneurs began
to use the Internet to sell goods
directly to customers.
 More recently, social networking
sites like Facebook have become a
popular way for people of all ages
to stay connected.

Time Line..
1967 – plans for ARPANET were published
1971 – ARPANET was successfully developed with 23 host computers
1972 – ARPANET went ‘public’
First program for person-to-person communication (e-mail)
1973
75% of all ARPANET traffic is e-mail
First international connection (University College of London)
1974 – TCP/IP
Each network should work on its own
Within each network there would be a ‘gateway’ / router
During congestion, Packages would be routed through the fastest available route
Large mainframe computers
1991 - Internet is commercialized..

IP Address
Each device attached to a TCP/IP-based network must be given a
unique address called IP Address.
These addresses are carried in the IP packet to identify the source
and destination hosts.
Each IP address has two components: a network identifier (NETID)
and a host identifier (HOSTID).
The NETID identifies the specific network to which the host is
attached. The HOSTID uniquely identifies a host within that
network.

Internet Protocol
IP is a connectionless, unreliable, best-effort
delivery protocol.
IP accepts whatever data is passed down to
it from the upper layers and forwards the
data in the form of IP Packets.
All the nodes are identified using an IP
address.
Packets are delivered from the source to the
destination using IP address

IP Address and Class details..

OSI vs. TCP/IP
TCP, UDP
IP
HTTP,
SMTP, …

Transport Layer
End-to-end data transfer
Transmission Control Protocol (TCP)
connection oriented
reliable delivery of data
ordering of delivery
User Datagram Protocol (UDP)
connectionless service
delivery is not guaranteed

TCP
Transmission Control Protocol
end to end protocol
Reliable connection = provides flow and error control
In TCP terms, a connection is a
temporary association between entities in different systems
TCP PDU
Called “TCP segment”
Includes source and destination port
Identify respective users (applications)
pair of ports (together with the IP addresses) uniquely identify a connection; such
an identification is necessary in order TCP to track segments between entities.

HTTP
The Hypertext Transfer Protocol (HTTP) is a protocol
used mainly to access data on the World Wide Web.
HTTP functions as a combination of FTP and SMTP.
HTTP Transaction
Persistent Versus Nonpersistent Connection
Topics discussed in this section:

HTTP
The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed,
collaborative, hypermedia information systems. This is the foundation for data
communication for the World Wide Web (i.e. internet) since 1990. HTTP is a generic and
stateless protocol which can be used for other purposes as well using extensions of its
request methods, error codes, and headers.
Basically, HTTP is a TCP/IP based communication protocol, that is used to deliver data
(HTML files, image files, query results, etc.) on the World Wide Web. The default port is TCP
80, but other ports can be used as well.
It provides a standardized way for computers to communicate with each other. HTTP
specification specifies how clients' request data will be constructed and sent to the server,
and how the servers respond to these requests.

Basic Features
There are three basic features that make HTTP a simple but powerful protocol:
HTTP is connectionless: The HTTP client, i.e., a browser initiates an HTTP request and after a request is
made, the client waits for the response. The server processes the request and sends a response back after
which client disconnect the connection. So client and server knows about each other during current request
and response only. Further requests are made on new connection like client and server are new to each
other.
HTTP is media independent: It means, any type of data can be sent by HTTP as long as both the client and
the server know how to handle the data content. It is required for the client as well as the server to specify
the content type using appropriate MIME-type.
HTTP is stateless: As mentioned above, HTTP is connectionless and it is a direct result of HTTP being a
stateless protocol. The server and client are aware of each other only during a current request. Afterwards,
both of them forget about each other. Due to this nature of the protocol, neither the client nor the browser
can retain information between different requests across the web pages.

HTTP uses the services of TCP on well-known port 80.
HTTP is a pull protocol, the
client pulls information from
the server (instead of server
pushes information down to
the client).

The HTTP protocol is a request/response protocol based on the client/server based
architecture where web browsers, robots and search engines, etc. act like HTTP clients, and
the Web server acts as a server.
Client
The HTTP client sends a request to the server in the form of a request method, URI, and
protocol version, followed by a MIME-like message containing request modifiers, client
information, and possible body content over a TCP/IP connection.
Server
The HTTP server responds with a status line, including the message's protocol version and a
success or error code, followed by a MIME-like message containing server information, entity
meta information, and possible entity-body content.

HTTP Protocol
As mentioned, whenever you enter a URL in the address box of the browser, the browser translates the URL into a
request message according to the specified protocol; and sends the request message to the server.
For example, the browser translated the URL http://www.nowhere123.com/doc/index.html into the following
request message:
GET /docs/index.html HTTP/1.1
Host: www.nowhere123.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)
When this request message reaches the server, the server can take either one of these actions:
• The server interprets the request received, maps the request into a file under the server's document directory,
and returns the file requested to the client.
• The server interprets the request received, maps the request into a program kept in the server, executes the
program, and returns the output of the program to the client.
• The request cannot be satisfied, the server returns an error message.

An example of the HTTP response message is as shown:
HTTP/1.1 200 OK
Date: Sun, 18 Oct 2009 08:56:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Sat, 20 Nov 2004 07:16:26 GMT
ETag: "10000000565a5-2c-3e94b66c2e680"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug
<html><body><h1>It works!</h1></body></html>
The browser receives the response message, interprets the message and displays the
contents of the message on the browser's window according to the media type of
the response (as in the Content-Type response header). Common media type include
"text/plain", "text/html", "image/gif", "image/jpeg", "audio/mpeg", "video/mpeg",
"application/msword", and "application/pdf".

HTTP Request Methods
HTTP protocol defines a set of request methods. A client can use one of these request methods to send a
request message to an HTTP server. The methods are:
•GET: A client can use the GET request to get a web resource from the server.
•HEAD: A client can use the HEAD request to get the header that a GET request would have obtained. Since
the header contains the last-modified date of the data, this can be used to check against the local cache
copy.
•POST: Used to post data up to the web server.
•PUT: Ask the server to store the data.
•DELETE: Ask the server to delete the data.
•TRACE: Ask the server to return a diagnostic trace of the actions it takes.
•Other extension methods.

GET Method
A GET request retrieves data from a web server by specifying
parameters in the URL portion of the request. This is the main
method used for document retrieval.
HEAD Method
The HEAD method is functionally similar to GET, except that the
server replies with a response line and headers, but no entity-
body.
POST Method
The POST method is used when you want to send some data to
the server, for example, file update, form data, etc.

This example retrieves a document. We use the GET method to retrieve an image
with the path /usr/bin/image1. The request line shows the method (GET), the URL,
and the HTTP version (1.1). The header has two lines that show that the client can
accept images in the GIF or JPEG format. The request does not have a body. The
response message contains the status line and four lines of header. The header lines
define the date, server, MIME version, and length of the document. The body of the
document follows the header

What is Domain Name System (DNS)
The Domain Name System (DNS) is the phone book of the Internet - it links host names
to IP addresses. In this manner, users can easily use a domain name.
When you type in a domain name into your web browser, your computer will check if it
knows to which IP address this request should go. If it doesn't know, it will ask it's DNS
servers.
These DNS servers are set in the network settings and are provided by default by your
Internet service provider. The servers will check if they know the correct IP address, and
if they don't, they will go through a number of steps to determine the correct IP
address.
Domain Name System (DNS)

Domain Name Server (DNS) in Application Layer
DNS is a host name to IP address translation service. DNS is a distributed database
implemented in a hierarchy of name servers.
Requirement
Every host is identified by the IP address but remembering numbers is very difficult for
the people and also the IP addresses are not static therefore a mapping is required to
change the domain name to IP address. So DNS is used to convert the domain name of
the websites to their numerical IP address.

Domain :
There are various kinds of DOMAIN :
1.Generic domain : .com(commercial) .edu(educational) .mil(military)
.org(non profit organization) .net(similar to commercial) all these are
generic domain.
2.Country domain .in (india) .us .uk
3.Inverse domain if we want to know what is the domain name of the
website. Ip to domain name mapping.So DNS can provide both the
mapping

Authoritative & Non-authoritative DNS name servers
DNS name servers are commonly split into two categories:
• Authoritative name servers
• Non-authoritative name servers
The difference between these two is that the authoritative name server
holds the record for a domain, and the non-authoritative name server does
not. Instead, it just asks other name servers for their records.

It is Very difficult to find out the ip address associated to a website because there
are millions of websites and with all those websites we should be able to generate
the ip address immediately,
there should not be a lot of delay for that to happen organization of database is
very important.
DNS record – Domain name, ip address what is the validity?? what is the time to
live ?? and all the information related to that domain name. These records are
stored in tree like structure.
Namespace – Set of possible names, flat or hierarchical . Naming system
maintains a collection of bindings of names to values – given a name, a resolution
mechanism returns the corresponding value –
Name server – It is an implementation of the resolution mechanism.. DNS
(Domain Name System) = Name service in Internet – Zone is an administrative
unit, domain is a subtree.

Hierarchy of Name Servers
Root name servers – It is contacted by name servers that can not resolve the name. It contacts
authoritative name server if name mapping is not known. It then gets the mapping and return the
IP address to the host.
Top level server – It is responsible for com, org, edu etc and all top level country domains like uk,
fr, ca, in etc. They have info about authoritative domain servers and know names and IP addresses
of each authoritative name server for the second level domains.
Authoritative name servers This is organization’s DNS server, providing authoritative hostName to
IP mapping for organization servers. It can be maintained by organization or service provider. In
order to reach cse.dtu.in we have to ask the root DNS server, then it will point out to the top level
domain server and then to authoritative domain name server which actually contains the IP
address. So the authoritative domain server will return the associative ip address.

The client machine sends a request to the local name server, which , if
root does not find the address in its database, sends a request to the
root name server , which in turn, will route the query to an
intermediate or authoritative name server.
The root name server can also contain some hostName to IP address
mappings .
The intermediate name server always knows who the authoritative
name server is. So finally the IP address is returned to the local name
server which in turn returns the IP address to the host.

Intranet
Intranet is defined as private network of computers within an organization with its own server
and firewall. we can define Intranet as:
•Intranet is system in which multiple PCs are networked to be connected to each other. PCs in
intranet are not available to the world outside of the intranet.
•Usually each company or organization has their own Intranet network and
members/employees of that company can access the computers in their intranet.
•Every computer in internet is identified by a unique IP address.
•Each computer in Intranet is also identified by a IP Address, which is unique among the
computers in that Intranet.

Benefits
Intranet is very efficient and reliable network system for any organization.
It is beneficial in every aspect such as collaboration, cost-effectiveness, security,
productivity and much more.
Communication
Intranet offers easy and cheap communication within an
organization. Employees can communicate using chat, e-mail or
blogs.
Time Saving
Information on Intranet is shared in real time.
Collaboration
Information is distributed among the employees as according to
requirement and it can be accessed by the authorized users,
resulting in enhanced teamwork.

Platform Independency
Intranet can connect computers and other devices with different architecture.
Cost Effective
Employees can see the data and other documents using browser rather than printing them and distributing duplicate copies
among the employees, which certainly decreases the cost.
Workforce Productivity
Data is available at every time and can be accessed using company workstation. This helps the employees work faster.
Business Management
It is also possible to deploy applications that support business operations.
Security
Since information shared on intranet can only be accessed within an organization, therefore there is almost no chance of being
theft.
Specific Users
Intranet targets only specific users within an organization therefore, once can exactly know whom he is interacting.
Immediate Updates
Any changes made to information are reflected immediately to all the users

Uniform Resource Locator (URL)
Every document on the Web has a unique address. This address is known as Uniform
Resource Locator (URL).
Several HTML and other markup language tags include a URL attribute value, including
hyperlinks, inline images, and forms. All of them use the same syntax to specify the location
of a web resource, regardless of the type or content of that resource. That's why it is known
a Uniform Resource Locator.
URL Elements
A URL is made of up several parts, each of which offers information to the web browser to
help find the page. It is easier to learn the parts of a URL, if you look at the example URL
given below, there are three key parts: the scheme, the host address, and the file path. The
following section will discuss each of them:
http://www.tutorialspoint.com/index.htm

The Scheme
The scheme identifies the type of protocol and URL you are linking to and
therefore, how the resource should be retrieved. For example, most web
browsers use Hypertext Transfer Protocol (HTTP) to pass information to
communicate with the web servers and this is the reason a URL starts with
http://.
There are other schemes available and you can use either of them based on
your requirement:

Sr.No Scheme & Description
1 http://
Hypertext Transfer Protocol (HTTP) is used to request pages from Web servers and send them back
from Web servers to browsers.
2 https://
Secure Hypertext Transfer Protocol (HTTPS) encrypts the data sent between the browser and the
Web server using a digital certificate.
3 ftp://
File Transfer Protocol is another method for transferring files on the Web. While HTTP is a lot more
popular for viewing Web sites because of its integration with browsers, FTP is still commonly used
protocol to transfer large files across the Web and to upload source files to your Web server.
4 file://
Used to indicate that a file is on the local hard disk or a shared directory on a LAN.

The Host Address
The host address is where a website can be found, either the IP address (four sets of
numbers between 0 and 255, for example 68.178.157.132 ) or more commonly the
domain name for a site such as www.tutorialspoint.com.
The File Path
The filepath always begins with a forward slash character, and may consist of one or more
directory or folder names. Each directory name is separated by forward slash characters
and the filepath may end with a filename at the end. Here index.htm is the filename which
is available in html directory:
https://www.tutorialspoint.com/html/index.htm

Other Parts of the URL
Using credentials is a way of specifying a username and password for a password-protected
part of a site. The credentials come before the host address, and they are separated from the
host address by an @ sign. Note how the username is separated from the password by a colon.
The following URL shows the username admin and the password admin123:
https://admin:admin123@tutorialspoint.com/admin/index.htm
Using the above URL, you can authenticate administrator and if provided ID and Password are
correct then administrator will have access on index.htm file available in admin directory.
You can use a telnet URL to connect to a server as follows :
telnet://user:password@tutorialspoint.com:port/
Another important information is Web Server Port Number. By default HTTP Server runs on
port number 80. But if you are running a server on any other port number then it can be
provided as follows, assuming server is running on port 8080:
https://www.tutorialspoint.com:8080/index.htm

Absolute and Relative URLs
You may address a URL in one of the following two ways:
•Absolute − An absolute URL is the complete address of a
resource. For example
http://www.tutorialspoint.com/html/html_text_links.htm
•Relative − A relative URL indicates where the resource is in
relation to the current page. Given URL is added with the
<base> element to form a complete URL. For example
/html/html_text_links.htm

Email
Email is a service which allows us to send the message in electronic mode over the internet.
It offers an efficient, inexpensive and real time method of distributing information among
people.
E-Mail Address : -
Each user of email is assigned a unique name for his email account. This name is known as E-
mail address.
Different users can send and receive messages according to the e-mail address.
• The username and the domain name are separated by @ (at)
symbol.
• E-mail addresses are not case sensitive.
• Spaces are not allowed in e-mail address.

E-mail System
E-mail system comprises of the following three components:
•Mailer
•Mail Server
•Mailbox
Mailer
It is also called mail program, mail application or mail client. It allows us to
manage, read and compose e-mail.
Mail Server
The function of mail server is to receive, store and deliver the email. It is must
for mail servers to be Running all the time because if it crashes or is down,
email can be lost.
Mailboxes
Mailbox is generally a folder that contains emails and information about
them.

Working of E-mail
Email working follows the client server approach. In this client is the mailer i.e. the mail application or mail program and
server is a device that manages emails.
Following example will take you through the basic steps involved in sending and receiving emails and will give you a
better understanding of working of email system:
Suppose person A wants to send an email message to person B.
Person A composes the messages using a mailer program i.e. mail client and then select Send option.
The message is routed to Simple Mail Transfer Protocol to person B’s mail server.
The mail server stores the email message on disk in an area designated for person B.
The disk space area on mail server is called mail spool.
Now, suppose person B is running a POP client and knows how to communicate with B’s mail server.
It will periodically poll the POP server to check if any new email has arrived for B.As in this case, person A has sent an
email for person B, so email is forwarded over the network to B’s PC. This is message is now stored on person B’s PC.

E-mail Message Components
E-mail message comprises of different
components: E-mail Header, Greeting, Text,
and Signature. These components are
described in the following diagram:

E-mail Header
The first five lines of an E-mail message is called E-mail header. The header part
comprises of following fields:
• From
• Date
• To
• Subject
• CC
• BCC

From : - The From field indicates the sender’s address i.e. who sent the e-mail.
Date : - The Date field indicates the date when the e-mail was sent.
To : -The To field indicates the recipient’s address i.e. to whom the e-mail is sent.
Subject :- The Subject field indicates the purpose of e-mail. It should be precise and to the point.
CC : -CC stands for Carbon copy. It includes those recipient addresses whom we want to keep informed but not exactly the
intended recipient.
BCC : - BCC stands for Blind Carbon Copy. It is used when we do not want one or more of the recipients to know that
someone else was copied on the message.
Greeting : -Greeting is the opening of the actual message. Eg. Hi Sir or Hi Guys etc.
Text : - It represents the actual content of the message.
Signature : This is the final part of an e-mail message. It includes Name of Sender, Address, and Contact Number

Advantages
E-mail is proven to be powerful and reliable medium of communication.
Here are the benefits of E-mail:
•Reliable
•Convenience
•Speed
•Inexpensive
•Printable
•Global
•Generality
Reliable
Many of the mail systems notify the sender if e-mail message was undeliverable.
Convenience
There is no requirement of stationary and stamps. One does not have to go to post office. But all these
things are not required for sending or receiving an mail.

Speed
E-mail is very fast. However, the speed also depends upon the underlying network.
Inexpensive
The cost of sending e-mail is very low.
Printable
It is easy to obtain a hardcopy of an e-mail. Also an electronic copy of an e-mail can also be saved for records.
Global
E-mail can be sent and received by a person sitting across the globe.
Generality
It is also possible to send graphics, programs and sounds with an e-mail.
Disadvantages
Apart from several benefits of E-mail, there also exists some disadvantages as discussed below:
•Forgery
•Overload
•Misdirection
•Junk
•No response

Forgery
E-mail doesn’t prevent from forgery, that is, someone impersonating the sender,
since sender is usually not authenticated in any way.
Overload
Convenience of E-mail may result in a flood of mail.
Misdirection
It is possible that you may send e-mail to an unintended recipient.
Junk
Junk emails are undesirable and inappropriate emails. Junk emails are sometimes
referred to as spam.
No Response
It may be frustrating when the recipient does not read the e-mail and respond on a
regular basis.

E-mail Protocols are set of rules that help the client to properly transmit the information to or
from the mail server. Here, we discuss various protocols such as SMTP, POP, and IMAP.
Email Protocol 1: SMTP
SMTP stands for Simple Mail Transfer Protocol. It was first proposed in 1982.
It is a standard protocol used for sending e-mail efficiently and reliably over the internet.
Key Points:
•SMTP is application level protocol( port no 25)
•It handles exchange of messages between e-mail servers over TCP/IP network.
•Apart from transferring e-mail, SMTP also provides notification regarding incoming mail.
•When you send e-mail, your e-mail client sends it to your e-mail server which further contacts the
recipient mail server using SMTP client.
•These SMTP commands specify the sender’s and receiver’s e-mail address, along with the message
to be send.
•In case, message cannot be delivered, an error report is sent to the sender which makes SMTP
• a reliable protocol.

S.N. SMTP Command Description
1 HELLO : This command initiates the SMTP conversation.
2 EHELLO : This is an alternative command to initiate the conversation. ESMTP indicates that the
sender server wants to use extended SMTP protocol.
3 MAIL FROM : This indicates the sender’s address.
4 RCPT TO : It identifies the recipient of the mail. In order to deliver similar message to multiple
users this command can be repeated multiple times.
5 SIZE : This command let the server know the size of attached message in bytes.
6 DATA : The DATA command signifies that a stream of data will follow. Here stream of data
refers to the body of the message.
7 QUIT : This commands is used to terminate the SMTP connection.
8 VERFY : This command is used by the receiving server in order to verify whether the given
username is valid or not.

Protocol 2 : IMAP stands for Internet Mail Access Protocol.
Key Points:
IMAP allows the client program to manipulate the e-mail message on the
server without downloading them on the local computer.
The e-mail is stored and maintained by the remote server.
It enables us to take any action such as downloading, delete the mail
without reading the mail. It enables us to create, manipulate and delete
remote message folders called mail boxes.

Protocol 3: POP
POP stands for Post Office Protocol. It is generally used to support a single client. There are several
versions of POP but the POP 3 is the current standard.
Key Points
•POP is an application layer internet standard protocol.
•Since POP supports offline access to the messages, thus requires less internet usage time.
•POP does not allow search facility.
•In order to access the messages, it is necessary to download them.
•It allows only one mailbox to be created on server

A search engine is a website that allows users to look up information on the World Wide Web (www).
The search engine will achieve this by looking at many web pages to find matches to the user's search
inputs. It will return results ranked by relevancy and popularity by the search engine.
Some popular search-engines are Google, Yahoo!, Ask.com, AltaVista, AOLSearch and Bing.
To use a search engine you must enter at least one keyword in to the search box. Usually an on-screen
button must be clicked on to submit the search. The search engine looks for matches between the
keyword(s) entered and its database of websites and words.
After the user inputs their search or query into the search bar, a list of results will appear on the screen
know as search engine results page (SERP).
This list of webpages contains matches related to the user's query in a particular order determine by a
ranking system.
Most search engine will remove "spam" pages from the list of results to provide a better list of results.
The user can then click on any of the links to go to that webpage.
Search Engine

Search engines are some of the most advanced websites on the web. They use
special computer code to sort the web pages on SERPs. The most popular or
highest quality web pages will be near the top of the list.
When a user types words into the search engine, it looks for web pages with those
words. There could be thousands, or even millions, of web pages with those
words. So, the search engine helps users by putting the web pages it thinks the
user wants first.
Search engines are very useful to find information about anything quickly and
easily. Using more keywords or different keywords improves the results of
searches.

Search Engine Components
Generally there are three basic components of a search engine :
• Web Crawler
• Database
• Search Interfaces
Web crawler : - It is also known as spider or bots. It is a software component that traverses the web to gather
information.
Database: - All the information on the web is stored in database. It consists of huge web resources.
Search Interfaces:- This component is an interface between user and the database. It helps the user to search
through the database.

Search Engine Working : -
The search engine looks for the keyword in the index for predefined database instead of going directly
to the web to search for the keyword.
It then uses software to search for the information in the database. This software component is known
as web crawler.
Once web crawler finds the pages, the search engine then shows the relevant web pages as a result.
These retrieved web pages generally include title of page, size of text portion, first several sentences etc.
These search criteria may vary from one search engine to the other. The retrieved information is ranked
according to various factors such as frequency of keywords, relevancy of information, links etc.
User can click on any of the search results to open it.

There are so many reasons to work fast,
1. Google has super computer with them self
2. Google’s Rank brain works very smart and it’s helps to understand the meaning
of search query
3.
the important things to be know is google not always access the main server it’s
access the local servers based on the location and query
so google is so fast than others

Newsgroup :
A newsgroup is an Internet-based discussion around an individual, entity, organization or topic. Newsgroups
enable remotely connected users to share, discuss and learn about their topic of interest by exchanging text
messages, images, videos and other forms of digital content.
Newsgroups are also referred to as usenet newsgroups.
Newsgroups were initially created in 1979 by some university students to exchange messages.
Users can subscribe for free by submitting an email address, and the group generally consists of several
topics/categories based around a main theme.

The user/subscriber can post a message in a particular topic/category, which is
either automatically visible in open newsgroups, or can only be viewed by
approved members in moderated groups.
All subscribers participating or following a particular topic/newsgroup will be
notified of new messages and updates.
Moreover, news/stories/topics in the newsgroup can be read through a
downloadable news reader application.

Newsgroup Classification
There exist a number of newsgroups distributed all around the world. These are identified using a hierarchical naming
system in which each newsgroup is assigned a unique name that consists of alphabetic strings separated by periods.
The leftmost portion of the name represents the top-level category of the newsgroup followed by subtopic. The subtopic
can further be subdivided and subdivided even further (if needed).
For example, the newsgroup comp.lang.C++ contains discussion on C++ language.
The leftmost part comp classifies the newsgroup as one that contains discussion of computer related topics.
The second part identifies one of the subtopic lang that related to computer languages.
The third part identifies one of the computer languages, in this case C++.

The following list shows the top-level hierarchies of Usenet Newsgroup:
Comp.* Computer related topics including computer hardware, software, languages etc. Comp.database.oracle
News.* Newsgroup and Usenet topics News.software.nntp
Rec.* Artistic activities, hobbies, or recreational activities such as books, movies etc. Rec.arts.animation
Sci.* Scientific topics Sci.bio.botany
Soc.* Social issues and various culture Soc.culture.india , Soc.politics.india
Talk.* Conventional subjects such as religion, politics etc.
Humanities.* Art, literature, philosophy and culture Humanities.classics
Misc.* Miscellaneous topics i.e. issues tat may not fit into other categories Misc.answers
Misc.books.technical

What Is a Directory Service?
A directory service is a customizable information store that functions as a single
point from which users can locate resources and services distributed throughout
the network. This customizable information store also gives administrators a
single point for managing its objects and their attributes.
Although this information store appears as a single point to the users of the
network, it is actually most often stored in a distributed form.

A directory service is a customizable information store that functions as a single
point from which users can locate resources and services distributed throughout
the network. This customizable information store also gives administrators a single
point for managing its objects and their
Attributes
The database that forms a directory service is not designed for transactional data.
(For this reason, many people prefer to use the phrase “information store” in their
definitions of a directory service.)
The data stored in your directory service should be fairly stable and should change
only as frequently as the objects in your network. For example, the data that
forms a directory service changes much less frequently than a sales database.

It unit 1

More Related Content

What's hot

Similar to It unit 1

More from GopikaS12

Recently uploaded

It unit 1