World Wide Web
From Wikipedia, the free encyclopaedia
"WWW" and "The web" redirect here. For other uses of WWW, see WWW (disambiguation).
For other uses of web, see Web (disambiguation).
Not to be confused with the Internet.
A visualization of routing paths through a portion of
World Wide Web
The web's logo designed by Robert Cailliau
The World Wide Web (abbreviated as WWW or W3, commonly known as the web) is
a system of interlinked hypertext documents accessed via the Internet. With a web browser,
one can view web pages that may contain text, images, videos, and
other multimedia and navigate between them via hyperlinks.
The web was developed between March 1989 and December 1990. Using concepts from
his earlier hypertext systems such as ENQUIRE, British engineer Tim Berners-Lee,
a computer scientist and at that time employee of the CERN, now Director of the World
Wide Web Consortium (W3C), wrote a proposal in March 1989 for what would eventually
become the World Wide Web. The 1989 proposal was meant for a more effective CERN
communication system but Berners-Lee eventually realised the concept could be
implemented throughout the world.  At CERN, a European research organisation
near Geneva straddling the border between France and Switzerland, Berners-Lee and
Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and
access information of various kinds as a web of nodes in which the user can browse at
will", and Berners-Lee finished the first website in December that year. Berners-Lee
posted the project on the alt.hypertext newsgroup on 7 August 1991.
2.2 Dynamic updates of web pages
2.3 WWW prefix
2.4 Scheme specifiers: http and https
3 Web servers
5 Intellectual property
11 Speed issues
13 See also
15 Further reading
16 External links
Main article: History of the World Wide Web
The NeXT Computer used by Berners-Lee. The handwritten label declares, "This machine is
a server. DO NOT POWER IT DOWN!!"
In the May 1970 issue of Popular Science magazine, Arthur C. Clarke predicted that satellites
would someday "bring the accumulated knowledge of the world to your fingertips" using a
console that would combine the functionality of the photocopier, telephone, television and a
small computer, allowing data transfer and video conferencing around the globe.
In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and
software project he had built in 1980, and described a more elaborate information
With help from Robert Cailliau, he published a more formal proposal (on 12 November
1990) to build a "Hypertext project" called "WorldWideWeb" (one word, also "W3") as a
"web" of "hypertext documents" to be viewed by "browsers" using a client–server
architecture. This proposal estimated that a read-only web would be developed within three
months and that it would take six months to achieve "the creation of new links and new
material by readers, [so that] authorship becomes universal" as well as "the automatic
notification of a reader when new material of interest to him/her has become available."
While the read-only goal was met, accessible authorship of web content took longer to
mature, with the wiki concept, blogs, Web 2.0 and RSS/Atom.
The proposal was modeled after the SGML reader Dynatext by Electronic Book Technology,
a spin-off from the Scholarship at Brown University. The Dynatext system, licensed by
CERN, was a key player in the extension of SGML ISO 8879:1986 to Hypermedia
within HyTime, but it was considered too expensive and had an inappropriate licensing
policy for use in the general high energy physics community, namely a fee for each document
and each document alteration.
The CERN datacenter in 2010 housing some WWW servers
A NeXT Computer was used by Berners-Lee as the world's first web server and also to write
the first web browser, Worldwide Web, in 1990. By Christmas 1990, Berners-Lee had built
all the tools necessary for a working Web: the first web browser (which was a web editor
as well); the first web server; and the first web pages, which described the project itself.
The first web page may be lost, but Paul Jones of UNC-Chapel Hill in North Carolina
revealed in May 2013 that he has a copy of a page sent to him by Berners-Lee which is the
oldest known web page. Jones stored it on a floppy disk and on his NeXT computer.
On 6 August 1991, Berners-Lee posted a short summary of the World Wide Web project on
the alt.hypertext newsgroup. This date also marked the debut of the Web as a publicly
available service on the Internet, although new users only access it after August 23. For this
reason this is considered the internet’s day. Many news media have reported that the first
photo on the web was uploaded by Berners-Lee in 1992, an image of the CERN house
band Les Horribles Cernettestaken by Silvano de Gennaro; Gennaro has disclaimed this
story, writing that media were "totally distorting our words for the sake of cheap
The first server outside Europe was set up at the Stanford Linear Accelerator Center (SLAC)
in Palo Alto, California, to host the SPIRES-HEP database. Accounts differ substantially as
to the date of this event. The World Wide Web Consortium says December 1992, whereas
SLAC itself claims 1991. This is supported by a W3C document titled A Little History
of the World Wide Web.
The crucial underlying concept of hypertext originated with older projects from the 1960s,
such as the Hypertext Editing System (HES) at Brown University, Ted Nelson's Project
Xanadu, and Douglas Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in
turn inspired by Vannevar Bush's microfilm-based "memex", which was described in the
1945 essay "As We May Think".
Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book Weaving The
Web, he explains that he had repeatedly suggested that a marriage between the two
technologies was possible to members of both technical communities, but when no one took
up his invitation, he finally assumed the project himself. In the process, he developed three
1. a system of globally unique identifiers for resources on the Web and elsewhere, the
universal document identifier (UDI), later known as uniform resource locator (URL)
and uniform resource identifier(URI);
2. the publishing language HyperText Markup Language (HTML);
3. the Hypertext Transfer Protocol (HTTP).
The World Wide Web had a number of differences from other hypertext systems available at
the time. The web required only unidirectional links rather than bidirectional ones, making it
possible for someone to link to another resource without action by the owner of that resource.
It also significantly reduced the difficulty of implementing web servers and browsers (in
comparison to earlier systems), but in turn presented the chronic problem of link rot. Unlike
predecessors such as HyperCard, the World Wide Web was non-proprietary, making it
possible to develop servers and clients independently and to add extensions without licensing
restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to
anyone, with no fees due. Coming two months after the announcement that the server
implementation of the Gopher protocol was no longer free to use, this produced a rapid shift
away from Gopher and towards the Web. An early popular web browser
was ViolaWWW for Unix and the X Windowing System.
Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th
anniversary of the World Wide Web Consortium.
Scholars generally agree that a turning point for the World Wide Web began with the
introduction of the Mosaic web browser in 1993, a graphical browser developed by a
team at the National Center for Supercomputing Applications at the University of Illinois at
Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen. Funding for Mosaic came from
the U.S. High-Performance Computing and Communications Initiative and the High
Performance Computing and Communication Act of 1991, one of several computing
developments initiated by U.S. Senator Al Gore. Prior to the release of Mosaic, graphics
were not commonly mixed with text in web pages and the web's popularity was less than
older protocols in use over the Internet, such as Gopher and Wide Area Information
Servers (WAIS). Mosaic's graphical user interface allowed the Web to become, by far, the
most popular Internet protocol.
The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the
European Organization for Nuclear Research (CERN) in October 1994. It was founded at
the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with
support from the Defense Advanced Research Projects Agency (DARPA), which had
pioneered the Internet; a year later, a second site was founded at INRIA (a French national
computer research lab) with support from the European Commission DG InfSo; and in 1996,
a third continental site was created in Japan at Keio University. By the end of 1994, while the
total number of websites was still minute compared to present standards, quite a number
of notable websites were already active, many of which are the precursors or inspiration for
today's most popular services.
Connected by the existing Internet, other websites were created around the world, adding
international standards for domain names and HTML. Since then, Berners-Lee has played an
active role in guiding the development of web standards (such as the markup languages in
which web pages are composed), and has advocated his vision of a Semantic Web. The
World Wide Web enabled the spread of information over the Internet through an easy-to-use
and flexible format. It thus played an important role in popularizing use of the
Internet. Although the two terms are sometimes conflated in popular use, World Wide
Web is not synonymous with Internet. The web is a collection of documents and both
client and server software using Internet protocols such as TCP/IP and HTTP.
Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for his contribution to the
World Wide Web.
The terms Internet and World Wide Web are often used in everyday speech without much
distinction. However, the Internet and the World Wide Web are not the same. The Internet is
a global system of interconnected computer networks. In contrast, the web is one of the
services that runs on the Internet. It is a collection of text documents and other resources,
linked by hyperlinks and URLs, usually accessed by web browsers from web servers. In
short, the web can be thought of as an application "running" on the Internet.
Viewing a web page on the World Wide Web normally begins either by typing the URL of
the page into a web browser or by following a hyperlink to that page or resource. The web
browser then initiates a series of communication messages, behind the scenes, in order to
fetch and display it. In the 1990s, using a browser to view web pages—and to move from one
web page to another through hyperlinks—came to be known as 'browsing,' 'web surfing,' or
'navigating the web'. Early studies of this new behavior investigated user patterns in using
web browsers. One study, for example, found five user patterns: exploratory surfing, window
surfing, evolved surfing, bounded navigation and targeted navigation.
The following example demonstrates how a web browser works. Consider accessing a page
with the URL http://example.org/wiki/World_Wide_Web.
First, the browser resolves the server-name portion of the URL (example.org) into an Internet
Protocol address using the globally distributed database known as the Domain Name
System (DNS); this lookup returns an IP address such as 18.104.22.168. The browser then
requests the resource by sending an HTTP request across the Internet to the computer at that
particular address. It makes the request to a particular application port in the
underlying Internet Protocol Suite so that the computer receiving the request can distinguish
an HTTP request from other network protocols it may be servicing such as e-mail delivery;
the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple
as the two lines of text GET /wiki/World_Wide_Web HTTP/1.1 Host: example.org
The computer receiving the HTTP request delivers it to web server software listening for
requests on port 80. If the web server can fulfill the request it sends an HTTP response back
to the browser indicating success, which can be as simple as HTTP/1.0 200 OK ContentType: text/html; charset=UTF-8 followed by the content of the requested page. The
Hypertext Markup Language for a basic web page looks like <html> <head>
<title>Example.org – The World Wide Web</title> </head> <body> <p>The World Wide
Web, abbreviated as WWW and commonly known ...</p> </body> </html>
The web browser parses the HTML, interpreting the markup (<title>, <p> for paragraph, and
such) that surrounds the words in order to draw the text on the screen.
Many web pages use HTML to reference the URLs of other resources such as images, other
embedded media, scripts that affect page behavior, and Cascading Style Sheets that affect
page layout. The browser will make additional HTTP requests to the web server for these
other Internet media types. As it receives their content from the web server, the browser
progressively renders the page onto the screen as specified by its HTML and these additional
Linking[edit source | editbeta]
Most web pages contain hyperlinks to other related pages and perhaps to downloadable files,
source documents, definitions and other web resources. In the underlying HTML, a hyperlink
looks like <a href="http://example.org/wiki/Main_Page">Example.org, a free
Graphic representation of a minute fraction of the WWW, demonstratinghyperlinks
Such a collection of useful, related resources, interconnected via hypertext links is dubbed
a web of information. Publication on the Internet created what Tim Berners-Lee first called
the WorldWideWeb (in its original CamelCase, which was subsequently discarded) in
The hyperlink structure of the WWW is described by the webgraph: the nodes of
the webgraph correspond to the web pages (or URLs) the directed edges between them to
Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced
with different content. This makes hyperlinks obsolete, a phenomenon referred to in some
circles as link rot and the hyperlinks affected by it are often called dead links. The ephemeral
nature of the Web has prompted many efforts to archive web sites. The Internet Archive,
active since 1996, is the best known of such efforts.
Dynamic updates of web pages[edit source | editbeta]
Main article: Ajax (programming)
of Netscape, for use within web pages. The standardised version isECMAScript. To
can make additional HTTP requests to the server, either in response to user actions such as
mouse movements or clicks, or based on lapsed time. The server's responses are used to
modify the current page rather than creating a new page with each response, so the server
needs only to provide limited, incremental information. Multiple Ajax requests can be
handled at the same time, and users can interact with the page while data is being retrieved.
Web pages may also regularly poll the server to check whether new information is
Many domain names used for the World Wide Web begin with www because of the longstanding practice of naming Internet hosts (servers) according to the services they provide.
The hostname for a web server is often www, in the same way that it may be ftp for an FTP
server, and news or nntp for a USENET news server. These host names appear as Domain
Name System or (DNS) subdomain names, as inwww.example.com. The use of 'www' as a
subdomain name is not required by any technical or policy standard and many web sites do
not use it; indeed, the first ever web server was callednxoc01.cern.ch. According to Paolo
Palazzi, who worked at CERN along with Tim Berners-Lee, the popular use of 'www'
subdomain was accidental; the World Wide Web project page was intended to be published at
www.cern.ch while info.cern.ch was intended to be the CERN home page, however the dns
records were never switched, and the practice of prepending 'www' to an institution's website
domain name was subsequently copied. Many established websites still use 'www', or they
invent other subdomain names such as 'www2', 'secure', etc. Many such web servers are set
up so that both the domain root (e.g., example.com) and the www subdomain (e.g.,
www.example.com) refer to the same site; others require one form or the other, or they may
map to different web sites.
The use of a subdomain name is useful for load balancing incoming web traffic by creating
a CNAME record that points to a cluster of web servers. Since, currently, only a subdomain
can be used in a CNAME, the same result cannot be achieved by using the bare domain root.
When a user submits an incomplete domain name to a web browser in its address bar input
field, some web browsers automatically try adding the prefix "www" to the beginning of it
and possibly ".com", ".org" and ".net" at the end, depending on what might be missing. For
example, entering 'microsoft' may be transformed to http://www.microsoft.com/ and
'openoffice' to http://www.openoffice.org. This feature started appearing in early versions of
Mozilla Firefox, when it still had the working title 'Firebird' in early 2003, from an earlier
practice in browsers such as Lynx. It is reported that Microsoft was granted a US patent
for the same idea in 2008, but only for mobile devices.
In English, www is usually read as double-u double-u double-u. Some users pronounce it dubdub-dub, particularly in New Zealand. Stephen Fry, in his "Podgrammes" series of podcasts,
pronounces it wuh wuh wuh. The English writer Douglas Adams once quipped in The
Independent on Sunday (1999): "The World Wide Web is the only thing I know of whose
shortened form takes three times longer to say than what it's short for". In Mandarin
Chinese, World Wide Web is commonly translated via a phono-semantic matching to wàn wéi
wǎng (万维网), which satisfies www and literally means "myriad dimensional net", a
translation that very appropriately reflects the design concept and proliferation of the World
Wide Web. Tim Berners-Lee's web-space states that World Wide Web is officially spelled as
three separate words, each capitalised, with no intervening hyphens.
Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain
names and make them easily pronounceable. As the mobile web grows in popularity,
services like Gmail.com,MySpace.com, Facebook.com and Twitter.com are most often
discussed without adding www to the domain (or, indeed, the .com).
Scheme specifies: http and https
The scheme specifies http:// or https:// at the start of a web URI refers to Hypertext Transfer
Protocol or HTTP Secure respectively. Unlike www, which has no specific purpose, these
specify the communication protocol to be used for the request and response. The HTTP
protocol is fundamental to the operation of the World Wide Web and the added encryption
layer in HTTPS is essential when confidential information such as passwords or banking
information are to be exchanged over the public Internet. Web browsers usually prepend
http:// to addresses too, if omitted.
Main article: Web server
The primary function of a web server is to deliver web pages on the request to clients. This
means delivery of HTML documents and any additional content that may be included by a
document, such as images, style sheets and scripts.
Main article: Internet privacy
Every time a web page is requested from a web server the server can identify, and usually it
logs, the IP address from which the request arrived. Equally, unless set not to do so, most
web browsers record the web pages that have been requested and viewed in a history feature,
and usually cache much of the content locally. Unless HTTPS encryption is used, web
requests and responses travel in plain text across the internet and they can be viewed,
recorded and cached by intermediate systems.
When a web page asks for, and the user supplies, personally identifiable information such as
their real name, address, e-mail address, etc., then a connection can be made between the
current web traffic and that individual. If the website uses HTTP cookies, username and
password authentication, or other tracking techniques, then it will be able to relate other web
visits, before and after, to the identifiable information provided. In this way it is possible for a
web-based organization to develop and build a profile of the individual people who use its
site or sites. It may be able to build a record for an individual that includes information about
their leisure activities, their shopping interests, their profession, and other aspects of
their demographic profile. These profiles are obviously of potential interest to marketers,
advertisers and others. Depending on the website's terms and conditions and the local laws
that apply information from these profiles may be sold, shared, or passed to other
organizations without the user being informed. For many ordinary people, this means little
more than some unexpected e-mails in their in-box, or some uncannily relevant advertising
on a future web page. For others, it can mean that time spent indulging an unusual interest
can result in a deluge of further targeted marketing that may be unwelcome. Law
enforcement, counter terrorism and espionage agencies can also identify, target and track
individuals based on what appear to be their interests or proclivities on the web.
Social networking sites make a point of trying to get the user to truthfully expose their real
names, interests and locations. This makes the social networking experience more realistic
and therefore engaging for all their users. On the other hand, photographs uploaded and
unguarded statements made will be identified to the individual, who may regret some
decisions to publish these data. Employers, schools, parents and other relatives may be
influenced by aspects of social networking profiles that the posting individual did not intend
for these audiences. On-line bullies may make use of personal information to harass
or stalk users. Modern social networking websites allow fine grained control of the privacy
settings for each individual posting, but these can be complex and not easy to find or use,
especially for beginners.
Photographs and videos posted onto websites have caused particular problems, as they can
add a person's face to an on-line profile. With modern and potential facial recognition
technology, it may then be possible to relate that face with other, previously anonymous,
images, events and scenarios that have been imaged elsewhere. Because of image caching,
mirroring and copying, it is difficult to remove an image from the World Wide Web.
Main article: Intellectual property
The intellectual property rights for any creative work initially rests with its creator. Web
users who want to publish their work onto the World Wide Web, however, need to be aware
of the details of the way they do it. If artwork, photographs, writings, poems, or technical
innovations are published by their creator onto a privately owned web server, then they may
choose the copyright and other conditions freely themselves. This is unusual though; more
commonly work is uploaded to websites and servers that are owned by other organizations. It
depends upon the terms and conditions of the site or service provider to what extent the
original owner automatically signs over rights to their work by the choice of destination and
by the act of uploading.
Some users of the web erroneously assume that everything they may find online is freely
available to them as if it was in the public domain, which is not always the case. Content
owners that are aware of this widespread belief, may expect that their published content will
probably be used in some capacity somewhere without their permission. Some content
publishers therefore embed digital watermarks in their media files, sometimes charging users
to receive unmarked copies for legitimate use. Digital rights management includes forms of
access control technology that further limit the use of digital content even after it has been
bought or downloaded.
The web has become criminals' preferred pathway for spreading malware. Cybercrime carried
out on the web can include identity theft, fraud, espionage and intelligence
gathering. Web-based vulnerabilitiesnow outnumber traditional computer security
concerns, and as measured by Google, about one in ten web pages may contain
malicious code. Most web-based attacks take place on legitimate websites, and most, as
measured by Sophos, are hosted in the United States, China and Russia. The most common
of all malware threats is SQL injection attacks against websites. Through HTML and URIs
the web was vulnerable to attacks like cross-site scripting (XSS) that came with the
design that favors the use of scripts. Today by one estimate, 70% of all websites are open
to XSS attacks on their users.
Proposed solutions vary to extremes. Large security vendors like McAfee already design
governance and compliance suites to meet post-9/11 regulations, and some,
like Finjan have recommended active real-time inspection of code and all content regardless
of its source. Some have argued that for enterprise to see security as a business opportunity
rather than a cost center, "ubiquitous, always-on digital rights management" enforced in
the infrastructure by a handful of organizations must replace the hundreds of companies that
today secure data and networks. Jonathan Zittrain has said users sharing responsibility for
computing safety is far preferable to locking down the Internet.
Standards[edit source | editbeta]
Main article: Web standards
Many formal standards and other technical specifications and software define the operation of
different aspects of the World Wide Web, the Internet, and computer information exchange.
Many of the documents are the work of the World Wide Web Consortium (W3C), headed by
Berners-Lee, but some are produced by the Internet Engineering Task Force (IETF) and other
Usually, when web standards are discussed, the following publications are seen as
Recommendations for markup languages, especially HTML and XHTML, from the
W3C. These define the structure and interpretation of hypertext documents.
Recommendations for stylesheets, especially CSS, from the W3C.
Recommendations for the Document Object Model, from W3C.
Additional publications provide definitions of other essential technologies for the World
Wide Web, including, but not limited to, the following:
Uniform Resource Identifier (URI), which is a universal system for referencing resources
on the Internet, such as hypertext documents and images. URIs, often called URLs, are
defined by the IETF's RFC 3986 / STD 66: Uniform Resource Identifier (URI): Generic
Syntax, as well as its predecessors and numerous URI scheme-defining RFCs;
HyperText Transfer Protocol (HTTP), especially as defined by RFC
2616: HTTP/1.1 and RFC 2617: HTTP Authentication, which specify how the browser
and server authenticate each other.
Main article: Web accessibility
There are methods available for accessing the web in alternative mediums and formats, so as
to enable use by individuals with disabilities. These disabilities may be visual, auditory,
physical, speech related, cognitive, neurological, or some combination therin. Accessibility
features also help others with temporary disabilities like a broken arm or the aging population
as their abilities change. The Web is used for receiving information as well as providing
information and interacting with society. The World Wide Web Consortium claims it
essential that the Web be accessible in order to provide equal access and equal opportunity to
people with disabilities. Tim Berners-Lee once noted, "The power of the Web is in its
universality. Access by everyone regardless of disability is an essential aspect." Many
countries regulate web accessibility as a requirement for websites. International
cooperation in the W3C Web Accessibility Initiative led to simple guidelines that web
content authors as well as software developers can use to make the Web accessible to persons
who may or may not be using assistive technology.
The W3C Internationalization Activity assures that web technology will work in all
languages, scripts, and cultures. Beginning in 2004 or 2005, Unicode gained ground and
eventually in December 2007 surpassed both ASCII and Western European as the Web's
most frequently used character encoding. Originally RFC 3986 allowed resources to be
identified by URI in a subset of US-ASCII. RFC 3987allows more characters—any character
in the Universal Character Set—and now a resource can be identified by IRI in any
Between 2005 and 2010, the number of web users doubled, and was expected to surpass two
billion in 2010. Early studies in 1998 and 1999 estimating the size of the web using
capture/recapture methods showed that much of the web was not indexed by search engines
and the web was much larger than expected. According to a 2001 study, there were a
massive number, over 550 billion, of documents on the Web, mostly in the invisible Web,
or Deep Web. A 2002 survey of 2,024 million web pages determined that by far the
most web content was in the English language: 56.4%; next were pages in German (7.7%),
French (5.6%), and Japanese (4.9%). A more recent study, which used web searches in 75
different languages to sample the web, determined that there were over 11.5 billion web
pages in the publicly index able web as of the end of January 2005. As of March 2009, the
index able web contains at least 25.21 billion pages. On 25 July 2008, Google software
engineers Jesse Alpert and Nissan Hajaj announced that Google Search had discovered one
trillion unique URLs. As of May 2009, over 109.5 million domains operated.[not in citation
Of these 74% were commercial or other domains operating in the .com generic top-level
Statistics measuring a website's popularity are usually based either on the number of page
views or on associated server 'hits' (file requests) that it receives.
Frustration over congestion issues in the Internet infrastructure and the high latency that
results in slow browsing has led to a pejorative name for the World Wide Web: the World
Wide Wait. Speeding up the Internet is an ongoing discussion over the use
of peering and QoS technologies. Other solutions to reduce the congestion can be found
at W3C. Guidelines for web response times are:
0.1 second (one tenth of a second). Ideal response time. The user does not sense any
1 second. Highest acceptable response time. Download times above 1 second interrupt
the user experience.
10 seconds. Unacceptable response time. The user experience is interrupted and the user
is likely to leave the site or system.
Main article: Web cache
If a user revisits a web page after only a short interval, the page data may not need to be reobtained from the source web server. Almost all web browsers cache recently obtained data,
usually on the local hard drive. HTTP requests sent by a browser will usually ask only for
data that has changed since the last download. If the locally cached data are still current, they
will be reused. Caching helps reduce the amount of web traffic on the Internet. The decision
about expiration is made independently for each downloaded file, whether image, style
content, many of the basic resources need to be refreshed only occasionally. Web site
site-wide files so that they can be cached efficiently. This helps reduce page download times
and lowers demands on the Web server.
There are other components of the Internet that can cache web content. Corporate and
academic firewalls often cache Web resources requested by one user for the benefit of all.
(See also caching proxy server.) Some search engines also store cached content from
websites. Apart from the facilities built into web servers that can determine when files have
been updated and so need to be re-sent, designers of dynamically generated web pages can
control the HTTP headers sent back to requesting users, so that transient or sensitive pages
are not cached. Internet banking and news sites frequently use this facility. Data requested
with an HTTP 'GET' is likely to be cached if other conditions are met; data obtained in
response to a 'POST' is assumed to depend on the data that was POSTed and so is not cached.
1. ^ a b Quittner, Joshua (29 March 1999). "Tim Berners Lee – Time 100 People of the
Century". Time Magazine. Retrieved 17 May 2010. "He wove the World Wide Web
and created a mass medium for the 21st century. The World Wide Web is BernersLee's alone. He designed it. He loosed it on the world. And he more than anyone else
has fought to keep it open, nonproprietary and free."
2. ^ Tim Berners-Lee. "Frequently asked questions". World Wide Web Consortium.
Retrieved 22 July 2010.
3. ^ "World Wide Web Consortium". "The World Wide Web Consortium (W3C)..."
4. ^ "Frequently asked questions" W3.org. Retrieved 15 June 2013.
5. ^ "Inventing the Web: Tim Berners-Lee’s 1990 Christmas Baby" Seeing the Picture.
Retrieved 15 June 2013.
6. ^ WorldWideWeb: Proposal for a HyperText Project. W3.org (1990-11-12).
Retrieved on 2013-07-17.
7. ^ "Le Web a été inventé... en France!". Le Point. 1 January 2012. Retrieved 2013-0405.
8. ^ a b c "Berners-Lee, Tim; Cailliau, Robert (12 November 1990)."WorldWideWeb:
Proposal for a hypertexts Project". Retrieved 27 July 2009.
9. ^ Berners-Lee, Tim. "Pre-W3C Web and Internet Background". World Wide Web
Consortium. Retrieved 21 April 2009.
10. ^ "Aug. 7, 1991: Ladies and Gentlemen, the World Wide Web"Wired. Retrieved 15
11. ^ von Braun, Wernher (May 1970). "TV Broadcast Satellite".Popular Science: 65–
66. Retrieved 12 January 2011.
12. ^ Berners-Lee, Tim (March 1989). "Information Management: A Proposal". W3C.
Retrieved 27 July 2009.
13. ^ "Tim Berners-Lee's original World Wide Web browser". "With recent phenomena
like blogs and wikis, the web is beginning to develop the kind of collaborative nature
that its inventor envisaged from the start."
14. ^ "Tim Berners-Lee: client". W3.org. Retrieved 27 July 2009.
15. ^ "First Web pages". W3.org. Retrieved 27 July 2009.
16. ^ Murawski, John (24 May 2013). "Hunt for world's oldest WWW page leads to
UNC Chapel Hill". News & Observer.
17. ^ "Short summary of the World Wide Web project". Google. 6 August 1991.
Retrieved 27 July 2009.
18. ^ "Silvano de Gennaro disclaims 'the first photo on the web'". Retrieved 27 July
2012. "If you read well our website, it says that it was, to our knowledge, the 'first
photo of a band'. Dozens of media are totally distorting our words for the sake of
cheap sensationalism. Nobody knows which was the first photo on the web."
19. ^ "W3C timeline". Retrieved 30 March 2010.
20. ^ "About SPIRES". Retrieved 30 March 2010.
21. ^ "The Early World Wide Web at SLAC".
22. ^ "A Little History of the World Wide Web".
23. ^ Conklin, Jeff (1987), IEEE Computer 20 (9): 17–41
24. ^ "Inventor of the Week Archive: The World Wide Web".Massachusetts Institute of
Technology: MIT School of Engineering. Retrieved 23 July 2009.
25. ^ "Ten Years Public Domain for the Original Web Software". Tenyearswww.web.cern.ch. 30 April 2003. Retrieved 27 July 2009.
26. ^ "Mosaic Web Browser History – NCSA, Marc Andreessen, Eric Bina".
Livinginternet.com. Retrieved 27 July 2009.
27. ^ "NCSA Mosaic – September 10, 1993 Demo". Totic.org. Retrieved 27 July 2009.
28. ^ "Vice President Al Gore's ENIAC Anniversary Speech". Cs.washington.edu. 14
February 1996. Retrieved 27 July 2009.
29. ^ "Internet legal definition of Internet". West's Encyclopedia of American Law,
edition 2. Free Online Law Dictionary. 15 July 2009. Retrieved 25 November 2008.
30. ^ "WWW (World Wide Web) Definition". TechTerms. Retrieved 19 February 2010.
31. ^ "The W3C Technology Stack". World Wide Web Consortium. Retrieved 21 April
32. ^ Muylle, Steve; Rudy Moenaert, Marc Despont (1999). "A grounded theory of
World Wide Web search behaviour". Journal of Marketing Communications 5 (3):
33. ^ a b Hamilton, Naomi (31 July 2008). "The A-Z of Programming Languages:
34. ^ Buntin, Seth (23 September 2008). "jQuery Polling plugin". Retrieved 2009-08-22.
35. ^ Berners-Lee, Tim. "Frequently asked questions by the Press". W3C. Retrieved 27
36. ^ Palazzi, P (2011) 'The Early Days of the WWW at CERN'
37. ^ "automatically adding www.___.com". mozillaZine. 16 May 2003. Retrieved 27
38. ^ Masnick, Mike (7 July 2008). "Microsoft Patents Adding 'www.' And '.com' To
Text". Techdirt. Retrieved 27 May 2009.
39. ^ "MDBG Chinese-English dictionary – Translate". Retrieved 27 July 2009.
40. ^ "Frequently asked questions by the Press – Tim BL". W3.org. Retrieved 27 July
41. ^ "It's not your grandfather's Internet". Strategic Finance. 2010.
42. ^ boyd, danah; Hargittai, Eszter (July 2010). "Facebook privacy settings: Who
cares?". First Monday (University of Illinois at Chicago) 15 (8).
43. ^ a b Ben-Itzhak, Yuval (18 April 2008). "Infosecurity 2008 – New defence strategy
in battle against e-crime". ComputerWeekly(Reed Business Information). Retrieved
20 April 2008.
44. ^ Christey, Steve and Martin, Robert A. (22 May 2007)."Vulnerability Type
Distributions in CVE (version 1.1)". MITRE Corporation. Retrieved 7 June 2008.
45. ^ Symantec Internet Security Threat Report: Trends for July–December 2007
(Executive Summary) (PDF) XIII. Symantec Corp. April 2008. pp. 1–2. Retrieved 11
46. ^ "Google searches web's dark side". BBC News. 11 May 2007. Retrieved 26 April
47. ^ "Security Threat Report" (PDF). Sophos. Q1 2008. Retrieved 24 April 2008.
48. ^ "Security threat report" (PDF). Sophos. July 2008. Retrieved 24 August 2008.
49. ^ Fogie, Seth, Jeremiah Grossman, Robert Hansen, and Anton Rager (2007). Cross
Site Scripting Attacks: XSS Exploits and Defense (PDF). Syngress, Elsevier Science
& Technology. pp. 68–69, 127. ISBN 1-59749-154-3. Archived from the original on
25 June 2008. Retrieved 6 June 2008.
50. ^ O'Reilly, Tim (30 September 2005). "What Is Web 2.0". O'Reilly Media. pp. 4–5.
Retrieved 4 June 2008. and AJAX web applications can introduce security
vulnerabilities like "client-side security controls, increased attack surfaces, and new
possibilities for Cross-Site Scripting (XSS)", in Ritchie, Paul (March 2007). "The
security risks of AJAX/web 2.0 applications" (PDF). Infosecurity (Elsevier).
Archived from the original on 25 June 2008. Retrieved 6 June 2008. which
citesHayre, Jaswinder S. and Kelath, Jayasankar (22 June 2006)."Ajax Security
Basics". SecurityFocus. Retrieved 6 June 2008.
51. ^ Berinato, Scott (1 January 2007). "Software Vulnerability Disclosure: The Chilling
Effect". CSO (CXO Media). p. 7. Archived from the original on 18 April 2008.
Retrieved 7 June 2008.
52. ^ Prince, Brian (9 April 2008). "McAfee Governance, Risk and Compliance Business
Unit". eWEEK (Ziff Davis Enterprise Holdings). Retrieved 25 April 2008.
53. ^ Preston, Rob (12 April 2008). "Down To Business: It's Past Time To Elevate The
Infosec Conversation". InformationWeek(United Business Media). Retrieved 25 April
54. ^ Claburn, Thomas (6 February 2007). "RSA's Coviello Predicts Security
Consolidation". InformationWeek (United Business Media). Retrieved 25 April 2008.
55. ^ Duffy Marsan, Carolyn (9 April 2008). "How the iPhone is killing the
'Net". Network World (IDG). Retrieved 17 April 2008.
56. ^ a b c "Web Accessibility Initiative (WAI)". World Wide Web Consortium. Retrieved
7 April 2009.[dead link]
57. ^ "Developing a Web Accessibility Business Case for Your Organization:
Overview". World Wide Web Consortium. Retrieved 7 April 2009.[dead link]
58. ^ "Legal and Policy Factors in Developing a Web Accessibility Business Case for
Your Organization". World Wide Web Consortium. Retrieved 7 April 2009.
59. ^ "Web Content Accessibility Guidelines (WCAG) Overview". World Wide Web
Consortium. Retrieved 7 April 2009.
60. ^ "Internationalization (I18n) Activity". World Wide Web Consortium. Retrieved 10
61. ^ Davis, Mark (5 April 2008). "Moving to Unicode 5.1". Google. Retrieved 10 April
62. ^ "World Wide Web Consortium Supports the IETF URI Standard and IRI Proposed
Standard" (Press release). World Wide Web Consortium. 26 January 2005. Retrieved
10 April 2009.
63. ^ Lynn, Jonathan (19 October 2010). "Internet users to exceed 2 billion ...". Reuters.
Retrieved 9 February 2011.
64. ^ S. Lawrence, C.L. Giles, "Searching the World Wide Web," Science, 280(5360),
65. ^ S. Lawrence, C.L. Giles, "Accessibility of Information on the Web," Nature, 400,
66. ^ "The 'Deep' Web: Surfacing Hidden Value". Brightplanet.com. Archived from the
original on 4 April 2008. Retrieved 27 July 2009.
67. ^ "Distribution of languages on the Internet". Netz-tipp.de. Retrieved 27 July 2009.
68. ^ Alessio Signorini. "Indexable Web Size". Cs.uiowa.edu. Retrieved 27 July 2009.
69. ^ "The size of the World Wide Web". Worldwidewebsize.com. Retrieved 27 July
70. ^ Alpert, Jesse; Hajaj, Nissan (25 July 2008). "We knew the web was big...". The
Official Google Blog.
71. ^ a b "Domain Counts & Internet Statistics". Name Intelligence. Retrieved 17 May
72. ^ "World Wide Wait". TechEncyclopedia. United Business Media. Retrieved 10
73. ^ Khare, Rohit and Jacobs, Ian (1999). "W3C Recommendations Reduce 'World
Wide Wait'". World Wide Web Consortium. Retrieved 10 April 2009.
74. ^ Nielsen, Jakob (from Miller 1968; Card et al. 1991) (1994)."5". Usability
Engineering: Response Times: The Three Important Limits. Morgan Kaufmann.
Retrieved 10 April 2009.
Niels Brügger, ed. Web History (2010) 362 pages; Historical perspective on the World
Wide Web, including issues of culture, content, and preservation.
Fielding, R.; Gettys, J.; Mogul, J.; Frystyk, H.; Masinter, L.; Leach, P.; Berners-Lee, T.
(June 1999). Hypertext Transfer Protocol – HTTP/1.1. Request For Comments 2616.
Information Sciences Institute.[dead link]
Berners-Lee, Tim; Bray, Tim; Connolly, Dan; Cotton, Paul; Fielding, Roy; Jeckle, Mario;
Lilley, Chris; Mendelsohn, Noah; Orchard, David; Walsh, Norman; Williams, Stuart (15
December 2004).Architecture of the World Wide Web, Volume One. Version 20041215.
Polo, Luciano (2003). "World Wide Web Technology Architecture: A Conceptual
Analysis". New Devices. Retrieved 31 July 2005.
Skau, H.O. (March 1990). "The World Wide Web and Health Information". New
Devices. Retrieved 1989.
Early archive of the first Web site
Internet Statistics: Growth and Usage of the Web and the Internet
Living Internet A comprehensive history of the Internet, including the World Wide Web.
Web Design and Development at the Open Directory Project
World Wide Web Consortium (W3C)
W3C Recommendations Reduce "World Wide Wait"
World Wide Web Size Daily estimated size of the World Wide Web.
Antonio A. Casilli, Some Elements for a Sociology of Online Interactions
The Erdős Webgraph Server offers weekly updated graph representation of a constantly
increasing fraction of the WWW.
History of the World Wide Web
From Wikipedia, the free encyclopedia
The examples and perspective in this article deal primarily with the United States and
do not represent a worldwide view of the subject.Please improve this article and
discuss the issue on the talk page. (October 2010)
Today, the Web and the Internet allow connectivity from literally everywhere on earth—even ships at sea and in outer space.
The World Wide Web ("WWW" or simply the "Web") is a global information medium which users can read
and write via computers connected to theInternet. The term is often mistakenly used as a synonym for the
Internet itself, but the Web is a service that operates over the Internet, just as e-mail also does. The history
of the Internet dates back significantly further than that of the World Wide Web.
The hypertext portion of the Web in particular has an intricate intellectual history; notable influences and
precursors include Vannevar Bush's Memex,IBM's Generalized Markup Language, and Ted
Nelson's Project Xanadu.
The concept of a home-based global information system goes at least as far back as "A Logic Named Joe",
a 1946 short story by Murray Leinster, in which computer terminals, called "logics," were in every home.
Although the computer system in the story is centralized, the story captures some of the feeling of the
ubiquitous information explosion driven by the Web.
1 1979–1991: Development of the World Wide Web
2 1992–1995: Growth of the WWW
2.1 Early browsers
2.2 Web organization
3 1996–1998: Commercialization of the WWW
4 1999–2001: "Dot-com" boom and bust
5 2002–present: The Web becomes ubiquitous
5.1 Web 2.0
6 See also
9 External links
1979–1991: Development of the World Wide Web[edit source | editbeta]
"In August, 1984 I wrote a proposal to the SW Group Leader, Les Robertson, for the establishment of a pilot project to install
and evaluate TCP/IP protocols on some key non-Unix machines at CERN ... By 1990 CERN had become the largest Internet
site in Europe and this fact... positively influenced the acceptance and spread of Internet techniques both in Europe and
elsewhere... A key result of all these happenings was that by 1989 CERN's Internet facility was ready to become the medium
within which Tim Berners-Lee would create the World Wide Web with a truly visionary idea..."
Ben Segal. Short History of Internet Protocols at CERN, April 1995 
The NeXTcube used by Tim Berners-Lee at CERN became the first Web server.
In 1980, Tim Berners-Lee, an independent contractor at the European Organization for Nuclear
Research (CERN), Switzerland, built ENQUIRE, as a personal database of people and software models,
but also as a way to play with hypertext; each new page of information in ENQUIRE had to be linked to an
In 1984 Berners-Lee returned to CERN, and considered its problems of information presentation: physicists
from around the world needed to share data, and with no common machines and no common presentation
software. He wrote a proposal in March 1989 for "a large hypertext database with typed links", but it
generated little interest. His boss, Mike Sendall, encouraged Berners-Lee to begin implementing his system
on a newly acquired NeXT workstation. He considered several names, including Information Mesh, The
Information Mine (turned down as it abbreviates to TIM, the WWW's creator's name) or Mine of
Information (turned down because it abbreviates to MOI which is "Me" in French), but settled on World
Robert Cailliau, Jean-François Abramatic andTim Berners-Lee at the 10th anniversary of theWWW Consortium.
He found an enthusiastic collaborator in Robert Cailliau, who rewrote the proposal (published on November
12, 1990) and sought resources within CERN. Berners-Lee and Cailliau pitched their ideas to the European
Conference on Hypertext Technology in September 1990, but found no vendors who could appreciate their
vision of marrying hypertext with the Internet.
By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the HyperText
Transfer Protocol(HTTP) 0.9, the HyperText Markup Language (HTML), the first Web
browser (named WorldWideWeb, which was also a Web editor), the first HTTP server software (later
known as CERN httpd), the first web server (http://info.cern.ch), and the first Web pages that described the
project itself. The browser could access Usenet newsgroups and FTP files as well. However, it could run
only on the NeXT; Nicola Pellow therefore created a simple text browser that could run on almost any
computer called the Line Mode Browser. To encourage use within CERN, Bernd Pollermann put the
CERN telephone directory on the web — previously users had to log onto the mainframe in order to look up
According to Tim Berners-Lee, the Web was mainly invented in the Building 31 at CERN ( 46.2325°N
6.0450°E ) but also at home, in the two houses he lived in during that time (one in France, one in
Switzerland). In January 1991 the first Web servers outside CERN itself were switched on. 
The first web page may be lost, but Paul Jones of UNC-Chapel Hill in North Carolina revealed in May 2013
that he has a copy of a page sent to him in 1991 by Berners-Lee which is the oldest known web page.
Jones stored the plain-text page, with hyperlinks, on a floppy disk and on his NeXT computer.
On August 6, 1991, Berners-Lee posted a short summary of the World Wide Web project on the
alt.hypertext newsgroup. This date also marked the debut of the Web as a publicly available service on
the Internet, although new users only access it after August 23. For this reason this is considered
the internaut's day.
"The WorldWideWeb (WWW) project aims to allow all links to be made to any information anywhere. [...]
The WWW project was started to allow high energy physicists to share data, news, and documentation. We
are very interested in spreading the web to other areas, and having gateway servers for other data.
Collaborators welcome!" —from Tim Berners-Lee's first message
Paul Kunz from the Stanford Linear Accelerator Center visited CERN in September 1991, and was
captivated by the Web. He brought the NeXT software back to SLAC, where librarian Louise Addis adapted
it for the VM/CMS operating system on the IBM mainframe as a way to display SLAC’s catalog of online
documents; this was the first web server outside of Europe and the first in North America. The www-talk
mailing list was started in the same month.
An early CERN-related contribution to the Web was the parody band Les Horribles Cernettes, whose
promotional image is believed to be among the Web's first five pictures.
1992–1995: Growth of the WWW[edit source | editbeta]
In keeping with its birth at CERN, early adopters of the World Wide Web were primarily university-based
scientific departments or physics laboratories such as Fermilab and SLAC. By January 1993 there were
fifty Web servers across the world; by October 1993 there were over five hundred.
Early websites intermingled links for both the HTTP web protocol and the then-popular Gopher protocol,
which provided access to content through hypertext menus presented as a file system rather than
through HTML files. Early Web users would navigate either by bookmarking popular directory pages, such
as Berners-Lee's first site at http://info.cern.ch/, or by consulting updated lists such as the NCSA"What's
New" page. Some sites were also indexed by WAIS, enabling users to submit full-text searches similar to
the capability later provided by search engines.
There was still no graphical browser available for computers besides the NeXT. This gap was discussed in
January 1992, and filled in April 1992 with the release of Erwise, an application developed at theHelsinki
University of Technology, and in May by ViolaWWW, created by Pei-Yuan Wei, which included advanced
features such as embedded graphics, scripting, and animation. ViolaWWW was originally an application
for HyperCard. Both programs ran on the X Window System for Unix.
Students at the University of Kansas adapted an existing text-only hypertext browser, Lynx, to access the
web. Lynx was available on Unix and DOS, and some web designers, unimpressed with glossy graphical
websites, held that a website not accessible through Lynx wasn’t worth visiting.
Early browsers[edit source | editbeta]
The turning point for the World Wide Web was the introduction of the Mosaic web browser in 1993, a
graphical browser developed by a team at the National Center for Supercomputing Applications(NCSA) at
the University of Illinois at Urbana-Champaign (UIUC), led by Marc Andreessen. Funding for Mosaic came
from the High-Performance Computing and Communications Initiative, a funding program initiated by thenSenator Al Gore's High Performance Computing and Communication Act of 1991 also known as the Gore
Remarkably the first Mosaic Browser lacked a "back button", a feature proposed in 1992-3 by the same
individual who invented the concept of clickable text documents. The request was emailed from the
University of Texas computing facility. The browser was intended to be an editor and not simply a viewer,
but was to work with computer generated hypertext lists called "search engines".
The origins of Mosaic date to 1992. In November 1992, the NCSA at the University of Illinois (UIUC)
established a website. In December 1992, Andreessen and Eric Bina, students attending UIUC and
working at the NCSA, began work on Mosaic. They released an X Window browser in February 1993. It
gained popularity due to its strong support of integrated multimedia, and the authors’ rapid response to
user bug reports and recommendations for new features.
The first Microsoft Windows browser was Cello, written by Thomas R. Bruce for the Legal Information
Institute at Cornell Law School to provide legal information, since more lawyers had more access to
Windows than to Unix. Cello was released in June 1993. The NCSA released Mac Mosaic and
WinMosaic in August 1993.
After graduation from UIUC, Andreessen and James H. Clark, former CEO of Silicon Graphics, met and
formed Mosaic Communications Corporation to develop the Mosaic browser commercially. The company
changed its name to Netscape in April 1994, and the browser was developed further as Netscape
Web organization[edit source | editbeta]
In May 1994, the first International WWW Conference, organized by Robert Cailliau, was held at
CERN; the conference has been held every year since. In April 1993, CERN had agreed that anyone
could use the Web protocol and code royalty-free; this was in part a reaction to the perturbation caused by
the University of Minnesota's announcement that it would begin charging license fees for its implementation
of the Gopher protocol.
In September 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at the Massachusetts
Institute of Technology with support from the Defense Advanced Research Projects Agency (DARPA) and
the European Commission. It comprised various companies that were willing to create standards and
recommendations to improve the quality of the Web. Berners-Lee made the Web available freely, with no
patent and no royalties due. The W3C decided that its standards must be based on royalty-free technology,
so they can be easily adopted by anyone.
By the end of 1994, while the total number of websites was still minute compared to present standards,
quite a number of notable websites were already active, many of which are the precursors or inspiring
examples of today's most popular services.
1996–1998: Commercialization of the WWW[edit source | editbeta]
Main article: Web marketing
By 1996 it became obvious to most publicly traded companies that a public Web presence was no longer
optional. Though at first people saw mainly the possibilities of free publishing and
instant worldwide information, increasing familiarity with two-way communication over the "Web" led to the
possibility of direct Web-based commerce (e-commerce) and instantaneous group communications
worldwide. More dotcoms, displaying products on hypertext webpages, were added into the Web.
1999–2001: "Dot-com" boom and bust[edit source | editbeta]
Low interest rates in 1998–99 facilitated an increase in start-up companies. Although a number of these
new entrepreneurs had realistic plans and administrative ability, most of them lacked these characteristics
but were able to sell their ideas to investors because of the novelty of the dot-com concept.
Historically, the dot-com boom can be seen as similar to a number of other technology-inspired booms of
the past including railroads in the 1840s, automobiles in the early 20th century, radio in the 1920s,
television in the 1940s, transistor electronics in the 1950s, computer time-sharing in the 1960s, and home
computers and biotechnology in the early 1980s.
In 2001 the bubble burst, and many dot-com startups went out of business after burning through
their venture capital and failing to become profitable. Many others, however, did survive and thrive in the
early 21st century. Many companies which began as online retailers blossomed and became highly
profitable. More conventional retailers found online merchandising to be a profitable additional source of
revenue. While some online entertainment and news outlets failed when their seed capital ran out, others
persisted and eventually became economically self-sufficient. Traditional media outlets (newspaper
publishers, broadcasters and cablecasters in particular) also found the Web to be a useful and profitable
additional channel for content distribution, and an additional means to generate advertising revenue. The
sites that survived and eventually prospered after the bubble burst had two things in common; a sound
business plan, and a niche in the marketplace that was, if not unique, particularly well-defined and wellserved.
2002–present: The Web becomes ubiquitous[edit source | editbeta]
In the aftermath of the dot-com bubble, telecommunications companies had a great deal of overcapacity as
many Internet business clients went bust. That, plus ongoing investment in local cell infrastructure kept
connectivity charges low, and helping to make high-speed Internet connectivity more affordable. During this
time, a handful of companies found success developing business models that helped make the World Wide
Web a more compelling experience. These include airline booking sites, Google's search engine and its
profitable approach to simplified, keyword-based advertising, as well as ebay's do-it-yourself auction site
and Amazon.com's online department store.
This new era also begot social networking websites, such as MySpace and Facebook, which, though
unpopular at first, very rapidly gained acceptance in becoming a major part of youth culture.
Web 2.0[edit source | editbeta]
Beginning in 2002, new ideas for sharing and exchanging content ad hoc, such as Weblogs and RSS,
rapidly gained acceptance on the Web. This new model for information exchange, primarily
featuring DIYuser-edited and generated websites, was coined Web 2.0.
The Web 2.0 boom saw many new service-oriented startups catering to a new, democratized Web. Some
believe it will be followed by the full realization of a Semantic Web.
Tim Berners-Lee originally expressed the vision of the Semantic Web as follows:
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and
transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when
it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines.
The ‘intelligent agents’ people have touted for ages will finally materialize.
— Tim Berners-Lee, 1999
Predictably, as the World Wide Web became easier to query, attained a higher degree of usability, and
shed its esoteric reputation, it gained a sense of organization and unsophistication which opened the
floodgates and ushered in a rapid period of popularization. New sites such as Wikipedia and its sister
projects proved revolutionary in executing the User edited content concept. In 2005, 3 exPayPalemployees formed a video viewing website called YouTube. Only a year later, YouTube was proven
the most quickly popularized website in history, and even started a new concept of user-submitted content
in major events, as in the CNN-YouTube Presidential Debates.
The popularity of YouTube, Facebook, etc., combined with the increasing availability and affordability of
high-speed connections has made video content far more common on all kinds of websites. Many videocontent hosting and creation sites provide an easy means for their videos to be embedded on third party
websites without payment or permission.
This combination of more user-created or edited content, and easy means of sharing content, such as via
RSS widgets and video embedding, has led to many sites with a typical "Web 2.0" feel. They have articles
with embedded video, user-submitted comments below the article, and RSS boxes to the side, listing some
of the latest articles from other sites.
Continued extension of the World Wide Web has focused on connecting devices to the Internet,
coined Intelligent Device Management. As Internet connectivity becomes ubiquitous, manufacturers have
started to leverage the expanded computing power of their devices to enhance their usability and
capability. Through Internet connectivity, manufacturers are now able to interact with the devices they have
sold and shipped to their customers, and customers are able to interact with the manufacturer (and other
providers) to access new content.
Lending credence to the idea of the ubiquity of the web, Web 2.0 has found a place in the global English
lexicon. On June 10, 2009 the Global Language Monitor declared it to be the one-millionth English word.
See also[edit source | editbeta]
History of computing
Hardware before 1960
Hardware 1960s to present
Hardware in Soviet Bloc countries
Graphical user interface
World Wide Web
Timeline of computing
more timelines ...
Computer Lib / Dream Machines
History of hypertext
History of the web browser
References[edit source | editbeta]
Berners-Lee, Tim; Fischetti, Mark (1999) Weaving the Web: The Original Design and Ultimate Destiny
of the World Wide Web by Its Inventor, ISBN 978-0-06-251586-5, HarperSanFrancisco (978-0-06251587-X (pbk.), HarperSanFrancisco, 2000)
Cailliau, Robert; Gillies, James (2000) How the Web Was Born: The Story of the World Wide
Web, ISBN 978-0-19-286207-5, Oxford University Press
Graham, Ian S. (1995) The HTML Sourcebook: The Complete Guide to HTML. New York: John Wiley
Herman, Andrew (2000) The World Wide Web and Contemporary Cultural Theory : Magic, Metaphor,
Power, ISBN 978-0-415-92502-0, Routledge
Raggett, Dave; Lam, Jenny; Alexander, Ian (1996) HTML 3. Electronic Publishing on the World Wide
Web, ISBN 978-0-201-87693-0, Addison-Wesley
Footnotes[edit source | editbeta]
^ a b c Berners-Lee, Tim. "Frequently asked questions - Start of the web: Influences". World Wide Web Consortium.
Retrieved 22 July 2010.
^ Berners-Lee, Tim. "Frequently asked questions - Why the //, #, etc?". World Wide Web Consortium. Retrieved 22
^ A Short History of Internet Protocols at CERNby Ben Segal. 1995
^ The Next Crossroad of Web History by Gregory Gromov
^ Berners-Lee, Tim (May 1990). "Information Management: A Proposal". World Wide Web Consortium. Retrieved 24
^ Tim Berners-Lee, Weaving the Web, HarperCollins, 2000, p.23
^ a b c d e f Berners-Lee, Tim (ca 1993/1994). "A Brief History of the Web". World Wide Web Consortium. Retrieved 17
^  Tim Berners-Lee's account of the exact locations at CERN where the Web was invented
^ a b c d e Raggett et al, 1996. p. 21
10. ^ Murawski, John (24 May 2013). "Hunt for world's oldest WWW page leads to UNC Chapel Hill". News & Observer.
11. ^ How the web went world wide, Mark Ward, Technology Correspondent, BBC News. Retrieved 24 January 2011
12. ^ Berners-Lee, Tim. "Qualifiers on Hypertext links... - alt.hypertext". Retrieved 11 July 2012.
13. ^ Tim Berners-Lee, Weaving the Web, HarperCollins, 2000, p.46
14. ^ Heather McCabe (1999-02-09). "Grrl Geeks Rock Out". Wired magazine.
15. ^ Mosaic Web Browser History – NCSA, Marc Andreessen, Eric Bina
16. ^ NCSA Mosaic – September 10, 1993 Demo
17. ^ Vice President Al Gore's ENIAC Anniversary Speech.
18. ^ Robert Cailliau (21 July 2010). "A Short History of the Web". NetValley. Retrieved 21 July 2010.
19. ^ Tim Berners-Lee. "Frequently asked questions - Robert Cailliau's role". World Wide Web Consortium. Retrieved 22
20. ^ "IW3C2 - Past and Future Conferences". International World Wide Web Conferences Steering Committee. 201005-02. Retrieved 16 May 2010.
21. ^ Berners-Lee, Tim; Fischetti, Mark (1999).Weaving the Web. HarperSanFrancisco. chapter 12. ISBN 978-0-06251587-2.
22. ^ "'Millionth English Word' declared".NEWS.BBC.co.uk
Berghel, H., and Blank, D. (1999) The World Wide Web. In Advances in
Computing, M. Zelkowitz (Ed). Academic Press, NY.
THE WORLD WIDE WEB
Hal Berghel & Douglas Blank
Department of Computer Science
University of Arkansas
This article provides a high-level overview of the World Wide Web in the context
of a wide range of other Internet information access and delivery services. This
overview will include client-side, server-side and "user-side" perspectives.
Underlying Web technologies as well as current technology extensions to the Web
will also be covered. Social implications of Web technology will also be
TABLE OF CONTENTS
THE INTERNET INFRASTRUCTURE
THE SUCCESS OF THE WEB
4.1 END USERS' PERSPECTIVE
4.2 HISTORICAL PERSPECTIVE
5. THE UNDERLYING TECHNOLOGIES
5.1 HYPERTEXT MARKUP LANGUAGE (HTML)
5.2 HYPERTEXT TRANSFER PROTOCOL (HTTP)
6. DYNAMIC WEB TECHNOLOGIES
6.1 COMMON GATEWAY INTERFACE
6.5 EXECUTABLE CONTENT
6.8 SERVER SIDE INCLUDES
6.9 PUSH TECHNOLOGIES
6.10 STATE PARAMETERS
7. SECURITY and PRIVACY
7.1 SECURE SOCKET LAYER
7.2 SECURE HTTP (S-HTTP)
8. THE WEB AS A SOCIAL PHENOMENON
The World Wide Web, or "the Web," is a "finite but unbounded" collection of
media-rich digital resources which are connected through high-speed digital
networks. It relies upon an Internet protocol suite (see, ) which supports the
cross-platform transmission and rendering of a wide variety of media types (i.e.,
multimedia). This cross-platform delivery environment represents an important
departure from more traditional network communications protocols like Email,
Telnet and FTP because it is content-centric. It is also to be distinguished from
earlier document acquisition systems such as Gopher and WAIS (Wide Area
Information Systems) which accommodated a narrower range of media formats
and failed to include hyperlinks within their network navigation protocols.
Following Gopher, the Web quickly extended and enriched the metaphor of
integrated browsing and navigation. This made it possible to navigate and peruse a
wide variety of media types on the Web effortlessly, which in turn led to the Web's
hegemony as an Internet protocol.
Thus, while earlier network protocols were special-purpose in terms of both
function and media formats, the Web is highly versatile. It became the first
convenient form of digital communication which had sufficient rendering and
browsing utilities to allow any person or group with network access to share
media-rich information with their peers. It also became the standard for hyper-
linking cybermedia (cyberspace multimedia), connecting concept to source in
manifold directions identified primarily by Uniform Resource Locators (URLs).
In a formal sense, the Web is a client-server model for packet-switched, networked
computer systems defined by the protocol pair Hypertext Transfer Protocol
(HTTP) and Hypertext Markup Language (HTML). HTTP is the primary transport
protocol of the Web, while HTML defines the organization and structure of the
Web documents to be exchanged. At this writing, the current HTTP standard is at
version 1.0, and the current HTML version is 4.0.
HTTP and HTML are higher-order Internet protocols specifically created for the
Web. In addition, the Web must also utilize the lower-level Internet protocols,
Internet Protocol (IP) and Transmission Control Protocol (TCP). The basic Internet
protocol suite is thus designated TCP/IP. IP determines how datagrams will be
exchanged via packet-switched networks while TCP builds upon IP by adding
control and reliability checking .
According to NSFNET Backbone statistics, the Web moved into first place both in
terms of the percentage of total packets moved (21%) and percentage of total bytes
moved (26%) along the NSF backbone in the first few months of 1995. This placed
the Web well ahead of the traditional Internet activity leaders, FTP (14%/21%) and
Telnet (7.5%/2.5%), as the most popular Internet service. A comparison of the
evolutionary patterns of the Web, Gopher and FTP are graphically depicted in
Figure 1: Merit NIC Backbone statistics for the Web, Gopher and FTP from 1993-1995 in
terms of both packet and byte counts. (source: Merit NIC and Jim Pitkow , used with
2. THE INTERNET INFRASTRUCTURE
The Web evolution should be thought of as an extension of the digital computer
network technology which began in the 1960's. Localized, platform-dependent,
low-performance networks became prevalent in the 1970's. These LANS (local
area networks) were largely independent of, and incompatible with, each other. In
a quest for technology which could integrate these individual LANs, the U.S.
Department of Defense, through its Advanced Research Projects Agency (ARPA,
nee DARPA), funded research in inter-networking - or inter-connecting LANS via
a Wide Area Network (aka WAN). The first national network which resulted from
this project was called, not surprisingly, ARPANET. For most of the 1970's and
1980's ARPANET served as the primary network backbone in use for
interconnecting LANs for both the research community and the U.S. Government.
At least two factors considerably advanced the interest in the ARPANET project.
First, and foremost, it was an "open" architecture: its underlying technological
requirements, and software specifications were available for anyone to see. As a
result, it became an attractive alternative to developers who bristled at the notion of
developing software which would run on only a certain subset of available
Second, ARPANET was built upon a robust, highly versatile and enormously
popular protocol suite: TCP/IP (Transmission Control Protocol / Internet Protocol).
The success and stability of TCP/IP elevated it to the status as the de facto standard
for inter-networking. The U.S. military began using ARPANET in earnest in the
early 1980's, with the research community following suit. Since TCP/IP software
was in essence in the public domain, a frenzy of activity in deploying TCP/IP soon
resulted in both government and academe. One outgrowth was the NSF-sponsored
CSNET which linked computer science departments together. By the end of the
1980's, virtually every one who wanted to be inter-networked could gain access
through government of academic institutions. ARPANET gradually evolved into
the Internet, and the rest, as they say, is history.
Not unexpectedly, the rapid (in fact, exponential) growth produced some problems.
First and foremost was the problem of scalability. The original ARPANET
backbone was unable to carry the network traffic by the mid-1980's. It was
replaced by a newer backbone supported by the NSF, a backbone operated by a
consortium of IBM, MCI and Merit shortly thereafter, and finally by the privatized,
not-for-profit corporation, Advanced Networks and Services (ANS), which
consisted of the earlier NSFNET consortium members. In the mid 1990's, MCI
corporation deployed the high-speed Backbone Network System (vBNS) which
completed the trend toward privatization of the digital backbone networks. The
next stage of evolution for inter-network backbones is likely to be an outgrowth of
the much-discussed Internet II project proposed to the U.S. Congress in 1997. This
"next generation" Internet is expected to increase the available bandwidth over the
backbone by two orders of magnitude.
Of course, there were other network environments besides the Internet which have
met with varying degrees of success. Bitnet was a popular alternative for IBM
mainframe customers during the !970's and early 1980's, as was UUCP for the
Unix environment and the Email-oriented FIDONET. Europeans, meanwhile, used
an alternative network protocol, D.25, for several of their networks (Joint
Academic Network -JANET, European Academic and Research Network (EARN).
By 1991, however, the enormous popularity of the Internet drove even recalcitrant
foreign network providers into the Internet camp. High-speed, reliable Internet
connectivity was assured with the European Backbone (EBONE) project. At this
writing all but a handful of developing countries have some form of Internet
connectivity. For the definitive overview of the Internet, see .
3. THE SUCCESS OF THE WEB
It has been suggested  that the rapid deployment of the Web is a result of a
unique combination of characteristics:
1. the Web is an enabling technology - The Web was the first widespread
network technology to extend the notion of "virtual network machine" to
multimedia. While the ability to execute programs on, and retrieve content
from, distributed computers was not new (e.g., Telnet and FTP were already
in wide use by the time that the Web was conceived), the ability to produce
and distribute media-rich documents via a common, platform-independent
document structure, was new to the Web.
2. the Web is a unifying technology - The unification came through the Web's
accommodation of a wide range of multimedia formats. Since such audio
(e.g., .WAV,.AU), graphics (e.g., .GIF,.JPG) and animation (e.g., MPEG)
formats are all digital, they were already unified in desktop applications
prior to the Web. The Web, however, unified them for distributed, network
applications. One Web "browser", as it later became called, would correctly
render dozens of media formats regardless of network source. In addition,
the Web unifies not only the access to many differing multimedia formats,
but provides a platform-independent protocol which allows anyone,
regardless of hardware or operating system, access to that media.
3. the Web is a social phenomena - The Web social experience evolved in
three stages. Stage one was the phenomenon of Web "surfing". The richness
and variety of Web documents and the novelty of the experience made Web
surfing the de facto standard for curiosity-driven networking behavior in the
1990's. The second stage involved such Web interactive communication
forums as Internet Relay Chat (IRC), which provided a new outlet for
interpersonal but not-in-person communication. The third stage, which is in
infancy as of this writing, involves the notion of virtual community. The
widespread popularity and social implications of such network-based,
interactive communication is becoming an active area in computing
4.1 END USERS' PERSPECTIVE
Extensive reporting on Web use and Web users may be found in a number of Web
survey sites. Perhaps the most thorough of which is the biannual, self-selection
World Wide Web Survey which began in January, 1994 (see reference, below). As
this article is being written, the most current Web Survey is the eighth (October,
1997). Selected summary data appear in the table, below:
TABLE 1. Summary Information on Web use
average age of Web user = 35.7 years
male:female ratio of users = 62:38
% users with college degrees = 46.9
% in computing field = 20.6; 23.4% are in education; 11.7% in management
% of users from U.S. = 80.5 (and slowly decreasing)
% of users who connect via modems with transmission speeds of
33.5Kb/sec or less = 55
% of respondents reported who use the Web for purchases exceeding $100 =
% of users for whom English is the primary language = 93.1
% of users who have Internet bank accounts = 5.5
% of Microsoft Windows platforms = 64.5 (Apple = 25.6%)
% of users who plan to use Netscape = 60 (Internet Explorer = 15%)
Source: GVU's WWW User Surveys, http://www.cc.gatech.edu/gvu/user_surveys/.
Used with permission.
Of course a major problem with self-selection surveys, where subjects determine
whether, or to what degree, they wish to participate in the survey, is that the
samples are likely to be biased. In the case of the Web survey, for example, the
authors recommend that the readers assume biases towards the experienced users.
As a consequence, they recommend that readers confirm the results through
random sample surveys. Despite these limitations, however, the Web Surveys are
widely used and referenced and are among our best sources of information on Web
An interesting byproduct of these surveys will be an increased understanding of the
difference between traditional and electronic surveying methodologies and a
concern over possible population distortions under a new, digital lens. One may
only conjecture at this point whether telephone respondents behave similarly to
network respondents in survey settings. In addition, Web surveyors will develop
new techniques for non-biased sampling which avoids the biases inherent in selfselection. The science and technology behind such electronic sampling may well
be indispensable for future generations of Internet marketers, communicators, and
4.2 HISTORICAL PERSPECTIVE
The Web was conceived by Tim Berners-Lee and his colleagues at CERN (now
called the European Laboratory for Particle Physics) in 1989 as a shared
information space which would support collaborative work. Berners-Lee defined
HTTP and HTML at that time. As a proof-of- concept prototype, he developed the
first Web client navigator-browser in 1990 for the NeXTStep platform. Nicola
Pellow developed the first cross-platform Web browser in 1991 while Berners-Lee
and Bernd Pollerman developed the first server application - a phone book
database. By 1992, the interest in the Web was sufficient to produce four additional
browsers - Erwise, Midas, and Viola for X Windows, and Cello for Windows. The
following year, Marc Andreessen of the National Center for Supercomputer
Application (NCSA) wrote Mosaic for the X Windows System which soon became
the browser standard against which all others would be compared. Andreessen
went on to co-found Netscape Communications in 1994 whose current browser,
Netscape Navigator, remains the current de facto standard Web browser, despite
continuous loss of market share to Microsoft's Internet Explorer in recent years
(see, Figure 2). Netscape has also announced plans to license without cost the
source code for version 5.0 to be released in Spring, 1998. At this point it is
unclear what effect the move to "open sources" may have. (see Figures 2 and 3).
Figure 2: Market share of the three dominant Web browsers from 1994 through 1997.
Despite the original design goal of supporting collaborative work, Web use has
become highly variegated. The Web has been extended into a wide range of
products and services offered by individuals and organizations, for commerce,
education, entertainment, "edutainment", and even propaganda. A partial list of
popular Web applications includes:
individual and organizational homepages
sales prospecting via interactive forms-based surveys
advertising and the distribution of product promotional material
new product information, product updates, product recall notices
product support - manuals, technical support, frequently asked questions
corporate record-keeping - usually via local area networks (LANs) and
electronic commerce made possible with the advent of several secure HTTP
transmission protocols and electronic banking which can handle small
charges (perhaps at the level of millicents)
Figure 3: Navigator 4.x is a recent generic
"navigator/browser" from Netscape Corporation.
Displayed is a vanilla "splash page" of the World
Wide Web Test Pattern - a test bench for
determining the level of HTML compliance of a
Most Web resources at this writing are still set up for non-interactive, multimedia
downloads (e.g., non-interactive Java  animation applets, movie clips, real-
time audio transmissions, text with graphics). This will change in the next decade
as software developers and Web content-providers shift their attention to the
interactive and participatory capabilities of the Internet, the Web, and their
successor technologies. Already, the Web is eating into television's audience and
will probably continue to do so. Since it seems inevitable that some aspects of both
television and the Web will merge in the 21st century, they are said to be
convergent technologies. But as of this writing, the dominant Web theme seems to
remain static HTML documents and non-interactive animations.
As mentioned above, the uniqueness of the Web as a network technology is a
product of two protocols: HTML and HTTP. We elaborate on these protocols
5. THE UNDERLYING TECHNOLOGIES
5.1 HYPERTEXT MARKUP LANGUAGE (HTML)
HTML is the business part of document preparation for the Web. Two not-forprofit organizations play a major role in standardizing HTML: the World Wide
Web Consortium (www.w3.org) and the Internet Engineering Task Force
(www.ietf.org). Any document which conforms to the W3C/IETF HTML
standards is called a Web message entity. HTML is about the business of defining
Web message entities.
The hypertext orientation of HTML derives from the pioneering and independent
visions of Vannevar Bush  in the mid-1940's, and Doug Englebart  and Ted
Nelson  in the 1960's. Bush proposed mechanical and computational aids in
support of associative memory - i.e., the linking together of concepts which shared
certain properties. Englebart sought to integrate variegated documents and their
references through a common core document in a project called Augment. Nelson,
who coined the terms "hypertext" and "hypermedia," added to the work of Bush
and Englebart the concept of non-linear document traversal, as his proposed
project Xanadu (www.xanadu.net/the.project) attempted to "create, access and
manipulate this literature of richly formatted and connected information cheaply,
reliably and securely from anywhere in the world." Subsequently, Nelson has also
defined the notions of "transclusion," or virtual copies of collections of documents,
and "transcopyright" which enables the aggregation of information regardless of
ownership by automating the procedure by means of which creators are paid for
their intellectual property. We won't comment beyond saying that the Web is an
ideal test bed for Nelson's ideas.
From an technical perspective, HTML is a sequence of "extensions" to the original
concept of Berners-Lee - which was text-oriented. By early 1993, when the NCSA
Mosaic navigator- browser client was released for the X Windows System, HTML
had been extended to include still-frame graphics. Soon audio and other forms of
After 1993, however, HTML standards were a moving target. Marc Andreesen, the
NCSA Mosaic project leader, left the NCSA to form what would become Netscape
Corporation. Under his technical supervision, Netscape went its own way in
offering new features which were not endorsed by W3C/IETF, and at times were
inconsistent with the Standard Generalized Markup Language (SGML) orientation
intended by the designers of HTML. SGML is a document definition language
which is independent of any particular structure - i.e. layout is defined by the
presentation software based upon Under pressure to gain market share,
navigator/browser developers attempted to add as many useful "extensions" to the
HTML standards as could be practicably supported. This competition has been
called the "Mosaic War,"  which persists in altered form even to this day.
Although not complete, Table 1 provides a technical perspective of the evolution
Table 1: HTML Evolution.
note: (1) Version 3.2 is actually a subset of Version 3.0, the latter of which failed
to get endorsed by W3C/IETF. (2) Dates are only approximate because of the time
lag between the introduction of the technology and the subsequent endorsement as
a standard. In some cases this delay is measured in years.
GML - Generalized Markup Language
Developed by IBM in 1969 to separate form from content in
SGML - ISO 8879 Standard Generalized Markup Language
HTML Version 1 (circa 1992-3)
basic HTML structure
HTML Version 2 (circa 1994)
HTML Version 3.2 (circa 1996-7)
advanced CGI programming
text flow around graphics
HTML Version 4.x (early 1998)
format via cascading style sheets (vs. HTML tags)
compound documents with hierarchy of alternate rendering strategies
tty + braille support
client-side image maps
Extensible Markup Language. Subset of SGML
Among the many Netscape innovations are: typographical enhancements and fonts
alignment and colorization controls for text and graphics dynamic updating
(continuous refresh without reload) server push/client pull frames cookies plug-ins
scripts frames Java applets layers Many of these have become part of subsequent
HTML standards. In addition to these formal standards, discussion is already
underway for a radical extension of HTML called XML
(www.w3.org/XML/Activity). In many ways, HTML evolved away from its nicely
thought-out roots. GML, or Generalized Markup Language, was developed in the
1960's at IBM to describe many different kinds of documents. Standard
Generalized Markup Language, or SGML, was based on GML and became an ISO
standard years later in the 1980's. SGML still stands today as the mother of all
markup languages. Its designers were very careful to not confuse form and content,
and created a wonderfully rich language. HTML became a patchwork of ideas as it
quickly evolved over the last few years, and mudied the difference between form
and content.XML is an effort to reunite HTML with its SGML roots. The
development of XML, which began in late 1996, deals with the non-extensibility
of HTML to handle advanced page design and a full range of new multimedia.
XML will accomplish this by using
1. a more SGML-like markup language (vs. HTML) allows "personal" or
"group" -oriented tags, and
2. a low-level syntax for data definition
To see how XML differs from HTML, we examine a page of HTML code:
<p>Smith, Aaron S. (1999). <i>Understanding the Web</i>.
Web Books, Inc. </p>
This code, when rendered by an appropriate browser, would appear similar to the
Smith, Aaron S. (1999). Understanding the Web. Web Books, Inc.
Tags are special symbols in HTML and XML, and are indicated by the
surrounding less-than and greater-than symbols. The majority of tags are paired i.e., they surround the text that they affect. For example, <I> and </I> indicate that
the italics should be turned on and off, respectively.
Now, contrast the HTML example with sample XML code:
<?XML version="1.0" ?>
<ref-name> Smith-1999b </ref-name>
<last> Smith </last>
<first> Aaron </first>
<mi> S </mi>
<title> Understanding the Web </title>
<year> 1999 </year>
<publisher> Web Books, Inc. </publisher>
<type> Book </type>
Like the HTML code, XML is made up of tags. However, XML does not describe
how to render the data, it merely indicates the structure and content of the data.
HTML does have some of these kinds of tags (for example, <title> in the above
HTML example) but, for the most part, HTML has evolved completely away from
its SGML roots.
XML was designed to be compatible with current HTML (and SGML, for that
matter). Today's most common web browsers (Microsoft's Internet Explorer and
Netscape's Navigator) do not support XML directly. Instead, most XML processors
have been implemented as Java applications or applets (see the Web Consortium's
website for a list of XML processors at www.w3.org). Such a Java processor could
be instructed to render the XML inside the browser exactly like the rendered
One of the nice properties of XML is the separation of content and format. This
distinction will surely help tame the Wild Web as it will allow easier searching,
better structuring, and greater assistance to software agents in general. However,
this isn't XML's greatest virtue: what makes XML a great leap forward for the Web
is its ability to create new tags. Much like a modern database management system
can define new fields, XML can create a new tag. In addition, XML tags can also
have structure like the name field above was composed of first, last, and middle
initial. As long as client and server agree on the structure of the data, they can
freely create and share new data fields, types, and content via XML.
Some have said that XML "does for data what Java does for programs." Examples
of XML applications are the math-formula markup language, MathML
(http://www.w3.org/TR/WD-math/), which combines the ability to define content
with a less-powerful suite of features to define presentation. Another example is
RDF, a resource description format for meta-data
(http://www.w3.org/RDF/Overview.html), which is used in both PICS, the
Platform for Internet Content Selection (http://www.w3.org/PICS/) and SMIL, the
Synchronized Multimedia Integration Language, which is a declarative language
for synchronizing multimedia on the Web. The XML prototype client is Jumbo
Although XML will help make marking up Web pages easier, there is still a battle
raging over which system should be responsible for the details of rendering pages.
Current HTML coders must take responsibilty for exact placement over page
layout, and getting a standard look across browsers is non-trivial. However, SGML
leaves the page layout details up to the browser. Exactly how this important issue
will play out remains to be seen.
5.2 HYPERTEXT TRANSFER PROTOCOL (HTTP)
HTTP is a platform-independent protocol based upon the client-server model of
computing which runs on any TCP/IP, packet switched digital network - e.g., the
Internet. HTTP stands for Hyper Text Transfer Protocol and is the communication
protocol with which browsers request data, and servers provide it. This data can be
of many types including video, sound, graphics, and text. In addition, HTTP is
extensible in that it can be augmented to transfer types data that do not yet exist.
HTTP is an application layer protocol, and sits directly on top of TCP
(Transmission File Protocol) . It is similar to in many ways to the File
Transmission Protocol (FTP) and TELNET. HTTP follows the following logical
1. A connection from the client's browser is made to a server, typically by the
user having clicked on a link.
2. A request is made of the server. This request could be for data (i.e., a
"GET") or could be a request to process data (i.e., "POST" or "PUT").
3. The server attempts to fulfill the request. If successful, the client's browser
will receive additional data to render. Otherwise, an error occurs.
4. The connection is then closed.
HTTP uses the same underlying communication protocols as do all the applications
that sit on top of TCP. For this reason, one can use the TELNET application to
make an HTTP request. Other TCP-based applications include FTP, TFTP (Trivial
File Transfer Protocol), and SMTP (Simple Mail Transfer Protocol) to name just a
few. Consider the following example:
% telnet www.uark.edu 80
GET / HTTP/1.0
This command made from any operating system with access to the TELNET
program requests to talk to port 80, the standard HTTP port, of a machine running
a web server (TELNET normally uses port 23). A request is made to get the root
document (GET /), in a particular protocol (HTTP/1.0), and accepting either text or
HTML. The data (i.e., HTML codes) are returned, and the connection is closed.
Note: the ending empty line is required.
HTTP/1.0 200 OK
Date: Sun, 03 May 1998 22:25:37 GMT
These are the data returned from the previous request. First, the server responds
with the protocol (HTTP/1.0 in this example), gives the corresponding code (200
OK), provides details of the server (Netscape-Enterprise), date and time, and the
format of the following data (text/html). Finally, an empty line separates the header
from the actually HTML code.
This type of processing is called "stateless". This makes HTTP only slightly, yet
importantly, different from FTP. FTP has "state"; an FTP session has a series of
settings that may be altered during the course of a dialog between client and server.
For example, the "current directory" and "download data type" settings maybe be
changed during an FTP dialog. HTTP, on the other hand, has no such interaction--the conversation is limited to a simple request and response. This has been the
most limiting aspect of HTTP. Much current Web development has centered
around dealing with this particular limitation of the protocol (i.e., cookies).
Although HTTP is very limited, it has shown its flexibility through what must be
one of the most explosive and rapidly changing technological landscapes ever.
This flexibility is made possible via the protocol's format negotiations. The
negotiation begins with the client identifying the types of formats it can
understand. The server responds with data in any of those formats that it can
supply (text/html in the above example). In this manner, the client and server can
agree on file types yet to be invented, or which depend on proprietary formats. If
the client and server cannot agree on a format, the data is simply ignored.
6. DYNAMIC WEB TECHNOLOGIES
Web technologies evolved beyond the original concept in several important
respects. We examine HTML forms, the Common Gateway Interface, plug-ins,
executable content, and push technologies.
6.1 COMMON GATEWAY INTERFACE
The support of the Common Gateway Interface (CGI) within HTTP in 1993 added
interactive computing capability to the Web. Here is a one-line C program that
formats the standard greeting in basic HTML. (Note: make sure the binary is
marked executable. Also, often the binary will need to have a .cgi extension to tell