World wide web


Published on

Published in: Technology, Education
1 Comment
  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

World wide web

  1. 1. World Wide Web From Wikipedia, the free encyclopaedia "WWW" and "The web" redirect here. For other uses of WWW, see WWW (disambiguation). For other uses of web, see Web (disambiguation). Not to be confused with the Internet. Internet A visualization of routing paths through a portion of the Internet. General[show] Governance[show] Information infrastructure[show] Services[show] Guides[show] World Wide Web The web's logo designed by Robert Cailliau
  2. 2. Invented by Tim Berners-Lee[1][2] Company CERN Availability Worldwide The World Wide Web (abbreviated as WWW or W3,[3] commonly known as the web) is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia and navigate between them via hyperlinks. The web was developed between March 1989 and December 1990.[4][5] Using concepts from his earlier hypertext systems such as ENQUIRE, British engineer Tim Berners-Lee, a computer scientist and at that time employee of the CERN, now Director of the World Wide Web Consortium (W3C), wrote a proposal in March 1989 for what would eventually become the World Wide Web.[1] The 1989 proposal was meant for a more effective CERN communication system but Berners-Lee eventually realised the concept could be implemented throughout the world. [6] At CERN, a European research organisation near Geneva straddling the border between France and Switzerland,[7] Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and access information of various kinds as a web of nodes in which the user can browse at will",[8] and Berners-Lee finished the first website in December that year.[9] Berners-Lee posted the project on the alt.hypertext newsgroup on 7 August 1991.[10] Contents [hide] 1 History 2 Function o 2.1 Linking o 2.2 Dynamic updates of web pages o 2.3 WWW prefix o 2.4 Scheme specifiers: http and https 3 Web servers 4 Privacy 5 Intellectual property 6 Security 7 Standards
  3. 3. 8 Accessibility 9 Internationalization 10 Statistics 11 Speed issues 12 Caching 13 See also 14 References 15 Further reading 16 External links History Main article: History of the World Wide Web The NeXT Computer used by Berners-Lee. The handwritten label declares, "This machine is a server. DO NOT POWER IT DOWN!!" In the May 1970 issue of Popular Science magazine, Arthur C. Clarke predicted that satellites would someday "bring the accumulated knowledge of the world to your fingertips" using a console that would combine the functionality of the photocopier, telephone, television and a small computer, allowing data transfer and video conferencing around the globe.[11] In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and software project he had built in 1980, and described a more elaborate information management system.[12] With help from Robert Cailliau, he published a more formal proposal (on 12 November 1990) to build a "Hypertext project" called "WorldWideWeb" (one word, also "W3") as a "web" of "hypertext documents" to be viewed by "browsers" using a client–server architecture.[8] This proposal estimated that a read-only web would be developed within three months and that it would take six months to achieve "the creation of new links and new material by readers, [so that] authorship becomes universal" as well as "the automatic notification of a reader when new material of interest to him/her has become available."
  4. 4. While the read-only goal was met, accessible authorship of web content took longer to mature, with the wiki concept, blogs, Web 2.0 and RSS/Atom.[13] The proposal was modeled after the SGML reader Dynatext by Electronic Book Technology, a spin-off from the Scholarship at Brown University. The Dynatext system, licensed by CERN, was a key player in the extension of SGML ISO 8879:1986 to Hypermedia within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in the general high energy physics community, namely a fee for each document and each document alteration. The CERN datacenter in 2010 housing some WWW servers A NeXT Computer was used by Berners-Lee as the world's first web server and also to write the first web browser, Worldwide Web, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web:[14] the first web browser (which was a web editor as well); the first web server; and the first web pages,[15] which described the project itself. The first web page may be lost, but Paul Jones of UNC-Chapel Hill in North Carolina revealed in May 2013 that he has a copy of a page sent to him by Berners-Lee which is the oldest known web page. Jones stored it on a floppy disk and on his NeXT computer.[16] On 6 August 1991, Berners-Lee posted a short summary of the World Wide Web project on the alt.hypertext newsgroup.[17] This date also marked the debut of the Web as a publicly available service on the Internet, although new users only access it after August 23. For this reason this is considered the internet’s day. Many news media have reported that the first photo on the web was uploaded by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettestaken by Silvano de Gennaro; Gennaro has disclaimed this story, writing that media were "totally distorting our words for the sake of cheap sensationalism."[18] The first server outside Europe was set up at the Stanford Linear Accelerator Center (SLAC) in Palo Alto, California, to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event. The World Wide Web Consortium says December 1992,[19] whereas SLAC itself claims 1991.[20][21] This is supported by a W3C document titled A Little History of the World Wide Web.[22]
  5. 5. The crucial underlying concept of hypertext originated with older projects from the 1960s, such as the Hypertext Editing System (HES) at Brown University, Ted Nelson's Project Xanadu, and Douglas Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush's microfilm-based "memex", which was described in the 1945 essay "As We May Think".[23] Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally assumed the project himself. In the process, he developed three essential technologies: 1. a system of globally unique identifiers for resources on the Web and elsewhere, the universal document identifier (UDI), later known as uniform resource locator (URL) and uniform resource identifier(URI); 2. the publishing language HyperText Markup Language (HTML); 3. the Hypertext Transfer Protocol (HTTP).[24] The World Wide Web had a number of differences from other hypertext systems available at the time. The web required only unidirectional links rather than bidirectional ones, making it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to anyone, with no fees due.[25] Coming two months after the announcement that the server implementation of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System. Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th anniversary of the World Wide Web Consortium.
  6. 6. Scholars generally agree that a turning point for the World Wide Web began with the introduction[26] of the Mosaic web browser[27] in 1993, a graphical browser developed by a team at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen. Funding for Mosaic came from the U.S. High-Performance Computing and Communications Initiative and the High Performance Computing and Communication Act of 1991, one of several computing developments initiated by U.S. Senator Al Gore.[28] Prior to the release of Mosaic, graphics were not commonly mixed with text in web pages and the web's popularity was less than older protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic's graphical user interface allowed the Web to become, by far, the most popular Internet protocol. The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European Organization for Nuclear Research (CERN) in October 1994. It was founded at the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second site was founded at INRIA (a French national computer research lab) with support from the European Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By the end of 1994, while the total number of websites was still minute compared to present standards, quite a number of notable websites were already active, many of which are the precursors or inspiration for today's most popular services. Connected by the existing Internet, other websites were created around the world, adding international standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the development of web standards (such as the markup languages in which web pages are composed), and has advocated his vision of a Semantic Web. The World Wide Web enabled the spread of information over the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing use of the Internet.[29] Although the two terms are sometimes conflated in popular use, World Wide Web is not synonymous with Internet.[30] The web is a collection of documents and both client and server software using Internet protocols such as TCP/IP and HTTP. Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for his contribution to the World Wide Web. Function The terms Internet and World Wide Web are often used in everyday speech without much distinction. However, the Internet and the World Wide Web are not the same. The Internet is a global system of interconnected computer networks. In contrast, the web is one of the
  7. 7. services that runs on the Internet. It is a collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed by web browsers from web servers. In short, the web can be thought of as an application "running" on the Internet.[31] Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into a web browser or by following a hyperlink to that page or resource. The web browser then initiates a series of communication messages, behind the scenes, in order to fetch and display it. In the 1990s, using a browser to view web pages—and to move from one web page to another through hyperlinks—came to be known as 'browsing,' 'web surfing,' or 'navigating the web'. Early studies of this new behavior investigated user patterns in using web browsers. One study, for example, found five user patterns: exploratory surfing, window surfing, evolved surfing, bounded navigation and targeted navigation.[32] The following example demonstrates how a web browser works. Consider accessing a page with the URL First, the browser resolves the server-name portion of the URL ( into an Internet Protocol address using the globally distributed database known as the Domain Name System (DNS); this lookup returns an IP address such as The browser then requests the resource by sending an HTTP request across the Internet to the computer at that particular address. It makes the request to a particular application port in the underlying Internet Protocol Suite so that the computer receiving the request can distinguish an HTTP request from other network protocols it may be servicing such as e-mail delivery; the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple as the two lines of text GET /wiki/World_Wide_Web HTTP/1.1 Host: The computer receiving the HTTP request delivers it to web server software listening for requests on port 80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating success, which can be as simple as HTTP/1.0 200 OK ContentType: text/html; charset=UTF-8 followed by the content of the requested page. The Hypertext Markup Language for a basic web page looks like <html> <head> <title> – The World Wide Web</title> </head> <body> <p>The World Wide Web, abbreviated as WWW and commonly known ...</p> </body> </html> The web browser parses the HTML, interpreting the markup (<title>, <p> for paragraph, and such) that surrounds the words in order to draw the text on the screen. Many web pages use HTML to reference the URLs of other resources such as images, other embedded media, scripts that affect page behavior, and Cascading Style Sheets that affect page layout. The browser will make additional HTTP requests to the web server for these other Internet media types. As it receives their content from the web server, the browser
  8. 8. progressively renders the page onto the screen as specified by its HTML and these additional resources. Linking[edit source | editbeta] Most web pages contain hyperlinks to other related pages and perhaps to downloadable files, source documents, definitions and other web resources. In the underlying HTML, a hyperlink looks like <a href="">, a free encyclopedia</a> Graphic representation of a minute fraction of the WWW, demonstratinghyperlinks Such a collection of useful, related resources, interconnected via hypertext links is dubbed a web of information. Publication on the Internet created what Tim Berners-Lee first called the WorldWideWeb (in its original CamelCase, which was subsequently discarded) in November 1990.[8] The hyperlink structure of the WWW is described by the webgraph: the nodes of the webgraph correspond to the web pages (or URLs) the directed edges between them to the hyperlinks. Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as link rot and the hyperlinks affected by it are often called dead links. The ephemeral nature of the Web has prompted many efforts to archive web sites. The Internet Archive, active since 1996, is the best known of such efforts. Dynamic updates of web pages[edit source | editbeta] Main article: Ajax (programming) JavaScript is a scripting language that was initially developed in 1995 by Brendan Eich, then of Netscape, for use within web pages.[33] The standardised version isECMAScript.[33] To make web pages more interactive, some web applications also use JavaScript techniques such as Ajax (asynchronous JavaScript and XML).Client-side script is delivered with the page that can make additional HTTP requests to the server, either in response to user actions such as mouse movements or clicks, or based on lapsed time. The server's responses are used to
  9. 9. modify the current page rather than creating a new page with each response, so the server needs only to provide limited, incremental information. Multiple Ajax requests can be handled at the same time, and users can interact with the page while data is being retrieved. Web pages may also regularly poll the server to check whether new information is available.[34] WWW prefix Many domain names used for the World Wide Web begin with www because of the longstanding practice of naming Internet hosts (servers) according to the services they provide. The hostname for a web server is often www, in the same way that it may be ftp for an FTP server, and news or nntp for a USENET news server. These host names appear as Domain Name System or (DNS) subdomain names, as The use of 'www' as a subdomain name is not required by any technical or policy standard and many web sites do not use it; indeed, the first ever web server was[35] According to Paolo Palazzi,[36] who worked at CERN along with Tim Berners-Lee, the popular use of 'www' subdomain was accidental; the World Wide Web project page was intended to be published at while was intended to be the CERN home page, however the dns records were never switched, and the practice of prepending 'www' to an institution's website domain name was subsequently copied. Many established websites still use 'www', or they invent other subdomain names such as 'www2', 'secure', etc. Many such web servers are set up so that both the domain root (e.g., and the www subdomain (e.g., refer to the same site; others require one form or the other, or they may map to different web sites. The use of a subdomain name is useful for load balancing incoming web traffic by creating a CNAME record that points to a cluster of web servers. Since, currently, only a subdomain can be used in a CNAME, the same result cannot be achieved by using the bare domain root. When a user submits an incomplete domain name to a web browser in its address bar input field, some web browsers automatically try adding the prefix "www" to the beginning of it and possibly ".com", ".org" and ".net" at the end, depending on what might be missing. For example, entering 'microsoft' may be transformed to and 'openoffice' to This feature started appearing in early versions of Mozilla Firefox, when it still had the working title 'Firebird' in early 2003, from an earlier practice in browsers such as Lynx.[37] It is reported that Microsoft was granted a US patent for the same idea in 2008, but only for mobile devices.[38] In English, www is usually read as double-u double-u double-u. Some users pronounce it dubdub-dub, particularly in New Zealand. Stephen Fry, in his "Podgrammes" series of podcasts, pronounces it wuh wuh wuh. The English writer Douglas Adams once quipped in The Independent on Sunday (1999): "The World Wide Web is the only thing I know of whose
  10. 10. shortened form takes three times longer to say than what it's short for". In Mandarin Chinese, World Wide Web is commonly translated via a phono-semantic matching to wàn wéi wǎng (万维网), which satisfies www and literally means "myriad dimensional net",[39] a translation that very appropriately reflects the design concept and proliferation of the World Wide Web. Tim Berners-Lee's web-space states that World Wide Web is officially spelled as three separate words, each capitalised, with no intervening hyphens.[40] Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain names and make them easily pronounceable.[41] As the mobile web grows in popularity, services like,, and are most often discussed without adding www to the domain (or, indeed, the .com). Scheme specifies: http and https The scheme specifies http:// or https:// at the start of a web URI refers to Hypertext Transfer Protocol or HTTP Secure respectively. Unlike www, which has no specific purpose, these specify the communication protocol to be used for the request and response. The HTTP protocol is fundamental to the operation of the World Wide Web and the added encryption layer in HTTPS is essential when confidential information such as passwords or banking information are to be exchanged over the public Internet. Web browsers usually prepend http:// to addresses too, if omitted. Web servers Main article: Web server The primary function of a web server is to deliver web pages on the request to clients. This means delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and scripts. Privacy Main article: Internet privacy Every time a web page is requested from a web server the server can identify, and usually it logs, the IP address from which the request arrived. Equally, unless set not to do so, most web browsers record the web pages that have been requested and viewed in a history feature, and usually cache much of the content locally. Unless HTTPS encryption is used, web requests and responses travel in plain text across the internet and they can be viewed, recorded and cached by intermediate systems. When a web page asks for, and the user supplies, personally identifiable information such as their real name, address, e-mail address, etc., then a connection can be made between the current web traffic and that individual. If the website uses HTTP cookies, username and
  11. 11. password authentication, or other tracking techniques, then it will be able to relate other web visits, before and after, to the identifiable information provided. In this way it is possible for a web-based organization to develop and build a profile of the individual people who use its site or sites. It may be able to build a record for an individual that includes information about their leisure activities, their shopping interests, their profession, and other aspects of their demographic profile. These profiles are obviously of potential interest to marketers, advertisers and others. Depending on the website's terms and conditions and the local laws that apply information from these profiles may be sold, shared, or passed to other organizations without the user being informed. For many ordinary people, this means little more than some unexpected e-mails in their in-box, or some uncannily relevant advertising on a future web page. For others, it can mean that time spent indulging an unusual interest can result in a deluge of further targeted marketing that may be unwelcome. Law enforcement, counter terrorism and espionage agencies can also identify, target and track individuals based on what appear to be their interests or proclivities on the web. Social networking sites make a point of trying to get the user to truthfully expose their real names, interests and locations. This makes the social networking experience more realistic and therefore engaging for all their users. On the other hand, photographs uploaded and unguarded statements made will be identified to the individual, who may regret some decisions to publish these data. Employers, schools, parents and other relatives may be influenced by aspects of social networking profiles that the posting individual did not intend for these audiences. On-line bullies may make use of personal information to harass or stalk users. Modern social networking websites allow fine grained control of the privacy settings for each individual posting, but these can be complex and not easy to find or use, especially for beginners.[42] Photographs and videos posted onto websites have caused particular problems, as they can add a person's face to an on-line profile. With modern and potential facial recognition technology, it may then be possible to relate that face with other, previously anonymous, images, events and scenarios that have been imaged elsewhere. Because of image caching, mirroring and copying, it is difficult to remove an image from the World Wide Web. Intellectual property Main article: Intellectual property The intellectual property rights for any creative work initially rests with its creator. Web users who want to publish their work onto the World Wide Web, however, need to be aware of the details of the way they do it. If artwork, photographs, writings, poems, or technical innovations are published by their creator onto a privately owned web server, then they may choose the copyright and other conditions freely themselves. This is unusual though; more
  12. 12. commonly work is uploaded to websites and servers that are owned by other organizations. It depends upon the terms and conditions of the site or service provider to what extent the original owner automatically signs over rights to their work by the choice of destination and by the act of uploading.[citation needed] Some users of the web erroneously assume that everything they may find online is freely available to them as if it was in the public domain, which is not always the case. Content owners that are aware of this widespread belief, may expect that their published content will probably be used in some capacity somewhere without their permission. Some content publishers therefore embed digital watermarks in their media files, sometimes charging users to receive unmarked copies for legitimate use. Digital rights management includes forms of access control technology that further limit the use of digital content even after it has been bought or downloaded.[citation needed] Security The web has become criminals' preferred pathway for spreading malware. Cybercrime carried out on the web can include identity theft, fraud, espionage and intelligence gathering.[43] Web-based vulnerabilitiesnow outnumber traditional computer security concerns,[44][45] and as measured by Google, about one in ten web pages may contain malicious code.[46] Most web-based attacks take place on legitimate websites, and most, as measured by Sophos, are hosted in the United States, China and Russia.[47] The most common of all malware threats is SQL injection attacks against websites.[48] Through HTML and URIs the web was vulnerable to attacks like cross-site scripting (XSS) that came with the introduction of JavaScript[49] and were exacerbated to some degree by Web 2.0 and Ajax web design that favors the use of scripts.[50] Today by one estimate, 70% of all websites are open to XSS attacks on their users.[51] Proposed solutions vary to extremes. Large security vendors like McAfee already design governance and compliance suites to meet post-9/11 regulations,[52] and some, like Finjan have recommended active real-time inspection of code and all content regardless of its source.[43] Some have argued that for enterprise to see security as a business opportunity rather than a cost center,[53] "ubiquitous, always-on digital rights management" enforced in the infrastructure by a handful of organizations must replace the hundreds of companies that today secure data and networks.[54] Jonathan Zittrain has said users sharing responsibility for computing safety is far preferable to locking down the Internet.[55] Standards[edit source | editbeta] Main article: Web standards
  13. 13. Many formal standards and other technical specifications and software define the operation of different aspects of the World Wide Web, the Internet, and computer information exchange. Many of the documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some are produced by the Internet Engineering Task Force (IETF) and other organizations. Usually, when web standards are discussed, the following publications are seen as foundational: Recommendations for markup languages, especially HTML and XHTML, from the W3C. These define the structure and interpretation of hypertext documents. Recommendations for stylesheets, especially CSS, from the W3C. Standards for ECMAScript (usually in the form of JavaScript), from Ecma International. Recommendations for the Document Object Model, from W3C. Additional publications provide definitions of other essential technologies for the World Wide Web, including, but not limited to, the following: Uniform Resource Identifier (URI), which is a universal system for referencing resources on the Internet, such as hypertext documents and images. URIs, often called URLs, are defined by the IETF's RFC 3986 / STD 66: Uniform Resource Identifier (URI): Generic Syntax, as well as its predecessors and numerous URI scheme-defining RFCs; HyperText Transfer Protocol (HTTP), especially as defined by RFC 2616: HTTP/1.1 and RFC 2617: HTTP Authentication, which specify how the browser and server authenticate each other. Accessibility Main article: Web accessibility There are methods available for accessing the web in alternative mediums and formats, so as to enable use by individuals with disabilities. These disabilities may be visual, auditory, physical, speech related, cognitive, neurological, or some combination therin. Accessibility features also help others with temporary disabilities like a broken arm or the aging population as their abilities change.[56] The Web is used for receiving information as well as providing information and interacting with society. The World Wide Web Consortium claims it essential that the Web be accessible in order to provide equal access and equal opportunity to people with disabilities.[57] Tim Berners-Lee once noted, "The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect."[56] Many countries regulate web accessibility as a requirement for websites.[58] International cooperation in the W3C Web Accessibility Initiative led to simple guidelines that web
  14. 14. content authors as well as software developers can use to make the Web accessible to persons who may or may not be using assistive technology.[56][59] Internationalization The W3C Internationalization Activity assures that web technology will work in all languages, scripts, and cultures.[60] Beginning in 2004 or 2005, Unicode gained ground and eventually in December 2007 surpassed both ASCII and Western European as the Web's most frequently used character encoding.[61] Originally RFC 3986 allowed resources to be identified by URI in a subset of US-ASCII. RFC 3987allows more characters—any character in the Universal Character Set—and now a resource can be identified by IRI in any language.[62] Statistics Between 2005 and 2010, the number of web users doubled, and was expected to surpass two billion in 2010.[63] Early studies in 1998 and 1999 estimating the size of the web using capture/recapture methods showed that much of the web was not indexed by search engines and the web was much larger than expected.[64][65] According to a 2001 study, there were a massive number, over 550 billion, of documents on the Web, mostly in the invisible Web, or Deep Web.[66] A 2002 survey of 2,024 million web pages[67] determined that by far the most web content was in the English language: 56.4%; next were pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used web searches in 75 different languages to sample the web, determined that there were over 11.5 billion web pages in the publicly index able web as of the end of January 2005.[68] As of March 2009, the index able web contains at least 25.21 billion pages.[69] On 25 July 2008, Google software engineers Jesse Alpert and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs.[70] As of May 2009, over 109.5 million domains operated.[71][not in citation given] Of these 74% were commercial or other domains operating in the .com generic top-level domain.[71] Statistics measuring a website's popularity are usually based either on the number of page views or on associated server 'hits' (file requests) that it receives. Speed issues Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow browsing has led to a pejorative name for the World Wide Web: the World Wide Wait.[72] Speeding up the Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to reduce the congestion can be found at W3C.[73] Guidelines for web response times are:[74]
  15. 15. 0.1 second (one tenth of a second). Ideal response time. The user does not sense any interruption. 1 second. Highest acceptable response time. Download times above 1 second interrupt the user experience. 10 seconds. Unacceptable response time. The user experience is interrupted and the user is likely to leave the site or system. Caching Main article: Web cache If a user revisits a web page after only a short interval, the page data may not need to be reobtained from the source web server. Almost all web browsers cache recently obtained data, usually on the local hard drive. HTTP requests sent by a browser will usually ask only for data that has changed since the last download. If the locally cached data are still current, they will be reused. Caching helps reduce the amount of web traffic on the Internet. The decision about expiration is made independently for each downloaded file, whether image, style sheet, JavaScript, HTML, or other web resource. Thus even on sites with highly dynamic content, many of the basic resources need to be refreshed only occasionally. Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few site-wide files so that they can be cached efficiently. This helps reduce page download times and lowers demands on the Web server. There are other components of the Internet that can cache web content. Corporate and academic firewalls often cache Web resources requested by one user for the benefit of all. (See also caching proxy server.) Some search engines also store cached content from websites. Apart from the facilities built into web servers that can determine when files have been updated and so need to be re-sent, designers of dynamically generated web pages can control the HTTP headers sent back to requesting users, so that transient or sensitive pages are not cached. Internet banking and news sites frequently use this facility. Data requested with an HTTP 'GET' is likely to be cached if other conditions are met; data obtained in response to a 'POST' is assumed to depend on the data that was POSTed and so is not cached. References 1. ^ a b Quittner, Joshua (29 March 1999). "Tim Berners Lee – Time 100 People of the Century". Time Magazine. Retrieved 17 May 2010. "He wove the World Wide Web and created a mass medium for the 21st century. The World Wide Web is BernersLee's alone. He designed it. He loosed it on the world. And he more than anyone else has fought to keep it open, nonproprietary and free."
  16. 16. 2. ^ Tim Berners-Lee. "Frequently asked questions". World Wide Web Consortium. Retrieved 22 July 2010. 3. ^ "World Wide Web Consortium". "The World Wide Web Consortium (W3C)..." 4. ^ "Frequently asked questions" Retrieved 15 June 2013. 5. ^ "Inventing the Web: Tim Berners-Lee’s 1990 Christmas Baby" Seeing the Picture. Retrieved 15 June 2013. 6. ^ WorldWideWeb: Proposal for a HyperText Project. (1990-11-12). Retrieved on 2013-07-17. 7. ^ "Le Web a été inventé... en France!". Le Point. 1 January 2012. Retrieved 2013-0405. 8. ^ a b c "Berners-Lee, Tim; Cailliau, Robert (12 November 1990)."WorldWideWeb: Proposal for a hypertexts Project". Retrieved 27 July 2009. 9. ^ Berners-Lee, Tim. "Pre-W3C Web and Internet Background". World Wide Web Consortium. Retrieved 21 April 2009. 10. ^ "Aug. 7, 1991: Ladies and Gentlemen, the World Wide Web"Wired. Retrieved 15 June 2013. 11. ^ von Braun, Wernher (May 1970). "TV Broadcast Satellite".Popular Science: 65– 66. Retrieved 12 January 2011. 12. ^ Berners-Lee, Tim (March 1989). "Information Management: A Proposal". W3C. Retrieved 27 July 2009. 13. ^ "Tim Berners-Lee's original World Wide Web browser". "With recent phenomena like blogs and wikis, the web is beginning to develop the kind of collaborative nature that its inventor envisaged from the start." 14. ^ "Tim Berners-Lee: client". Retrieved 27 July 2009. 15. ^ "First Web pages". Retrieved 27 July 2009. 16. ^ Murawski, John (24 May 2013). "Hunt for world's oldest WWW page leads to UNC Chapel Hill". News & Observer. 17. ^ "Short summary of the World Wide Web project". Google. 6 August 1991. Retrieved 27 July 2009. 18. ^ "Silvano de Gennaro disclaims 'the first photo on the web'". Retrieved 27 July 2012. "If you read well our website, it says that it was, to our knowledge, the 'first photo of a band'. Dozens of media are totally distorting our words for the sake of cheap sensationalism. Nobody knows which was the first photo on the web." 19. ^ "W3C timeline". Retrieved 30 March 2010. 20. ^ "About SPIRES". Retrieved 30 March 2010. 21. ^ "The Early World Wide Web at SLAC". 22. ^ "A Little History of the World Wide Web".
  17. 17. 23. ^ Conklin, Jeff (1987), IEEE Computer 20 (9): 17–41 24. ^ "Inventor of the Week Archive: The World Wide Web".Massachusetts Institute of Technology: MIT School of Engineering. Retrieved 23 July 2009. 25. ^ "Ten Years Public Domain for the Original Web Software". 30 April 2003. Retrieved 27 July 2009. 26. ^ "Mosaic Web Browser History – NCSA, Marc Andreessen, Eric Bina". Retrieved 27 July 2009. 27. ^ "NCSA Mosaic – September 10, 1993 Demo". Retrieved 27 July 2009. 28. ^ "Vice President Al Gore's ENIAC Anniversary Speech". 14 February 1996. Retrieved 27 July 2009. 29. ^ "Internet legal definition of Internet". West's Encyclopedia of American Law, edition 2. Free Online Law Dictionary. 15 July 2009. Retrieved 25 November 2008. 30. ^ "WWW (World Wide Web) Definition". TechTerms. Retrieved 19 February 2010. 31. ^ "The W3C Technology Stack". World Wide Web Consortium. Retrieved 21 April 2009. 32. ^ Muylle, Steve; Rudy Moenaert, Marc Despont (1999). "A grounded theory of World Wide Web search behaviour". Journal of Marketing Communications 5 (3): 143. 33. ^ a b Hamilton, Naomi (31 July 2008). "The A-Z of Programming Languages: JavaScript". Computerworld. IDG. Retrieved 12 May 2009. 34. ^ Buntin, Seth (23 September 2008). "jQuery Polling plugin". Retrieved 2009-08-22. 35. ^ Berners-Lee, Tim. "Frequently asked questions by the Press". W3C. Retrieved 27 July 2009. 36. ^ Palazzi, P (2011) 'The Early Days of the WWW at CERN' 37. ^ "automatically adding". mozillaZine. 16 May 2003. Retrieved 27 May 2009. 38. ^ Masnick, Mike (7 July 2008). "Microsoft Patents Adding 'www.' And '.com' To Text". Techdirt. Retrieved 27 May 2009. 39. ^ "MDBG Chinese-English dictionary – Translate". Retrieved 27 July 2009. 40. ^ "Frequently asked questions by the Press – Tim BL". Retrieved 27 July 2009. 41. ^ "It's not your grandfather's Internet". Strategic Finance. 2010. 42. ^ boyd, danah; Hargittai, Eszter (July 2010). "Facebook privacy settings: Who cares?". First Monday (University of Illinois at Chicago) 15 (8). 43. ^ a b Ben-Itzhak, Yuval (18 April 2008). "Infosecurity 2008 – New defence strategy in battle against e-crime". ComputerWeekly(Reed Business Information). Retrieved 20 April 2008.
  18. 18. 44. ^ Christey, Steve and Martin, Robert A. (22 May 2007)."Vulnerability Type Distributions in CVE (version 1.1)". MITRE Corporation. Retrieved 7 June 2008. 45. ^ Symantec Internet Security Threat Report: Trends for July–December 2007 (Executive Summary) (PDF) XIII. Symantec Corp. April 2008. pp. 1–2. Retrieved 11 May 2008. 46. ^ "Google searches web's dark side". BBC News. 11 May 2007. Retrieved 26 April 2008. 47. ^ "Security Threat Report" (PDF). Sophos. Q1 2008. Retrieved 24 April 2008. 48. ^ "Security threat report" (PDF). Sophos. July 2008. Retrieved 24 August 2008. 49. ^ Fogie, Seth, Jeremiah Grossman, Robert Hansen, and Anton Rager (2007). Cross Site Scripting Attacks: XSS Exploits and Defense (PDF). Syngress, Elsevier Science & Technology. pp. 68–69, 127. ISBN 1-59749-154-3. Archived from the original on 25 June 2008. Retrieved 6 June 2008. 50. ^ O'Reilly, Tim (30 September 2005). "What Is Web 2.0". O'Reilly Media. pp. 4–5. Retrieved 4 June 2008. and AJAX web applications can introduce security vulnerabilities like "client-side security controls, increased attack surfaces, and new possibilities for Cross-Site Scripting (XSS)", in Ritchie, Paul (March 2007). "The security risks of AJAX/web 2.0 applications" (PDF). Infosecurity (Elsevier). Archived from the original on 25 June 2008. Retrieved 6 June 2008. which citesHayre, Jaswinder S. and Kelath, Jayasankar (22 June 2006)."Ajax Security Basics". SecurityFocus. Retrieved 6 June 2008. 51. ^ Berinato, Scott (1 January 2007). "Software Vulnerability Disclosure: The Chilling Effect". CSO (CXO Media). p. 7. Archived from the original on 18 April 2008. Retrieved 7 June 2008. 52. ^ Prince, Brian (9 April 2008). "McAfee Governance, Risk and Compliance Business Unit". eWEEK (Ziff Davis Enterprise Holdings). Retrieved 25 April 2008. 53. ^ Preston, Rob (12 April 2008). "Down To Business: It's Past Time To Elevate The Infosec Conversation". InformationWeek(United Business Media). Retrieved 25 April 2008. 54. ^ Claburn, Thomas (6 February 2007). "RSA's Coviello Predicts Security Consolidation". InformationWeek (United Business Media). Retrieved 25 April 2008. 55. ^ Duffy Marsan, Carolyn (9 April 2008). "How the iPhone is killing the 'Net". Network World (IDG). Retrieved 17 April 2008. 56. ^ a b c "Web Accessibility Initiative (WAI)". World Wide Web Consortium. Retrieved 7 April 2009.[dead link] 57. ^ "Developing a Web Accessibility Business Case for Your Organization: Overview". World Wide Web Consortium. Retrieved 7 April 2009.[dead link]
  19. 19. 58. ^ "Legal and Policy Factors in Developing a Web Accessibility Business Case for Your Organization". World Wide Web Consortium. Retrieved 7 April 2009. 59. ^ "Web Content Accessibility Guidelines (WCAG) Overview". World Wide Web Consortium. Retrieved 7 April 2009. 60. ^ "Internationalization (I18n) Activity". World Wide Web Consortium. Retrieved 10 April 2009. 61. ^ Davis, Mark (5 April 2008). "Moving to Unicode 5.1". Google. Retrieved 10 April 2009. 62. ^ "World Wide Web Consortium Supports the IETF URI Standard and IRI Proposed Standard" (Press release). World Wide Web Consortium. 26 January 2005. Retrieved 10 April 2009. 63. ^ Lynn, Jonathan (19 October 2010). "Internet users to exceed 2 billion ...". Reuters. Retrieved 9 February 2011. 64. ^ S. Lawrence, C.L. Giles, "Searching the World Wide Web," Science, 280(5360), 98–100, 1998. 65. ^ S. Lawrence, C.L. Giles, "Accessibility of Information on the Web," Nature, 400, 107–109, 1999. 66. ^ "The 'Deep' Web: Surfacing Hidden Value". Archived from the original on 4 April 2008. Retrieved 27 July 2009. 67. ^ "Distribution of languages on the Internet". Retrieved 27 July 2009. 68. ^ Alessio Signorini. "Indexable Web Size". Retrieved 27 July 2009. 69. ^ "The size of the World Wide Web". Retrieved 27 July 2009. 70. ^ Alpert, Jesse; Hajaj, Nissan (25 July 2008). "We knew the web was big...". The Official Google Blog. 71. ^ a b "Domain Counts & Internet Statistics". Name Intelligence. Retrieved 17 May 2009. 72. ^ "World Wide Wait". TechEncyclopedia. United Business Media. Retrieved 10 April 2009. 73. ^ Khare, Rohit and Jacobs, Ian (1999). "W3C Recommendations Reduce 'World Wide Wait'". World Wide Web Consortium. Retrieved 10 April 2009. 74. ^ Nielsen, Jakob (from Miller 1968; Card et al. 1991) (1994)."5". Usability Engineering: Response Times: The Three Important Limits. Morgan Kaufmann. Retrieved 10 April 2009. Further reading
  20. 20. Niels Brügger, ed. Web History (2010) 362 pages; Historical perspective on the World Wide Web, including issues of culture, content, and preservation. Fielding, R.; Gettys, J.; Mogul, J.; Frystyk, H.; Masinter, L.; Leach, P.; Berners-Lee, T. (June 1999). Hypertext Transfer Protocol – HTTP/1.1. Request For Comments 2616. Information Sciences Institute.[dead link] Berners-Lee, Tim; Bray, Tim; Connolly, Dan; Cotton, Paul; Fielding, Roy; Jeckle, Mario; Lilley, Chris; Mendelsohn, Noah; Orchard, David; Walsh, Norman; Williams, Stuart (15 December 2004).Architecture of the World Wide Web, Volume One. Version 20041215. W3C. Polo, Luciano (2003). "World Wide Web Technology Architecture: A Conceptual Analysis". New Devices. Retrieved 31 July 2005. Skau, H.O. (March 1990). "The World Wide Web and Health Information". New Devices. Retrieved 1989. External links Early archive of the first Web site Internet Statistics: Growth and Usage of the Web and the Internet Living Internet A comprehensive history of the Internet, including the World Wide Web. Web Design and Development at the Open Directory Project World Wide Web Consortium (W3C) W3C Recommendations Reduce "World Wide Wait" World Wide Web Size Daily estimated size of the World Wide Web. Antonio A. Casilli, Some Elements for a Sociology of Online Interactions The Erdős Webgraph Server offers weekly updated graph representation of a constantly increasing fraction of the WWW. History of the World Wide Web From Wikipedia, the free encyclopedia
  21. 21. The examples and perspective in this article deal primarily with the United States and do not represent a worldwide view of the subject.Please improve this article and discuss the issue on the talk page. (October 2010) Today, the Web and the Internet allow connectivity from literally everywhere on earth—even ships at sea and in outer space. The World Wide Web ("WWW" or simply the "Web") is a global information medium which users can read and write via computers connected to theInternet. The term is often mistakenly used as a synonym for the Internet itself, but the Web is a service that operates over the Internet, just as e-mail also does. The history of the Internet dates back significantly further than that of the World Wide Web. The hypertext portion of the Web in particular has an intricate intellectual history; notable influences and precursors include Vannevar Bush's Memex,[1]IBM's Generalized Markup Language,[2] and Ted Nelson's Project Xanadu.[1] The concept of a home-based global information system goes at least as far back as "A Logic Named Joe", a 1946 short story by Murray Leinster, in which computer terminals, called "logics," were in every home. Although the computer system in the story is centralized, the story captures some of the feeling of the ubiquitous information explosion driven by the Web. Contents [hide] 1 1979–1991: Development of the World Wide Web 2 1992–1995: Growth of the WWW o 2.1 Early browsers o 2.2 Web organization 3 1996–1998: Commercialization of the WWW 4 1999–2001: "Dot-com" boom and bust 5 2002–present: The Web becomes ubiquitous o 5.1 Web 2.0 6 See also
  22. 22. 7 References 8 Footnotes 9 External links 1979–1991: Development of the World Wide Web[edit source | editbeta] "In August, 1984 I wrote a proposal to the SW Group Leader, Les Robertson, for the establishment of a pilot project to install and evaluate TCP/IP protocols on some key non-Unix machines at CERN ... By 1990 CERN had become the largest Internet site in Europe and this fact... positively influenced the acceptance and spread of Internet techniques both in Europe and elsewhere... A key result of all these happenings was that by 1989 CERN's Internet facility was ready to become the medium within which Tim Berners-Lee would create the World Wide Web with a truly visionary idea..." Ben Segal. Short History of Internet Protocols at CERN, April 1995 [3] The NeXTcube used by Tim Berners-Lee at CERN became the first Web server. In 1980, Tim Berners-Lee, an independent contractor at the European Organization for Nuclear Research (CERN), Switzerland, built ENQUIRE, as a personal database of people and software models, but also as a way to play with hypertext; each new page of information in ENQUIRE had to be linked to an existing page.[1] In 1984 Berners-Lee returned to CERN, and considered its problems of information presentation: physicists from around the world needed to share data, and with no common machines and no common presentation software. He wrote a proposal in March 1989 for "a large hypertext database with typed links", but it generated little interest. His boss, Mike Sendall, encouraged Berners-Lee to begin implementing his system on a newly acquired NeXT workstation.[4] He considered several names, including Information Mesh,[5] The Information Mine (turned down as it abbreviates to TIM, the WWW's creator's name) or Mine of Information (turned down because it abbreviates to MOI which is "Me" in French), but settled on World Wide Web.[6]
  23. 23. Robert Cailliau, Jean-François Abramatic andTim Berners-Lee at the 10th anniversary of theWWW Consortium. He found an enthusiastic collaborator in Robert Cailliau, who rewrote the proposal (published on November 12, 1990) and sought resources within CERN. Berners-Lee and Cailliau pitched their ideas to the European Conference on Hypertext Technology in September 1990, but found no vendors who could appreciate their vision of marrying hypertext with the Internet. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the HyperText Transfer Protocol(HTTP) 0.9, the HyperText Markup Language (HTML), the first Web browser (named WorldWideWeb, which was also a Web editor), the first HTTP server software (later known as CERN httpd), the first web server (, and the first Web pages that described the project itself. The browser could access Usenet newsgroups and FTP files as well. However, it could run only on the NeXT; Nicola Pellow therefore created a simple text browser that could run on almost any computer called the Line Mode Browser.[7] To encourage use within CERN, Bernd Pollermann put the CERN telephone directory on the web — previously users had to log onto the mainframe in order to look up phone numbers.[7] According to Tim Berners-Lee, the Web was mainly invented in the Building 31 at CERN ( 46.2325°N 6.0450°E ) but also at home, in the two houses he lived in during that time (one in France, one in Switzerland).[8] In January 1991 the first Web servers outside CERN itself were switched on. [9] The first web page may be lost, but Paul Jones of UNC-Chapel Hill in North Carolina revealed in May 2013 that he has a copy of a page sent to him in 1991 by Berners-Lee which is the oldest known web page. Jones stored the plain-text page, with hyperlinks, on a floppy disk and on his NeXT computer.[10] On August 6, 1991,[11] Berners-Lee posted a short summary of the World Wide Web project on the alt.hypertext newsgroup.[12] This date also marked the debut of the Web as a publicly available service on the Internet, although new users only access it after August 23. For this reason this is considered the internaut's day. "The WorldWideWeb (WWW) project aims to allow all links to be made to any information anywhere. [...] The WWW project was started to allow high energy physicists to share data, news, and documentation. We
  24. 24. are very interested in spreading the web to other areas, and having gateway servers for other data. Collaborators welcome!" —from Tim Berners-Lee's first message Paul Kunz from the Stanford Linear Accelerator Center visited CERN in September 1991, and was captivated by the Web. He brought the NeXT software back to SLAC, where librarian Louise Addis adapted it for the VM/CMS operating system on the IBM mainframe as a way to display SLAC’s catalog of online documents;[7] this was the first web server outside of Europe and the first in North America.[13] The www-talk mailing list was started in the same month.[9] An early CERN-related contribution to the Web was the parody band Les Horribles Cernettes, whose promotional image is believed to be among the Web's first five pictures.[14] 1992–1995: Growth of the WWW[edit source | editbeta] In keeping with its birth at CERN, early adopters of the World Wide Web were primarily university-based scientific departments or physics laboratories such as Fermilab and SLAC. By January 1993 there were fifty Web servers across the world; by October 1993 there were over five hundred.[9] Early websites intermingled links for both the HTTP web protocol and the then-popular Gopher protocol, which provided access to content through hypertext menus presented as a file system rather than through HTML files. Early Web users would navigate either by bookmarking popular directory pages, such as Berners-Lee's first site at, or by consulting updated lists such as the NCSA"What's New" page. Some sites were also indexed by WAIS, enabling users to submit full-text searches similar to the capability later provided by search engines. There was still no graphical browser available for computers besides the NeXT. This gap was discussed in January 1992,[9] and filled in April 1992 with the release of Erwise, an application developed at theHelsinki University of Technology, and in May by ViolaWWW, created by Pei-Yuan Wei, which included advanced features such as embedded graphics, scripting, and animation.[7] ViolaWWW was originally an application for HyperCard. Both programs ran on the X Window System for Unix.[7] Students at the University of Kansas adapted an existing text-only hypertext browser, Lynx, to access the web. Lynx was available on Unix and DOS, and some web designers, unimpressed with glossy graphical websites, held that a website not accessible through Lynx wasn’t worth visiting. Early browsers[edit source | editbeta] The turning point for the World Wide Web was the introduction[15] of the Mosaic web browser[16] in 1993, a graphical browser developed by a team at the National Center for Supercomputing Applications(NCSA) at the University of Illinois at Urbana-Champaign (UIUC), led by Marc Andreessen. Funding for Mosaic came from the High-Performance Computing and Communications Initiative, a funding program initiated by thenSenator Al Gore's High Performance Computing and Communication Act of 1991 also known as the Gore Bill.[17]
  25. 25. Remarkably the first Mosaic Browser lacked a "back button", a feature proposed in 1992-3 by the same individual who invented the concept of clickable text documents. The request was emailed from the University of Texas computing facility. The browser was intended to be an editor and not simply a viewer, but was to work with computer generated hypertext lists called "search engines". The origins of Mosaic date to 1992. In November 1992, the NCSA at the University of Illinois (UIUC) established a website. In December 1992, Andreessen and Eric Bina, students attending UIUC and working at the NCSA, began work on Mosaic. They released an X Window browser in February 1993. It gained popularity due to its strong support of integrated multimedia, and the authors’ rapid response to user bug reports and recommendations for new features. The first Microsoft Windows browser was Cello, written by Thomas R. Bruce for the Legal Information Institute at Cornell Law School to provide legal information, since more lawyers had more access to Windows than to Unix. Cello was released in June 1993.[7] The NCSA released Mac Mosaic and WinMosaic in August 1993.[9] After graduation from UIUC, Andreessen and James H. Clark, former CEO of Silicon Graphics, met and formed Mosaic Communications Corporation to develop the Mosaic browser commercially. The company changed its name to Netscape in April 1994, and the browser was developed further as Netscape Navigator. Web organization[edit source | editbeta] In May 1994, the first International WWW Conference, organized by Robert Cailliau,[18][19] was held at CERN;[20] the conference has been held every year since. In April 1993, CERN had agreed that anyone could use the Web protocol and code royalty-free; this was in part a reaction to the perturbation caused by the University of Minnesota's announcement that it would begin charging license fees for its implementation of the Gopher protocol. In September 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at the Massachusetts Institute of Technology with support from the Defense Advanced Research Projects Agency (DARPA) and the European Commission. It comprised various companies that were willing to create standards and recommendations to improve the quality of the Web. Berners-Lee made the Web available freely, with no patent and no royalties due. The W3C decided that its standards must be based on royalty-free technology, so they can be easily adopted by anyone. By the end of 1994, while the total number of websites was still minute compared to present standards, quite a number of notable websites were already active, many of which are the precursors or inspiring examples of today's most popular services. 1996–1998: Commercialization of the WWW[edit source | editbeta] Main article: Web marketing
  26. 26. By 1996 it became obvious to most publicly traded companies that a public Web presence was no longer optional.[citation needed] Though at first people saw mainly[citation needed] the possibilities of free publishing and instant worldwide information, increasing familiarity with two-way communication over the "Web" led to the possibility of direct Web-based commerce (e-commerce) and instantaneous group communications worldwide. More dotcoms, displaying products on hypertext webpages, were added into the Web. 1999–2001: "Dot-com" boom and bust[edit source | editbeta] Low interest rates in 1998–99 facilitated an increase in start-up companies. Although a number of these new entrepreneurs had realistic plans and administrative ability, most of them lacked these characteristics but were able to sell their ideas to investors because of the novelty of the dot-com concept. Historically, the dot-com boom can be seen as similar to a number of other technology-inspired booms of the past including railroads in the 1840s, automobiles in the early 20th century, radio in the 1920s, television in the 1940s, transistor electronics in the 1950s, computer time-sharing in the 1960s, and home computers and biotechnology in the early 1980s. In 2001 the bubble burst, and many dot-com startups went out of business after burning through their venture capital and failing to become profitable. Many others, however, did survive and thrive in the early 21st century. Many companies which began as online retailers blossomed and became highly profitable. More conventional retailers found online merchandising to be a profitable additional source of revenue. While some online entertainment and news outlets failed when their seed capital ran out, others persisted and eventually became economically self-sufficient. Traditional media outlets (newspaper publishers, broadcasters and cablecasters in particular) also found the Web to be a useful and profitable additional channel for content distribution, and an additional means to generate advertising revenue. The sites that survived and eventually prospered after the bubble burst had two things in common; a sound business plan, and a niche in the marketplace that was, if not unique, particularly well-defined and wellserved. 2002–present: The Web becomes ubiquitous[edit source | editbeta] In the aftermath of the dot-com bubble, telecommunications companies had a great deal of overcapacity as many Internet business clients went bust. That, plus ongoing investment in local cell infrastructure kept connectivity charges low, and helping to make high-speed Internet connectivity more affordable. During this time, a handful of companies found success developing business models that helped make the World Wide Web a more compelling experience. These include airline booking sites, Google's search engine and its profitable approach to simplified, keyword-based advertising, as well as ebay's do-it-yourself auction site and's online department store. This new era also begot social networking websites, such as MySpace and Facebook, which, though unpopular at first, very rapidly gained acceptance in becoming a major part of youth culture. Web 2.0[edit source | editbeta]
  27. 27. Beginning in 2002, new ideas for sharing and exchanging content ad hoc, such as Weblogs and RSS, rapidly gained acceptance on the Web. This new model for information exchange, primarily featuring DIYuser-edited and generated websites, was coined Web 2.0. The Web 2.0 boom saw many new service-oriented startups catering to a new, democratized Web. Some believe it will be followed by the full realization of a Semantic Web. Tim Berners-Lee originally expressed the vision of the Semantic Web as follows:[21] I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize. — Tim Berners-Lee, 1999 Predictably, as the World Wide Web became easier to query, attained a higher degree of usability, and shed its esoteric reputation, it gained a sense of organization and unsophistication which opened the floodgates and ushered in a rapid period of popularization. New sites such as Wikipedia and its sister projects proved revolutionary in executing the User edited content concept. In 2005, 3 exPayPalemployees formed a video viewing website called YouTube. Only a year later, YouTube was proven the most quickly popularized website in history, and even started a new concept of user-submitted content in major events, as in the CNN-YouTube Presidential Debates. The popularity of YouTube, Facebook, etc., combined with the increasing availability and affordability of high-speed connections has made video content far more common on all kinds of websites. Many videocontent hosting and creation sites provide an easy means for their videos to be embedded on third party websites without payment or permission. This combination of more user-created or edited content, and easy means of sharing content, such as via RSS widgets and video embedding, has led to many sites with a typical "Web 2.0" feel. They have articles with embedded video, user-submitted comments below the article, and RSS boxes to the side, listing some of the latest articles from other sites. Continued extension of the World Wide Web has focused on connecting devices to the Internet, coined Intelligent Device Management. As Internet connectivity becomes ubiquitous, manufacturers have started to leverage the expanded computing power of their devices to enhance their usability and capability. Through Internet connectivity, manufacturers are now able to interact with the devices they have sold and shipped to their customers, and customers are able to interact with the manufacturer (and other providers) to access new content. Lending credence to the idea of the ubiquity of the web, Web 2.0 has found a place in the global English lexicon. On June 10, 2009 the Global Language Monitor declared it to be the one-millionth English word.[22] See also[edit source | editbeta]
  28. 28. History of computing Hardware Hardware before 1960 Hardware 1960s to present Hardware in Soviet Bloc countries Computer science Artificial intelligence Compiler construction Computer science Operating systems Programming languages Software engineering Modern concepts Graphical user interface Internet Personal computers Laptops Video games World Wide Web Timeline of computing 2400 BC–1949 1950–1979 1980–1989 1990–1999
  29. 29. 2000–2009 2010–2019 more timelines ... Category V T E Hypermedia Linked Data Semantic Web Tim Berners-Lee Computer Lib / Dream Machines History of hypertext History of the web browser References[edit source | editbeta] Berners-Lee, Tim; Fischetti, Mark (1999) Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor, ISBN 978-0-06-251586-5, HarperSanFrancisco (978-0-06251587-X (pbk.), HarperSanFrancisco, 2000) Cailliau, Robert; Gillies, James (2000) How the Web Was Born: The Story of the World Wide Web, ISBN 978-0-19-286207-5, Oxford University Press Graham, Ian S. (1995) The HTML Sourcebook: The Complete Guide to HTML. New York: John Wiley and Sons Herman, Andrew (2000) The World Wide Web and Contemporary Cultural Theory : Magic, Metaphor, Power, ISBN 978-0-415-92502-0, Routledge Raggett, Dave; Lam, Jenny; Alexander, Ian (1996) HTML 3. Electronic Publishing on the World Wide Web, ISBN 978-0-201-87693-0, Addison-Wesley Footnotes[edit source | editbeta] 1. ^ a b c Berners-Lee, Tim. "Frequently asked questions - Start of the web: Influences". World Wide Web Consortium. Retrieved 22 July 2010.
  30. 30. 2. ^ Berners-Lee, Tim. "Frequently asked questions - Why the //, #, etc?". World Wide Web Consortium. Retrieved 22 July 2010. 3. ^ A Short History of Internet Protocols at CERNby Ben Segal. 1995 4. ^ The Next Crossroad of Web History by Gregory Gromov 5. ^ Berners-Lee, Tim (May 1990). "Information Management: A Proposal". World Wide Web Consortium. Retrieved 24 August 2010. 6. ^ Tim Berners-Lee, Weaving the Web, HarperCollins, 2000, p.23 7. ^ a b c d e f Berners-Lee, Tim (ca 1993/1994). "A Brief History of the Web". World Wide Web Consortium. Retrieved 17 August 2010. 8. ^ [1] Tim Berners-Lee's account of the exact locations at CERN where the Web was invented 9. ^ a b c d e Raggett et al, 1996. p. 21 10. ^ Murawski, John (24 May 2013). "Hunt for world's oldest WWW page leads to UNC Chapel Hill". News & Observer. 11. ^ How the web went world wide, Mark Ward, Technology Correspondent, BBC News. Retrieved 24 January 2011 12. ^ Berners-Lee, Tim. "Qualifiers on Hypertext links... - alt.hypertext". Retrieved 11 July 2012. 13. ^ Tim Berners-Lee, Weaving the Web, HarperCollins, 2000, p.46 14. ^ Heather McCabe (1999-02-09). "Grrl Geeks Rock Out". Wired magazine. 15. ^ Mosaic Web Browser History – NCSA, Marc Andreessen, Eric Bina 16. ^ NCSA Mosaic – September 10, 1993 Demo 17. ^ Vice President Al Gore's ENIAC Anniversary Speech. 18. ^ Robert Cailliau (21 July 2010). "A Short History of the Web". NetValley. Retrieved 21 July 2010. 19. ^ Tim Berners-Lee. "Frequently asked questions - Robert Cailliau's role". World Wide Web Consortium. Retrieved 22 July 2010. 20. ^ "IW3C2 - Past and Future Conferences". International World Wide Web Conferences Steering Committee. 201005-02. Retrieved 16 May 2010. 21. ^ Berners-Lee, Tim; Fischetti, Mark (1999).Weaving the Web. HarperSanFrancisco. chapter 12. ISBN 978-0-06251587-2. 22. ^ "'Millionth English Word' declared"
  31. 31. Berghel, H., and Blank, D. (1999) The World Wide Web. In Advances in Computing, M. Zelkowitz (Ed). Academic Press, NY. THE WORLD WIDE WEB Hal Berghel & Douglas Blank PREPRINT Department of Computer Science University of Arkansas ABSTRACT This article provides a high-level overview of the World Wide Web in the context of a wide range of other Internet information access and delivery services. This overview will include client-side, server-side and "user-side" perspectives. Underlying Web technologies as well as current technology extensions to the Web will also be covered. Social implications of Web technology will also be addressed. TABLE OF CONTENTS 1. 2. 3. 4. INTRODUCTION THE INTERNET INFRASTRUCTURE THE SUCCESS OF THE WEB PERSPECTIVES 4.1 END USERS' PERSPECTIVE 4.2 HISTORICAL PERSPECTIVE
  32. 32. 5. THE UNDERLYING TECHNOLOGIES 5.1 HYPERTEXT MARKUP LANGUAGE (HTML) 5.2 HYPERTEXT TRANSFER PROTOCOL (HTTP) 6. DYNAMIC WEB TECHNOLOGIES 6.1 COMMON GATEWAY INTERFACE 6.2 FORMS 6.3 HELPER-APPS 6.4 PLUG-INS 6.5 EXECUTABLE CONTENT 6.6 PROGRAMMING 6.7 DHTML 6.8 SERVER SIDE INCLUDES 6.9 PUSH TECHNOLOGIES 6.10 STATE PARAMETERS 7. SECURITY and PRIVACY 7.1 SECURE SOCKET LAYER 7.2 SECURE HTTP (S-HTTP) 7.3 COOKIES 8. THE WEB AS A SOCIAL PHENOMENON 9. CONCLUSION 1. INTRODUCTION The World Wide Web, or "the Web," is a "finite but unbounded" collection of media-rich digital resources which are connected through high-speed digital networks. It relies upon an Internet protocol suite (see, [9][19]) which supports the cross-platform transmission and rendering of a wide variety of media types (i.e., multimedia). This cross-platform delivery environment represents an important departure from more traditional network communications protocols like Email, Telnet and FTP because it is content-centric. It is also to be distinguished from earlier document acquisition systems such as Gopher and WAIS (Wide Area Information Systems) which accommodated a narrower range of media formats and failed to include hyperlinks within their network navigation protocols. Following Gopher, the Web quickly extended and enriched the metaphor of integrated browsing and navigation. This made it possible to navigate and peruse a wide variety of media types on the Web effortlessly, which in turn led to the Web's hegemony as an Internet protocol. Thus, while earlier network protocols were special-purpose in terms of both function and media formats, the Web is highly versatile. It became the first convenient form of digital communication which had sufficient rendering and browsing utilities to allow any person or group with network access to share media-rich information with their peers. It also became the standard for hyper-
  33. 33. linking cybermedia (cyberspace multimedia), connecting concept to source in manifold directions identified primarily by Uniform Resource Locators (URLs). In a formal sense, the Web is a client-server model for packet-switched, networked computer systems defined by the protocol pair Hypertext Transfer Protocol (HTTP) and Hypertext Markup Language (HTML). HTTP is the primary transport protocol of the Web, while HTML defines the organization and structure of the Web documents to be exchanged. At this writing, the current HTTP standard is at version 1.0, and the current HTML version is 4.0. HTTP and HTML are higher-order Internet protocols specifically created for the Web. In addition, the Web must also utilize the lower-level Internet protocols, Internet Protocol (IP) and Transmission Control Protocol (TCP). The basic Internet protocol suite is thus designated TCP/IP. IP determines how datagrams will be exchanged via packet-switched networks while TCP builds upon IP by adding control and reliability checking [9][20]. According to NSFNET Backbone statistics, the Web moved into first place both in terms of the percentage of total packets moved (21%) and percentage of total bytes moved (26%) along the NSF backbone in the first few months of 1995. This placed the Web well ahead of the traditional Internet activity leaders, FTP (14%/21%) and Telnet (7.5%/2.5%), as the most popular Internet service. A comparison of the evolutionary patterns of the Web, Gopher and FTP are graphically depicted in Figure 1. Figure 1: Merit NIC Backbone statistics for the Web, Gopher and FTP from 1993-1995 in terms of both packet and byte counts. (source: Merit NIC and Jim Pitkow [18], used with permission)
  34. 34. 2. THE INTERNET INFRASTRUCTURE The Web evolution should be thought of as an extension of the digital computer network technology which began in the 1960's. Localized, platform-dependent, low-performance networks became prevalent in the 1970's. These LANS (local area networks) were largely independent of, and incompatible with, each other. In a quest for technology which could integrate these individual LANs, the U.S. Department of Defense, through its Advanced Research Projects Agency (ARPA, nee DARPA), funded research in inter-networking - or inter-connecting LANS via a Wide Area Network (aka WAN). The first national network which resulted from this project was called, not surprisingly, ARPANET. For most of the 1970's and 1980's ARPANET served as the primary network backbone in use for interconnecting LANs for both the research community and the U.S. Government. At least two factors considerably advanced the interest in the ARPANET project. First, and foremost, it was an "open" architecture: its underlying technological requirements, and software specifications were available for anyone to see. As a result, it became an attractive alternative to developers who bristled at the notion of developing software which would run on only a certain subset of available platforms. Second, ARPANET was built upon a robust, highly versatile and enormously popular protocol suite: TCP/IP (Transmission Control Protocol / Internet Protocol). The success and stability of TCP/IP elevated it to the status as the de facto standard for inter-networking. The U.S. military began using ARPANET in earnest in the early 1980's, with the research community following suit. Since TCP/IP software was in essence in the public domain, a frenzy of activity in deploying TCP/IP soon resulted in both government and academe. One outgrowth was the NSF-sponsored CSNET which linked computer science departments together. By the end of the 1980's, virtually every one who wanted to be inter-networked could gain access through government of academic institutions. ARPANET gradually evolved into the Internet, and the rest, as they say, is history. Not unexpectedly, the rapid (in fact, exponential) growth produced some problems. First and foremost was the problem of scalability. The original ARPANET backbone was unable to carry the network traffic by the mid-1980's. It was replaced by a newer backbone supported by the NSF, a backbone operated by a consortium of IBM, MCI and Merit shortly thereafter, and finally by the privatized, not-for-profit corporation, Advanced Networks and Services (ANS), which consisted of the earlier NSFNET consortium members. In the mid 1990's, MCI corporation deployed the high-speed Backbone Network System (vBNS) which completed the trend toward privatization of the digital backbone networks. The next stage of evolution for inter-network backbones is likely to be an outgrowth of the much-discussed Internet II project proposed to the U.S. Congress in 1997. This
  35. 35. "next generation" Internet is expected to increase the available bandwidth over the backbone by two orders of magnitude. Of course, there were other network environments besides the Internet which have met with varying degrees of success. Bitnet was a popular alternative for IBM mainframe customers during the !970's and early 1980's, as was UUCP for the Unix environment and the Email-oriented FIDONET. Europeans, meanwhile, used an alternative network protocol, D.25, for several of their networks (Joint Academic Network -JANET, European Academic and Research Network (EARN). By 1991, however, the enormous popularity of the Internet drove even recalcitrant foreign network providers into the Internet camp. High-speed, reliable Internet connectivity was assured with the European Backbone (EBONE) project. At this writing all but a handful of developing countries have some form of Internet connectivity. For the definitive overview of the Internet, see [9]. 3. THE SUCCESS OF THE WEB It has been suggested [2] that the rapid deployment of the Web is a result of a unique combination of characteristics: 1. the Web is an enabling technology - The Web was the first widespread network technology to extend the notion of "virtual network machine" to multimedia. While the ability to execute programs on, and retrieve content from, distributed computers was not new (e.g., Telnet and FTP were already in wide use by the time that the Web was conceived), the ability to produce and distribute media-rich documents via a common, platform-independent document structure, was new to the Web. 2. the Web is a unifying technology - The unification came through the Web's accommodation of a wide range of multimedia formats. Since such audio (e.g., .WAV,.AU), graphics (e.g., .GIF,.JPG) and animation (e.g., MPEG) formats are all digital, they were already unified in desktop applications prior to the Web. The Web, however, unified them for distributed, network applications. One Web "browser", as it later became called, would correctly render dozens of media formats regardless of network source. In addition, the Web unifies not only the access to many differing multimedia formats, but provides a platform-independent protocol which allows anyone, regardless of hardware or operating system, access to that media. 3. the Web is a social phenomena - The Web social experience evolved in three stages. Stage one was the phenomenon of Web "surfing". The richness and variety of Web documents and the novelty of the experience made Web surfing the de facto standard for curiosity-driven networking behavior in the 1990's. The second stage involved such Web interactive communication forums as Internet Relay Chat (IRC), which provided a new outlet for interpersonal but not-in-person communication. The third stage, which is in infancy as of this writing, involves the notion of virtual community. The
  36. 36. widespread popularity and social implications of such network-based, interactive communication is becoming an active area in computing research. 4. PERSPECTIVES 4.1 END USERS' PERSPECTIVE Extensive reporting on Web use and Web users may be found in a number of Web survey sites. Perhaps the most thorough of which is the biannual, self-selection World Wide Web Survey which began in January, 1994 (see reference, below). As this article is being written, the most current Web Survey is the eighth (October, 1997). Selected summary data appear in the table, below: TABLE 1. Summary Information on Web use average age of Web user = 35.7 years male:female ratio of users = 62:38 % users with college degrees = 46.9 % in computing field = 20.6; 23.4% are in education; 11.7% in management % of users from U.S. = 80.5 (and slowly decreasing) % of users who connect via modems with transmission speeds of 33.5Kb/sec or less = 55 % of respondents reported who use the Web for purchases exceeding $100 = 39 % of users for whom English is the primary language = 93.1 % of users who have Internet bank accounts = 5.5 % of Microsoft Windows platforms = 64.5 (Apple = 25.6%) % of users who plan to use Netscape = 60 (Internet Explorer = 15%) Source: GVU's WWW User Surveys, Used with permission. Of course a major problem with self-selection surveys, where subjects determine whether, or to what degree, they wish to participate in the survey, is that the samples are likely to be biased. In the case of the Web survey, for example, the authors recommend that the readers assume biases towards the experienced users. As a consequence, they recommend that readers confirm the results through random sample surveys. Despite these limitations, however, the Web Surveys are widely used and referenced and are among our best sources of information on Web use. An interesting byproduct of these surveys will be an increased understanding of the difference between traditional and electronic surveying methodologies and a concern over possible population distortions under a new, digital lens. One may
  37. 37. only conjecture at this point whether telephone respondents behave similarly to network respondents in survey settings. In addition, Web surveyors will develop new techniques for non-biased sampling which avoids the biases inherent in selfselection. The science and technology behind such electronic sampling may well be indispensable for future generations of Internet marketers, communicators, and organizers. 4.2 HISTORICAL PERSPECTIVE The Web was conceived by Tim Berners-Lee and his colleagues at CERN (now called the European Laboratory for Particle Physics) in 1989 as a shared information space which would support collaborative work. Berners-Lee defined HTTP and HTML at that time. As a proof-of- concept prototype, he developed the first Web client navigator-browser in 1990 for the NeXTStep platform. Nicola Pellow developed the first cross-platform Web browser in 1991 while Berners-Lee and Bernd Pollerman developed the first server application - a phone book database. By 1992, the interest in the Web was sufficient to produce four additional browsers - Erwise, Midas, and Viola for X Windows, and Cello for Windows. The following year, Marc Andreessen of the National Center for Supercomputer Application (NCSA) wrote Mosaic for the X Windows System which soon became the browser standard against which all others would be compared. Andreessen went on to co-found Netscape Communications in 1994 whose current browser, Netscape Navigator, remains the current de facto standard Web browser, despite continuous loss of market share to Microsoft's Internet Explorer in recent years (see, Figure 2). Netscape has also announced plans to license without cost the source code for version 5.0 to be released in Spring, 1998. At this point it is unclear what effect the move to "open sources" may have. (see Figures 2 and 3).
  38. 38. Figure 2: Market share of the three dominant Web browsers from 1994 through 1997. Despite the original design goal of supporting collaborative work, Web use has become highly variegated. The Web has been extended into a wide range of products and services offered by individuals and organizations, for commerce, education, entertainment, "edutainment", and even propaganda. A partial list of popular Web applications includes: individual and organizational homepages sales prospecting via interactive forms-based surveys advertising and the distribution of product promotional material new product information, product updates, product recall notices product support - manuals, technical support, frequently asked questions (FAQs) corporate record-keeping - usually via local area networks (LANs) and intranets electronic commerce made possible with the advent of several secure HTTP transmission protocols and electronic banking which can handle small charges (perhaps at the level of millicents) religious proselytizing propagandizing digital politics Figure 3: Navigator 4.x is a recent generic "navigator/browser" from Netscape Corporation. Displayed is a vanilla "splash page" of the World Wide Web Test Pattern - a test bench for determining the level of HTML compliance of a browser. Most Web resources at this writing are still set up for non-interactive, multimedia downloads (e.g., non-interactive Java [21] animation applets, movie clips, real-
  39. 39. time audio transmissions, text with graphics). This will change in the next decade as software developers and Web content-providers shift their attention to the interactive and participatory capabilities of the Internet, the Web, and their successor technologies. Already, the Web is eating into television's audience and will probably continue to do so. Since it seems inevitable that some aspects of both television and the Web will merge in the 21st century, they are said to be convergent technologies. But as of this writing, the dominant Web theme seems to remain static HTML documents and non-interactive animations. As mentioned above, the uniqueness of the Web as a network technology is a product of two protocols: HTML and HTTP. We elaborate on these protocols below. 5. THE UNDERLYING TECHNOLOGIES 5.1 HYPERTEXT MARKUP LANGUAGE (HTML) HTML is the business part of document preparation for the Web. Two not-forprofit organizations play a major role in standardizing HTML: the World Wide Web Consortium ( and the Internet Engineering Task Force ( Any document which conforms to the W3C/IETF HTML standards is called a Web message entity. HTML is about the business of defining Web message entities. The hypertext orientation of HTML derives from the pioneering and independent visions of Vannevar Bush [8] in the mid-1940's, and Doug Englebart [10] and Ted Nelson [14] in the 1960's. Bush proposed mechanical and computational aids in support of associative memory - i.e., the linking together of concepts which shared certain properties. Englebart sought to integrate variegated documents and their references through a common core document in a project called Augment. Nelson, who coined the terms "hypertext" and "hypermedia," added to the work of Bush and Englebart the concept of non-linear document traversal, as his proposed project Xanadu ( attempted to "create, access and manipulate this literature of richly formatted and connected information cheaply, reliably and securely from anywhere in the world." Subsequently, Nelson has also defined the notions of "transclusion," or virtual copies of collections of documents, and "transcopyright" which enables the aggregation of information regardless of ownership by automating the procedure by means of which creators are paid for their intellectual property. We won't comment beyond saying that the Web is an ideal test bed for Nelson's ideas. From an technical perspective, HTML is a sequence of "extensions" to the original concept of Berners-Lee - which was text-oriented. By early 1993, when the NCSA Mosaic navigator- browser client was released for the X Windows System, HTML
  40. 40. had been extended to include still-frame graphics. Soon audio and other forms of multimedia followed. After 1993, however, HTML standards were a moving target. Marc Andreesen, the NCSA Mosaic project leader, left the NCSA to form what would become Netscape Corporation. Under his technical supervision, Netscape went its own way in offering new features which were not endorsed by W3C/IETF, and at times were inconsistent with the Standard Generalized Markup Language (SGML) orientation intended by the designers of HTML. SGML is a document definition language which is independent of any particular structure - i.e. layout is defined by the presentation software based upon Under pressure to gain market share, navigator/browser developers attempted to add as many useful "extensions" to the HTML standards as could be practicably supported. This competition has been called the "Mosaic War," [3] which persists in altered form even to this day. Although not complete, Table 1 provides a technical perspective of the evolution of HTML. Table 1: HTML Evolution. note: (1) Version 3.2 is actually a subset of Version 3.0, the latter of which failed to get endorsed by W3C/IETF. (2) Dates are only approximate because of the time lag between the introduction of the technology and the subsequent endorsement as a standard. In some cases this delay is measured in years. GML - Generalized Markup Language Developed by IBM in 1969 to separate form from content in displaying documents SGML - ISO 8879 Standard Generalized Markup Language Adopted 1986 HTML Version 1 (circa 1992-3) basic HTML structure rudimentary graphics hypertext HTML Version 2 (circa 1994) forms lists HTML Version 3.2 (circa 1996-7) tables applets scripts advanced CGI programming security text flow around graphics HTML Version 4.x (early 1998) inline frames format via cascading style sheets (vs. HTML tags) compound documents with hierarchy of alternate rendering strategies
  41. 41. internationalization tty + braille support client-side image maps advanced forms/tables XML (1998) Extensible Markup Language. Subset of SGML Among the many Netscape innovations are: typographical enhancements and fonts alignment and colorization controls for text and graphics dynamic updating (continuous refresh without reload) server push/client pull frames cookies plug-ins scripts frames Java applets layers Many of these have become part of subsequent HTML standards. In addition to these formal standards, discussion is already underway for a radical extension of HTML called XML ( In many ways, HTML evolved away from its nicely thought-out roots. GML, or Generalized Markup Language, was developed in the 1960's at IBM to describe many different kinds of documents. Standard Generalized Markup Language, or SGML, was based on GML and became an ISO standard years later in the 1980's. SGML still stands today as the mother of all markup languages. Its designers were very careful to not confuse form and content, and created a wonderfully rich language. HTML became a patchwork of ideas as it quickly evolved over the last few years, and mudied the difference between form and content.XML is an effort to reunite HTML with its SGML roots. The development of XML, which began in late 1996, deals with the non-extensibility of HTML to handle advanced page design and a full range of new multimedia. XML will accomplish this by using 1. a more SGML-like markup language (vs. HTML) allows "personal" or "group" -oriented tags, and 2. a low-level syntax for data definition To see how XML differs from HTML, we examine a page of HTML code: <html> <head> <title>Bibliography</title> </head> <body> <p>Smith, Aaron S. (1999). <i>Understanding the Web</i>. Web Books, Inc. </p> </body> </html> This code, when rendered by an appropriate browser, would appear similar to the following: Smith, Aaron S. (1999). Understanding the Web. Web Books, Inc. Tags are special symbols in HTML and XML, and are indicated by the surrounding less-than and greater-than symbols. The majority of tags are paired i.e., they surround the text that they affect. For example, <I> and </I> indicate that the italics should be turned on and off, respectively. Now, contrast the HTML example with sample XML code:
  42. 42. <?XML version="1.0" ?> <xmldoc> <bibliography> <ref-name> Smith-1999b </ref-name> <name> <last> Smith </last> <first> Aaron </first> <mi> S </mi> </name> <title> Understanding the Web </title> <year> 1999 </year> <publisher> Web Books, Inc. </publisher> <type> Book </type> </bibliography> </xmldoc> Like the HTML code, XML is made up of tags. However, XML does not describe how to render the data, it merely indicates the structure and content of the data. HTML does have some of these kinds of tags (for example, <title> in the above HTML example) but, for the most part, HTML has evolved completely away from its SGML roots. XML was designed to be compatible with current HTML (and SGML, for that matter). Today's most common web browsers (Microsoft's Internet Explorer and Netscape's Navigator) do not support XML directly. Instead, most XML processors have been implemented as Java applications or applets (see the Web Consortium's website for a list of XML processors at Such a Java processor could be instructed to render the XML inside the browser exactly like the rendered HTML. One of the nice properties of XML is the separation of content and format. This distinction will surely help tame the Wild Web as it will allow easier searching, better structuring, and greater assistance to software agents in general. However, this isn't XML's greatest virtue: what makes XML a great leap forward for the Web is its ability to create new tags. Much like a modern database management system can define new fields, XML can create a new tag. In addition, XML tags can also have structure like the name field above was composed of first, last, and middle initial. As long as client and server agree on the structure of the data, they can freely create and share new data fields, types, and content via XML. Some have said that XML "does for data what Java does for programs." Examples of XML applications are the math-formula markup language, MathML (, which combines the ability to define content with a less-powerful suite of features to define presentation. Another example is RDF, a resource description format for meta-data (, which is used in both PICS, the Platform for Internet Content Selection ( and SMIL, the Synchronized Multimedia Integration Language, which is a declarative language for synchronizing multimedia on the Web. The XML prototype client is Jumbo (
  43. 43. Although XML will help make marking up Web pages easier, there is still a battle raging over which system should be responsible for the details of rendering pages. Current HTML coders must take responsibilty for exact placement over page layout, and getting a standard look across browsers is non-trivial. However, SGML leaves the page layout details up to the browser. Exactly how this important issue will play out remains to be seen. 5.2 HYPERTEXT TRANSFER PROTOCOL (HTTP) HTTP is a platform-independent protocol based upon the client-server model of computing which runs on any TCP/IP, packet switched digital network - e.g., the Internet. HTTP stands for Hyper Text Transfer Protocol and is the communication protocol with which browsers request data, and servers provide it. This data can be of many types including video, sound, graphics, and text. In addition, HTTP is extensible in that it can be augmented to transfer types data that do not yet exist. HTTP is an application layer protocol, and sits directly on top of TCP (Transmission File Protocol) . It is similar to in many ways to the File Transmission Protocol (FTP) and TELNET. HTTP follows the following logical flow: 1. A connection from the client's browser is made to a server, typically by the user having clicked on a link. 2. A request is made of the server. This request could be for data (i.e., a "GET") or could be a request to process data (i.e., "POST" or "PUT"). 3. The server attempts to fulfill the request. If successful, the client's browser will receive additional data to render. Otherwise, an error occurs. 4. The connection is then closed. HTTP uses the same underlying communication protocols as do all the applications that sit on top of TCP. For this reason, one can use the TELNET application to make an HTTP request. Other TCP-based applications include FTP, TFTP (Trivial File Transfer Protocol), and SMTP (Simple Mail Transfer Protocol) to name just a few. Consider the following example: % telnet 80 GET / HTTP/1.0 Accept: text/html Accept: text/plain User-Agent: TELNET/1.0 This command made from any operating system with access to the TELNET program requests to talk to port 80, the standard HTTP port, of a machine running a web server (TELNET normally uses port 23). A request is made to get the root document (GET /), in a particular protocol (HTTP/1.0), and accepting either text or HTML. The data (i.e., HTML codes) are returned, and the connection is closed. Note: the ending empty line is required.
  44. 44. Conversely, consider: HTTP/1.0 200 OK Server: Netscape-Enterprise/3.0K Date: Sun, 03 May 1998 22:25:37 GMT Content-type: text/html Connection: close <HTML> <HEAD> ... These are the data returned from the previous request. First, the server responds with the protocol (HTTP/1.0 in this example), gives the corresponding code (200 OK), provides details of the server (Netscape-Enterprise), date and time, and the format of the following data (text/html). Finally, an empty line separates the header from the actually HTML code. This type of processing is called "stateless". This makes HTTP only slightly, yet importantly, different from FTP. FTP has "state"; an FTP session has a series of settings that may be altered during the course of a dialog between client and server. For example, the "current directory" and "download data type" settings maybe be changed during an FTP dialog. HTTP, on the other hand, has no such interaction--the conversation is limited to a simple request and response. This has been the most limiting aspect of HTTP. Much current Web development has centered around dealing with this particular limitation of the protocol (i.e., cookies). Although HTTP is very limited, it has shown its flexibility through what must be one of the most explosive and rapidly changing technological landscapes ever. This flexibility is made possible via the protocol's format negotiations. The negotiation begins with the client identifying the types of formats it can understand. The server responds with data in any of those formats that it can supply (text/html in the above example). In this manner, the client and server can agree on file types yet to be invented, or which depend on proprietary formats. If the client and server cannot agree on a format, the data is simply ignored. 6. DYNAMIC WEB TECHNOLOGIES Web technologies evolved beyond the original concept in several important respects. We examine HTML forms, the Common Gateway Interface, plug-ins, executable content, and push technologies. 6.1 COMMON GATEWAY INTERFACE The support of the Common Gateway Interface (CGI) within HTTP in 1993 added interactive computing capability to the Web. Here is a one-line C program that formats the standard greeting in basic HTML. (Note: make sure the binary is marked executable. Also, often the binary will need to have a .cgi extension to tell