An Introduction to Cyber World to a Newbie.InternetBack in 1960’s, the internet that we use today was developed by thecontribution of several people. The initial idea is credited to LeonardKleinrock, a computer science professor at University of California, LosAngeles (UCLA) after he published his first paper titled ”Information Flowin Large Communication Nets” . Initially, the internet was not public, therewas a forerunner ARPAnet (Advanced Research Project Agency Networks).ARPAnet created the TCP/IP communication standard which determinesthe data transfer on Internet today.What is a Protocol?A protocol is a standardized means of communication among machinesacross a network. These rules or set of established procedures determinethe format and transmission of data. Protocols allow data to be taken apartfor faster transmission, transmitted, and then reassembled at thedestination in the correct order.WWW, HTML, URL, HTTPTim Berners-Lee was the man leading the development of the World WideWeb (WWW), the defining of Hyper Text Markup Language (HTML) used tocreate web pages, Hyper Text Transfer Protocol (HTTP) and the UniversalResource Locators (URLs). All the developments took place around 1989and 1991. Tim Berners-Lee is currently the Director of the World Wide WebConsortium (W3C), the group that sets technical standards for the Web.WWWThe term WWW is an acronym or abbreviation to the World Wide Web orsometimes simply called as the Web. It is a system of all the resources(such as FTP, telnet, Usenet) and users on the Internet servers that support
specifically formatted documents called the Web Pages includinghyperlinked text, audio, and video files, etc. that can be accessed andsearched by browsers based on standards such as HTTP and TCP/IP.It was created in 1989 by the UK physicist Tim Berners-Leewhile working at the European Particle Physics Laboratory (called CERN) inSwitzerland.World Wide Web consists of all the public Web sites connected to theInternet worldwide, including the client devices (such as computers andcell phones) that access Web content. The WWW is just one of manyapplications of the Internet and computer networks.A broader definition from, the World Wide Web Consortium or W3C (theorganization founded by Tim Berners-Lee): "The World Wide Web is theuniverse of network-accessible information, an embodiment of humanknowledge." There are several applications called Web browsers that makeit easy to access the World Wide Web.The World Web is based on these technologies: HTML - Hypertext Markup Language HTTP - Hypertext Transfer Protocol Web browsers and web serversHTMLHTML stands for HyperText Markup Language also known as the mothertongue of the browser, the authoring language used to create documents onthe World Wide Web. Hence, it is the publishing language of the World WideWeb. HTML is similar to SGML (Standard Generalized Markup Language),although it is not a strict subset.HTML is a language, which makes it possible to present information (e.g.scientific research) on the Internet.Developed by scientist Tim Berners-Lee in 1990, HTML is the "hidden" codethat helps us communicate with others on the World Wide Web (WWW).The purpose was to make it easier for scientists at different universities togain access to each others research documents. The project became a
bigger success than Tim Berners-Lee had ever imagined. By inventingHTML he laid the foundation for the web as it is known today.H-T-M-L Hyper is the opposite of linear. Old-fashioned computer programswere necessarily linear - that is, they had a specific order. But with a"hyper" language such as HTML, the user can go anywhere on the webpage at any time. Text is just what youre looking at now - English characters used tomake up ordinary words. Mark-up- HTML defines the structure and layout of a Web documentby using a variety of tags and attributes. The correct structure for anHTML document starts with <HTML><HEAD> (enter here whatdocument is about) <BODY> and ends with </BODY></HTML>. Allthe information one would like to include in the Web page fits inbetween the <BODY> and </BODY> tags. There are hundreds of othertags used to format and layout the information in a Web page. Tagsare also used to specify hypertext links. These allow Web developersto direct users to other Web pages with only a click of the mouse oneither an image or word(s). Language is just that. HTML is the language that computers read inorder to understand web pages.The first version of HTML was described by Tim Berners-Lee in late 1991.For its first five years (1990-1995), HTML went through a number ofrevisions and experienced a number of extensions, primarily hosted first atCERN, and then at the IETF.With the creation of the World Wide Web Consortium (W3C), HTMLsdevelopment changed venue again. HTML is a formal recommendation bythe W3C and is generally followed to by the major browsers like MicrosoftsInternet Explorer.A first abortive attempt at extending HTML in 1995 known as HTML 3.0then made way to a more pragmatic approach known as HTML 3.2, whichwas completed in 1997. HTML4 followed, reaching completion in 1998.
The current version of HTML is HTML 4.0. Significant features in HTML 4are sometimes described in general as dynamic HTML.What is sometimes referred to as HTML 5 is an extensible form of HTMLcalled Extensible Hypertext Markup Language (XHTML). It is the newestspecification for HTML, and many browsers are going to start supporting itin the future.What we see when we view a page on the Internet is the browsersinterpretation of HTML. To see the HTML code of a page on the Internet,simply right-click on the browser and choose "View Source Code".HTTPHttp is a protocol or a language or a medium in which the information ispassed back and forth between the web servers and the clients.Http allows transmitting and receiving of information across the internetIf the website is communicating with the browser with http then it is likelyto be communicating with regular unsecure http method and any one cansnoop on the computer’s conversation with the website.All the user information is contained in the HTTP headers, cookies andquery parametersHTTPSHttps (= http +‘s’) is a URI Scheme identical in syntax of the http schemewhere ‘s’ stands for “secure”.It is a simple layering of http over SSL/TSL protocols to protect the traffic,thus adding security capabilities to the standard http communications. Itprovides authentication of the website and the associated web server thatone is dealing with.It protects the user from Man-in-the-middle-attacks by providing:- Bidirectional encryption of information between the client and theserver thus protecting the spying and tampering of the data, or theforging of communication.
Ensuring that the communication between the user and the websiteis not forged by a third person or an imposter.HTTPS is especially important over unencrypted networks (such as Wi-Fi), as anyone on the same local network can "packet sniff" and discoversensitive information. In addition, many free to use and even paid for WLANnetworks do “packet injection” for serving their own advertisements onwebpages or just for tricks, however this can be exploited maliciously e.g.by injecting malware and spying on users.Whenever a website is loaded in http instead of https the use informationand the session gets exposed. Therefore, it becomes mandatory to check forhttps before filling up and submission of the information to the server.Another example where HTTPS is important is over Tor Browser bundles,connections over Tor (anonymity network), as malicious Tor nodes candamage or alter the contents passing through them in an insecure mannerand inject malware into the connection. It is only due to the securityreasons Tor project started the development of HTTPS everywhere, whichis now included in the Tor Browser Bundle.Https signals the browser to use an added encryption layer of SSL/TSL toprotect the traffic.A client can find out by examining the server’s certificate whether theserver is secure or not.A Stark contrast between HTTP and HTTPSHttp HTTP URLs begin with http:// operates on port number 80 by default it is vulnerable, insecure and is subjected to man-in-middle andspying attacks It is faster than the https. When large amount of data are processedover a port performance difference is evident works on application layer
Https HTTPS URLs begin with https:// use port 443 by default, It is secured over the internet connection and is not subjected toman-in-the-middle attacks as all the information gets encryptedbefore being sent to the server. https is not a separate protocol but ordinary http over encryptedSSL/TSL (SSL comes in 2 options- mutual and single) Works on the network layer.The web server has to be prepared to accept https connections.Note:-There is a sophisticated type of man-in-the-middle attack called SSLstripping attack which was presented at the Blackhat Conference 2009.This type of attack overthrows the security provided by HTTPS by changingthe https: link into an http: link, taking advantage of the fact that fewinternet users actually type "https" into their browser interface: they get toa secure site by clicking on a link, and thus are deceived into consideringthat they are using Secured Http when in fact they are using the normalHTTP. The attacker then communicates in clear with the client. Thisencouraged the development of a countermeasure in HTTP called HTTPStrict Transport Security.Web ServerThe main function of a web server is to deliver web pages on the request toclients. This means delivery of HTML documents and any additional contentthat may be included by a document, such as images, style sheets andscripts. Not all Internet servers are part of the World Wide Web.
Web BrowserA software application which is the gateway to the internet, installed on thecomputer itself. It is used to locate, retrieve and also display content onthe World Wide Web, including Web pages, images, video and other files. Asa client/server model, the browser is the client running on a computer thatcontacts the Web server and requests information. The Web server sendsthe information back to the Web browser which displays the results on thecomputer or other internet-enabled device that supports a browser.A web browser communicates with a web server using the http protocol todownload the pages requested by the user, usually by clicking on ahyperlink. A browser can translate HTML, the language used to create webpages, into the content displayed in the browser window.Popular web browsers include Google Chrome, Mozilla Firefox, Opera, andInternet Explorer.Web sites and Web browsing exploded in popularity during the mid-1990s.URLA URL is an abbreviation or acronym of Uniform Resource Locator (URL.).It was developed by Tim Berners-Lee in 1994 and the Internet EngineeringTask Force (IETF) URI working group.It is a reference to documents and other resources on some machine on thenetwork on the World Wide Web. In other words, it is the global address orunique address for a file that is accessible on the Internet. It is alsosometimes referred to as a link.Such a file might be any Web (HTML) page other than the home page, animage file, or a program such as a common gateway interface application orJava applet.
It is in the form of formatted text string used by Web browsers, emailclients and other software to identify a network resource on the Internet.On the Web (which uses the Hypertext Transfer Protocol, or HTTP), anexample of a URL is:It specifies the use of a HTTP (Web browser) application, a uniquecomputer named www.example.com, and the location of a text file or page tobe accessed on that computer whose pathname is /abc/xyz.txt.A URL for a particular image on a Web site might look like this:A URL for a file meant to be downloaded using the File Transfer Protocol(FTP) would require that the "ftp" protocol be specified like thishypothetical URL:The example uses the Hypertext Transfer Protocol (HTTP), which istypically used to serve up hypertext documents.This is how a computer locates the web page that you are trying to find.URLs also can point to other resources on the network, such as databasequeries and command output. Network resources are files that can be plainWeb pages, other text documents, graphics, or programs.As stated earlier, a URL is a formatted string which consist of three parts(substrings):1. Network protocol2. Host name or address3. File or resource locationThese substrings are separated by special characters as follows:protocol :// host / locationhttp://www.example.com/abc/xyz.txthttp://work.example.com/assets/images/pic.gifhttp://www.example.com/widgets/tool.ps
URL ProtocolThe protocol substring defines a network protocol to be used to access aresource. These strings are short names followed by the three characters:// (a simple naming convention to denote a protocol definition). TypicalURL protocols include http://, ftp://, and mailto://.It indicates what protocol to be used to fetch the resource that identifies aspecific computer on the Internet,For example, the two URLs below point to two different files at the domainexample.com. The first specifies an executable file that should be fetchedusing the FTP protocol; the second specifies a Web page that should befetched using the HTTP protocol:ftp://www.example.com/stuff.exehttp://www.example.com/index.htmlURL HostThe host substring identifies a computer or other network device. Hostscome from standard Internet databases such as DNS and can be names orcan specify the IP address or the domain name where the resource islocated. The resource name is the complete address to the resource. Theformat of the resource name depends entirely on the protocol used, but formany protocols, including HTTP, the resource name contains one or more ofthe following components: Host NameThe name of the machine on which the resource lives. FilenameThe pathname to the file on the machine. Port NumberThe port number to which to connect (typically optional).
ReferenceA reference to a named anchor within a resource that usuallyidentifies a specific location within a file (typically optional).URL LocationThe location substring contains a path to one specific network resource onthe host. Resources are normally located in a host directory or folder. Forexample, /bin/accessibleobject/build-url.htm is the location of a Web pageincluding two subdirectories and the file name.When the location element is omitted such as in http://work.example.com/,the URL conventionally points to the root directory of the host and often ahome page (like index.html).In simple terms, it is a pathname, a hierarchical description that specifiesthe location of a file in that computer.An example of a URL is: http://www.example.com/index.html . In thisexample URL, example.com is called the domain name. The "index.html"refers to the specific page.Note: - The protocol identifier and the resource name are separated by acolon and two forward slashes.HTTP is just one of many different protocols used to access different typesof resources on the net. Other protocols include File Transfer Protocol(FTP), Gopher, File, and News.For many protocols, the host name and the filename are required, while theport number and reference are optional. For example, the resource namefor an HTTP URL must specify a server on the network (Host Name) and thepath to the document on that machine (Filename); it also can specify a portnumber and a reference.
Absolute vs. Relative URLsFull URLs featuring all three substrings are called absolute URLs. In somecases such as within Web pages, URLs can contain only the one locationelement. These are called relative URLs. Relative URLs are used forefficiency by Web servers and a few other programs when they alreadyknow the correct URL protocol and host.URIThe generic term for all types of names and addresses that refer to objectson the World Wide Web. The term "Web address" is a synonym for a URLthat uses the HTTP / HTTPS protocol.A URL is a type of URI Uniform Resource Identifier, formerly calledUniversal Resource Identifier.The URL format is specified in RFC 1738 Uniform Resource Locators (URL).Institutions on the WebThe chart below refers to the type of institutions that people may comeacross while accessing the Internet. The terminal portion of the host namedefines the country in which the host resides. For example,http://www.bbc.co.uk is the web address for the commercial business (.co)called BBC residing in the United Kingdom (.uk).
Extensions.com commercial institution.biz commercial institution.net commercial institution.edu educational institution.org not-for-profit organization.gov government institution.mil military.int international institution.info unrestricted use.museum museums.name names of individuals.pro lawyers, accountants, and doctors.aero aeronautical industry.coop cooperative organizations.jobs job advertisements.mobi mobile-device compatible sites