The World Wide Web CSCE 101 – Spring 2010


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The World Wide Web CSCE 101 – Spring 2010

  1. 1. The World Wide Web CSCE 101 – Spring 2010
  2. 2. The Internet and the Web <ul><li>Internet: A worldwide computer network that connects hundreds of thousands of smaller networks. “The mother of all networks”. </li></ul><ul><li>World Wide Web: The interconnected system of servers that support multimedia documents, i.e. the multimedia part of the Internet. </li></ul><ul><li>Timeline: </li></ul><ul><ul><li>Early 1960s: introduction of the network concept </li></ul></ul><ul><ul><li>1970: ARPANET, scholarly-aimed networks </li></ul></ul><ul><ul><ul><li>62 computers in 1974 </li></ul></ul></ul><ul><ul><ul><li>500 computers in 1983 </li></ul></ul></ul><ul><ul><ul><li>28,000 computers in 1987 </li></ul></ul></ul><ul><ul><li>1975: Ethernet developed by Robert Metcalf </li></ul></ul><ul><ul><li>1980: TCP/IP </li></ul></ul><ul><ul><li>1982: The first computer virus, Elk Cloner, spread via Apple II floppy disks </li></ul></ul><ul><ul><li>1989: Web invented by Tim Berners-Lee </li></ul></ul><ul><ul><li>1990: First Web browser based on HTML developed by Berners-Lee </li></ul></ul><ul><ul><li>Early 1990s: Anderseen developed the first graphical browser (Mosaic) </li></ul></ul><ul><ul><li>1993: The White House launches its Web site </li></ul></ul>
  3. 3. Web Browsers <ul><li>Web Browser: is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. Software that enables users to view Web pages and to jump from one page to another, e.g. IE, Mozilla Firefox, Safari, etc. </li></ul><ul><ul><li>Which browser is better? Why? </li></ul></ul><ul><li>Web Page: A document on the Web that can include multimedia data </li></ul><ul><li>Web Site: A collection of related Web pages usually designed or controlled by the </li></ul><ul><li> same individual or company. Generally shares a common domain name. </li></ul><ul><li>Practical Browser Tools: </li></ul><ul><ul><li>Status Bar: security info, page load progress </li></ul></ul><ul><ul><li>Favorites (bookmarks) </li></ul></ul><ul><ul><li>View  Source: view the code of a Web page </li></ul></ul><ul><ul><li>Tools  Internet Options  history, temporary Internet files, home page, auto complete, security settings, programs, etc. </li></ul></ul>
  4. 4. Domain Names <ul><li>URL (Uniform Resource Locator): The human-friendly address of a Web page </li></ul><ul><ul><li>String of characters that points to a piece of information on the Internet </li></ul></ul><ul><ul><li>Syntax: protocol://domain name/directory/file, e.g. </li></ul></ul><ul><ul><li>The domain name includes the domain type and sometimes a country extension </li></ul></ul><ul><ul><li>Have you ever mistyped a URL and gone to a website you weren’t expecting? </li></ul></ul><ul><li>ICANN non-profit organization was established to regulate human-friendly domain names </li></ul><ul><li>DNS (Domain Name System): A distributed set of servers storing domain information in hierarchical fashion </li></ul><ul><ul><li>DNS provides the mapping between the IP addresses and URLs of Internet sites </li></ul></ul><ul><ul><li>DNS requires static IP addresses </li></ul></ul><ul><ul><li>DNS poisoning </li></ul></ul><ul><ul><li>Domain names must be registered to ensure uniqueness, registration fees vary, cybersquatting </li></ul></ul>
  5. 5. Domain Names <ul><li>Main Domain Extension Types Suffix Extension Descriptions </li></ul><ul><li>.com ( .com mercial) is a generic top-level domain. It was one of the original top-level domains, and has grown to be the largest in use. </li></ul><ul><li>.org ( .org anization) is a generic top-level domain, and is mostly associated with non-profit organizations. It is also used in the charitable field, and used by the open-source movement. Government sites and Political parties in the US have domain names ending in .org </li></ul><ul><li>.net ( .net work) is a generic top-level domain and is one of the original top-level domains. Initially intended to be used only for network providers (such as Internet service providers). It is still popular with network operators, it is often treated as a second .com. It is currently the third most popular top-level domain. </li></ul><ul><li>.edu ( .edu cation) is the generic top-level domain for educational institutions, primarily those in the United States. One of the first top-level domains, .edu was originally intended for educational institutions anywhere in the world. Only post-secondary institutions that are accredited by an agency on the U.S. Department of Education's list of nationally recognized accrediting agencies are eligible to apply for a .edu domain. </li></ul><ul><li>.info ( .info rmation) is a generic top-level domain intended for informative website's, although its use is not restricted. It is an unrestricted domain, meaning that anyone can obtain a second-level domain under .info. The .info was one of many extension(s) that was meant to take the pressure off the overcrowded .com domain. </li></ul><ul><li>.gov ( .gov ernment) a generic top-level domain used by government entities in the United States. Other countries typically use a second-level domain for this purpose, e.g., for the United Kingdom. Since the United States controls the .gov Top Level Domain, it would be impossible for another country to create a domain ending in .gov. </li></ul><ul><li>.biz (business) the name is a phonetic spelling of the first syllable of &quot;business.&quot; A generic top-level domain to be used by businesses. It was created due to the demand for good domain names available in the .com top-level domain, and to provide an alternative to businesses whose preferred .com domain name which had already been registered by another. </li></ul>
  6. 6. Cookies <ul><li>Little text files left on your hard disk by some websites you visit </li></ul><ul><li>Cookies are data not programs, they do not generate pop-ups or behave like viruses </li></ul><ul><li>Can include your log-in name and browser preferences </li></ul><ul><li>Can be convenient </li></ul><ul><li>But they can be used to gather information about you and your browsing habits </li></ul><ul><ul><li>“ Third party” cookies: used by advertising companies to track users across multiple sites </li></ul></ul><ul><ul><li>People share machines </li></ul></ul>session-id-time 954242000 session-id 002-4135256-7625846 x-main eKQIfwnxuF7qtmX52x6VWAXh@Ih6Uo5H ubid-main 077-9263437-9645324 Sample cookie
  7. 7. E-mail <ul><li>E-mail Software and Carriers: </li></ul><ul><ul><li>Free Web-based e-mail services (e.g. Yahoo Mail) or bundled with software (e.g. MS Outlook) </li></ul></ul><ul><li>E-mail Privacy: </li></ul><ul><ul><li>How did they find my e-mail address? </li></ul></ul><ul><ul><li>Can anyone read the content of my messages? </li></ul></ul><ul><ul><li>What happens to my deleted e-mail messages? </li></ul></ul><ul><ul><li>What are my rights? - None Basically </li></ul></ul><ul><ul><li>Can anything be done to enhance e-mail privacy? </li></ul></ul><ul><li>E-mail Security: </li></ul><ul><ul><li>Dangers of attachments and HTML graphics </li></ul></ul><ul><li>Useful E-mail Tools: Mailing lists, filters (rules) </li></ul>
  8. 8. Deciphering Spam <ul><li>Spam: Unsolicited e-mail in the form of advertisements or chain letters. </li></ul><ul><ul><li>Waste of storage space, processing power, bandwidth, and time </li></ul></ul><ul><ul><li>E-mail address spoofing, disposable e-mail addresses or anonymous re-mailers, and zombies are techniques used in spamming </li></ul></ul><ul><ul><li>Email address harvesting </li></ul></ul><ul><li>Motives: </li></ul><ul><ul><li>Marketing </li></ul></ul><ul><ul><li>Chain letters & hoaxes </li></ul></ul><ul><ul><li>Malicious intent </li></ul></ul><ul><ul><li>Theft of confidential information (e.g. phishing) </li></ul></ul><ul><li>Spam Filters: </li></ul><ul><ul><li>Pattern-based or content based </li></ul></ul><ul><ul><li>Challenge-based </li></ul></ul><ul><ul><li>Black & White list based. </li></ul></ul><ul><li>Fight back by reporting new spammers to , , or </li></ul>
  9. 9. Searching for Information <ul><li>Search engine databases are often compiled using software programs called spiders </li></ul><ul><ul><li>Spiders crawl through the Web, following links from one page to another </li></ul></ul><ul><ul><li>Index the words on that site </li></ul></ul><ul><ul><li>Indexing techniques </li></ul></ul><ul><ul><li>Influencing search results (paid, malicious e.g. Google bombs), link rot </li></ul></ul><ul><li>If you publish an embarrassing web page and then take it down, is it REALLY gone? </li></ul><ul><li>Guidelines to evaluate Web resources </li></ul><ul><ul><li>Should you trust information you find online? </li></ul></ul><ul><ul><li>Does the information appear on a professional site maintained by a professional organization? </li></ul></ul><ul><ul><li>Does the website authority appear to be legitimate? </li></ul></ul><ul><ul><li>Is the website objective, complete, and current? </li></ul></ul>
  10. 10. Search Engines <ul><li>Types of Search Engines: </li></ul><ul><ul><li>Human-organized: Documents are categorized by subject-area experts, smaller databases, more accurate search results, e.g. Open Directory, About </li></ul></ul><ul><ul><li>Computer-created: Software spiders crawl the web for documents and categorize pages, larger databases, ranking systems, e.g. Google </li></ul></ul><ul><ul><li>Hybrid: Combines the two categories above </li></ul></ul><ul><ul><li>Metasearch or clustering: Direct queries to multiple search engines and cluster results, e.g. Copernic, Vivisimo, Mamma </li></ul></ul><ul><ul><li>Topic-specific – e.g. WebMD </li></ul></ul><ul><li>Advanced Search Options: </li></ul><ul><ul><li>Searches for various information formats & types, e.g. image search, scholarly search </li></ul></ul><ul><ul><li>Advanced query operators and wild cards </li></ul></ul><ul><ul><ul><li>? (e.g. science? means search for the keyword “science” but I am not sure of the spelling) </li></ul></ul></ul><ul><ul><ul><li>* (wildcard, e.g. comput* searches for keywords starting with “comput” combined with any word ending) </li></ul></ul></ul><ul><ul><ul><li>x AND y (both terms must be present) </li></ul></ul></ul><ul><ul><ul><li>x OR y (at least one of the terms must be present) </li></ul></ul></ul>
  11. 11. More Web Resources <ul><li>Wikis: </li></ul><ul><ul><li>A Wiki is a website on which authoring and editing can be done by anyone at anytime using a simple browser. </li></ul></ul><ul><ul><li>Wikipedia, Wikimedia, Wikibooks, Citizendium, etc. </li></ul></ul><ul><ul><li>Allow individuals to edit content to facilitate </li></ul></ul><ul><ul><li>Accuracy concerns </li></ul></ul><ul><li>Internet Telephony (VoIP): </li></ul><ul><ul><li>Providers include Vonage, Verizon, Skype, etc. </li></ul></ul><ul><ul><li>Uses the Internet to make phone calls, videoconference </li></ul></ul><ul><ul><li>Long-distance calls are either very inexpensive or free </li></ul></ul><ul><ul><li>Quality, security, and reliability concerns </li></ul></ul>
  12. 12. More Web Resources <ul><li>Social Networks: </li></ul><ul><ul><li>MySpace, Facebook, Friendster, Orkut, etc. </li></ul></ul><ul><ul><li>What are some features of today’s popular social networks? </li></ul></ul><ul><ul><li>Anti-social networks? </li></ul></ul><ul><ul><li>Social networks as “study groups”, Courses 2.0 </li></ul></ul><ul><ul><li>Privacy and safety concerns </li></ul></ul><ul><li>Plagiarism in the Internet Age: </li></ul><ul><ul><li>In a recent survey, 60% of students revealed that they have cheated in the past </li></ul></ul><ul><ul><li>Websites offering course material, e.g., </li></ul></ul><ul><ul><li>Use of portable electronic devices for cheating </li></ul></ul><ul><ul><li>Services used to combat cheating, e.g. </li></ul></ul>
  13. 13. More Web Resources <ul><li>Instant messaging (IM) and real-time chat (RTC) software </li></ul><ul><ul><li>Multi-protocol IM clients (AIM) </li></ul></ul><ul><ul><li>Web-based IM systems (Forum, chat room) </li></ul></ul><ul><li>Podcasting </li></ul><ul><li>Blogs </li></ul><ul><ul><li>Blogger, Xanga, LiveJournal, etc. </li></ul></ul><ul><ul><li>Microblog, vlog, photoblog, sketchblog, linklog, etc. </li></ul></ul><ul><ul><li>Blog search engines </li></ul></ul><ul><ul><li>Blogs and advertising, implications of ad blocking software </li></ul></ul><ul><ul><li>Do bloggers have the same rights as journalists? </li></ul></ul><ul><li>Really Simple Syndication (RSS) </li></ul><ul><ul><li>FireAnt, i-Fetch, RSS Captor, etc. </li></ul></ul><ul><ul><li>Built-in Web browser RSS features </li></ul></ul><ul><ul><li>Search for keyword: “RSS Readers” </li></ul></ul>