ChatGPT and Beyond - Elevating DevOps Productivity
CSE509 Lecture 1
1. CSE509: Introduction to Web Science and Technology Lecture 1: Introduction Muhammad AtifQureshi Web Science Research Group Institute of Business Administration (IBA)
2. Outline What is Web Science? Why We Need Web Science? Implications of Web Science CSE509 Adminstrivia Course Contents July 09, 2011
3. Science of the Web Introduction Why we need Web Science as a research field? Because we need a systems-level understanding of the Web. – Prof. Nigel Shadbolt, One of pioneers of Web Science program, University of Southampton July 09, 2011
4. Web Science Social and engineering dimensions (New York Times at launch of Web Science Program at Univ. of Southampton and MIT in 2006) Extends well beyond traditional Computer Science Introduction The Web isn’t about what you can do with computers. It’s people and, yes, they are connected by computers. But computer science, as the study of what happens in a computer, doesn’t tell you about what happens on the Web. –Tim Berners-Lee One of the founder of WWW July 09, 2011
5. What is the Web? A distributed document delivery system implemented through application-level protocols on the Internet A tool for collaborative writing and community building A framework of protocols that support e-commerce A network of co-operating computers A large, cylindrical, directed graph made up of Web pages and links July 09, 2011 Introduction
23. Synthetic discipline that creates mechanisms (e.g., formalisms, algorithms, etc.) in order to support particular desired behaviorJuly 09, 2011 Introduction
24. Which Science Explains the Web? Given Neither the Web nor the world is static The Web evolves in response to various pressures from Science Commerce The public Politics Etc. July 09, 2011 Introduction
25. Web Science The Web is a new technical and social phenomenon and a growing organism The Web needs to be studied and understood as an entity in its own right Web Science is a new field of science that involves a multi-disciplinary study and inquiry for the understanding of the Web and its relationships to us July 09, 2011 Introduction
26. Why Web Science? Dynamics and evolution The “deep (or dark) Web” Sampling, lack of complete enumeration Scale (e.g., What is the percentage of Web pages updated daily?) Search (e.g., What percentage of Web pages are indexed by search engines?) Web topology Artifacts of social interactions (blogs, etc.), Web sociology July 09, 2011 Importance
27. Web Science vs. Computer Science Metrics Computer Science: Moore’s Law, O(n) algorithm analysis, Gigabytes Web Science: Page views, Unique visitors/month, No. of songs/videos Topics Computer Science: Computer networks, Programming languages, Database systems, Operating systems, Compilers, Graphics Web Science: Social networks, Relationships (users, web pages, etc.), Web 2.0 applications, E-*, Creating/sharing multimedia Focus Computer Science: Technology, Computers, HPC, Proficient programmers Web Science: Applications, Users, Mobile interactivity, Universal accessibility July 09, 2011
28. What Could Scientific Theories for the Web Look Like? Every page on the Web can be reached by following less than 10 links The average number of words per search query is greater than 3 A wikipedia page on average contains 0.03 false facts The Web is a “scale-free” graph July 09, 2011 Importance
30. July 09, 2011 Proper discipline of interest is not only Web Science But “Web Science and Technology”
31. Web’s Relation with Entrepreneurship July 09, 2011 Implication Web Science represents a pretty big next step in the evolution of information. This kind of research is likely to have a lot of influence on the next generation of researchers, scientists and most importantly, the next generation of entrepreneurs who will build new companies from this. – Eric Schmdt, Ex-CEO, Google Inc.
32. For Pakistan Web Science and Technology Job market is heavily consumed by technology of Web solutions Remote industry such as Google, Yahoo, Microsoft is heavily investing in it Business is getting a good amount of share from the Web Social Media reaches people massively than the traditional media July 09, 2011 Implication
33. Course Objectives Have insight on the future direction of the Web How technological changes affect the Web as a system Learn design principles for complex Web applications and systems Prepare for the new era of Web science and technology July 09, 2011
34. Course Information Instructors Muhammad AtifQureshi ArjumandYounus Class Hours Saturdays 6:00 pm to 8:15 pm Office Hours Mondays 1:00 pm to 3:00 pm Evaluation Assignments (50%) Mid-Term Exam (30%) Research Project (20%) July 09, 2011
35. Course Organization Session One Information Retrieval Session Two Large-Scale Web Mining Session Three Social Web Mining July 09, 2011
36. Information Retrieval Principles and Theories behind Web Search Engines Basic IR models, data structures and algorithms Topic-based models Link-based ranking Search engine architecture July 09, 2011
37. Large-Scale Web Mining MapReduce Design Patterns Big data Larger amount of data means useful applications Algorithms using MapReduce Distributed File Systems (GFS) July 09, 2011 There is substantial promise in this new paradigm of computing, but unwarranted hype by the media and popular sources threatens its credibility in the long run. In some ways, cloud computing is simply brilliant marketing – Jimmy LinTwitter Scientist and Maryland Professor
38. Social Web Mining Social Web Crawling Mining for Information in Social Networks Trend analysis Dynamics and evolution patterns Temporal analysis Community detection and analysis Social Search July 09, 2011
39. EXAMPLE OF WEB SCIENCE PROJECT: Diff-IE (courtesy Jaime Teevan, Microsoft Research) July 09, 2011
Nigel Shadbolt – Prof. at Univ. of Southampton who had initiated the Web Science program in collaboration with MIT
Processing the enormous quantities of data necessary for these advances requires largeclusters, making distributed computing paradigms more crucial than ever. MapReduce is a programmingmodel for expressing distributed computations on massive datasets and an execution frameworkfor large-scale data processing on clusters of commodity servers. The programming model providesan easy-to-understand abstraction for designing scalable algorithms, while the execution frameworktransparently handles many system-level details, ranging from scheduling to synchronization to faulttolerance.MapReduce+Cloud Computing Debate