Lecture 6: How do we study the Social Web (2013)

  • 2,180 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,180
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
12
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Social Web Lecture VI How can we STUDY the Social Web?: The Web Science Lora Aroyo The Network Institute VU University Amsterdam (based on slides from Les Carr, Nigel Shadbolt)Monday, March 11, 13
  • 2. The Web the most used and one of the most transformative applications in the history of computing, e.g. how the Social Web has transformed the worlds communication approximately 1010 people more than 1011 web documentsMonday, March 11, 13
  • 3. Web is NOT a Thing • it’s not a verb, or a noun • it’s a performance, not an object • co-constructed with society • activity of individuals who create interlinked content that both reflect and reinforce the interlinkedness of society and social interaction ... and a record of that performanceMonday, March 11, 13
  • 4. The Web Great success as a technology, it’s built on significant computing infrastructure, but as an entity surprisingly unstudiedMonday, March 11, 13
  • 5. Science & Engineering • physical science: analytic discipline to find laws that generate or explain observed phenomena • CS is mainly synthetic: formalisms & algorithms are created to support specific desired behaviors • Web Science: web needs to be studied & understood as a phenomenon but also to be engineered for future growth and capabilitiesMonday, March 11, 13
  • 6. Simple micro rules give rise to complex macro phenomena • at microscale an infrastructure of artificial languages and protocols: a piece of engineering • however, interaction of people creating, linking and consuming information generates webs behavior as emergent properties at macroscale • properties require new analytic methods to be understood • some properties are desirable and are to be engineered in, others are undesirable and if possible engineered outMonday, March 11, 13
  • 7. A new way of software development • software applications designed based on appropriate technology (algorithm, design) and with envisioned social construct • usually tested in the small, testing microscale properties • a macrosystem evolving from people using the microsystem and interacting in often unpredicted ways, is far more interesting and must be analyzed in different ways • also the macrosystems exhibit challenges that do not exist at microscaleMonday, March 11, 13
  • 8. Evolution of Search Engines 1: techniques designed to rank documents 2: people were gaming to influence algorithms & improve their search rank 3: adapt search technologies to defeat this influenceMonday, March 11, 13
  • 9. The Web Graph • to understand the web, in good CS tradition, we look at the graph • nodes are web pages (HTML) • edges are hypertext links between nodes • first analysis shows that in-degree and out-degree follow power law distribution => shown to hold for large samples • this gave insight into the growth of the webMonday, March 11, 13
  • 10. Search Algorithms • the Web graph also as basis of algorithms for search engines: • HITS or PageRank assume that inserting a hyperlink symbolizes an endorsement of authority of the page linked toMonday, March 11, 13
  • 11. User State is Important • the original Web graph is too simple, starts from quasi static HTML • for personalization or customization different representations (of sources) may be served to different requesters, e.g. cookies • graph based models often do not account for this sort of user- dependent state, and not fit for all the information behind the servers, in Deep Web • it’s not a simple HTTP-GET anymore (but HTTP-POST or HTTP-GET with complex URI) that is the basis for defining nodes in the graph • URis that carry user state are heavily used in Web applications, but are not in the model and largely unanalyzedMonday, March 11, 13
  • 12. According to Google each day 20-25% of searches have not been seen before, i.e. generate a new identifier thus a new node in the graph more than 20 million new links per day, 200 per second do they follow the same power laws & growth models?Monday, March 11, 13
  • 13. validating such models is hard According to Google exponential growth of content changes in number & power of servers each day 20-25% of searches have not been seen before, i.e. increasing adiversity in users generate new identifier thus a new node in the graph more than 20 million new links per day, 200 per second do they follow the same power laws & growth models?Monday, March 11, 13
  • 14. Social Web Sites • modern websites (on the social web) • have large script systems running in browser • store personal information many Social Web sites are not part of the (open) graph model do these systems show a similar behavior? (macro) are they stable? are they fair? do they need to be regulated? are the access restrictions, for personal information, assured? there is a need for understanding and intervening/engineeringMonday, March 11, 13
  • 15. Wikipedia • purely mathematical (technology-based) models do not capture the whole story • the Wikipedia structure (link labels) shows a Zipf-like distribution just like other tag-based systems • Wikipedia is built on MediaWiki software • but other MediaWiki-based applications did not generate such significant use • the pure technological explanation cannot explain it • must be related to the social model of how Wikipedia is organized this is referred to as the dynamics of a social machine (already in TBL’s original vision of WWW)Monday, March 11, 13
  • 16. Collective Intelligence • why do people contribute? • how to maintain the connected content? • how are trust & provenance represented, maintained and repaired on the Web?Monday, March 11, 13
  • 17. Collective Intelligence Motivation Example Mean Fun “Writing in Wikipedia is fun” 6.10 Ideology “I think information should be free” 5.59 Values “I feel it’s important to help others” 3.96 Understanding “Writing in Wikipedia allows me to gain a new perspective on things” 3.92 Enhancement “Writing in Wikipedia makes me feel needed” 2.97 Protective “By writing in Wikipedia I feel less lonely” 1.97 Career “I can make new contacts that might help my career” 1.67 Social “People I am close to want me to write in Wikipedia” 1.51Monday, March 11, 13
  • 18. Social Machines • todays interactive applications are very early social machines limited by being largely isolated from one another • more effective social machines can be expected • social processes in society interlink, so they should also interlink on the web • technology needed to allow user communities to construct, share & adapt social machines to get success through trial, use & refinementMonday, March 11, 13
  • 19. Next Generation Social Machines • what are fundamental theoretical properties of social machines, what algorithms are needed to create them? • what underlying architectural principles a needed to effectively engineer new web components for this social software? • how can we extend current web infrastructure with mechanisms that make the social properties of information sharing explicit and conform to relevant social-policy expectations? • how do cultural differences affect development and use of social mechanisms?Monday, March 11, 13
  • 20. Modeling the Social Machines • trustworthiness, reliability or silent expectations about use of information • privacy, copyright, legal rules • we lack structures for formally representing & reasoning over such properties • thus, without scalable models for these issues it is hard to help the web go in the best possible directionMonday, March 11, 13
  • 21. Monday, March 11, 13
  • 22. L.A. Carr, C.J. Pope,W. Hall,N.R. Shadbolt http://webscience.ecs.soton.ac.uk/Monday, March 11, 13
  • 23. Web Science is about additionality not the union of disciplines, but intersectionMonday, March 11, 13
  • 24. Society is Diverse different parts of society have different objectives and hence incompatible Web requirements, e.g. openness, security, transparency, privacyMonday, March 11, 13
  • 25. Understanding the Socio-Cultural • POWER DISTANCE: The extent to which power is distributed equally within a society and the degree that society accepts this distribution. • UNCERTAINTY AVOIDANCE: The degree to which individuals require set boundaries and clear structures • INDIVIDUALISM vs COLLECTIVISM: The degree to which individuals base their actions on self- interest versus the interests of the group. • MASCULINITY vs FEMININITY: A measure of a societys goal orientation • TIME ORIENTATION: The degree to which a society does or does not value long-term commitments and respect for tradition.Monday, March 11, 13
  • 26. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinctionMonday, March 11, 13
  • 27. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinctionMonday, March 11, 13
  • 28. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinctionMonday, March 11, 13
  • 29. Understanding the variation • Ecology of the Web - structure of the environment, producers and consumers • Populations (individuals and species), traits/characteristics, heredity, genotypes and phenotypes • Mechanisms - variation (mutation, migration, HGT, genetic drift), selection • Outcomes - adaption, co- evolution, competition, co- operation, speciation, extinctionMonday, March 11, 13
  • 30. but How to do the Science?Monday, March 11, 13
  • 31. it’s relationships, stupid! not attributes All the worlds a net by David Cohen April, 2002 May, 2007Monday, March 11, 13
  • 32. • Leveraging recent advances in: • Theories: about the social motivations for creating, maintaining, dissolving and re-creating links in multidimensional networks and about emergence of macro-structures • Data: Semantic Web/Web 2.0 provide the technological capability to capture, store, merge, and query relational metadata needed to more effectively understand and enable communities • Methods: qualitative and quantitative methods to enable theoretically grounded network predictions • Computational infrastructure: Cloud computing and petascale applications are critical to face the computational challenges in analyzing the dataMonday, March 11, 13
  • 33. Network Analysis • is about linking social actors, e.g. systematically understanding and identifying connections • by using empirical data • draws on graphic imagery • relies on mathematical/ computational models • Jacob Moreno - one of the founders of social network analysis; some of the earliest graphical depictions of social networks (1933)Monday, March 11, 13
  • 34. Think Networks! Albert-László Barabási: Linked:The New Science of Networks • everything is connected to everything else • networks are pervasive - from the human brain to the Internet to the economy to our group of friends • following underlying order and follow simple laws • "new cartographers" are mapping networks in a wide range of scientific disciplines • social networks, corporations, and cells are more similar than they are different • new insights into the interconnected world • new insights on robustness of the Internet, spread of fads and viruses, even the future of democracy. April, 2002Monday, March 11, 13
  • 35. NYT, 26 February 2007Monday, March 11, 13
  • 36. Networks: another perspective :-) • Social Networks: It’s not what you know, it’s who you know • Cognitive Social Networks: It’s not who you know, it’s who they think you know. • Knowledge Networks: It’s not what you know, it’s what they think you knowMonday, March 11, 13
  • 37. Big Data Owners Who can do macro analysis? •Google, Bing,Yahoo!, Baidu •Large scale, comprehensive data •New forms of research alliance How Billions of Trivial Data Points can Lead to UnderstandingMonday, March 11, 13
  • 38. Monday, March 11, 13
  • 39. Open Data • common standards for release of public data • common terms for data where necessary • licenses - CC variants • exploitation & publication of distributed and decentralized information assetsMonday, March 11, 13
  • 40. Web ObservatoryMonday, March 11, 13
  • 41. slides from: david de roureMonday, March 11, 13
  • 42. slides from: david de roureMonday, March 11, 13
  • 43. Web Science Reflections Is the Web changing faster than our ability to observe it? How to measure or instrument the Web? How to identify behaviors and patterns? How to analyze the changing structure of the Web?Monday, March 11, 13
  • 44. Big Bang: Web Information • assumption of the open exchange of information is being imposed on the society • is the Web, open access, open data and scientific and creative commons offer a beneficial opportunity or dangerous cul-de- sac?Monday, March 11, 13
  • 45. Open Questions • How is the world changing as other parts of society impose their requirements on the Web?, e.g. current examples with SOTA/PIPA, ACTA requirements for security and policing taking over free exchange of information, unrestricted transfer of knowledge • Are the public and open aspects of the Web a fundamental change in society’s information processes, or just a temporary glitch?, e.g. are open source, open access, open science & creative commons efficient alternatives to free-based knowledge transfer?Monday, March 11, 13
  • 46. Open Questions • do we take Web for granted as provider of a free and unrestricted information exchange? • is Web Science the response to the pressure for the Web to change - to respond to the issues of security, commerce, criminality and privacy? • What are the challenges for Web science? •to explain how the Web impacts society? •to predict the outcomes of proposed changes to Web infrastructure on business & society?Monday, March 11, 13
  • 47. What can you do as a Computer Scientist? specifically for the Social WebMonday, March 11, 13
  • 48. Hands-on Teaser • Q&A on Assignments • Pitch of the Social Web Apps image source: http://www.flickr.com/photos/bionicteaching/1375254387/Monday, March 11, 13