Social information Access Tutorial at UMAP 2014


Published on

The power of the modern Web, which is frequently called the Social Web or Web 2.0, is frequently traced to the power of users as contributors of various kinds of contents through Wikis, blogs, and resource sharing sites. However, the community power impacts not only the production of Web content, but also the access to all kinds of Web content. A number of research groups worldwide explore what we call social information access techniques that help users get to the right information using “collective wisdom” distilled from actions of those who worked with this information earlier.

Social information access can be formally defined as a stream of research that explores methods for organizing users' past interaction with an information system (known as explicit and implicit feedback), in order to provide better access to information to the future users of the system. It covers a range of rather different systems and technologies from social navigation to collaborative filtering. An important feature of all social information access systems is self-organization. Social information access systems are able to work with little or no involvement of human indexers, organizers, or other kinds of experts. They are truly powered by a community of users. Due to this feature, social information access technologies are frequently considered as an alternative to the traditional (content-oriented) technologies. The goal of this tutorial is to provide an overview of the emerging social information access research stream and to provide some practical guidelines for building social information access systems.

Published in: Technology
  • Be the first to comment

Social information Access Tutorial at UMAP 2014

  1. 1. Social Information Access Peter Brusilovsky with Rosta Farzan, Jaewook Ahn, Sharon Hsiao, Denis Parra, Michael Yudelson, Chirayu Wongchokprasitti, Sherry Sahebi School of Information Sciences University of Pittsburgh
  2. 2. Personalized and Adaptive Web Systems Lab (at UMAP 2012)
  3. 3. The New Web: the Web of People
  4. 4. Web 2.0: Fast Start, Broad Spread • Term was introduced following the first O'Reilly Media Web 2.0 conference in 2004 • By September 2005, a Google search for Web 2.0 returned more than 9.5 million results • In 2013 similar search returned over 1390 million results
  5. 5. Social Web Web 2.0 Social Web of Web 2.0?
  6. 6. The Social Web
  7. 7. Key Elements • The Users’ Web • Collective Intelligence: Wisdom of Crowds • The power of the user • Applications powered by user community • Stigmergy • User as a first-class participant, contributor, author
  8. 8. Amazon: Reviews and ratings
  9. 9. eBay: Driving a marketplace
  10. 10. Wikipedia: Providing content
  11. 11. Delicious: Sharing + Organization
  12. 12. Social Linking: Identity + Links
  13. 13. Publish Your Self: [Micro]Blogs
  14. 14. The Other Side of the Social Web User content User interaction Which wisdom of crowds?
  15. 15. Social Information Access Methods for organizing users’ past interaction with an information system (known as explicit and implicit feedback), in order to provide better access to information to the future users of the system
  16. 16. Critical Questions • What kind of past interaction to take into account? • How to process it to produce “wisdom of crowds” ? • In which context to reveal it to end users? • How to make wisdom of crowds useful in this context?
  17. 17. Social Information Access: Contexts Social Navigation – Social support of user browsing Social Recommendation (Collaborative Filtering) – Proactive information access Social Search – Social support of search Social Visualization – Social support for visualization-based access to information Social Bookmarking – Access to bookmarked/shared information facilitated with tags
  18. 18. What is Social Search? - Social Information Access in Search context - A set of techniques focusing on: • collecting, processing, and organizing traces of users’ past interactions • applying this “community wisdom” in order to improve search-based access to information
  19. 19. Variables Defining Social Search I Which users? • Creators • Consumers What kind of interaction is considered? • Browsing • Searching • Annotation • Tagging
  20. 20. Variables Defining Social Search II Which subset of the interactions is used to assist the current user? • All • Group • Selected Peers (similar, socially connected, etc) What kind of search process improvement? • Off-line improvement of search engine performance • On-line user assistance
  21. 21. The Case of Google PageRank Which users? Which activity? What is affected? How it is affected? How it improves search?
  22. 22. How Search Could be Changed? Let’s classify potential impact by stages Before search During search After search
  23. 23. Improving Search Engine Work Search Engine = Crawling + Indexing + Ranking Can we improve crawling? Can we improving indexing? Can we improve ranking?
  24. 24. Improving Indexing What is the problem with the classic approach to indexing? How indexing can be improved?
  25. 25. Social Indexing: Improve Finding Use social data to expand document index (document expansion) What we can get from page authors? Anchor text provided on a link to the page What we can get from searchers? Page selection in response to the query (Scholer, 2002) Query sequences (Amitay, 2005) What we can get from other page visitors? Page annotations (Dmitriev et al., 2006) Page tags (Yanbe, 2007)
  26. 26. Search Engines: Improve Ranking What we can get from page authors? Links (Page Rank) What we can get from searchers? Page selection in response to the query (DirectHit) What we can get from page visitors beyond seatch context? Page visit count Page tags (Yanbe, 2007; Bao, 2007) Page annotations Combined approaches PageRate (Zhu, 2001), (Agichtein, 2006)
  27. 27. Using Social Wisdom Before Search Can be done by both search engines and external interfaces Query checking - now standard Suggesting improved/related queries Example: query networks (Glance, 2001) Automatic query refinement and query expansion Using past queries and query sequences - what the user is really looking for (Fitzpatrick, 1997; Billerbeck, 2003; Huang, 2003) Using anchors (Kraft, 2004) Using annotations, tags
  28. 28. Using Social Wisdom After Search Better ranking, link promotion • Link re-ordering using social wisdom (based on the result selection traces by earlier searchers) Suggesting additional results • Suggest results (or sites!) found by earlier searchers (link generation) Providing social annotations of search results • Link popularity • Past link selection by socially connected users
  29. 29. Challenges of Social Search • Matching similar users • Number of page hits is not reliable (DirectHit failure) • Using “everyone” social data is a bad idea – need not good pages overall, but those that match a query • Even matching with users who issue the same query is not reliable enough – same query, very different goals! • Reliability of social feedback • A click on a result link is not a reliable evidence of quality and relevance • Need to do a wise mining of search sessions and sequences • Fusing query relevance and social wisdom • Single ranking is not the best way to express two dimensions of relevance
  30. 30. Some Advanced Approaches Improving precision by considering more similar users “Quest” approach Community-based search Combining community-based search and navigation Adjusting the precision to the quality of data Site-level recommendation
  31. 31. AntWorld: Quest-Based Approach – Quests establish similarities between users – Relevance between documents and quests is provided by explicit feedback
  32. 32. Quest Approach to Social Search Quests establish similarities between users Relevance between documents and quests is provided by explicit feedback Search is enhanced by cross-recommending or stressing good documents
  33. 33. Evaluation of Quest approach SERF (Jung, 2004) – Results with recommendations were shown on over 40% searches. – In about 40% of cases the users clicked and 71.6% of these clicks were on recommended links! If only Google results are shown users clicked in only 24.4% of cases – The length of the session is significantly shorter (1.6 vs 2.2) when recommendations are shown – Ratings of the first visited document are higher if it was recommended (so, appeal and quality both better)
  34. 34. Quest-Based Approach What is good/important? Critique?
  35. 35. I-SPY: Community-Based Search
  36. 36. I-SPY: Mechanism Community-query-hit matrix User similarity defined by communities and queries Result selection provide implicit feedback
  37. 37. I-Spy: Proxy Version
  38. 38. I-Spy Approach What is good/important? Critique?
  39. 39. From iSpy to Heystaks: Folders
  40. 40. Site-level Social Search • Moving from single query to query sequences • What the user selected at the end • Moving from page recommendation to site recommendation White, R., Bilenko, M., and Cucerzan, S. (2007) Studying the use of popular destinations to enhance web search interaction. In: SIGIR '07, Amsterdam, The Netherlands, July 23 - 27, 2007, ACM Press, pp. 159-166
  41. 41. Site-level search What Kinds of Social Wisdom? How Social Wisdom is Used?
  42. 42. What is Social Navigation? - Social Information Access in Web/hyperspace browsing context - A set of techniques focusing on: • collecting, processing, and organizing traces of users’ past interactions • applying this “community wisdom” in order to help future users to navigate hyperspace
  43. 43. You are in Japan and its lunchtime. There are many food places around, but you do not know how good they are or what these places are actually offering. All signs are in Japanese Social Navigation in Real World “…without knowing much, we joined the longest existing queue formed for a sushi restaurant. looking at faces of people (both young and old) filled with expectations despite the long wait in the cold weather, we were sure that the food would be worth every minute of waiting time. well, it was”. (A comment on Flickr image, used in Rosta Farzan’s Thesis)
  44. 44. Social Navigation in Real World What would you do…? • Walking by the cinema you feel like watching a movie, but none of the movies seems familiar • You missed a lecture and want to do your readings. You have a textbook and 100 assigned pages to read, but do not know what was most important in the lecture and what can be skipped • You are hiking along a trail to a famous waterfall. You reached an unmarked road split and you have no map
  45. 45. Social Navigation: The Motivation • Natural tendency of people to follow each other Making use of “direct” and “indirect cues about the activities of others Following trails Footsteps in sand or snow Worn-out carpet Using dogears and annotations Giving direction or guidance • Navigation driven by the actions from one or more “advice providers”
  46. 46. Navigation Support: Predefined of Social Walking down a path in forest Walking down a road in a city Reading a sign at the airport to find the baggage claim Talking to a person at the airport help desk to find the baggage claim
  47. 47. The Lost Interaction History What is the difference between walking in a real world and browsing the Web? – Footprints – Worn-out carpet – People presence What is the difference between buying and borrowing a book? – Notes in the margins – Highlights & underlines – Dog-eared pages – Opens more easily to more used places
  48. 48. Edit Wear and Read Wear (1992) The pioneer idea of asynchronous indirect social navigation Developed for collaborating writing and editing Indicated read/edited places in a large document
  49. 49. Footprints (1997) Wexelblat & Maes, 1997 Allowing users to create history-rich objects Providing history-rich navigation in complex information space Showing what percentage of users have followed each link
  50. 50. Juggler (1998) Dieberger, 1998 Textual virtual environment (MOO) History-enriched environment Showing access-counter for rooms Recognizing URLs in the output of a communication tool Hiding it from user Popping out the page Integrating with social navigation Supporting interaction between teachers and students
  51. 51. Ideas for Social Navigation on WWW Awareness of presence of other users – Discussion of an article – Location attracting large crowds of users Relevant objects – Links visited by similar users – Items appreciated by similar users Recency – How long ago the page was created/visited Attitude – What other users did/thought about an item
  52. 52. Example: CoWeb
  53. 53. SN in Information Space:The History History-enriched environments – Edit Wear and Read Wear (1992) – Social navigation systems • Footprints, Juggler, Kalas Collaborative filtering – Manual push and pull • Tapestry, LN Recommender – Modern automatic CF recommender systems Social bookmarking – Collaborative tagging systems Social Search
  54. 54. Social Navigation in Information Space Synchronous Communication in real time Asynchronous Using the Interaction of past users Direct Direct communication between people Indirect Relying on user presence and traces of user behavior Chats Recommenders Q/A Systems Presence of other people History-enriched environments Direct Indirect Synchronous Asynchronous
  55. 55. EDUCO: Synchronous, Indirect SN
  56. 56. Direct Asynchronous SN Asynchronous discussion forums Recommending information to friends and community Directly asking questions for getting information Sharing bookmarks with others
  57. 57. Umtella: Direct Asynchronous SN
  58. 58. CourseAgent: Direct, Asynchronous • Adaptive community-based course planning system – Provides personalized access to course information – Provides social recommendation about courses • Recommendation in the form of in-context adaptive annotation – Visual cues • Expected course workload • Expected relevance to students’ career goals – Course Schedule – Course Catalog
  59. 59. CourseAgent: Direct, Asynchronous • Adaptive community-based course planning system –Provides social navigation through visual cues
  60. 60. Ratings: Raw Social Data
  61. 61. 62 Generating Social Navigation Overall workload Averaging over all ratings of the community Overall Relevance Average does not work Irrelevant to many but very relevant to one Goal-centered algorithm 16 rules
  62. 62. Trade-offs for Direct Approach • Reasonably reliable • Feedback directly provided • No need to deduce and guess • Explicit feedback is hard to obtain • Takes time to provide and requires commitment • “One out of a hundred” • Social system, which extensively relies on explicit feedback need either large community of users or special approaches to motivate direct contributions
  63. 63. Direct Asynchronous SN Asynchronous discussion forums Recommending information to friends and community Directly asking questions for getting information Sharing bookmarks with others
  64. 64. Motivation for Direct SN Extrinsic Motivation Adding some external reasons to contribute Recognition (badges, status – busy bee) Social comparison, social gaming Incentives (more power, functions) Intrinsic Motivation Encouraging natural reasons to contribute Personal values Engineering the system to turn contribution in user-beneficial actions
  65. 65. Comtella Stars Extrinsic
  66. 66. 67 Career Planning: Intrinsic Motivation:
  67. 67. 68 The Intrinsic Motivation Works •Career Planning was not advertised and was not noticed and used by half of the students •Contribution of experimental users who did not use Career planning (experimental group I) is close to control group •Significant increase of all contributions for those who had and used Career planning (experimental group II)
  68. 68. More about CourseAgent Farzan, R. and Brusilovsky, P. (2006) Social navigation support in a course recommendation system. In: V. Wade, H. Ashman and B. Smyth (eds.) Proceedings of 4th International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems (AH'2006), Dublin, Ireland, June 21-23, 2006, Springer Verlag, pp. 91-100. Farzan, R. and Brusilovsky, P. (2011) Encouraging User Participation in a Course Recommender System: An Impact on User Behavior. Computers in Human Behavior 27 (1), 276-284.
  69. 69. Amazon: Asynchronous, Indirect •Compare with direct SN in Amazon ratings or reviews: “the remake of this movie is horrible, I recommend to watch the original version instead” Traces of viewing and purchasing decisions is a valuable collective wisdom!
  70. 70. Knowledge Sea II: Asynchronous, Indirect •Social Navigation to support course readings
  71. 71. Knowledge Sea II (+ AnnotatEd)
  72. 72. Trade-off for indirect approach • Feedback is easy to get • Users provide feedback simply by navigating and doing other regular actions • It works quite well • Most useful pages tend to rise as socially important • Social navigation cues attract users • Indirect feedback might not be reliable • A click or other action in the interface is a small commitment, may be a result of error • “Tar pits” • Main challenge of systems based on indirect approach: increase the reliability of indirect feedback • Better processing of unreliable events (time, scrolling) • Use more reliable events (cf. browsing vs. purchase)
  73. 73. Knowledge See II: Beyond clicks • Make better use of existing feedback • Switched from click-based calculation of user traffic to time based • Time and patterns can provide more reliable evidence • Added annotation-based social navigation • Annotations are more reliable • Users are eager to provide annotations and even categorize them into positive/regular
  74. 74. Spatial Annotation Interface A Spatial Annotation Interface adds social navigation on the page level Staking a space Commenting 75BooksOnline'08
  75. 75. Page-level Navigation Support Visual Cues - annotation background and border Background Style •Background filling Ownership •Background color Owner’s attitude Border style •Border color Positiveness •Border thickness # of comments •Border stroke Public or personal 76BooksOnline'08
  76. 76. Annotation-based SN does work • Usage • With additional navigation support map-based and browsing-based access emerged as the primary access way • Effect on navigation • Significant increase of link following (pro-rated normalized access) • Impact • Annotation leads students to valuable pages
  77. 77. Back to Motivation Issue BooksOnline'08 Annotations are explicit actions used for implicit feedback and as with all explicit actions, it come with motivation problems.
  78. 78. More on KS-II and AnnotatEd Farzan, R. and Brusilovsky, P. (2005) Social navigation support through annotation-based group modeling. In: L. Ardissono, P. Brna and A. Mitrovic (eds.) Proceedings of 10th International User Modeling Conference, Berlin, July 24-29, 2005, Springer Verlag, pp. 463-472 Farzan, R. and Brusilovsky, P. (2008) AnnotatEd: A social navigation and annotation service for web-based educational resources. New Review in Hypermedia and Multimedia 14 (1), 3-32. Brusilovsky, P. and Kim, J. (2009) Enhancing Electronic Books with Spatial Annotation and Social Navigation Support. In: Proceedings of the 5th International Conference on Universal Digital Library (ICUDL 2009), Pittsburgh, PA, November 6-8, 2009
  79. 79. CoMeT: Asynchronous, Indirect
  80. 80. Fighting for SN reliability in CoMeT • Broader set of evidences • View, annotate, tag, schedule talks, send to friends, connect to peers • Declare affiliations (similarity!) • Join and post links to a set of communities • Combining in-context (visual cues) and out-of context (ranking) guidance • Exploring the power of “top N” • Powerful, but dangerous! • Handling of Top N in Conference Navigator
  81. 81. Known Problems and Challenges Concept drift Snowball effects Bootstrapping
  82. 82. Concept Drift Old history information becomes less relevant History decay different for a very popular and a less popular information Shift of Interest
  83. 83. Snowball effect Just one visit before the current visit can turn the page into ‘hot’ The page could be useful or useless Next users follow the same path Snowball gets bigger and bigger
  84. 84. Conference Navigator Project • Social conference support system – combining social and personalized guidance • Fighting snowball effect in top 20 page
  85. 85. Bootstrapping Social navigation works with many users What if there are very few users? How to match a new user against already populated system? How to encourage users to leave their trails (commenting, …)? How to make the new information visible in already populated system?
  86. 86. Social Visualization • Using social wisdom in visualization- based access • Visualization provide “big picture” of the information space in 2-3D and use topology and visual cues to direct attention to right information • Share many aspects with social navigation, but focus on visual wholistic interface
  87. 87. Social Visualization with VIBE
  88. 88. Social Visualization in CN3 TalkExplorer: Integrating recommendations and SNS visually
  89. 89. Social Visualization in E-learning • Progressor and Progressor+ projects • Problem: guide students to most appropriate educational content – examples, problems, etc. • Using reliable indicators of student progress (problem solving success) • Provide visualization to better support guidance • Explore peer-based and community- based SNS
  90. 90. Parallel Introspective Views 91Hsiao, I.-H., Bakalov, F., Brusilovsky, P., and König-Ries, B. (2011) Open Social Student Modeling: Visualizing Student Models with Parallel Introspective Views. Proceedings of 19th International Conference on User Modeling, Adaptation, and Personalization, Girona, Spain, July 11-15, 2011, Springer-Verlag, pp. 171-182.
  91. 91. Progressor 92Hsiao, I. H., Bakalov, F., Brusilovsky, P., and König-Ries, B. (2013) Progressor: social navigation support through open social student modeling. New Review of Hypermedia and Multimedia 19 (2), 112-131.
  92. 92. Students spent more time in Progressor+ Quiz =: 5 hours Example : 5 hours 20 mins 93 60.04 150.19 224.7 296.9 69.52 121.23 110.66 321.1 0 50 100 150 200 250 300 350 400 QuizJET JavaGuide Progressor Progressor+ Total time spent (minutes) Quiz Example
  93. 93. Students achieved higher Success Rate 94 42.63% 58.31% 68.39% 71.20% 0.00% 20.00% 40.00% 60.00% 80.00% QuizJET JavaGuide Progressor Progressor+ Success Rate p<.01
  94. 94. 95 Non-adaptive adaptive Social, adaptive, single content Progressor+ How Social Guidance Works
  95. 95. Advancing SN: Beyond Click Clicks are not reliable signs of interest! What other kinds of user activities can be tracked? – Annotation – Bookmarking – Sending e-mail – Solving a problem – Downloading – Purchasing – Rating and liking
  96. 96. Advancing SN: One Size Fits All? Which users’ actions are taken into account for social navigation? – All users – Coherent, like-minded group of users Group-level social navigation – KnowledgeSea II, Progressor – a class – CourseAgent – users with similar goals – CoFIND – Facebook – social network – Amazon - context
  97. 97. SIA Challenges across Contexts • Increasing reliability of indirect sources • Time spent reading vs. simple click • Query sequences vs. simple result access • Adding more reliable evidences of relevance/quality/interests • Annotation vs. browsing • Purchasing/downloading vs. viewing • May add the problem of motivation! • Basis for user similarity (not “all for all”) • Co-rating in recommender systems (sparsity!) • Users with similar goals (CourseAgent) • Single class in Knowledge Sea II (still topic drift!) • Quest or community in AntWorld and iSpy
  98. 98. More Challenges:Merging the Technologies • Different branches of SIA have little connections to each other • Social navigation use navigation data to assist navigation • Social search use search traces to assist future searchers • Many opportunities to merge two or more SIA technologies • Social Web system with broader SIA • Use several kinds of user traces to support a specific SIA technology • Offer several kinds of SIA • Earlier work: Social Navigation + Social Search – ASSIST ACM – ASSIST YouTube • Social Navigation + Recommendation • Adding Social Visualization
  99. 99. Social Search with Visual Cues General annotation Question Praise Negative Positive Similarity score Document with high traffic (higher rank) Document with positive annotation (higher rank) Query relevance and social relevance shown separately: rank/annotation
  100. 100. Annotation-Based Search: Impact Acceptance – Users noticed and applied social visual cues • Frequency of usage - viewed more documents per query with social visual cues – Users agreed with the need for social search • Survey results Performance – Social Visual Cues are taken into account for navigation • Social Navigation cues are twice as more influential in affecting user navigation decision than high rank – Social visual Cues provide higher prediction for page quality that high rank More information – Ahn, J.-w., Farzan, R., and Brusilovsky, P. (2006) Social search in the context of social navigation. Journal of the Korean Society for Information Management 23 (2), 147- 165.
  101. 101. ASSIST-ACM: Social Search + Nav Re-ranking result-list based on search and browsing history information Augmenting the links based on search and browsing history information Farzan, R., et al. (2007) ASSIST: adaptive social support for information space traversal. In: Proceedings of 18th conference on Hypertext and hypermedia, HT '07,, pp. 199-208