Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging


Published on

Week 8 slides from the class "Social Web 2.0" I taught at the University of Washington's Masters in Communication program in 2007. Most of the content is still very relevant today. Topics: Social metadata, ratings, and social tagging.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging

  1. 1. Social Web 2.0 Implications of Social Technologies for Digital Media Shelly Farnham, Ph.D. Com 597 Winter 2007
  2. 2. Week 8 <ul><li>Social Metadata: from Rater Systems to Folksonomies </li></ul><ul><ul><ul><ul><li>Social Navigation, Social Tagging </li></ul></ul></ul></ul>
  3. 3. Class exercise <ul><li>You as community of practice: </li></ul><ul><ul><li>Students/professionals digital media </li></ul></ul><ul><li>Tag yourself </li></ul><ul><ul><li>Five most significant people you interact with, related to areas of interest in digital media </li></ul></ul><ul><ul><li>Five most significant organizations/events (e.g., where you work, events you go to, professional orgs you are a a part of, volunteer orgs, projects you are a part of) </li></ul></ul><ul><ul><li>Five tag words most express your areas of interest </li></ul></ul><ul><li>Note: will read aloud </li></ul>
  4. 4. Problem Statement <ul><li>Attention economy </li></ul><ul><ul><li>Wealth of information creates poverty of attention </li></ul></ul><ul><ul><li>Need to allocate attention efficiently </li></ul></ul><ul><li>Copious amounts of online content </li></ul><ul><ul><li>Much of it user generated content (UGC) </li></ul></ul><ul><ul><li>Social metadata layer </li></ul></ul><ul><ul><ul><li>Authorship (who said what) </li></ul></ul></ul><ul><ul><ul><li>Activity/history/transaction data (who did what) </li></ul></ul></ul><ul><ul><ul><li>Relationships, network/groups data (Who knows who) </li></ul></ul></ul><ul><ul><ul><li>Semantic tagging (who said what it is about) </li></ul></ul></ul><ul><li>How use social metadata around content to aid users in wisely spending their limited attention? </li></ul>
  5. 5. blogs home pages digital libraries news search hyperlinked browsing readers meme maps tag clouds social metadata collaborative filtering prominence network, group affinity photos Content Filters UI authorship user activity tags Groups/networks music user filtering by preference similarity filtering by social proximity filtering by highest rank ratings co-presence social metadata
  6. 6. Challenges <ul><li>Keeping up with proliferation of content </li></ul><ul><ul><li>Very dynamic: constantly updating, changing </li></ul></ul><ul><li>Integrating data across sites </li></ul><ul><li>Extracting semantic meaning for digital objects </li></ul><ul><li>Holy grail: Semantic web </li></ul><ul><ul><li>Common formats for integration and combination of data drawn from diverse sources </li></ul></ul><ul><ul><li>Language for how data relates to real world objects </li></ul></ul><ul><ul><li>Actionable information, interoperability based on meaning </li></ul></ul><ul><ul><li>RDF (resource description framework), XML </li></ul></ul>
  7. 7. Social Navigation <ul><li>Assumption </li></ul><ul><ul><li>Where people spend their time a good approximation of value </li></ul></ul><ul><li>User behavior guide us to interesting/relevant information </li></ul><ul><ul><li>Google, what do people link to? </li></ul></ul><ul><ul><li>BlueDot, what do people bookmark? </li></ul></ul>
  8. 8. Automated Recommendation Systems <ul><li>Collaborative Filtering </li></ul><ul><ul><li>Provide some information about your preferences </li></ul></ul><ul><ul><li>Get recommendations based what else people who shared your preferences also liked </li></ul></ul><ul><ul><li> </li></ul></ul><ul><li>Proximity/clustering analyses </li></ul><ul><ul><li>Objects that occur near each other </li></ul></ul><ul><ul><li>frequently must be related </li></ul></ul><ul><ul><li> </li></ul></ul>
  9. 9. Social Tagging <ul><li>Add tags to objects to bookmark </li></ul><ul><ul><li>Individual motivation, organizes in your list of favorites </li></ul></ul><ul><ul><li>Aggregated collective knowledge, importance emerges </li></ul></ul><ul><ul><ul><li>can see most popular tags, most popular items with same tags </li></ul></ul></ul><ul><li>A.k.a. </li></ul><ul><ul><li>User generated meta-data </li></ul></ul><ul><ul><li>Collaborative tagging </li></ul></ul><ul><ul><li>Ethnoclassification </li></ul></ul><ul><ul><li>Folksonomies </li></ul></ul><ul><ul><ul><li>No hierarchy, just automatically generated related tags </li></ul></ul></ul><ul><ul><ul><li>Categorization vs. classificaiton (inclusive, not exclusive) </li></ul></ul></ul>
  10. 10. Delicious <ul><li>Where </li></ul><ul><li>Social </li></ul><ul><li>Tagging </li></ul><ul><li>Started </li></ul>
  11. 12. Flickr <ul><li>Tag cloud </li></ul><ul><li>Alphabetic order </li></ul><ul><li>Font size: </li></ul><ul><ul><li>Prominence </li></ul></ul><ul><ul><li>Related </li></ul></ul>
  12. 13. Flickr <ul><li>Geo tagging </li></ul>
  13. 14. Meme Maps
  14. 15. BlueDot <ul><li>Tagging </li></ul><ul><li>+ </li></ul><ul><li>Social </li></ul><ul><li>network </li></ul>
  15. 16. The Power of Social Tagging <ul><li>Enables people to organize and access info that is high in relevance </li></ul><ul><li>Can enable social connectivity </li></ul><ul><li>Low cost, shared workload, increases scalability </li></ul><ul><li>low barrier to entry, no expertise required to participate </li></ul><ul><li>User defined diction, terminology, in any knowledge domain, enables non-professionals to participate in system </li></ul><ul><li>Responsive to dynamic changes in terminology, new terminology, changes and innovation in resources, emergent taxonomy through “desire lines” </li></ul>
  16. 20. The Power of Social Tagging (cont’d) <ul><li>Accommodates relational info structures (as opposed to hierarchical) </li></ul><ul><li>Can use frequency of terms across people to approximate prominence of idea, or prominence of tagged item </li></ul><ul><ul><li>(if lots of people tagged it, must be important tag, or must be important/interestinfsg object) </li></ul></ul><ul><li>Enables discovery of new terminology/resources through browsing, serendipitous </li></ul>
  17. 21. Problems with Social Tagging <ul><li>Polysemy – single word has multiple meanings </li></ul><ul><li>Synonymy – different words same meaning </li></ul><ul><li>“ basic level” – continuum of specificity </li></ul><ul><ul><li>Individual difference </li></ul></ul><ul><li>Idiosyncratic, meta-noise, low quality of tags </li></ul><ul><li>Inexpert taggers </li></ul><ul><li>Keyword vs. keyword phases, dealing with spaces and multiple words </li></ul>
  18. 22. Primary Objects of Social Tagging <ul><li>Resource </li></ul><ul><li>Semantic Tags </li></ul><ul><li>People </li></ul>
  19. 23. Social Uses of Tags <ul><li>Conversation/communication </li></ul><ul><li>Group formation </li></ul><ul><li>Event collections </li></ul><ul><li>New terms, e.g. “sometaithurts, flicktion” </li></ul><ul><li>Some self-referential: “me “to read” </li></ul>
  20. 24. Kinds of Tags <ul><li>Identifying what about (dogs) </li></ul><ul><li>Identifying what it is (book, article) </li></ul><ul><li>Who owns it </li></ul><ul><li>Refining (2001) </li></ul><ul><li>Qualities (stupid) </li></ul><ul><li>Self reference </li></ul><ul><li>Task organizing </li></ul>
  21. 25. Design Considerations <ul><li>Tagging rights </li></ul><ul><ul><li>Self vs. free for all </li></ul></ul><ul><li>Tagging support </li></ul><ul><ul><li>blind (new terminoloyg), viewable, suggestive (convergence) </li></ul></ul><ul><li>Aggregation </li></ul><ul><ul><li>Bag vs. set (no repetition) </li></ul></ul><ul><li>Tagger: author, user, expert </li></ul><ul><ul><li>With user get collective wisdom/evaluation </li></ul></ul><ul><li>Immediate feedback: see associated items as soon as you tag </li></ul><ul><li>Type of object/resource </li></ul><ul><li>Resource connectivity (linked, grouped) </li></ul><ul><li>Social connectivity (linked, grouped) </li></ul>
  22. 26. User incentives <ul><li>organizational or social </li></ul><ul><li>Future retrieval </li></ul><ul><li>Contribution and sharing </li></ul><ul><li>Attract attention </li></ul><ul><li>Self-presentation </li></ul><ul><li>Opinion expression </li></ul>
  23. 27. Developing algorithms <ul><li>Start simple, become complicated </li></ul><ul><li>Expect a lot of tweaking to get to what seems to fit </li></ul><ul><ul><li>Different data, different models more appropriate </li></ul></ul><ul><ul><li>Use “known” information space to evaluate </li></ul></ul><ul><li>Keep data in its rawest form (for now) </li></ul><ul><ul><li>User, tag, tag resource, timestamp, tagging context? Metatag? Tagging type? </li></ul></ul><ul><li>As reasonably as possible, develop algorithms that map onto understanding of information space </li></ul><ul><ul><li>Social networks, associative models of memory, etc. </li></ul></ul><ul><li>Associative data structure </li></ul><ul><ul><li>Authors, resources, tags all objects that are associated by particular instances of tagging/bookmarking – could include time of content creation as well </li></ul></ul><ul><li>Weighting associations/similarity measures </li></ul><ul><ul><li>Expect to differentially weight associations </li></ul></ul><ul><ul><ul><li>Author, resource, high weight </li></ul></ul></ul><ul><ul><ul><li>Tagger, resource, lower weight </li></ul></ul></ul><ul><ul><ul><li>Bookmarker, resource, medium weight </li></ul></ul></ul><ul><li>Expect will be transforming data to tweak shape of distributions </li></ul>
  24. 28. Inferring Prominance <ul><li>Frequency of occurrence </li></ul><ul><ul><li>(see dubinko as used on flickr, “interestingness”) </li></ul></ul><ul><li>Spectral analysis: Social network analysis style looking for hubs </li></ul><ul><ul><li>Applied to tags (Wu et al.) </li></ul></ul><ul><li>SNA: betweenness centrality p. 188 </li></ul><ul><ul><li>Items all connect to this central item </li></ul></ul>
  25. 29. Inferring Tag Relatedness <ul><li>Frequency co-occurrence </li></ul><ul><ul><li>Between tags: </li></ul></ul><ul><ul><ul><li>Co-occurrence in resources </li></ul></ul></ul><ul><ul><ul><li>Co-ocurrence in author </li></ul></ul></ul><ul><ul><li>Between resources: </li></ul></ul><ul><ul><ul><li>Co-occurrence of tags </li></ul></ul></ul><ul><ul><ul><li>Co-occurrence of authors of tags, bookmarks </li></ul></ul></ul><ul><ul><li>Between users: </li></ul></ul><ul><ul><ul><li>Relatedness of tags </li></ul></ul></ul><ul><ul><ul><li>frequency of co-occurrence in tags </li></ul></ul></ul><ul><li>Indirect </li></ul><ul><ul><li>through analysis of emergent semantics (wu, 2006), dimensions </li></ul></ul><ul><ul><li>Neural net semantic “spread”, via link weights </li></ul></ul><ul><ul><li>Thesus: distance in ontological tree (Halkidi) p. 323 semantic proximity, using WordNet or known corpus </li></ul></ul>
  26. 30. Similarity Measures <ul><li>Sim(a, b) = Nab/sqrt(Na * Nb) </li></ul><ul><li>Or asymetric/bidirectional: Sim(a, b) Nab/Na </li></ul><ul><li>Weight count for # of tags in doc? Assumption if only two tags describe document are more highly related than if 8 tags describe document: </li></ul><ul><ul><li>Sim(dog, cat) in {dog, cat} = 1 </li></ul></ul><ul><ul><li>Sim(dog, cat) in {dog, cat, vet, mouse} = .5 </li></ul></ul><ul><li>(Weighted sums)/(sum of weights) for adding multiple similarity measures </li></ul><ul><ul><li>(WeightA*SimA + WeightB*SimB + WeightC*SimC)/(WeightA + WeightB + WeightC) </li></ul></ul><ul><li>Weight some keywords as more important (Haldiki), e.g. depending on position (Goldre shows that 1 st usually more important) </li></ul>
  27. 31. Connectionist Models <ul><li>Based on neural net models </li></ul><ul><ul><li>Set/net of nodes with activation levels and weights </li></ul></ul><ul><ul><li>Activation on any node weighted sum of inputs including external </li></ul></ul><ul><ul><li>Activation of nodes occurs upon co occurrence, or query? </li></ul></ul><ul><ul><ul><li>Instance of Female and Cancer added together as tags </li></ul></ul></ul><ul><ul><li>Spread activation across paths </li></ul></ul><ul><ul><ul><li>Female and Cancer to Breast </li></ul></ul></ul><ul><ul><li>Update path weights and activation values </li></ul></ul><ul><ul><li>Iterate until stabilize (decay at each iteration) </li></ul></ul><ul><ul><li>Weights decay with time and lack of activation across links </li></ul></ul><ul><li>In search, filter for most similar items using link weights multiplied out 2 degrees </li></ul><ul><li>McClelland and Rumelhart in PDP vol. 2 </li></ul><ul><li>Read and Miller p. 27 on the Dynamic Construction of Meaning </li></ul><ul><li>Gelgi, spreading activation </li></ul>
  28. 32. Automated tagging? <ul><li>(Brooks & Montanez) </li></ul><ul><li>Using standard keyword extraction methods (TFIDF score, extracting three words most frequent relative to standard frequency in a corpus) </li></ul><ul><li>Argue is more effective than social tagging for developing similarity measures across documents </li></ul>
  29. 33. Inferring Quality of Tags <ul><li>Frequency of occurrence </li></ul><ul><li>Filter out “one-offs” </li></ul><ul><li>Author of tag </li></ul><ul><ul><li>Time in system </li></ul></ul><ul><ul><li>Has contributed content </li></ul></ul><ul><ul><li>Is part of community of interest (e.g., group) </li></ul></ul><ul><li>“ seed” tags? </li></ul>
  30. 34. Inferring Structures <ul><li>Cluster analysis </li></ul><ul><ul><li>Agglomerative, average linkage </li></ul></ul><ul><ul><li>Wu et al keywords at each hiearchy… </li></ul></ul><ul><li>Wu 2006: Separable Mixture Model (some form of MDS?) </li></ul><ul><ul><li>emergent semantics </li></ul></ul><ul><ul><li>Based on co-occurrence data (users, resources, and tags) </li></ul></ul><ul><ul><li>Say works reasonably up to 40 dimensions </li></ul></ul><ul><li>Mapping to existing e.g. WordNet (Haldiki) </li></ul>
  31. 35. Inferring Structures <ul><li>Machine tags, a.k.a. Triple tags </li></ul><ul><li>Flickr, mostly at API level </li></ul><ul><li>Metatag for a tag </li></ul><ul><ul><li>Namespace – class, or facet </li></ul></ul><ul><ul><li>redicate – name of the property </li></ul></ul><ul><ul><li>Value </li></ul></ul><ul><ul><ul><li>Flora:tree=coniferous </li></ul></ul></ul>
  32. 36. Design Implications <ul><li>Use tags as another way to link people </li></ul><ul><li>Social presence indicators increase tagging (Lee): </li></ul><ul><ul><li>Profile </li></ul></ul><ul><ul><li>subscribing to others’ bookmarks </li></ul></ul><ul><ul><li>awareness of who else tagged like you ,etc. </li></ul></ul><ul><li>Use as way to associate informal lay language and formal terminology </li></ul><ul><li>Importance of showing authors of tags </li></ul>
  33. 37. UI considerations <ul><li>Add tag widget </li></ul><ul><ul><li>Add tags at time of bookmarking (delicious) </li></ul></ul><ul><ul><li>Add tags at time of content creation (profile, stories, journal entries) </li></ul></ul><ul><ul><li>View my tagged items OR all tagged items </li></ul></ul><ul><ul><ul><li>Chronological order </li></ul></ul></ul><ul><ul><ul><li>By tag </li></ul></ul></ul>
  34. 38. References <ul><ul><li>Aldenderfer, M., and Blashfied, R. (1984) Cluster Analysis. Sage Publications, Newbury Park. </li></ul></ul><ul><ul><li>Bechtel, W., Abrahamsen, A. (1991). Connectionism and the Mind: An introduction to parallel processing in networks. Blackwell, Oxford. </li></ul></ul><ul><ul><li>Brooks, C., Montanez, N., (2006). Improved annotation of the blogosphere via autotagging and hierarchical clustering. WWW 2006, Edinburgh, Scotland. </li></ul></ul><ul><ul><li>Dubinko, M., Kumar, R., Magnani, J. (2006). Visualizing tags over time. WWW 2006, Edinburgh, Scotland. </li></ul></ul><ul><ul><li>Gelgi, F., Vadrevu, S., Davulcu, H. () Improving web data annotattions with spreading activation. WISE 2005, New York, NY. </li></ul></ul><ul><ul><li>Golder, S. A., Huberman, B. A. (2006?) The structure of collaborative tagging systems. </li></ul></ul><ul><ul><li>Haldinki, M., Nguyen, B., Varlamis, I., Vazirgiannis, M. (2003). THESUS: Organizing web document collections based on link semantics. VLDB Journal (2003) 12: 320-332. </li></ul></ul><ul><ul><li>Kim, J., Candan, K. (2006). CP/CV Concept Similarity Mining without frequency information from domain describing taxonomies. CIKM 2006. </li></ul></ul><ul><ul><li>Lee, Kathy. (2006). What goes around comes around: an analysis of as social space. CSCW 2006. </li></ul></ul><ul><ul><li>Marlow, C., Naarman, M., boyd, D., Davis, M. (2006). HT06, tagging paper, taxonomy, flickr, academic article, to read. HT 2006. </li></ul></ul><ul><ul><li>Mathes, A. Folksonomies – Cooperative classification and communication through metadata. </li></ul></ul><ul><ul><li>McClelland and Rumelhart, Parallel Distributed Processing </li></ul></ul><ul><ul><li>Read, S. J. and Miller, L. C., eds. (1998). Connectionist Models of Social Reasoning and Social Behavior. Lawrence Earlbaum, New Jersey. </li></ul></ul><ul><ul><li>Shirky, Clay. (??) Ontology is Overrated: Categories, Links, and Tags. </li></ul></ul><ul><ul><li>Wasserman, S., and Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK. </li></ul></ul><ul><ul><li>Wu, X., Zhang, L., Yu, Y. (2006). Exploring social annotations for the semantic web. WWW 2006, Edinburgh, Scotland. </li></ul></ul><ul><ul><li>Wu, H., Zubair, M., Maly, K. (2006). Harvesting social knowledge from folksonomies. HT 2006, Odense, Denmark. </li></ul></ul><ul><ul><li>Xue, G., Zeng, H., Chen, Z., Ma, W., and Yu, Y. (2004). Similarity spreading: A unified framework for similarity calculation of interrelated objects. WWW 2004 New York, New York. </li></ul></ul>