Folksonomies - Indexing and Retrieval for Web 2.0


Published on

A presentation for CMN 5150 at the University of Ottawa on Folksonomies by Dr. Isabella Peters. Covers the material of the book with a practical focus on knowledge management and communications.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Executive SummaryThis presentation is a summary of the material presented in the monograph “Folksonomies: Indexing and Retrieval in Web 2.0” by Isabella Peters. The summary has been tailored for Communications post-graduate students studying Knowledge Management. Background on the author and book are treated briefly, followed by an overview of the book’s contents. The material is then broken into 3 sections. First, the use of folksonomies is examined, with relevant examples being highlighted. Secondly, the use of folksonomies for Knowledge Management is treated with a focus on the data collected, the resulting data set, the user behaviour, as well as advantages and disadvantages. Lastly, the use of folksonomies for information retrieval is examined.
  • Curtis added this page because it is not in Folksonomies but it is relevant nowadays
  • 3 pieces of data constitute a folksonomyOther pieces of data can add value but are not an intrinsic partDirect association between users and resourcesFollowers, other metadataCommunity suggested as a 4th piece which is the virtual sum of all knowledge drawn from the graph
  • Resources tend to be links or some public content, like
  • Tend to be private content, likeflickr, Youtube, Amazon, etc…Can’t measure popularity or recommend tags based on what others have taggedTagging might be possible via graph-based calculations
  • More useful in the analysis of broad folksonomiesMath isn’t importantThe key here is understanding how tags tend to distribute themselves over a given resourceUseful to understand typical high-frequency tags versus low-frequency tagsLow-frequency tags especially important for optimal user experience since low-value tags tend to be what gives folksonomies an edge versus traditional search
  • Key here is gaining an understanding of the problems that folksonomies face and the limitations in addressing these problems
  • Authentic language – speaks the vernacular, adapts to idioms rapidlyActuality – stays current, adapts with new knowledgeMultiple interpretations – not restricted to a single classification or point of viewCheap – no pro curator, no need to understand data-set before deploymentScalability – Becomes a better system with more useNeologisms – Adapts to new words wellIdentify communities – Data mining techniques can perform sophisticated cluster analysis to gain a knowledge of user communitiesRecommendation systems – collective knowledge can be tapped to make tagging easierFamiliarize users – tagging is so simple and intuitive that Faster – Studies show users tag much faster than having to classify properly in a professional taxonomyRecollection – Studies show that users can recall which tags were used to classify resources much more accurately than if they have to recall a taxonomical classification
  • Lack of a controlled vocabulary – Useless tags proliferate if gardening is not successfulThe context of indexing is lost – The link between tag and resource is assumed, the reason for indexing is not clear. Users may tag for different reasons. Not all users are community minded. Some tag just for the community.Languages are mixed – Different languages get treated the same, hard to adapt across languagesHidden relations are unexploited – Relations that would be useful are hard to tap intoSpam tags, user-specific tags, unclear keywords – Not all tags contribute useful knowledge, some maliciously soResources are indexed as a whole – No way to apply tags to just part of a resourceSocial character of tags is mostly invisible – Not many standard interfaces for working with the collective tag cloud. Mostly specialist tools that aren’t very productive.Cold start problem – System is totally unpopulated at the start, so until it reaches critical mass, many benefits are unrealized. Thus, may be less useful for specialist systems.
  • Search – Tags make text-based searching a little more intelligent by placing great weight on text matches with tags associated with an item. There is still the same difficulty of overcoming tag problems like synonyms, language issues, etc…Browse – A retrieval method unique to folksonomies. Allows for discovery without knowing what you are looking for. Basically, you are leaning on the collective intelligence, captured as the tag cloud.Visualize – Discover patterns in the data, hitherto unknown similarities and relations. For instance, you might discover through visualization tools that people interested in KM tend to be interested in medicine, which might be new information for some.
  • Folksonomies - Indexing and Retrieval for Web 2.0

    1. 1. Folksonomies: Indexing and Retrieval in Web 2.0 By Isabella Peters Presented by: Curtis Naphan & Shahid Zia Qaisrani CMN 5150 Fall 2011
    2. 2. On the Author• Dr. Isabella Peters, M.A.• Specializes in Information Science• Researcher and Lecturer at Heinrich- Heine-Universität in Düsseldorf, Germany Source:
    3. 3. On the Book• Published in 2009• Part of the Knowledge & Information book series• Originally in German• Thorough and “sober” analysis of folksonomies• Not casual reading but a good resource for those who need to know Source:
    4. 4. How the Book is Structured Overview of • Where are folksonomies being used today? Collaborative • What are the various characteristics?Information Services Overview of • What are some relevant concepts in folksonomies? Terminology and • What are the alternatives? Models Folksonomies for • How can folksonomies help capture knowledge? Knowledge • What are the benefits and drawbacks? Representation Folksonomies for • How can folksonomies help retrieve knowledge?Information Retrieval • How do they compare with traditional methods?
    5. 5. Overview of Folksonomies A look at how folksonomies are being used today
    6. 6. What are they? FOLKSONOMY• The use of tags to index and retrieve content dog spot funny gun
    7. 7. Why are they used?• Web 2.0 – User-generated content – Little formal curation• Taxonomies too restrictive – “If hierarchies were a good way to organize links, Yahoo would be king of the hill and Google an also-ran service.” (Shirky, 2004)• Full-text search not enough – Non-textual resources – Collaborative browsing
    8. 8. Where are they used?• Incorporated into many applications• Some differences: – Tag my stuff vs. tag everyone’s stuff – Content belongs to me vs. Content is public
    9. 9. Social Bookmarking• Users add bookmarks Examples• User can tag bookmarks •• Link can be tagged by • Diigo multiple users • Bibsonomy• Tags aid: • CiteULike – Personal retrieval – Collaborative browsing – Search• Often used for PKM
    10. 10. Link Tags Recommended Tags
    11. 11. E-Commerce• Users can tag products• Complements search and professional directory• Example:
    12. 12. Knowledge Bank• For researchers and engineers• Tag Widget• Simple and Advance Search• Boolean AND• Multi-user tagging• Example: Engineering Village
    13. 13. Streaming Radio• Example:• Songs streamed and played up to 3 times• Remunerated for playback• Collaborative rating system• Taste and listening habits• Tag-based recommender system
    14. 14. Libraries, Museums• Tagging real-life objects via web• Complements traditional indexing methodsExamples• LibraryThing• Stevemuseum roman cicero marble bust
    15. 15. Photosharing• Users can tag any photo• Aids search, browsingExamples• Flickr
    16. 16. • Tagged and rated blogs• Search engine and directory• Tag generator code
    17. 17. Twitter• Slightly different implementation – Tags extracted from #hashed keywords• Twitter adds: – Users following users – Messages linked to @users link to other user user hash-tag
    18. 18. Tagging Games
    19. 19. Overall Remarks• Each application’s implementation of folksonomies is different• Subject matter is crucial – Altruism is rare (Wikipedia) – Personal gain is important motivation (• Implementation is important – Must be easy to use – Often few features• Usefulness tends to increase when alternative indexing and retrieval methods are insufficient
    20. 20. Knowledge Representation How folksonomies are used to capture knowledge
    21. 21. Overview of Knowledge Representation• Types of Data• Broad versus Narrow• Tag Distribution• Tag Gardening• User Behaviour• Advantages• Disadvantages
    22. 22. The Tripartite Hypergraph• 3 types of data – Users/Identity – Resources/Object – Tags/Metadata• 3 types of graphs – User-Tag-Resource – User-Tag-User – Resource-Tag-Resource Source:
    23. 23. User-Tag-Resource Graph User TagResource User Tag Resource Tag User Tag Resource• Answers the question “Which resources relate to which user?”• Useful for PKM and browsing through interesting users’ resources
    24. 24. User-Tag-User Graph Very similar users (e=2) User TagUser User Tag Tag Tag User • Answers the question “Which users are similar?” • Useful for finding users with similar interests • Similarity can be measured by connected edges
    25. 25. Resource-Tag-Resource Graph ResourceResource Tag Tag TagResource Resource Resource Tag Tag Resource Highly related resources! (e=2)• Answers the question “Which resources are similar?”• Useful for finding related resources• Similarity can be measured by connected edges
    26. 26. Broad Folksonomies• A resource can be tagged with the same tag more than once – E.g., CiteULike, Connotea, Bibso nomy – Tend to be link-based resources• Can calculate tag frequency per item• Can enable tag recommender systems Source: Thomas vander Wal, 2005
    27. 27. Narrow Folksonomies• A resource can be tagged with a certain tag only once – E.g. flickr, Amazon, YouTube – Tend to be non-textual resources – Resources are inherently unique – Duplicates cannot be detected easily• Tag occurrence for a resource is either 0 or 1 Source: Thomas vander Wal, 2005
    28. 28. Tag Distribution• Tends to follow the “Power Law” (drops off exponentially)• Long Tail tags tend to be either useless (personal, synonyms, general) or high value discriminators
    29. 29. Tag Gardening• Is the attempt to address tagging problems, such as: – Synonyms (dog, doggy, dogs) – Multilingualism (dog, chien, Hund, perro) – Homonyms (jaguar[cat], jaguar[car]) – “Spagging” – Semantic Enrichment (dog is a mammal, poodle is a type of dog, london and paris are cities) – Personalisms (toread, willbuy, cmn5150) – Misspellings and orthographic variation (uottawa, u-ottawa, u_ottawa, uotawa)• Must be either: – User-guided and personal – Community-wide and automatic but invisible
    30. 30. Summary of Advantages• Authentic Language• Actuality/Neologisms• Multiple interpretations• Cheap indexing – distributed workload• More taggers, better effect – scales well• Identify communities and “small worlds”• Recommendation systems• Familiarize users with indexing system• Faster than classifying in a taxonomy• Good user recollection
    31. 31. Summary of Disadvantages• Lack of a controlled vocabulary• The context of indexing is lost• Languages are mixed• Hidden relations are unexploited• Spam tags, user-specific tags, unclear keywords• Resources are indexed as a whole• Social character of tags is mostly invisible• Cold start problem
    32. 32. Information RetrievalHow folksonomies are used to retrieve information
    33. 33. Retrieval with Folksonomies• Search – Works much like full-text search – Puts more weight on tag hits• Browse – Filter by tag – Uses tag clouds and other tools – Allows for “serendipitous” discovery• Visualize – Discover patterns in tags
    34. 34. Tag Filtering• Tag filtering is the mechanism for filtering a list of resources by tag – Mine, a person’s or the community’s• Usually assume AND relation between tags• Can be implemented with clicks-only or text• Could support more advanced filtering – e.g. newyork & (cats | dogs)
    35. 35. Searching on
    36. 36. Browsing on Diigo
    37. 37. Visualizing with Delicious Soup
    38. 38. TopigraphySource: Fujimara, 2008
    39. 39. Concluding RemarksWith applications to Knowledge Management
    40. 40. Folksonomies and KM • Information retrieval via• Tag gardening tags• Automatic processing • Tag clouds, tag search • Visualization tools Combination Internalization Externalization Socialization • Familiarization with tags• Adding tags to resources • Recommender systems
    41. 41. Folksonomies and BaHow can the environment (ba) contribute to the management ofknowledge?User Issues Technical Issues• Insight into community • Intra- and inter-linguistic mind via tag issues clouds, visualizations, reco • Inter-platform issues mmender systems • Spam detection• Promotion of “tagiquette” • Fair relevance rankings• Leveraging selfishness • Integrated visualization• Integration into traditional tools taxonomies
    42. 42. Conclusion• Folksonomies are a powerful, and sometimes necessary, way of managing Web 2.0• Functionality, not an application itself• Can complement traditional techniques, like ontologies, hierarchies, full-text search, etc…• Success depends on: – Number of users – Quality of implementation – Suitability of resource for tagging – Automatic tag management algorithms – Unsuitability of alternative classification and retrieval mechanisms
    43. 43. References• Fujimara, K. “Topigraphy: Visualization for Large-Scale Tag Clouds” (2008), WWW2008.• Nonaka, I. “The Concept of ‘Ba’: Building a Foundation for Knowledge Creation”, California Management Review, Vol. 40, No. 3, Spring 1998, p. 40-54.• Peters, Isabella. “Folksonomies: Indexing and Retrieval in Web 2.0” (2007), De Gruyter.• Peters, Isabella. “Folksonomies Indexing Und Retrieval In Bibliotheken” (2010). Retrieved from und-retrieval-in-bibliotheken• Peters, I. & Weller, K. “Tag Gardening for Folksonomy Enrichment and Maintenance” (2008). Retrieved from• Smith, G. “Visual Folksonomy Explanation” (2005). Retrieved from• Vander Wal, T. “Explaining and Showing Broad and Narrow Folksonomies” (2005). Retrieved from
    44. 44. Questions? Source: Larson, 1987