Hybrid Approaches to Taxonomy & Folksonmy


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • It’s not all hugs & puppies though. Tagging does have downsides.
  • Here’s an interesting image – if you haven’t seen a hype cycle before, Gartner puts these out regularly to indicate the adoption trend of new technologies. The cycle is this: first the technology is created and takes off. People start talking about it, there’s a lot of excitement about potential, and a few innovative companies try their hand at implementing it. As more organizations join, it hits what is called the peak of inflated expectations. Before data starts rolling in about how successful everyone has been with the technology, people start writing blogs and articles on a few star case studies and everyone thinks it will solve all of their problems. Then we hit the trough of disillusionment, where the data starts coming in and we realize that these technologies don’t work out the box, that they need work and don’t apply in all situations. Then those who don’t give up start building best practices and the technology matures, there are more stable vendors, and it enters the slope of enlightenment. We are just starting to climb out of the trough – realizing what we can realistically do with social tagging in the enterprise. As with any trend, it isn’t the panacea that people thought it would be, but it is a useful technique.
  • Appliance, OOTB – also OOTB connectors for FAST, Google, integrates with LDAP, most intranet portal search
  • More of a standalone system
  • Now called Lotus Connections
  • Hybrid Approaches to Taxonomy & Folksonmy

    1. 1. Hybrid Approaches to Taxonomy & Folksonomy Semantic T e chnology, 2009 Stephanie Lemieux Earley & Associates [email_address] www.earley.com
    2. 2. Agenda <ul><li>The taxonomy/folksonomy debate </li></ul><ul><li>Tagging pitfalls </li></ul><ul><li>Social tagging & the enterprise </li></ul><ul><li>Hybrid approaches to taxonomy/folksonomy </li></ul><ul><li>Corporate tagging tools </li></ul>
    3. 3. About me <ul><li>Stephanie Lemieux </li></ul><ul><ul><li>Senior Consultant at Earley & Associates, Inc. </li></ul></ul><ul><ul><li>Masters in Library and Information Studies (MLIS), specializing in taxonomy development, content management, search, IA </li></ul></ul><ul><ul><li>Developed enterprise taxonomies and helped a variety of clients through CMS deployments </li></ul></ul><ul><ul><li>Projects include: Motorola, Ford Foundation, Best Buy, American Greetings, Urban Land Institute </li></ul></ul><ul><ul><li>Blog: http://sethearley.wordpress.com/ </li></ul></ul>
    4. 4. The tired debate Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost – “crowdsourced”
    5. 5. The relevance problem <ul><li>Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term* </li></ul><ul><li>Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps </li></ul>*Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf
    6. 6. Taxonomy <ul><li>Added by a small number of individuals: author/originators or “authorized” persons (e.g.librarian) </li></ul><ul><li>Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary </li></ul><ul><li>Relationships between terms defined </li></ul><ul><ul><li>Hierarchical (e.g. Computer hardware > Keyboard) </li></ul></ul><ul><ul><li>Associative (e.g. Computer hardware – Software) </li></ul></ul><ul><ul><li>Equivalent (e.g. Laptop = Notebook Computer) </li></ul></ul>
    7. 7. Tags <ul><li>Added by authors and consumers (individual motivation) </li></ul><ul><li>Can connote any type of meaning or purpose </li></ul><ul><li>No compression around a single viewpoint, no control of vocabulary </li></ul><ul><li>Self-correcting through volume </li></ul>
    8. 8. Why tagging is so interesting… <ul><li>Adding individual value to the act of classification – user control over findability </li></ul><ul><li>Reducing the cognitive burden (i.e. it’s easy) </li></ul><ul><li>Reduced technological investment (i.e. it’s cheap) </li></ul><ul><li>Can leverage emergent structure (folksonomy) </li></ul>Reno| Tags
    9. 9. The downside… <ul><li>Neither tags nor taggers are perfect … </li></ul><ul><li>No language control </li></ul><ul><ul><li>Guy & Tonkin, 2006. </li></ul></ul><ul><ul><li>http://www.dlib.org/dlib/january06/guy/01guy.html </li></ul></ul>Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog
    10. 10. The downside… <ul><li>Varying levels of granularity </li></ul><ul><li>Same tag, different meanings </li></ul><ul><li>Lack of relationships between tags – which is broader? Narrower? </li></ul><ul><li>Lack of consistency/approach to change – even single user can change language and hamper own personal retrieval </li></ul>Robin Bird Turdus migratorinus … Known as “tag noise”
    11. 11. The downside… <ul><li>Most tag search does not account for stemming, plurals, etc. </li></ul>E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642
    12. 12. The tagging hype cycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html
    13. 13. The web vs. the enterprise <ul><li>Shirky: “there is no shelf” </li></ul><ul><ul><li>Traditional organization schemes are built to deal with physical collections and constraints. </li></ul></ul><ul><ul><li>They don’t work well on the web </li></ul></ul><ul><ul><ul><li>large corpus </li></ul></ul></ul><ul><ul><ul><li>no clear edges </li></ul></ul></ul><ul><ul><ul><li>no formal categories </li></ul></ul></ul><ul><ul><ul><li>no authority </li></ul></ul></ul><ul><li>The enterprise is much more defined </li></ul><ul><ul><ul><li>smaller corpuses </li></ul></ul></ul><ul><ul><ul><li>formal entities </li></ul></ul></ul><ul><ul><ul><li>coordinated users, clear tasks </li></ul></ul></ul><ul><ul><ul><li>need for reliable retrieval </li></ul></ul></ul>E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena
    14. 14. R o le of folksonomy in the enterprise? <ul><li>Tagging external links </li></ul><ul><ul><li>Seeing what colleagues are interested in </li></ul></ul><ul><ul><li>Sharing links with a specific team </li></ul></ul><ul><ul><li>Subscribing to link feeds </li></ul></ul><ul><ul><li>Monitoring news/blog coverage of the company </li></ul></ul><ul><ul><li>Consumer/competitor research </li></ul></ul><ul><ul><li>Tracking industry trends </li></ul></ul><ul><li>Tagging internal links </li></ul><ul><ul><li>Finding/facilitating access to most popular pages on the intranet </li></ul></ul><ul><ul><li>Seeing what intranet pages mean to staff </li></ul></ul>
    15. 15. Role of folksonomy in the enterprise? <ul><li>Social aspects </li></ul><ul><ul><li>Identifying subject matter experts </li></ul></ul><ul><ul><li>Connecting people who share interests </li></ul></ul><ul><ul><li>Encouraging collaboration & resource sharing </li></ul></ul><ul><li>Improve your taxonomy, information retrieval </li></ul><ul><ul><li>User tagging to refine the corporate taxonomy </li></ul></ul><ul><ul><ul><li>New concepts </li></ul></ul></ul><ul><ul><ul><li>New terminology </li></ul></ul></ul><ul><ul><li>Seeing what employees find interesting </li></ul></ul><ul><ul><li>Distributing tagging tasks </li></ul></ul>
    16. 16. The downside… <ul><li>Potential issues of security, inappropriateness </li></ul><ul><ul><li>Can implement some level of vetting </li></ul></ul><ul><li>Privacy concerns </li></ul><ul><ul><li>Can be anonymous tagging, although this removes some social value </li></ul></ul><ul><ul><li>Can create role or team-based collections </li></ul></ul><ul><li>Need higher ratio of active participants due to population size </li></ul>
    17. 17. Message text External News Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept: Not all content is created equally The content continuum Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value
    18. 18. What if we blended the two? <ul><li>Folksonomy / Taxonomy </li></ul>Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency
    19. 19. Hybrid approaches Co-existence Tag-influenced taxonomy Taxonomy influenced tagging Tag hierarchies/ontologies
    20. 20. Co-existence <ul><li>Taxonomy and folksonomy are used side by side </li></ul><ul><li>Strengths of each approach preserved, philosophy of each kept “pure” </li></ul>Web example: Flickr & Library of Congress: http://www.flickr.com/photos/library_of_congress/
    21. 21. Co-existence – public library
    22. 22. Raytheon – corporate example <ul><li>Used in Raytheon employee portal - website lists (“Suggested sites” feature box) </li></ul><ul><li>How does it work: </li></ul><ul><ul><li>inserted “Suggested Sites” in a &quot;feature&quot; box to the right of the regularly ranked results </li></ul></ul><ul><ul><li>website suggestions (URLs) submitted along with recommended tags/keyword which are subsequently verified and approved by librarians </li></ul></ul>http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation
    23. 23. Variation: Tag mediation <ul><li>Vetting & editing tags </li></ul><ul><li>Pros: </li></ul><ul><ul><li>Weeds out potentially inappropriate tags </li></ul></ul><ul><ul><li>Eliminates misspellings, plural issues, etc. </li></ul></ul><ul><ul><li>Some can be done automatically (spell-checker, e.g.) </li></ul></ul><ul><ul><li>Enhances findability </li></ul></ul><ul><li>Cons: </li></ul><ul><ul><li>Higher effort/cost </li></ul></ul><ul><ul><li>Perceived lack of trust </li></ul></ul><ul><ul><li>Who knows better? </li></ul></ul>
    24. 24. Tag-influenced taxonomy <ul><li>Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current </li></ul><ul><ul><li>Find new terminology (synonyms, popular language) </li></ul></ul><ul><ul><li>Find new concepts </li></ul></ul><ul><li>Performed as separate processes (taxonomy tagging=formal, tagging=informal) or combined in single interface </li></ul>
    25. 25. Tag-influenced taxonomy <ul><li>Requires formal vetting process </li></ul><ul><li>Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words) </li></ul><ul><li>Evaluate candidates based on </li></ul><ul><ul><li>Frequency (“literary warrant”) </li></ul></ul><ul><ul><li>Salience within context </li></ul></ul><ul><li>Look at tags used in conjunction with taxonomy </li></ul>
    26. 26. Taxonomy-influenced tagging <ul><li>Presenting choices/suggestions to user from controlled set of terms/tags </li></ul><ul><ul><li>Sometimes users prefer easy choice </li></ul></ul><ul><ul><ul><li>Drop-down menus </li></ul></ul></ul><ul><ul><ul><li>Check boxes </li></ul></ul></ul><ul><ul><ul><li>Type ahead </li></ul></ul></ul><ul><ul><ul><li>Tree view </li></ul></ul></ul><ul><ul><li>“ influenced” – option to enter own tag? Good source of new terms </li></ul></ul><ul><ul><li>Enforces consistency </li></ul></ul><ul><ul><li>Offers structure </li></ul></ul>
    27. 27. WWW example: ZigTag (2008) Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms
    28. 28. Zigtag <ul><li>Type ahead & synonyms encourage consistency </li></ul><ul><li>Users can enter new tags </li></ul><ul><li>Synonyms based on Wikipedia, so can be “dirty data” </li></ul><ul><li>No hierarchy, only equivalent relationships so far </li></ul>
    29. 29. Zigtag search Still get problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browesable tag cloud
    30. 30. Example: myedna (Education.au) http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging
    31. 31. Buzzillions.com <ul><li>Review site: tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy </li></ul><ul><li>Only popular tags exposed as faceted navigation </li></ul>© 2008
    32. 32. SharePoint? <ul><li>Plugins make taxonomy easy, present like tags </li></ul><ul><li>E.g. KWizCom: plugin manages taxonomy and tags in easy interface… can opt-out of letting users create own tags </li></ul>
    33. 33. Taxonomy-directed tagging <ul><li>Pros: </li></ul><ul><ul><li>More consistency </li></ul></ul><ul><ul><li>Better support for findability </li></ul></ul><ul><ul><li>Relationships, definitions leveraged – adding meaning to the tags </li></ul></ul><ul><ul><li>Realistic for the enterprise </li></ul></ul><ul><li>Cons: </li></ul><ul><ul><li>Not really folksonomy anymore.. </li></ul></ul><ul><ul><li>Can be forcing terminology on user </li></ul></ul><ul><ul><li>Need to develop reference list of concepts – manually through taxonomy or need large corpus to derive automatically </li></ul></ul>
    34. 34. Tag hierarchies <ul><li>2 flavors: user-powered, automatic derivation </li></ul><ul><li>User-powered </li></ul><ul><ul><li>Social approach </li></ul></ul><ul><ul><li>Bogus hierarchies possible </li></ul></ul><ul><ul><li>Small population will contribute </li></ul></ul><ul><li>RawSugar tried it (no longer around): taggers could specify hierarchy in own account, tags clustered in a based on common groups </li></ul>
    35. 35. Raw Sugar example
    36. 36. More user-powered tag relationships <ul><li>E.g. LibraryThing </li></ul>LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com
    37. 37. Automatic derivation <ul><li>Tag hiearchies, facets, ontologies, or “folksontology” </li></ul><ul><li>Done through statistical/clustering algorithms </li></ul>http://www.pui.ch/phred/automated_tag_clustering/
    38. 38. Delicious & citeulike hiearchy http://heymann.stanford.edu/taghierarchy.html
    39. 39. Clustering at Flickr
    40. 40. Auto clustering/facets <ul><li>Still not very mature </li></ul><ul><li>Time-sensitive </li></ul><ul><li>Community- sensitive </li></ul><ul><li>Ambiguous tags </li></ul><ul><li>Improve with volume (self- correcting) </li></ul>http://www.pui.ch/phred/automated_tag_clustering/
    41. 41. Intelligent tags <ul><li>Moving toward more semantic tagging with machine readable tags </li></ul><ul><ul><li>Flickr: can tag images with machine tags </li></ul></ul><ul><ul><li>e.g. “geo:quartier=“SoHo” </li></ul></ul><ul><ul><li> namespace:predicate=value </li></ul></ul><ul><ul><li>e.g. “lastfm:event=34640” – makes your photo appear on a lastfm event page </li></ul></ul>
    42. 42. Intelligent tags <ul><li>MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web </li></ul><ul><ul><li>http://moat-project.org/ </li></ul></ul><ul><li>Adding to the triplet </li></ul><ul><ul><li>User – resource – tag – meaning </li></ul></ul><ul><ul><li>Meaning = URI to a resource containing meaning (e.g. DBPedia) </li></ul></ul><tag:RestrictedTagging> <tag:taggedResource rdf:resource=&quot;http://example.org/post/1&quot;/> <foaf:maker rdf:resource=&quot;http://apassant.net/alex&quot;/> <tag:associatedTag rdf:resource=&quot;http://tags.moat-project.org/tag/ apple &quot;/> <moat:tagMeaning rdf:resource=&quot;http://dbpedia.org/resource/Apple_Records&quot;/> </tag:RestrictedTagging>
    43. 43. Conclusion <ul><li>Not all content is created equal – tags and taxonomies have their sweet spots </li></ul><ul><li>Hybrid approaches are emerging </li></ul><ul><ul><li>taxonomy-influenced tagging leading the pack in popularity on the web </li></ul></ul><ul><ul><li>co-existence in the enterprise </li></ul></ul><ul><li>Look for more developments on the semantic web/linked data front for making tags more intelligent </li></ul>
    44. 44. Corporate social tagging tools © 2008
    45. 45. Corporate social tagging software http://www.connectbeam.com/
    46. 46. Corporate social tagging software http://www.cogenz.com/
    47. 47. Corporate social tagging software http://www-306.ibm.com/software/lotus/products/connections/dogear.html
    48. 48. Corporate social tagging software <ul><li>BEA AquaLogic Pathways </li></ul><ul><ul><ul><li>http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/ </li></ul></ul></ul>© 2008
    49. 49. Corporate social tagging software <ul><ul><ul><li>http://www.newsgator.com/business/socialsites/default.aspx </li></ul></ul></ul>
    50. 50. Questions? Stephanie Lemieux [email_address] www.earley.com 781-444-0287 Blog : sethearley.wordpress.com Twitter : stephlemieux Send an email to [email_address] for a free pass to one of our conference calls.