Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy

  • 6,054 views
Uploaded on

Tagging isn’t new - it’s been around for a dog’s age in internet years. But in the past few years some fresh ideas and tools have reinvigorated the social tagging world. These new approaches include …

Tagging isn’t new - it’s been around for a dog’s age in internet years. But in the past few years some fresh ideas and tools have reinvigorated the social tagging world. These new approaches include an attempt to improve findability through a bit of structure and control. While the idea of adding control to folksonomy seems like going against the whole selling point of social tagging (flexibility, openness), it is bringing the tagging to a new level, making it more viable for practical use in enterprises. This session will present hybrid approaches to formal taxonomies and social tagging. How can they be used in the corporate environment? What type of content is appropriate for social tagging? What kind of software is available for the enterprise? Learn how social tagging is not necessarily anathema to corporate taxonomy programs and how this hybrid approach can bring the best of both worlds: a fresh, up to date taxonomy with the structure needed to improve information findability.
Key Takeaways:

Folksonomy and taxonomy defined
Drawbacks of pure social tagging
Social tagging in the enterprise
Hybrid taxonomy & folksonomy approaches: Four models

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Interesting slideshow thanks for sharing. Fyi you might want to put Knowledge Plaza (http://www.knowledgeplaza.be) on your radar as it has been providing a mixed taxonomy/folksonomy approach to their information/knowledge management environment for a couple of years now.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,054
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
323
Comments
1
Likes
19

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Hybrid Approaches to Taxonomy & Folksonomy Semantic Technology 2009 San Jose, CA June 17, 2009 Richard Beatch Paul Wlodarczyk Earley & Associates www.earley.com
  • 2. Agenda
    • The taxonomy/folksonomy debate
    • Tagging pitfalls
    • Social tagging & the enterprise
    • Hybrid approaches to taxonomy/folksonomy
      • Co-existence
      • Tag-influenced taxonomy
      • Taxonomy-influenced tags
      • Tag hierarchies/ontologies
    • Conclusion
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 3. About Earley & Associates
    • Founded in 1994, Earley & Associates is an information management (IM) consulting company specializing in
      • Taxonomy development and management
      • Content management strategy
      • Search integration
      • Usability & Information Architecture
    • Some of our recent clients include:
      • American Greetings, Hasbro, Ford Foundation, Astra Zeneca, Motorola, The Hartford Insurance Group, Urban Land Institute
    • Give us your business card
      • For a free pass to one of our Community of Practice conference calls
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 4. About us
    • Richard Beatch
      • Senior Consultant at Earley & Associates, Inc.
      • Ph.D. in Ontology
      • Specialized in Taxonomy, Search, Metadata, and content architecture.
      • Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T.
      • Blog: http://sethearley.wordpress.com/
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 5. About us
    • Paul Wlodarczyk
      • Director, Solutions Consulting at Earley & Associates, Inc.
      • MBA with BA in Psychology / Cognitive Science
      • Specialized in unstructured content technologies with over 20 years experience in XML / structured authoring, content reuse, ECM, KM, localization, semantic analysis and content enrichment
      • Blogs at http://sethearley.wordpress.com/ and http://thecontentguy.net
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 6. The tired debate Copyright © 2009 Earley & Associates Inc. All Rights Reserved Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost – “crowdsourced”
  • 7. The relevance problem
    • Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term*
    • Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps
    *Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 8. Taxonomy
    • Added by a small number of individuals: author/originators or “authorized” persons (e.g.librarian)
    • Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary
    • Relationships between terms defined
      • Hierarchical (e.g. Computer hardware > Keyboard)
      • Associative (e.g. Computer hardware – Software)
      • Equivalent (e.g. Laptop = Notebook Computer)
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 9. Tags
    • Added by authors and consumers (individual motivation)
    • Can connote any type of meaning or purpose
    • No compression around a single viewpoint, no control of vocabulary
    • Self-correcting through volume
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 10. Why tagging is so interesting…
    • Adding individual value to the act of classification – user control over findability
    • Reducing the cognitive burden (i.e. it’s easy)
    • Reduced technological investment (i.e. it’s cheap)
    • Can leverage emergent structure (folksonomy)
    Reno| Tags Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 11. The downside…
    • Neither tags nor taggers are perfect …
    • No language control
      • Guy & Tonkin, 2006.
      • http://www.dlib.org/dlib/january06/guy/01guy.html
    Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Copyright © 2009 Earley & Associates Inc. All Rights Reserved Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog
  • 12. The downside…
    • Varying levels of granularity
    • Same tag, different meanings
    • Lack of relationships between tags – which is broader? Narrower?
    • Lack of consistency/approach to change – even single user can change language and hamper own personal retrieval
    Robin Bird Turdus migratorinus … Known as “tag noise” Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 13. The downside…
    • Most tag search does not account for stemming, plurals, etc.
    E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642 Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 14. The tagging hype cycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 15. The web vs. the enterprise
    • Shirky: “there is no shelf”
      • Traditional organization schemes are built to deal with physical collections and constraints.
      • They don’t work well on the web
        • large corpus
        • no clear edges
        • no formal categories
        • no authority
    • The enterprise is much more defined
        • smaller corpuses
        • formal entities
        • coordinated users, clear tasks
        • need for reliable retrieval
    E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 16. R o le of folksonomy in the enterprise?
    • Tagging external links
      • Seeing what colleagues are interested in
      • Sharing links with a specific team
      • Subscribing to link feeds
      • Monitoring news/blog coverage of the company
      • Consumer/competitor research
      • Tracking industry trends
    • Tagging internal links
      • Finding/facilitating access to most popular pages on the intranet
      • Seeing what intranet pages mean to staff
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 17. Role of folksonomy in the enterprise?
    • Social aspects
      • Identifying subject matter experts
      • Connecting people who share interests
      • Encouraging collaboration & resource sharing
    • Improve your taxonomy, information retrieval
      • User tagging to refine the corporate taxonomy
        • New concepts
        • New terminology
      • Seeing what employees find interesting
      • Distributing tagging tasks
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 18. The downside…
    • Potential issues of security, inappropriateness
      • Can implement some level of vetting
    • Privacy concerns
      • Can be anonymous tagging, although this removes some social value
      • Can create role or team-based collections
    • Need higher ratio of active participants due to population size
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 19. Message text External News Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept: Not all content is created equally The content continuum Copyright © 2009 Earley & Associates Inc. All Rights Reserved Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value
  • 20. What if we blended the two?
    • Folksonomy / Taxonomy
    Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 21. Hybrid approaches Co-existence Tag-influenced taxonomy Taxonomy-influenced tagging Tag hierarchies/ontologies Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 22. Co-existence
    • Taxonomy and folksonomy are used side by side
    • Strengths of each approach preserved, philosophy of each kept “pure”
    Web example: Flickr & Library of Congress: http://www.flickr.com/photos/library_of_congress/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 23. Co-existence – Ann Arbor District Library Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 24. Raytheon – corporate example
    • Used in Raytheon employee portal - website lists (“Suggested sites” feature box)
    • How does it work:
      • inserted “Suggested Sites” in a "feature" box to the right of the regularly ranked results
      • website suggestions (URLs) submitted along with recommended tags/keywords which are subsequently verified and approved by librarians
    http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 25. Variation: Tag mediation
    • Vetting & editing tags
    • Pros:
      • Weeds out potentially inappropriate tags
      • Eliminates misspellings, plural issues, etc.
      • Some can be done automatically (spell-checker, e.g.)
      • Enhances findability
    • Cons:
      • Higher effort/cost
      • Perceived lack of trust
      • Who knows better?
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 26. Tag-influenced taxonomy
    • Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current
      • Find new terminology (synonyms, popular language)
      • Find new concepts
    • Performed as separate processes (taxonomy tagging=formal, tagging=informal) or combined in single interface
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 27. Tag-influenced taxonomy
    • Requires formal vetting process
    • Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words)
    • Evaluate candidates based on
      • Frequency (“literary warrant”)
      • Salience within context
    • Look at tags used in conjunction with taxonomy
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 28. Taxonomy-influenced tagging
    • Presenting choices/suggestions to user from controlled set of terms/tags
      • Sometimes users prefer easy choice
        • Drop-down menus
        • Check boxes
        • Type ahead
        • Tree view
      • “ influenced” – option to enter own tag? Good source of new terms
      • Enforces consistency
      • Offers structure
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 29. WWW example: ZigTag Defined Tagging Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 30. Zigtag
    • Type ahead & synonyms encourage consistency
    • Users can enter new tags
    • Synonyms based on Wikipedia, so can be “dirty data”
    • No hierarchy, only equivalent relationships so far
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 31. Zigtag search Still get problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browse-able tag cloud Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 32. Example: myedna (Education.au) http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 33. TextWise Semantic Cloud
    • Document (URL or text) is submitted to web service for semantic analysis
    • Category tags from subset of the ODP taxonomy
    • Concept tags are derived from document, persisted, related to ODP categories
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved 1 3 2
  • 34. Buzzillions.com
    • Review site: tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy
    • Only popular tags exposed as faceted navigation
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 35. SharePoint?
    • Plug-ins make taxonomy easy
    • Present the taxonomy like tags
    • E.g. KWizCom: plug-in manages taxonomy and tags in easy interface… can opt-out of letting users create own tags
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 36. Taxonomy-influenced tagging
    • Pros:
      • More consistency
      • Better support for findability
      • Relationships, definitions leveraged – adding meaning to the tags
      • Realistic for the enterprise
    • Cons:
      • Not really folksonomy anymore..
      • Can be forcing terminology on user
      • Need to develop reference list of concepts – manually through taxonomy or need large corpus to derive automatically
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 37. Tag hierarchies
    • Tag hierarchies come in two flavors:
    • User-powered
    • Automatic derivation
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 38. User-powered tag hierarchies
    • User-powered
      • Social approach
      • Bogus hierarchies possible
      • Small population will contribute
    • RawSugar tried it
      • (no longer around)
      • Taggers could specify hierarchy in own account, tags clustered based on common groups
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 39. User-powered tag hierarchies
    • E.g. LibraryThing
    LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 40. User-powered tag hierarchies: Intelligent tags
    • Move toward more semantic tagging with machine-readable tags, e.g. Flickr machine tags in “triple” format: [namespace]:[key]=[value]
      • geo:neighborhood=SoHo, geo:lat=58.41618, etc.
      • flickr:user=mortimer
      • taxonomy:common=grevyszebra  
      • lastfm:event=34640
        • makes your photo appear on a lastfm event page
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 41. User-powered tag hierarchies: Intelligent tags
    • MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web
      • http://moat-project.org/
    • Adding to the triplet
      • User – resource – tag – meaning
      • Meaning = URI to a resource containing meaning (e.g. DBPedia)
    <tag:RestrictedTagging> <tag:taggedResource rdf:resource=&quot;http://example.org/post/1&quot;/> <foaf:maker rdf:resource=&quot;http://apassant.net/alex&quot;/> <tag:associatedTag rdf:resource=&quot;http://tags.moat-project.org/tag/ apple &quot;/> <moat:tagMeaning rdf:resource=&quot;http://dbpedia.org/resource/Apple_Records&quot;/> </tag:RestrictedTagging> Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 42. Automatically derived tag hierarchies
    • Tag hierarchies, facets, ontologies, or “folksontology”
    • Done through statistical/clustering algorithms
    http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 43. Delicious & citeulike hiearchy http://heymann.stanford.edu/taghierarchy.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 44. Clustering at flickr Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 45. Auto clustering/facets
    • Still not very mature
    • Time-sensitive
    • Community- sensitive
    • Ambiguous tags
    • Improve with volume (self-correcting)
    http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 46. Tag hierarchy pros and cons
    • Pros:
      • Relationships, definitions leveraged – adding meaning to the tags
      • Provides a basis for application behavior in the absence of taxonomy (e.g. Flickr maps, clusters)
      • Self-correcting with volume
    • Cons:
      • Automatically derived relationships (clusters) can be bogus or time-sensitive
      • Folksonomic relationships can be esoteric (just like tags)
      • Small population of contributors
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 47. Conclusion
    • Not all content is created equal – tags and taxonomies have their sweet spots
    • Hybrid approaches are emerging
      • taxonomy-influenced tagging leading the pack in popularity on the web
      • co-existence in the enterprise
    • Look for more developments on the semantic web/linked data front for making tags more intelligent
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 48. Questions? Richard Beatch [email_address] Paul Wlodarczyk [email_address] Web : www.earley.com Blog : sethearley.wordpress.com Twitter : earleytaxonomy Give us your business card for a free pass to one of our Community of Practice conference calls (a $50 value).
  • 49. Appendix: Corporate social tagging tools
  • 50. Corporate social tagging software http://www.connectbeam.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 51. Corporate social tagging software http://www.cogenz.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 52. Corporate social tagging software http://www-306.ibm.com/software/lotus/products/connections/dogear.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 53. Corporate social tagging software
    • BEA AquaLogic Pathways
        • http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 54. Corporate social tagging software
        • http://www.newsgator.com/business/socialsites/default.aspx
    Copyright © 2009 Earley & Associates Inc. All Rights Reserved