Hybrid Approaches to  Taxonomy & Folksonomy   Semantic Technology 2009 San Jose, CA June 17, 2009  Richard Beatch Paul Wlodarczyk Earley & Associates www.earley.com
Agenda The taxonomy/folksonomy debate Tagging pitfalls Social tagging & the enterprise Hybrid approaches to taxonomy/folksonomy Co-existence Tag-influenced taxonomy Taxonomy-influenced tags Tag hierarchies/ontologies Conclusion Copyright © 2009 Earley & Associates Inc. All Rights Reserved
About Earley & Associates Founded in 1994, Earley & Associates is an information management (IM) consulting company specializing in  Taxonomy development and management  Content management strategy  Search integration  Usability & Information Architecture Some of our recent clients include: American Greetings, Hasbro, Ford Foundation, Astra Zeneca, Motorola, The Hartford Insurance Group, Urban Land Institute Give us your business card  For a free pass to one of our Community of Practice conference calls Copyright © 2009 Earley & Associates Inc. All Rights Reserved
About us Richard Beatch Senior Consultant at Earley & Associates, Inc. Ph.D. in Ontology Specialized in Taxonomy, Search, Metadata, and content architecture. Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T. Blog: http://sethearley.wordpress.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
About us Paul Wlodarczyk Director, Solutions Consulting at Earley & Associates, Inc. MBA with BA in Psychology / Cognitive Science Specialized in unstructured content technologies with over 20 years experience in XML / structured authoring, content reuse, ECM, KM, localization, semantic analysis and content enrichment Blogs at http://sethearley.wordpress.com/ and http://thecontentguy.net Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The tired debate Copyright © 2009 Earley & Associates Inc. All Rights Reserved Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost – “crowdsourced”
The relevance problem Search results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term* Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps *Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf   Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Taxonomy Added by a small number of individuals: author/originators or “authorized” persons (e.g.librarian) Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary Relationships between terms defined Hierarchical (e.g. Computer hardware > Keyboard) Associative (e.g. Computer hardware – Software) Equivalent (e.g. Laptop = Notebook Computer) Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Tags Added by authors and consumers (individual motivation) Can connote any type of meaning or purpose No compression around a single viewpoint, no control of vocabulary Self-correcting through volume Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Why tagging is so interesting… Adding individual value to the act of classification – user control over findability Reducing the cognitive burden  (i.e. it’s easy) Reduced technological  investment (i.e. it’s cheap) Can leverage emergent  structure (folksonomy) Reno| Tags Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The downside… Neither tags nor taggers are perfect … No language control Guy & Tonkin, 2006. http://www.dlib.org/dlib/january06/guy/01guy.html Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Copyright © 2009 Earley & Associates Inc. All Rights Reserved Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog
The downside… Varying levels of granularity Same tag, different meanings Lack of relationships between tags – which is broader? Narrower? Lack of consistency/approach to change – even single user can change language and hamper own personal retrieval Robin Bird Turdus migratorinus … Known as “tag noise” Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The downside… Most tag search does not account for stemming, plurals, etc. E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642 Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The tagging hype cycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The web vs. the enterprise Shirky: “there is no shelf” Traditional organization schemes are built to deal with physical collections and constraints. They don’t work well on the web large corpus no clear edges no formal categories no authority The enterprise is much more defined smaller corpuses formal entities coordinated users, clear tasks need for reliable retrieval E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena Copyright © 2009 Earley & Associates Inc. All Rights Reserved
R o le of folksonomy in the enterprise? Tagging external links Seeing what colleagues are interested in Sharing links with a specific team Subscribing to link feeds Monitoring news/blog coverage of the company Consumer/competitor research Tracking industry trends Tagging internal links Finding/facilitating access to most popular pages on the intranet Seeing what intranet pages mean to staff Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Role of folksonomy in the enterprise? Social aspects Identifying subject matter experts Connecting people who share interests Encouraging collaboration & resource sharing Improve your taxonomy, information retrieval User tagging to refine the corporate taxonomy New concepts New terminology Seeing what employees find interesting Distributing tagging tasks Copyright © 2009 Earley & Associates Inc. All Rights Reserved
The downside… Potential issues of security, inappropriateness Can implement some level of vetting Privacy concerns Can be anonymous tagging, although this removes some social value Can create role or team-based collections Need higher ratio of active participants due to population size Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Message text External News Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept:  Not all content is created equally The content continuum Copyright © 2009 Earley & Associates Inc. All Rights Reserved Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value
What if we blended the two? Folksonomy / Taxonomy Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Hybrid approaches Co-existence Tag-influenced taxonomy Taxonomy-influenced tagging Tag hierarchies/ontologies Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Co-existence Taxonomy and folksonomy are used side by side Strengths of each approach preserved, philosophy of each kept “pure” Web example: Flickr & Library of Congress:  http://www.flickr.com/photos/library_of_congress/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Co-existence –  Ann Arbor District Library Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Raytheon – corporate example Used in Raytheon employee portal - website lists (“Suggested sites” feature box) How does it work:  inserted “Suggested Sites” in a "feature" box to the right of the regularly ranked results  website suggestions (URLs) submitted along with recommended tags/keywords which are subsequently verified and approved by librarians http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Variation: Tag mediation Vetting & editing tags Pros: Weeds out potentially inappropriate tags Eliminates misspellings, plural issues, etc. Some can be done automatically (spell-checker, e.g.) Enhances findability Cons:  Higher effort/cost Perceived lack of trust Who knows better? Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Tag-influenced taxonomy Taxonomy & tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current Find new terminology (synonyms, popular language) Find new concepts Performed as separate  processes (taxonomy  tagging=formal,  tagging=informal) or  combined in single  interface Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Tag-influenced taxonomy Requires formal vetting process Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words)  Evaluate candidates based on  Frequency (“literary warrant”) Salience within context Look at tags used in conjunction with taxonomy Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Taxonomy-influenced tagging Presenting choices/suggestions to user from controlled set of  terms/tags Sometimes users prefer easy choice Drop-down menus Check boxes Type ahead Tree view “ influenced” – option to enter own tag? Good source of new terms Enforces consistency Offers structure Copyright © 2009 Earley & Associates Inc. All Rights Reserved
WWW example: ZigTag Defined Tagging Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Zigtag Type ahead & synonyms encourage consistency Users can enter new tags Synonyms based on Wikipedia, so can be “dirty data” No hierarchy, only equivalent relationships so far Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Zigtag search Still get problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browse-able tag cloud Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Example: myedna (Education.au) http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging Copyright © 2009 Earley & Associates Inc. All Rights Reserved
TextWise Semantic Cloud Document (URL or text) is submitted to web service for semantic analysis Category tags from subset of the ODP taxonomy Concept tags are derived from document, persisted, related to ODP categories Copyright © 2009 Earley & Associates Inc. All Rights Reserved 1 3 2
Buzzillions.com Review site: tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy Only popular tags exposed as faceted navigation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
SharePoint? Plug-ins make taxonomy easy Present the taxonomy like tags E.g. KWizCom: plug-in manages taxonomy and tags in easy interface… can opt-out of letting users create own tags Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Taxonomy-influenced tagging Pros: More consistency Better support for findability Relationships, definitions leveraged  –  adding meaning to the tags Realistic for the enterprise Cons: Not really folksonomy anymore.. Can be forcing terminology on user Need to develop reference list of concepts – manually through taxonomy or need large corpus to derive automatically Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Tag hierarchies Tag hierarchies come in two flavors:  User-powered Automatic derivation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
User-powered tag hierarchies User-powered Social approach Bogus hierarchies possible Small population will contribute RawSugar tried it  (no longer around)  Taggers could specify hierarchy in own account, tags clustered based on common groups Copyright © 2009 Earley & Associates Inc. All Rights Reserved
User-powered  tag hierarchies E.g. LibraryThing  LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com Copyright © 2009 Earley & Associates Inc. All Rights Reserved
User-powered  tag hierarchies: Intelligent tags Move toward more semantic tagging with machine-readable tags, e.g. Flickr  machine tags  in “triple” format: [namespace]:[key]=[value] geo:neighborhood=SoHo, geo:lat=58.41618, etc. flickr:user=mortimer taxonomy:common=grevyszebra   lastfm:event=34640 makes your photo appear on a lastfm event page Copyright © 2009 Earley & Associates Inc. All Rights Reserved
User-powered  tag hierarchies: Intelligent tags MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web http://moat-project.org/ Adding to the triplet User – resource – tag –  meaning Meaning = URI to a resource containing meaning (e.g. DBPedia) <tag:RestrictedTagging> <tag:taggedResource rdf:resource=&quot;http://example.org/post/1&quot;/> <foaf:maker rdf:resource=&quot;http://apassant.net/alex&quot;/> <tag:associatedTag rdf:resource=&quot;http://tags.moat-project.org/tag/ apple &quot;/> <moat:tagMeaning rdf:resource=&quot;http://dbpedia.org/resource/Apple_Records&quot;/> </tag:RestrictedTagging> Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Automatically derived tag hierarchies Tag hierarchies, facets, ontologies, or “folksontology” Done through statistical/clustering algorithms http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Delicious & citeulike hiearchy http://heymann.stanford.edu/taghierarchy.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Clustering at flickr Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Auto clustering/facets Still not very mature Time-sensitive Community- sensitive Ambiguous tags Improve with volume  (self-correcting) http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Tag hierarchy pros and cons Pros: Relationships, definitions  leveraged  –  adding meaning to the tags Provides a basis for application behavior in the absence of taxonomy (e.g. Flickr maps, clusters) Self-correcting with volume Cons: Automatically derived relationships (clusters) can be bogus or time-sensitive Folksonomic relationships can be esoteric (just like tags) Small population of contributors Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Conclusion Not all content is created equal – tags and taxonomies have their sweet spots Hybrid approaches are emerging taxonomy-influenced tagging leading the pack in popularity on the web co-existence in the enterprise Look for more developments on the semantic web/linked data front for making tags more intelligent Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Questions? Richard Beatch  [email_address] Paul Wlodarczyk [email_address]   Web :  www.earley.com Blog : sethearley.wordpress.com Twitter :  earleytaxonomy Give us your business card for a free pass to one of our Community of Practice conference calls (a $50 value).
Appendix: Corporate social tagging tools
Corporate social tagging software http://www.connectbeam.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Corporate social tagging software http://www.cogenz.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Corporate social tagging software http://www-306.ibm.com/software/lotus/products/connections/dogear.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Corporate social tagging software BEA AquaLogic Pathways http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
Corporate social tagging software http://www.newsgator.com/business/socialsites/default.aspx Copyright © 2009 Earley & Associates Inc. All Rights Reserved

Semantic Technology 2009: Hybrid Approaches to Taxonomy and Folksonomy

  • 1.
    Hybrid Approaches to Taxonomy & Folksonomy Semantic Technology 2009 San Jose, CA June 17, 2009 Richard Beatch Paul Wlodarczyk Earley & Associates www.earley.com
  • 2.
    Agenda The taxonomy/folksonomydebate Tagging pitfalls Social tagging & the enterprise Hybrid approaches to taxonomy/folksonomy Co-existence Tag-influenced taxonomy Taxonomy-influenced tags Tag hierarchies/ontologies Conclusion Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 3.
    About Earley &Associates Founded in 1994, Earley & Associates is an information management (IM) consulting company specializing in Taxonomy development and management Content management strategy Search integration Usability & Information Architecture Some of our recent clients include: American Greetings, Hasbro, Ford Foundation, Astra Zeneca, Motorola, The Hartford Insurance Group, Urban Land Institute Give us your business card For a free pass to one of our Community of Practice conference calls Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 4.
    About us RichardBeatch Senior Consultant at Earley & Associates, Inc. Ph.D. in Ontology Specialized in Taxonomy, Search, Metadata, and content architecture. Extensive industry experience leading the implementation and design of taxonomies and search solutions for a range of companies including Apple, McAfee, Allstate, Dell, and AT&T. Blog: http://sethearley.wordpress.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 5.
    About us PaulWlodarczyk Director, Solutions Consulting at Earley & Associates, Inc. MBA with BA in Psychology / Cognitive Science Specialized in unstructured content technologies with over 20 years experience in XML / structured authoring, content reuse, ECM, KM, localization, semantic analysis and content enrichment Blogs at http://sethearley.wordpress.com/ and http://thecontentguy.net Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 6.
    The tired debateCopyright © 2009 Earley & Associates Inc. All Rights Reserved Taxonomy Folksonomy Control Democracy Top-down Bottom-up Arduous process Just do it Accurate Good enough Restrictive Flexible Static Evolving Expensive to maintain Low cost – “crowdsourced”
  • 7.
    The relevance problemSearch results should be relevant to what a searcher wants, but technology can only determine if it is relevant to a search term* Taxonomies and folksonomies = 2 approaches to the problem of relevance with common goal of describing content, each with particular gaps *Billy Cripe: Folksonomy, Keywords & Tags: Social & Democratic User Interaction in Enterprise Content Management http://www.oracle.com/technology/products/content-management/pdf/OracleSocialTaggingWhitePaper.pdf Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 8.
    Taxonomy Added bya small number of individuals: author/originators or “authorized” persons (e.g.librarian) Describes meaning or purpose of content based on a set view point for a specific audience using a controlled vocabulary Relationships between terms defined Hierarchical (e.g. Computer hardware > Keyboard) Associative (e.g. Computer hardware – Software) Equivalent (e.g. Laptop = Notebook Computer) Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 9.
    Tags Added byauthors and consumers (individual motivation) Can connote any type of meaning or purpose No compression around a single viewpoint, no control of vocabulary Self-correcting through volume Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 10.
    Why tagging isso interesting… Adding individual value to the act of classification – user control over findability Reducing the cognitive burden (i.e. it’s easy) Reduced technological investment (i.e. it’s cheap) Can leverage emergent structure (folksonomy) Reno| Tags Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 11.
    The downside… Neithertags nor taggers are perfect … No language control Guy & Tonkin, 2006. http://www.dlib.org/dlib/january06/guy/01guy.html Study: 40% of flickr tags and 28% of del.icio.us tags were flawed in these ways Copyright © 2009 Earley & Associates Inc. All Rights Reserved Misspellings Library vs. libary Plam pilot Compound words TimBernersLee Case & number Folksonomy, Folksonomies Personal tags To read My dog @work Single-use tags Billybobsdog
  • 12.
    The downside… Varyinglevels of granularity Same tag, different meanings Lack of relationships between tags – which is broader? Narrower? Lack of consistency/approach to change – even single user can change language and hamper own personal retrieval Robin Bird Turdus migratorinus … Known as “tag noise” Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 13.
    The downside… Mosttag search does not account for stemming, plurals, etc. E.g. Search on Delicious: Folksonomy: 16049 Folksonomies: 4404 Both: 2642 Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 14.
    The tagging hypecycle http://www.pui.ch/phred/archives/2007/05/tag-history-and-gartners-hype-cycles.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 15.
    The web vs.the enterprise Shirky: “there is no shelf” Traditional organization schemes are built to deal with physical collections and constraints. They don’t work well on the web large corpus no clear edges no formal categories no authority The enterprise is much more defined smaller corpuses formal entities coordinated users, clear tasks need for reliable retrieval E.g. Flickr Delicious Social tagging works well in this context Social tagging is more of a challenge, needs clear arena Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 16.
    R o leof folksonomy in the enterprise? Tagging external links Seeing what colleagues are interested in Sharing links with a specific team Subscribing to link feeds Monitoring news/blog coverage of the company Consumer/competitor research Tracking industry trends Tagging internal links Finding/facilitating access to most popular pages on the intranet Seeing what intranet pages mean to staff Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 17.
    Role of folksonomyin the enterprise? Social aspects Identifying subject matter experts Connecting people who share interests Encouraging collaboration & resource sharing Improve your taxonomy, information retrieval User tagging to refine the corporate taxonomy New concepts New terminology Seeing what employees find interesting Distributing tagging tasks Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 18.
    The downside… Potentialissues of security, inappropriateness Can implement some level of vetting Privacy concerns Can be anonymous tagging, although this removes some social value Can create role or team-based collections Need higher ratio of active participants due to population size Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 19.
    Message text ExternalNews Reports Discussion postings Links Engineering document repositories Success Stories Policies Approved Methods Best Practices Key concept: Not all content is created equally The content continuum Copyright © 2009 Earley & Associates Inc. All Rights Reserved Lower Cost Higher Cost Tagging/Organizing Processes Unfiltered Reviewed/Vetted/Approved Lower Value Higher Value
  • 20.
    What if weblended the two? Folksonomy / Taxonomy Low cost Findability Flexible Structured relationships User terminology Oversight Social sharing Consistency Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 21.
    Hybrid approaches Co-existenceTag-influenced taxonomy Taxonomy-influenced tagging Tag hierarchies/ontologies Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 22.
    Co-existence Taxonomy andfolksonomy are used side by side Strengths of each approach preserved, philosophy of each kept “pure” Web example: Flickr & Library of Congress: http://www.flickr.com/photos/library_of_congress/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 23.
    Co-existence – Ann Arbor District Library Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 24.
    Raytheon – corporateexample Used in Raytheon employee portal - website lists (“Suggested sites” feature box) How does it work: inserted “Suggested Sites” in a &quot;feature&quot; box to the right of the regularly ranked results website suggestions (URLs) submitted along with recommended tags/keywords which are subsequently verified and approved by librarians http://www.slideshare.net/CJMConnors/i-kms-singapore-presentation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 25.
    Variation: Tag mediationVetting & editing tags Pros: Weeds out potentially inappropriate tags Eliminates misspellings, plural issues, etc. Some can be done automatically (spell-checker, e.g.) Enhances findability Cons: Higher effort/cost Perceived lack of trust Who knows better? Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 26.
    Tag-influenced taxonomy Taxonomy& tagging co-exist, tags serve as pool of candidate terms to enrich taxonomy, keep it current Find new terminology (synonyms, popular language) Find new concepts Performed as separate processes (taxonomy tagging=formal, tagging=informal) or combined in single interface Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 27.
    Tag-influenced taxonomy Requiresformal vetting process Can be supported by automation (e.g. candidate tags pulled & filtered with script to remove taxonomy terms, stop words) Evaluate candidates based on Frequency (“literary warrant”) Salience within context Look at tags used in conjunction with taxonomy Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 28.
    Taxonomy-influenced tagging Presentingchoices/suggestions to user from controlled set of terms/tags Sometimes users prefer easy choice Drop-down menus Check boxes Type ahead Tree view “ influenced” – option to enter own tag? Good source of new terms Enforces consistency Offers structure Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 29.
    WWW example: ZigTagDefined Tagging Definitions from Wikipedia & Wordnet Tagging with type-ahead against database of 3M unique concepts & 8M synonyms Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 30.
    Zigtag Type ahead& synonyms encourage consistency Users can enter new tags Synonyms based on Wikipedia, so can be “dirty data” No hierarchy, only equivalent relationships so far Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 31.
    Zigtag search Stillget problems with uncontrolled tags & recall Interesting relationships from Wikipedia Browse-able tag cloud Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 32.
    Example: myedna (Education.au)http://www.educationau.edu.au/jahia/webdav/site/myjahiasite/shared/papers/tagging_hayman.ppt Fully taxonomy-directed tagging Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 33.
    TextWise Semantic CloudDocument (URL or text) is submitted to web service for semantic analysis Category tags from subset of the ODP taxonomy Concept tags are derived from document, persisted, related to ODP categories Copyright © 2009 Earley & Associates Inc. All Rights Reserved 1 3 2
  • 34.
    Buzzillions.com Review site:tags are “controlled” not against a taxonomy, but against other tags – reduces redundancy Only popular tags exposed as faceted navigation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 35.
    SharePoint? Plug-ins maketaxonomy easy Present the taxonomy like tags E.g. KWizCom: plug-in manages taxonomy and tags in easy interface… can opt-out of letting users create own tags Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 36.
    Taxonomy-influenced tagging Pros:More consistency Better support for findability Relationships, definitions leveraged – adding meaning to the tags Realistic for the enterprise Cons: Not really folksonomy anymore.. Can be forcing terminology on user Need to develop reference list of concepts – manually through taxonomy or need large corpus to derive automatically Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 37.
    Tag hierarchies Taghierarchies come in two flavors: User-powered Automatic derivation Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 38.
    User-powered tag hierarchiesUser-powered Social approach Bogus hierarchies possible Small population will contribute RawSugar tried it (no longer around) Taggers could specify hierarchy in own account, tags clustered based on common groups Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 39.
    User-powered taghierarchies E.g. LibraryThing LibraryThing allows any use to combine (or uncombine) 2 tags that are semantically equivalent. www.librarything.com Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 40.
    User-powered taghierarchies: Intelligent tags Move toward more semantic tagging with machine-readable tags, e.g. Flickr machine tags in “triple” format: [namespace]:[key]=[value] geo:neighborhood=SoHo, geo:lat=58.41618, etc. flickr:user=mortimer taxonomy:common=grevyszebra   lastfm:event=34640 makes your photo appear on a lastfm event page Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 41.
    User-powered taghierarchies: Intelligent tags MOAT: Meaning of a tag – part of linked data movement, mapping tags to semantic web http://moat-project.org/ Adding to the triplet User – resource – tag – meaning Meaning = URI to a resource containing meaning (e.g. DBPedia) <tag:RestrictedTagging> <tag:taggedResource rdf:resource=&quot;http://example.org/post/1&quot;/> <foaf:maker rdf:resource=&quot;http://apassant.net/alex&quot;/> <tag:associatedTag rdf:resource=&quot;http://tags.moat-project.org/tag/ apple &quot;/> <moat:tagMeaning rdf:resource=&quot;http://dbpedia.org/resource/Apple_Records&quot;/> </tag:RestrictedTagging> Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 42.
    Automatically derived taghierarchies Tag hierarchies, facets, ontologies, or “folksontology” Done through statistical/clustering algorithms http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 43.
    Delicious & citeulikehiearchy http://heymann.stanford.edu/taghierarchy.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 44.
    Clustering at flickrCopyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 45.
    Auto clustering/facets Stillnot very mature Time-sensitive Community- sensitive Ambiguous tags Improve with volume (self-correcting) http://www.pui.ch/phred/automated_tag_clustering/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 46.
    Tag hierarchy prosand cons Pros: Relationships, definitions leveraged – adding meaning to the tags Provides a basis for application behavior in the absence of taxonomy (e.g. Flickr maps, clusters) Self-correcting with volume Cons: Automatically derived relationships (clusters) can be bogus or time-sensitive Folksonomic relationships can be esoteric (just like tags) Small population of contributors Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 47.
    Conclusion Not allcontent is created equal – tags and taxonomies have their sweet spots Hybrid approaches are emerging taxonomy-influenced tagging leading the pack in popularity on the web co-existence in the enterprise Look for more developments on the semantic web/linked data front for making tags more intelligent Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 48.
    Questions? Richard Beatch [email_address] Paul Wlodarczyk [email_address] Web : www.earley.com Blog : sethearley.wordpress.com Twitter : earleytaxonomy Give us your business card for a free pass to one of our Community of Practice conference calls (a $50 value).
  • 49.
  • 50.
    Corporate social taggingsoftware http://www.connectbeam.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 51.
    Corporate social taggingsoftware http://www.cogenz.com/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 52.
    Corporate social taggingsoftware http://www-306.ibm.com/software/lotus/products/connections/dogear.html Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 53.
    Corporate social taggingsoftware BEA AquaLogic Pathways http://www.bea.com/framework.jsp?CNT=index.jsp&FP=/content/products/aqualogic/pathways/ Copyright © 2009 Earley & Associates Inc. All Rights Reserved
  • 54.
    Corporate social taggingsoftware http://www.newsgator.com/business/socialsites/default.aspx Copyright © 2009 Earley & Associates Inc. All Rights Reserved