FaceTag: Integrating Bottom-up and Top-down Classification in a Social Tagging System


Published on

FaceTag is a working prototype of a semantic collaborative tagging tool conceived for bookmarking information architecture resources.
It aims to show how the widespread homogeneous and flat keywords' space created by users while tagging can be effectively mixed with a richer faceted classification scheme to improve the �information scent� and �berrypicking� capabilities of the system. The additional semantic structure is aggregated both implicitly observing user behaviour and explicitly introducing a compelling user experience that facilitates the end-user creation of relationships between tags.
FaceTag current implementation is written in PHP / SQL and includes an open API which allows querying and integration from other applications.

Published in: Education
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • FaceTag: Integrating Bottom-up and Top-down Classification in a Social Tagging System

    1. 1. Face T ag Integrating Bottom-up and Top-down Classification in a Social Tagging System Emanuele Quintarelli, Andrea Resmini, Luca Rosati EuroIA 2006, Berlin
    2. 2. The Evolution of Collaborative Tagging
    3. 3. An Emerging Approach to Distributed Classification <ul><ul><li>Collaborative tagging systems are used to organize, browse and share personal collections of resources through the introduction of simple metadata </li></ul></ul><ul><ul><li>Folksonomies are user-generated classifications, emerging through bottom-up consensus </li></ul></ul><ul><ul><li>The basic idea is simply to make people share items annotated with keywords </li></ul></ul>
    4. 4. Collaborative Tagging Examples <ul><li>An incomplete list of online tagging systems may include simply, Connotea, magnolia, Taggly, digg, flickr, YouTube, Technorati, 43things. </li></ul><ul><li>These are in fact web-based collaborative systems for: </li></ul><ul><ul><li>building a shared database of items </li></ul></ul><ul><ul><li>a flat metadata vocabulary </li></ul></ul><ul><ul><li>metadata driven queries </li></ul></ul><ul><ul><li>monitoring change in areas of interest </li></ul></ul><ul><ul><li>discovering emergences or trends </li></ul></ul>
    5. 5. Properties of Folksonomies <ul><li>Advantages </li></ul><ul><ul><li>Trade-off between simplicity and precision </li></ul></ul><ul><ul><li>Match the user’s real needs and language </li></ul></ul><ul><ul><li>Inclusive (nothing is left out) </li></ul></ul><ul><ul><li>Help discovery of information and serendipity </li></ul></ul><ul><ul><li>May be a forced move (the environment makes the difference) </li></ul></ul><ul><ul><li>Better than nothing (when traditional classification is not viable) </li></ul></ul><ul><li>Disadvantages </li></ul><ul><ul><li>Language issues </li></ul></ul><ul><ul><li>User Experience issues </li></ul></ul>
    6. 6. <ul><li>As a result of intrinsic language variability, tagging systems are also implicitly plagued by: </li></ul><ul><li>Polysemy (window: is that a hole or a glass pane) </li></ul><ul><li>Homonymy (apple, jaguar) </li></ul><ul><li>Plurals (blog / blogs, folksonomy / folksonomies) </li></ul><ul><li>Synonymy (tags, tagging, folksonomy) </li></ul><ul><li>Ego-oriented tags (toread, funny, interesting etc..) </li></ul><ul><li>Basic level variations (dog / beagle) </li></ul><ul><li>These problems can dramatically reduce the effectiveness of the application and the benefits brought on by tagging systems </li></ul>Language Issues
    7. 7. User Experience Issues <ul><li>Tag clouds are visual interfaces for information retrieval that provide a global contextual view of tags assigned to resources in the system </li></ul><ul><li>Flat tag clouds are not sufficient to provide a semantic and multidimensional browsing experience. They </li></ul><ul><ul><ul><li>have low findability quotient and low scalability </li></ul></ul></ul><ul><ul><ul><li>have high semantic density where few well-known topics dominate the scene </li></ul></ul></ul><ul><ul><ul><li>often follow an alphabetical criterion which limits the ability to explore the tag cloud </li></ul></ul></ul><ul><ul><ul><li>cannot visually support semantic relationships </li></ul></ul></ul><ul><ul><ul><li>often miss to provide complex logical operations </li></ul></ul></ul>
    8. 8. <ul><li>Information seekers in large domains of objects express the desire of having to deal with meaningful groupings of related items, in order to quickly understand relationships and decide how to proceed [Hearst 2006]. </li></ul><ul><li>How to generate and navigate such groups from a flat set of objects is anyway a totally different matter. </li></ul><ul><li>Taxonomies , clustering and faceted classification have been proposed in the past as useful techniques for such purposes </li></ul>Navigating Large Domains
    9. 9. <ul><li>Taxonomies are coherent and complete systems of meaningful labels which systematically organize a domain </li></ul><ul><li>Taxonomies are organically crafted before starting to catalogue by professionals who deduce future user needs and content types </li></ul><ul><li>They are authoritative centralized views and allow for greatear precision, avoiding ambiguity: the hierarchy provides context </li></ul>Taxonomies <ul><ul><li>Major drawbacks </li></ul></ul><ul><ul><li>do not have the ability to match the vocabulary and the ways of thinking of different users </li></ul></ul><ul><ul><li>expensive to build and maintain by professional indexers </li></ul></ul>
    10. 10. <ul><li>Clustering is the act of grouping items according to some measure of similarity </li></ul><ul><li>It reduces the semantic density and improve the visual consistency of tag clouds </li></ul><ul><li>But it generate messy groups , conflates many different dimensions and does not allow refinement and follow-up queries </li></ul><ul><li>Users prefer clear hierarchies with categories at uniform levels of granularity over the unpredictable and unlabeled groupings typical of clustering techniques </li></ul>Clusters
    11. 11. <ul><li>Facets are </li></ul><ul><li>orthogonal descriptors (categories) within a metadata system </li></ul><ul><li>Each facet has a name and addresses a different conceptual dimension or feature type relevant to the collection </li></ul><ul><li>Each object is classified combining labels from different facets </li></ul><ul><li>Facets can be </li></ul><ul><li>Flat or hierarchical </li></ul><ul><li>Assigned single or multiple values </li></ul>Facets 23 resources Facet name
    12. 12. <ul><ul><li>Hierarchical faceted metadata can be used to </li></ul></ul><ul><ul><li>add structure and context to tags </li></ul></ul><ul><ul><li>navigate along several dimensions simultaneously </li></ul></ul><ul><ul><li>seamlessly integrate browsing and searching </li></ul></ul><ul><ul><li>refine and broaden filtering criteria </li></ul></ul>Facets
    13. 13. Benefits of Facets <ul><ul><li>Easier to understand the meaning of tags </li></ul></ul><ul><ul><li>Large tag clouds more browsable </li></ul></ul><ul><ul><li>Reduction of the mental work (favouring recognition over recall) </li></ul></ul><ul><ul><li>Better support for exploration, discovery and iterative query refinement </li></ul></ul><ul><li>Usability studies show how this approach is preferred over single hierarchies and clusters </li></ul>
    14. 14. What do we need then? <ul><li>A middle ground between the pure democracy of bottom-up tagging and the empirical determinism of top-down controlled vocabularies </li></ul><ul><li>A new metadata ecology which merges and leverages emerging and traditional tools to improve findability and user experience </li></ul><ul><li>This new metadata ecology has to be a fusion , not only a simple process of coexistence </li></ul>
    15. 15. <ul><li>What </li></ul><ul><li>Face T ag tries to limit the impact of polysemy, homonymy and basic level variation introducing a multidimensional and semantically richer paradigm </li></ul><ul><li>Face T ag aims to improve the usability, findability, browsability, serendipity and scalability of the system </li></ul><ul><li>How </li></ul><ul><li>Face T ag mixes three contributions to social tagging systems: </li></ul><ul><ul><li>tag hierarchies </li></ul></ul><ul><ul><li>facets of tags </li></ul></ul><ul><ul><li>tagging and searching operate seamlessly </li></ul></ul><ul><li>Users provide the faceted hierarchical structure through an intuitive user experience </li></ul>What Face T ag contributes
    16. 16. Faceted Analysis
    17. 17. How to choose facets <ul><li>Freehand </li></ul><ul><li>each facet is decided on the spot </li></ul><ul><li>the subject is freely deconstructed in several aspects </li></ul><ul><li>each facet is find out at the moment </li></ul><ul><li>the subject is freely deconstructed in several aspects </li></ul><ul><li>Standard based </li></ul><ul><ul><li>each facet is found out using Ranganathan's or CRG's guidelines </li></ul></ul><ul><ul><li>the subject is then deconstructed following a general scheme </li></ul></ul><ul><ul><li>the general scheme works as a prototype of every particular faceted scheme </li></ul></ul>
    18. 18. <ul><li>The standard-based approach brings along a number of benefits: </li></ul><ul><li>it's a standard </li></ul><ul><li>experience demonstrates it suits several context </li></ul><ul><li>reduction of risks coming from a subjective perspective </li></ul>Benefits of standard compliance
    19. 19. How we chose facets from the CRG general scheme Document Date Time [Country] Space People, companies Agent Target (e.g. Industry, Health...) Patient -- Byproduct [Deliverables] Product Activities (e.g. competitive analysis, classification) Operation -- Process [Format] Material Language Property -- Part Resources Type (e.g. case study, report...) Type [Documents, resources] Thing FaceTag facets CRG standard categories
    20. 20. FaceT ag facets <ul><li>document date </li></ul>Date <ul><li>dion hinchcliffe </li></ul><ul><li>morville </li></ul>People <ul><li>industry </li></ul><ul><li>public administration </li></ul><ul><li>software>companies>google </li></ul>Usage <ul><li>discovery>competitive analysis </li></ul><ul><li>classification>facets </li></ul><ul><li>navigation design>breadcrumbs </li></ul>Activities/Subjects <ul><li>predefined values (based on ISO Standard ISO 639-2) </li></ul>Language <ul><li>case study </li></ul><ul><li>blog>enterprise web </li></ul>Resources Types Examples FaceTag facets
    21. 21. Benefits <ul><li>The blend of facets and tags bring benefits on two axes </li></ul><ul><li>Vertical </li></ul><ul><li>when a user associates a keyword to a facet the system suggests similar tags pertaining to the same facet. </li></ul><ul><li>Horizontal </li></ul><ul><li>the system allows the user to see all the other tags belonging to the same facet. </li></ul>Horizontal or syntagmatic axis Vertical or paradigmatic axis
    22. 22. Open Issues
    23. 23. Title Topic Activity Geo. Area Other What's “faceted” Year Author Other Metadata Formal properties Facets Semantic properties
    24. 24. <ul><li>Faceted Classification is a librarian theory postulating </li></ul><ul><ul><li>not only a multidimensional approach to an item </li></ul></ul><ul><ul><li>but also a semantic value of each dimension / facet </li></ul></ul><ul><ul><li>a specific citation order of such dimensions / facets </li></ul></ul><ul><li>In some cases our ‘facets’ are not proper facets , but metadata </li></ul><ul><ul><li>nevertheless we need such metadata </li></ul></ul><ul><ul><li>so we may decide for a hybrid approach (facets & metadata) </li></ul></ul><ul><li>Labeling of facets / metadata </li></ul><ul><ul><li>is yet another issue we have to better evaluate </li></ul></ul>Open issues
    25. 25. Technology Preview: Face T ag
    26. 26. Main page Facets and pertaining tags Search box with hinting Resources with pertaining tags
    27. 27. Using facets for searching: engaging 1 tag The user chooses 'intranet design' and facets and tags adjust. Breadcrumbing for the engaged tag appears (top).
    28. 28. Using facets for searching: intersecting 2 tags The user adds 'article' from another facet.Facets and tags adjust again. Breadcrumbing reflects the changes. Tags can be disengaged individually
    29. 29. Using facets for searching: final result set 1 2 3 The user adds 'shiv singh' from yet another facet and finds a final result set. Facets and tags adjust and show that there is no further possibility to zoom in. Breadcrumbing lists all engaged tags ready to be disengaged.