On the Navigability of Social Tagging Systems

779 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
779
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

On the Navigability of Social Tagging Systems

  1. 1. Graz University of Technology On the Navigability of Social Tagging Systems Christoph Trattner Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of Technology, Austria e-mail: ctrattner@iicm.edu web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph In collaboration with: D.Helic, M.Strohmaier, K. Andrews, Ch. Körner Christoph Trattner 2012 1
  2. 2. Graz University of Technology What is a tagging system and what are tags? What is a tagging system? A system that provides the user the possibility to apply tags to resources What are tags? - lightweight keywords (free form vocabulary) - generated by users - for users Christoph Trattner 2012 2
  3. 3. Graz University of Technology Popular examples of tagging systems are… Christoph Trattner 2012 3
  4. 4. Graz University of Technology Tags Christoph Trattner 2012 4
  5. 5. Graz University of Technology Tags Christoph Trattner 2012 5
  6. 6. Graz University of Technology Tags Christoph Trattner 2012 6
  7. 7. Graz University of Technology Why system designers like tags? - Tags add additional meta data to resources for which typically just sparse meta data information exists (such as pictures, movies, etc.) - Trough tags system designers are able to provide the user with simple navigational tools that improve the systems information retrieval properties - Tags are cheap!!! Christoph Trattner 2012 7
  8. 8. Graz University of Technology Why users like tags? - Trough tags users are able to categorize or describe resources - Can find information faster - through personal tags - Can find related content faster - trough related tags Christoph Trattner 2012 8
  9. 9. Graz University of Technology Navigation with Tags Typically tagging systems provide the user the following forms of information retrieval interfaces to navigate content of a tagging system 1. Tag clouds – widely used 2. Tag hierarchies new – hardly any implementations yet Christoph Trattner 2012 Gupta et al. 2010 9
  10. 10. Graz University of Technology How does tag (cloud) based navigation look like? Christoph Trattner 2012 10
  11. 11. Graz University of Technology Questions??? Are Tag Clouds useful for navigation? Christoph Trattner 2012 11
  12. 12. Graz University of Technology Modelling a tag dataset as a graph (1/2) - A tagging dataset is typically modeled as a tripartite hypergraph - V=RUUUT - An annotation is a hyperedge (r, t, u) - A tripartite hypergraph can be mapped onto three bipartite graphs connecting users and resources, users and tags, and tags and resources. Christoph Trattner 2012 12
  13. 13. Graz University of Technology Defining Navigability A network is navigable iff: There is a short path between all or almost all pairs of nodes in the network. Formally: 1. There exists a giant component 2. The effective diameter is low (bounded by log n)J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer ScienceTechnical Report 99-1776 (October 1999) Christoph Trattner 2012 13
  14. 14. Graz University of Technology Navigability: Examples Example 1: Not navigable: No giant component Example 2: Not navigable: giant component, BUT eff.diam: 7 > log2(8) Christoph Trattner 2012 14
  15. 15. Graz University of Technology Navigability: Examples Example 3: Navigable: Giant component AND eff.diam: 2 < log2(10) Is this efficiently navigable? There are short paths between all nodes, but can an agent or algorithm find them with local knowledge only? Christoph Trattner 2012 15
  16. 16. Graz University of Technology Efficiently navigable A network is efficiently navigable iff: If there is an algorithm that can find a short path with only local knowledge, and the delivery time of the algorithm is bounded polynomially by logk(n). Example 4: B A C Efficiently navigable, if the algorithm knows it needs to go through A  B  CJ. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer ScienceTechnical Report 99-1776 (October 1999) Christoph Trattner 2012 16
  17. 17. Graz University of Technology Navigability of Social Tagging Systems (1/2) In general tags form networks which are navigable from a network-theoretic perspective Christoph Trattner 2012 17
  18. 18. Graz University of Technology Navigability of Social Tagging Systems (2/2) . „Hub“ tags Tagging networks are navigable power-law networks. For power law networks, efficient sub-linear decentralised navigation algorithms exist. Christoph Trattner 2012 18
  19. 19. Graz University of Technology But how about User Interface constraints? Tag Cloud Size n topN resources (topN most common algorithm) Pagination of resources / tag k resources shown / page (reverse chronological ordering) Christoph Trattner 2012 19
  20. 20. Graz University of Technology How UI constraints effect Navigability Tag Cloud Size . Pagination Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising). BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds. Christoph Trattner 2012 20
  21. 21. Graz University of Technology Questions??? How can we recover the navigability of social tagging systems? Answer: Through resource specific resource list construction! Christoph Trattner 2012 21
  22. 22. Graz University of Technology What is a resource specific resource list ? • A resource specific resource list is a resource list that is not only specific to a particular tag but also to a particular resource in the tagging system • Typically resource lists are calculated as follows Res(t) = {ri(t),…,rn(t)} • Resource specific resource lists are calculated as Res(t,r) = {ri(t,r),…,rn(t,r)} Christoph Trattner 2012 22
  23. 23. Graz University of Technology Approach: Random Ordering -Instead of reverse-chronological ordering of resources, we apply a random ordering. - On each click on a particular tag a different resource list is generated - Problem: network is not efficiently navigable Better algorithms can easily be envisioned. Christoph Trattner 2012 23
  24. 24. Graz University of Technology Approach: Hierarchical Ordering • Instead of random ordering, we use hierarchical background knowledge for ranking paginated resources [Kleinberg 2001]. • Kleinberg showed that if the nodes of a network can be organized into a hierarchy, then such a hierarchy provides a probability distribution for connecting the nodes in the network. • For such a network a hierarchical decentralized searcher exists that is able to navigate the network in log(n) => the network is efficiently navigableJ. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,2001, p. 2001. Christoph Trattner 2012 24
  25. 25. Graz University of Technology Approach: Hierarchical OrderingJ. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,2001, p. 2001. Christoph Trattner 2012 25
  26. 26. Graz University of Technology Problem: Semantic Penalty • Hierarchy was more or less randomly constructed • Does not take semantic similarity between resources into account • Hence, two new approaches were developed • First idea, constructing efficiently navigable tag clouds from structured web content [Trattner 2011] • Second idea, develop an algorithm that is able to construct semantically sound resource hierarchies from tagging data [Trattner 2011a]C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,Volume 17, Issue 4, 565-582, 2011.C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails”, in CIT, 2011. Christoph Trattner 2012 26
  27. 27. Graz University of Technology On the construction of efficiently navigable tag clouds from structured web content • Content on the Web not always flat • There are websites that provide a hierarchical structure • Example: Austria-Forum Christoph Trattner 2012 27
  28. 28. Graz University of Technology Austria-Forum - Wiki-based Online encyclopedia system - provides over 200,000 information items about Austria. - differently to Wikipedia, articles in Austria-Forum are published, edited, checked and certified by people who are accepted as experts in particular field - articles are organized hierarchically into categories - categories are addressable via AEIOU Community Wissenssammlungen structured URLs (cf. Open Directory DMOZ) Christoph Trattner 2012 28
  29. 29. Graz University of Technology Resource Austria-Forum Tags Christoph Trattner 2012 29
  30. 30. Graz University of Technology Approach (1/2) 1. Hierarchical Tag Cloud Construction Christoph Trattner 2012 30
  31. 31. Graz University of Technology Approach (2/2) 2. Hierarchical Resource List Construction Christoph Trattner 2012 31
  32. 32. Graz University of Technology Evaluation To evaluate the presented algorithm, a network theoretical framework [Trattner 2011b] based on the Stanford SNAP Library (http://snap.stanford.edu/) was developed: Network-theoretic module: Calculates network properties such as the size of the Largest Strongly Connected Component (LSCC) or the Effective Diameter (ED) of the tag cloud network Searcher module: Implements a hierarchical decentralized searcher to simulate “efficient” tag cloud driven navigationC. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference onWeb Engineering (ICWE 2011), Springer, 2011 . Christoph Trattner 2012 32
  33. 33. Graz University of Technology Hierarchical Decentralized Search Background knowledge: (e.g. a folksonomy) A tag network: Goal: Navigate from START to TARGET using local background knowledge only start target Christoph Trattner 2012J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999) 33
  34. 34. Graz University of Technology Results: Navigability Approaches calculating resource lists in a random manner form navigable tag cloud networks Christoph Trattner 2012 34
  35. 35. Graz University of Technology Results: Searcher • Best Results are obtained with hierarchically constructed tag clouds/resource lists (=HH) • Naive (=TopN + chron. sorted resource list) approach performs worst (=N) • However, HR performs better than a pure random approach (=R) Christoph Trattner 2012 35
  36. 36. Graz University of Technology User Study  To measure the performance of the approach a between-group test design was used  For that purpose we randomly split up our test users into two groups Baseline Group B Group A Assigned to navigate Austria-Forum with Assigned to navigate in Austria- hierarchically constructed resource lists Forum with reverse chron. sorted resource lists Christoph Trattner 2012 36
  37. 37. Graz University of Technology User Study  During the study the users were asked to resolve 10 Tasks  In particular, the users were asked to navigate from 10 given start resources to 10 given target resources as fast as possible.  To get valid results, start and the target resources were selected uniform at random (same for all users)  As tool for navigation users were allowed to use only tag clouds Christoph Trattner 2012 37
  38. 38. Graz University of Technology User Study  To ensure that the user would have to navigate, we selected the paths in such a way that the users had to visit at least 0-4 intermediate resources to find the target resources  As a max. amount of time, each of the users was given 3 minutes of time for each task Christoph Trattner 2012 38
  39. 39. Graz University of Technology Example: Tag cloud based navigation Brahms Beethoven Start resource Target resource Resource list Christoph Trattner 2012 39
  40. 40. Graz University of Technology User Study  Since we observed during our pilot test that users had problems in finding resources that they did not know, the tags of the target resource were also presented to the users  The variable measured in the experiment was success rate, i.e. we measured whether the user could find the target resources or not! Christoph Trattner 2012 40
  41. 41. Graz University of Technology Results: User Study  All in all, 24 test user participated in the experiment  16 male and 8 female  median age = 33 years, ranging from 22 to 56  All participants were experienced computer users (on average 46 hours per week)  12 of them were experienced with the Austria-Forum test system  To get rid of this bias, we assigned those users randomly to group A and B Christoph Trattner 2012 41
  42. 42. Graz University of Technology Results: User Study  Regarding the mean success rate, we could observe that on average users of group A could find to 55% their designated target resources  Compared to this, in group B the users were only able to find to 23% their designated target resources  Or in other words, on overage, we could observe an improvement of 32% of the navigability of the Austria-Forum tagging system, while using hierarchically constructed resource lists.  These results confirm our theoretical assumptions as they were made in previous work of this area [Helic et al. 2011] Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011. Christoph Trattner 2012 42
  43. 43. Graz University of Technology Results: User Study The experiment showed that the hierarchically constructed tag network is significantly better navigable than the one naïve approach. Christoph Trattner 2012 43
  44. 44. Graz University of Technology Problem: Predefined Resource Hierarchy - Not always a predefined resource hierarchy is given - Hence, the presented approach is not completely generic - Other problem: The Success Rate drops drastically if the provided resource hierarchy is neither balanced nor complete Christoph Trattner 2012 44
  45. 45. Graz University of Technology Question? How can we construct fixed branched and balanced resource hierarchies from tagging data automatically??? Christoph Trattner 2012 45
  46. 46. Graz University of Technology Algorithm: Resource Hierarchy Generation Christoph Trattner 2012 46
  47. 47. Graz University of Technology Algorithm: Resource Hierarchy Labeling Christoph Trattner 2012 47
  48. 48. Graz University of Technology Results: Semantic Evaluation - Taxonomic F-Measure and Taxonomic Overlap identify the quality of a given taxonomy against a golden standard via common concepts. - Comparison to four popular tag hierarchy induction algorithms - As golden standard for the experiment the Germanet ontology was used (the Austria-Forum tag dataset contains only German tags) Christoph Trattner 2012 48
  49. 49. Graz University of Technology Results: Empirical Analysis - 9 test participants (all of them experienced in the evaluation of concept hierarchies) - resource taxonomy with b=10 - Evaluation via online test - Users had to classify tag trails Christoph Trattner 2012 49
  50. 50. Graz University of Technology Results: Empirical Evaluation Compared to a tag taxonomy comprising only tags we can see that concept relations of a tag-resource taxonomy with branching factor b = 10 are only to 5% less hierarchically arranged than the tag concepts of the in theory best semantically correct tag taxonomy approach the so-called Deg/Cooc tag taxonomy induction algorithm. Christoph Trattner 2012 50
  51. 51. Graz University of Technology Results: Tag Cloud Navigability In order to determine the navigability of the approach several tag networks with different resource list lengths were generated. Branching factors used in the experiment: b=2,5 and 10. Resource list length was varied from k=10 to 50. - To determine navigability: Size of LSCC and ED was measured. - To determine efficiency a hierarchical decentralized searcher was implemented utilizing the resource hierarchy as background knowledge to search the tag networks. Christoph Trattner 2012 51
  52. 52. Graz University of Technology Results: Network Properties Simulations show the navigability of the hierarchically constructed tag networks. Christoph Trattner 2012 52
  53. 53. Graz University of Technology Results: Searcher Simulations show very high success rates ( > 90%) even for “short” resource lists (k=10). Christoph Trattner 2012 53
  54. 54. Graz University of Technology Conclusions - From a network-theoretical perspective (and only looking at tags) tagging systems are navigable - However, if we consider simple user-interface constraints, they are NOT! - Problem: Current tag cloud algorithms calculate resource lists in a statically manner - Pagination clusters tag network into isolated network clusters - However, with hierarchically constructed resource lists navigability can be recovered - Such tag networks are also efficiently navigable, if the resources of the tagging system can be arranged into a fixed branched resource taxonomy Christoph Trattner 2012 54
  55. 55. Graz University of Technology End of Presentation Thank you! Christoph Trattner ctrattner@iicm.edu Graz University of Technology, Austria Christoph Trattner 2012 55
  56. 56. Graz University of Technology References and Further Readings Trattner, C., Lin, Y., Parra, D., Yue, Z., Brusilovsky, P.: Evaluating Tag-Based Information Access in Image Collections, In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, ACM, New York, NY, USA, 2012. Helic, D., Körner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational efficiency of broad vs. narrow folksonomies, In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, ACM, New York, NY, USA, 2012. Trattner, C., Singer, P., Helic, D. and Strohmaier, M.: Exploring the Differences and Similarities of Hierarchical Decentralized Search and Human Navigation in Information-networks In Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, ACM, New York, NY, USA, 2012. Trattner, C.: Linking Related Content in Web Encyclopedias with search query tag clouds, IADIS International Journal on WWW/Internet ,Volume 9(2), 2011. Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails, Journal of Computing and Information Technology, Volume 19(3), 155-167, 2011. Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer Science, Volume 17(4), 565-582, 2011. Christoph Trattner 2012 56
  57. 57. Graz University of Technology References and Further Readings Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of Folksonomies, In Proceedings of the 20th international conference on World wide web, ACM, New York, NY, USA, 417-426, 2011. Trattner, C., Körner, C., Helic, D.: Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies, In Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, ACM, 7–9 September 2011, Messe Congress Graz, Austria, 2011. Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study, In Proceedings of the 33rd International Conference on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011. Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging Systems, In proceedings of the Second IEEE International Conference on Social Computing , Minnesota, USA, 2010. Christoph Trattner 2012 57

×