This document summarizes research on improving the navigability of social tagging systems. It discusses how tagging systems work and why tags are useful for both system designers and users. It then analyzes how navigability in networks is defined and evaluates the navigability of social tagging networks. The document proposes constructing hierarchical resource lists and tag clouds using background knowledge from structured sources like Wikipedia. An evaluation of this approach through network analysis and a user study found it significantly improved users' ability to navigate the tagging system compared to reverse chronological ordering of resources.
1. Graz University of Technology
On the Navigability of Social Tagging
Systems
Christoph Trattner
Knowledge Management Institute and
Institute for Information Systems and Computer Media
Graz University of Technology, Austria
e-mail: ctrattner@iicm.edu
web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph
In collaboration with:
D.Helic, M.Strohmaier, K. Andrews, Ch. Körner
Christoph Trattner 2012
1
2. Graz University of Technology
What is a tagging system and what are
tags?
What is a tagging system?
A system that provides the user the possibility to
apply tags to resources
What are tags?
- lightweight keywords (free form vocabulary)
- generated by users
- for users
Christoph Trattner 2012
2
3. Graz University of Technology
Popular examples of tagging systems
are…
Christoph Trattner 2012
3
7. Graz University of Technology
Why system designers like tags?
- Tags add additional meta data to resources for which
typically just sparse meta data information exists
(such as pictures, movies, etc.)
- Trough tags system designers are able to provide the
user with simple navigational tools that improve the
systems information retrieval properties
- Tags are cheap!!!
Christoph Trattner 2012
7
8. Graz University of Technology
Why users like tags?
- Trough tags users are able to categorize or describe
resources
- Can find information faster
- through personal tags
- Can find related content faster
- trough related tags
Christoph Trattner 2012
8
9. Graz University of Technology
Navigation with Tags
Typically tagging systems provide the user the following forms of
information retrieval interfaces to navigate content of a tagging
system
1. Tag clouds – widely used
2. Tag hierarchies
new – hardly any implementations yet
Christoph Trattner 2012
Gupta et al. 2010 9
10. Graz University of Technology
How does tag (cloud) based navigation
look like?
Christoph Trattner 2012
10
11. Graz University of Technology
Questions???
Are Tag Clouds useful for navigation?
Christoph Trattner 2012
11
12. Graz University of Technology
Modelling a tag dataset as a graph (1/2)
- A tagging dataset is typically modeled as a tripartite
hypergraph
- V=RUUUT
- An annotation is a hyperedge (r, t, u)
- A tripartite hypergraph can be mapped onto three
bipartite graphs connecting users and resources,
users and tags, and tags and resources.
Christoph Trattner 2012
12
13. Graz University of Technology
Defining Navigability
A network is navigable iff:
There is a short path between all or almost all pairs of
nodes in the network.
Formally:
1. There exists a giant component
2. The effective diameter is low (bounded by log n)
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
Christoph Trattner 2012
13
14. Graz University of Technology
Navigability: Examples
Example 1:
Not navigable: No giant component
Example 2:
Not navigable: giant component, BUT
eff.diam: 7 > log2(8)
Christoph Trattner 2012
14
15. Graz University of Technology
Navigability: Examples
Example 3:
Navigable: Giant component AND
eff.diam: 2 < log2(10)
Is this efficiently navigable?
There are short paths between all nodes, but can an
agent or algorithm find them with local knowledge
only?
Christoph Trattner 2012
15
16. Graz University of Technology
Efficiently navigable
A network is efficiently navigable iff:
If there is an algorithm that can find a short path with
only local knowledge, and the delivery time of the
algorithm is bounded polynomially by logk(n).
Example 4: B
A C
Efficiently navigable, if the algorithm knows it needs to
go through A B C
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
Christoph Trattner 2012
16
17. Graz University of Technology
Navigability of Social Tagging Systems (1/2)
In general tags form networks which are navigable
from a network-theoretic perspective
Christoph Trattner 2012
17
18. Graz University of Technology
Navigability of Social Tagging Systems (2/2)
.
„Hub“ tags
Tagging networks are navigable power-law networks. For power law
networks, efficient sub-linear decentralised navigation algorithms exist.
Christoph Trattner 2012
18
19. Graz University of Technology
But how about User Interface constraints?
Tag Cloud Size n
topN resources
(topN most common algorithm)
Pagination of resources / tag
k resources shown / page
(reverse chronological ordering)
Christoph Trattner 2012
19
20. Graz University of Technology
How UI constraints effect Navigability
Tag Cloud Size
.
Pagination
Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does
not influence navigability (this is not very surprising).
BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination
with resources sorted in reverse-chronological order) leaves the network
vulnerable to fragmentation. This destroys navigability of prevalent approaches
to tag clouds.
Christoph Trattner 2012
20
21. Graz University of Technology
Questions???
How can we recover the navigability of social tagging
systems?
Answer: Through resource specific resource list
construction!
Christoph Trattner 2012
21
22. Graz University of Technology
What is a resource specific resource list ?
• A resource specific resource list is a resource list
that is not only specific to a particular tag but
also to a particular resource in the tagging
system
• Typically resource lists are calculated as follows
Res(t) = {ri(t),…,rn(t)}
• Resource specific resource lists are calculated
as
Res(t,r) = {ri(t,r),…,rn(t,r)}
Christoph Trattner 2012
22
23. Graz University of Technology
Approach: Random Ordering
-Instead of reverse-chronological ordering of resources,
we apply a random ordering.
- On each click on a particular tag a different resource list is
generated
- Problem: network is not efficiently navigable
Better algorithms can easily be envisioned.
Christoph Trattner 2012
23
24. Graz University of Technology
Approach: Hierarchical Ordering
• Instead of random ordering, we use hierarchical
background knowledge for ranking paginated
resources [Kleinberg 2001].
• Kleinberg showed that if the nodes of a network
can be organized into a hierarchy, then such a
hierarchy provides a probability distribution for
connecting the nodes in the network.
• For such a network a hierarchical decentralized
searcher exists that is able to navigate the
network in log(n) => the network is efficiently
navigable
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
Christoph Trattner 2012
24
25. Graz University of Technology
Approach: Hierarchical Ordering
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
Christoph Trattner 2012
25
26. Graz University of Technology
Problem: Semantic Penalty
• Hierarchy was more or less randomly
constructed
• Does not take semantic similarity between
resources into account
• Hence, two new approaches were developed
• First idea, constructing efficiently navigable tag clouds
from structured web content [Trattner 2011]
• Second idea, develop an algorithm that is able to
construct semantically sound resource hierarchies
from tagging data [Trattner 2011a]
C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,
Volume 17, Issue 4, 565-582, 2011.
C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails”, in CIT, 2011.
Christoph Trattner 2012
26
27. Graz University of Technology
On the construction of efficiently navigable tag
clouds from structured web content
• Content on the Web not always flat
• There are websites that provide a hierarchical
structure
• Example: Austria-Forum
Christoph Trattner 2012
27
28. Graz University of Technology
Austria-Forum
- Wiki-based Online encyclopedia system
- provides over 200,000 information items about
Austria.
- differently to Wikipedia, articles in Austria-Forum
are published, edited, checked and certified by
people who are accepted as experts in particular
field
- articles are organized hierarchically
into categories
- categories are addressable via AEIOU Community
Wissenssammlungen
structured URLs
(cf. Open Directory DMOZ)
Christoph Trattner 2012
28
29. Graz University of Technology
Resource Austria-Forum
Tags
Christoph Trattner 2012
29
30. Graz University of Technology
Approach (1/2)
1. Hierarchical Tag Cloud Construction
Christoph Trattner 2012
30
31. Graz University of Technology
Approach (2/2)
2. Hierarchical Resource List Construction
Christoph Trattner 2012
31
32. Graz University of Technology
Evaluation
To evaluate the presented algorithm, a network
theoretical framework [Trattner 2011b] based on the
Stanford SNAP Library (http://snap.stanford.edu/) was
developed:
Network-theoretic module: Calculates network properties
such as the size of the Largest Strongly Connected Component
(LSCC) or the Effective Diameter (ED) of the tag cloud network
Searcher module: Implements a hierarchical decentralized
searcher to simulate “efficient” tag cloud driven navigation
C. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference on
Web Engineering (ICWE 2011), Springer, 2011 .
Christoph Trattner 2012
32
33. Graz University of Technology
Hierarchical Decentralized Search
Background knowledge:
(e.g. a folksonomy)
A tag network:
Goal: Navigate from START to TARGET
using local background knowledge only
start target
Christoph Trattner 2012
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999) 33
34. Graz University of Technology
Results: Navigability
Approaches calculating resource lists in a
random manner form navigable tag cloud
networks
Christoph Trattner 2012
34
35. Graz University of Technology
Results: Searcher
• Best Results are obtained with
hierarchically constructed tag
clouds/resource lists (=HH)
• Naive (=TopN + chron. sorted resource
list) approach performs worst (=N)
• However, HR performs better than a
pure random approach (=R)
Christoph Trattner 2012
35
36. Graz University of Technology
User Study
To measure the performance of the approach a
between-group test design was used
For that purpose we randomly split up our test
users into two groups Baseline
Group B
Group A
Assigned to navigate Austria-Forum with Assigned to navigate in Austria-
hierarchically constructed resource lists Forum with reverse chron. sorted
resource lists
Christoph Trattner 2012
36
37. Graz University of Technology
User Study
During the study the users were asked to resolve
10 Tasks
In particular, the users were asked to navigate
from 10 given start resources to 10 given target
resources as fast as possible.
To get valid results, start and the target
resources were selected uniform at random
(same for all users)
As tool for navigation users were allowed to use
only tag clouds
Christoph Trattner 2012
37
38. Graz University of Technology
User Study
To ensure that the user would have to navigate,
we selected the paths in such a way that the
users had to visit at least 0-4 intermediate
resources to find the target resources
As a max. amount of time, each of the users was
given 3 minutes of time for each task
Christoph Trattner 2012
38
39. Graz University of Technology
Example: Tag cloud based navigation
Brahms Beethoven
Start resource Target resource
Resource list
Christoph Trattner 2012
39
40. Graz University of Technology
User Study
Since we observed during our pilot test that
users had problems in finding resources that
they did not know, the tags of the target resource
were also presented to the users
The variable measured in the experiment was
success rate, i.e. we measured whether the user
could find the target resources or not!
Christoph Trattner 2012
40
41. Graz University of Technology
Results: User Study
All in all, 24 test user participated in the experiment
16 male and 8 female
median age = 33 years, ranging from 22 to 56
All participants were experienced computer users (on
average 46 hours per week)
12 of them were experienced with the Austria-Forum
test system
To get rid of this bias, we assigned those users
randomly to group A and B
Christoph Trattner 2012
41
42. Graz University of Technology
Results: User Study
Regarding the mean success rate, we could observe that on
average users of group A could find to 55% their designated
target resources
Compared to this, in group B the users were only able to find to
23% their designated target resources
Or in other words, on overage, we could observe an improvement
of 32% of the navigability of the Austria-Forum tagging system,
while using hierarchically constructed resource lists.
These results confirm our theoretical assumptions as they were
made in previous work of this area [Helic et al. 2011]
Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic
Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011.
Christoph Trattner 2012
42
43. Graz University of Technology
Results: User Study
The experiment showed that the hierarchically constructed
tag network is significantly better navigable than the one
naïve approach.
Christoph Trattner 2012
43
44. Graz University of Technology
Problem: Predefined Resource Hierarchy
- Not always a predefined resource hierarchy is
given
- Hence, the presented approach is not
completely generic
- Other problem:
The Success Rate drops drastically if the
provided resource hierarchy is neither
balanced nor complete
Christoph Trattner 2012
44
45. Graz University of Technology
Question?
How can we construct fixed branched and balanced
resource hierarchies from tagging data
automatically???
Christoph Trattner 2012
45
46. Graz University of Technology
Algorithm: Resource Hierarchy Generation
Christoph Trattner 2012
46
47. Graz University of Technology
Algorithm: Resource Hierarchy Labeling
Christoph Trattner 2012
47
48. Graz University of Technology
Results: Semantic Evaluation
- Taxonomic F-Measure and
Taxonomic Overlap identify
the quality of a given taxonomy
against a golden standard via
common concepts.
- Comparison to four popular tag
hierarchy induction algorithms
- As golden standard for the experiment the Germanet
ontology was used (the Austria-Forum tag dataset contains
only German tags)
Christoph Trattner 2012
48
49. Graz University of Technology
Results: Empirical Analysis
- 9 test participants (all of them experienced in the evaluation
of concept hierarchies)
- resource taxonomy with b=10
- Evaluation via online test
- Users had to classify tag trails
Christoph Trattner 2012
49
50. Graz University of Technology
Results: Empirical Evaluation
Compared to a tag taxonomy comprising only tags we can
see that concept relations of a tag-resource taxonomy with
branching factor b = 10 are only to 5% less hierarchically
arranged than the tag concepts of the in theory best
semantically correct tag taxonomy approach the so-called
Deg/Cooc tag taxonomy induction algorithm.
Christoph Trattner 2012
50
51. Graz University of Technology
Results: Tag Cloud Navigability
In order to determine the navigability of the approach several
tag networks with different resource list lengths were
generated.
Branching factors used in the experiment: b=2,5 and 10.
Resource list length was varied from k=10 to 50.
- To determine navigability: Size of LSCC and ED was measured.
- To determine efficiency a hierarchical decentralized searcher was
implemented utilizing the resource hierarchy as background knowledge to
search the tag networks.
Christoph Trattner 2012
51
52. Graz University of Technology
Results: Network Properties
Simulations show the navigability of the hierarchically
constructed tag networks.
Christoph Trattner 2012
52
53. Graz University of Technology
Results: Searcher
Simulations show very high success rates ( > 90%)
even for “short” resource lists (k=10).
Christoph Trattner 2012
53
54. Graz University of Technology
Conclusions
- From a network-theoretical perspective (and only
looking at tags) tagging systems are navigable
- However, if we consider simple user-interface
constraints, they are NOT!
- Problem: Current tag cloud algorithms calculate resource lists in a
statically manner
- Pagination clusters tag network into isolated network clusters
- However, with hierarchically constructed resource
lists navigability can be recovered
- Such tag networks are also efficiently navigable, if
the resources of the tagging system can be arranged
into a fixed branched resource taxonomy
Christoph Trattner 2012
54
55. Graz University of Technology
End of Presentation
Thank you!
Christoph Trattner
ctrattner@iicm.edu
Graz University of Technology, Austria
Christoph Trattner 2012
55
56. Graz University of Technology
References and Further Readings
Trattner, C., Lin, Y., Parra, D., Yue, Z., Brusilovsky, P.: Evaluating Tag-Based Information
Access in Image Collections, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Helic, D., Körner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational efficiency of broad
vs. narrow folksonomies, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Trattner, C., Singer, P., Helic, D. and Strohmaier, M.: Exploring the Differences and Similarities
of Hierarchical Decentralized Search and Human Navigation in Information-networks In
Proceedings of the 11th International Conference on Knowledge Management and
Knowledge Technologies, ACM, New York, NY, USA, 2012.
Trattner, C.: Linking Related Content in Web Encyclopedias with search query tag clouds, IADIS
International Journal on WWW/Internet ,Volume 9(2), 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists and Tag Trails, Journal of Computing and Information Technology, Volume
19(3), 155-167, 2011.
Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag
Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer
Science, Volume 17(4), 565-582, 2011.
Christoph Trattner 2012
56
57. Graz University of Technology
References and Further Readings
Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of
Folksonomies, In Proceedings of the 20th international conference on World wide web,
ACM, New York, NY, USA, 417-426, 2011.
Trattner, C., Körner, C., Helic, D.: Enhancing the Navigability of Social Tagging Systems with
Tag Taxonomies, In Proceedings of the 11th International Conference on Knowledge
Management and Knowledge Technologies, ACM, 7–9 September 2011, Messe Congress
Graz, Austria, 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists: A Comparative Study, In Proceedings of the 33rd International Conference
on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011.
Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging
Systems, In proceedings of the Second IEEE International Conference on Social Computing
, Minnesota, USA, 2010.
Christoph Trattner 2012
57