On the Navigability of Social Tagging Systems

Graz University of Technology

On the Navigability of Social Tagging
Systems

Christoph Trattner
Knowledge Management Institute and
Institute for Information Systems and Computer Media
Graz University of Technology, Austria
e-mail: ctrattner@iicm.edu
web: http://www.austria-lexikon.at/af/User/Trattner%20Christoph

In collaboration with:
D.Helic, M.Strohmaier, K. Andrews, Ch. Körner

Christoph Trattner 2012
1


What is a tagging system and what are
tags?

What is a tagging system?
A system that provides the user the possibility to
apply tags to resources

What are tags?
- lightweight keywords (free form vocabulary)
- generated by users
- for users

2


Popular examples of tagging systems
are…

3


Tags
4


Tags

5


Tags
6


Why system designers like tags?

- Tags add additional meta data to resources for which
typically just sparse meta data information exists
(such as pictures, movies, etc.)

- Trough tags system designers are able to provide the
user with simple navigational tools that improve the
systems information retrieval properties

- Tags are cheap!!!

7


Why users like tags?

- Trough tags users are able to categorize or describe
resources

- Can find information faster
- through personal tags
- Can find related content faster
- trough related tags

8


Navigation with Tags
Typically tagging systems provide the user the following forms of
information retrieval interfaces to navigate content of a tagging
system

1. Tag clouds – widely used

2. Tag hierarchies
new – hardly any implementations yet

Gupta et al. 2010 9


How does tag (cloud) based navigation
look like?

10


Questions???

Are Tag Clouds useful for navigation?

11


Modelling a tag dataset as a graph (1/2)
- A tagging dataset is typically modeled as a tripartite
hypergraph

- V=RUUUT

- An annotation is a hyperedge (r, t, u)

- A tripartite hypergraph can be mapped onto three
bipartite graphs connecting users and resources,
users and tags, and tags and resources.

12


Defining Navigability

A network is navigable iff:
There is a short path between all or almost all pairs of
nodes in the network.

Formally:
1. There exists a giant component
2. The effective diameter is low (bounded by log n)

J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
13


Navigability: Examples

Example 1:

Not navigable: No giant component

Example 2:

Not navigable: giant component, BUT
eff.diam: 7 > log2(8)

14


Navigability: Examples

Example 3:

Navigable: Giant component AND
eff.diam: 2 < log2(10)

Is this efficiently navigable?
There are short paths between all nodes, but can an
agent or algorithm find them with local knowledge
only?
15


Efficiently navigable

A network is efficiently navigable iff:
If there is an algorithm that can find a short path with
only local knowledge, and the delivery time of the
algorithm is bounded polynomially by logk(n).

Example 4: B

A C

Efficiently navigable, if the algorithm knows it needs to
go through A  B  C
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science
Technical Report 99-1776 (October 1999)
16


Navigability of Social Tagging Systems (1/2)

In general tags form networks which are navigable
from a network-theoretic perspective

17


Navigability of Social Tagging Systems (2/2)

.

„Hub“ tags

Tagging networks are navigable power-law networks. For power law
networks, efficient sub-linear decentralised navigation algorithms exist.

18


But how about User Interface constraints?

Tag Cloud Size n
topN resources

(topN most common algorithm)

Pagination of resources / tag
k resources shown / page

(reverse chronological ordering)

19


How UI constraints effect Navigability
Tag Cloud Size

.

Pagination

Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does
not influence navigability (this is not very surprising).
BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination
with resources sorted in reverse-chronological order) leaves the network
vulnerable to fragmentation. This destroys navigability of prevalent approaches
to tag clouds.
20


Questions???

How can we recover the navigability of social tagging
systems?

Answer: Through resource specific resource list
construction!

21


What is a resource specific resource list ?
• A resource specific resource list is a resource list
that is not only specific to a particular tag but
also to a particular resource in the tagging
system

• Typically resource lists are calculated as follows
Res(t) = {ri(t),…,rn(t)}
• Resource specific resource lists are calculated
as
Res(t,r) = {ri(t,r),…,rn(t,r)}

22


Approach: Random Ordering

-Instead of reverse-chronological ordering of resources,
we apply a random ordering.
- On each click on a particular tag a different resource list is
generated
- Problem: network is not efficiently navigable

Better algorithms can easily be envisioned.

23


Approach: Hierarchical Ordering
• Instead of random ordering, we use hierarchical
background knowledge for ranking paginated
resources [Kleinberg 2001].
• Kleinberg showed that if the nodes of a network
can be organized into a hierarchy, then such a
hierarchy provides a probability distribution for
connecting the nodes in the network.
• For such a network a hierarchical decentralized
searcher exists that is able to navigate the
network in log(n) => the network is efficiently
navigable
J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
24


Approach: Hierarchical Ordering

J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press,
2001, p. 2001.
25


Problem: Semantic Penalty
• Hierarchy was more or less randomly
constructed
• Does not take semantic similarity between
resources into account
• Hence, two new approaches were developed
• First idea, constructing efficiently navigable tag clouds
from structured web content [Trattner 2011]
• Second idea, develop an algorithm that is able to
construct semantically sound resource hierarchies
from tagging data [Trattner 2011a]
C. Trattner , D. Helic, M. Strohmaier, “On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content,” in JUCS,
Volume 17, Issue 4, 565-582, 2011.
C. Trattner , “Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists and Tag Trails”, in CIT, 2011.

26


On the construction of efficiently navigable tag
clouds from structured web content

• Content on the Web not always flat
• There are websites that provide a hierarchical
structure

• Example: Austria-Forum

27


Austria-Forum
- Wiki-based Online encyclopedia system
- provides over 200,000 information items about
Austria.
- differently to Wikipedia, articles in Austria-Forum
are published, edited, checked and certified by
people who are accepted as experts in particular
field
- articles are organized hierarchically
into categories
- categories are addressable via AEIOU Community
Wissenssammlungen

structured URLs
(cf. Open Directory DMOZ)

28


Resource Austria-Forum

Tags

29


Approach (1/2)
1. Hierarchical Tag Cloud Construction

30


Approach (2/2)
2. Hierarchical Resource List Construction

31


Evaluation
To evaluate the presented algorithm, a network
theoretical framework [Trattner 2011b] based on the
Stanford SNAP Library (http://snap.stanford.edu/) was
developed:

Network-theoretic module: Calculates network properties
such as the size of the Largest Strongly Connected Component
(LSCC) or the Effective Diameter (ED) of the tag cloud network

Searcher module: Implements a hierarchical decentralized
searcher to simulate “efficient” tag cloud driven navigation

C. Trattner , “NAVTAG - A Network-Theoretic Framework to Assess and Improve the Navigability of Tagging Systems,” in11th International Conference on
Web Engineering (ICWE 2011), Springer, 2011 .

32


Hierarchical Decentralized Search
Background knowledge:
(e.g. a folksonomy)

A tag network:

Goal: Navigate from START to TARGET
using local background knowledge only

start target

J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999) 33


Results: Navigability

Approaches calculating resource lists in a
random manner form navigable tag cloud
networks

34


Results: Searcher

• Best Results are obtained with
hierarchically constructed tag
clouds/resource lists (=HH)

• Naive (=TopN + chron. sorted resource
list) approach performs worst (=N)

• However, HR performs better than a
pure random approach (=R)

35


User Study
 To measure the performance of the approach a
between-group test design was used

 For that purpose we randomly split up our test
users into two groups Baseline
Group B
Group A

Assigned to navigate Austria-Forum with Assigned to navigate in Austria-
hierarchically constructed resource lists Forum with reverse chron. sorted
resource lists
36


User Study

 During the study the users were asked to resolve
10 Tasks

 In particular, the users were asked to navigate
from 10 given start resources to 10 given target
resources as fast as possible.

 To get valid results, start and the target
resources were selected uniform at random
(same for all users)

 As tool for navigation users were allowed to use
only tag clouds

37


User Study

 To ensure that the user would have to navigate,
we selected the paths in such a way that the
users had to visit at least 0-4 intermediate
resources to find the target resources

 As a max. amount of time, each of the users was
given 3 minutes of time for each task

38


Example: Tag cloud based navigation

Brahms Beethoven

Start resource Target resource

Resource list
39


User Study

 Since we observed during our pilot test that
users had problems in finding resources that
they did not know, the tags of the target resource
were also presented to the users

 The variable measured in the experiment was
success rate, i.e. we measured whether the user
could find the target resources or not!

40


Results: User Study
 All in all, 24 test user participated in the experiment

 16 male and 8 female

 median age = 33 years, ranging from 22 to 56

 All participants were experienced computer users (on
average 46 hours per week)

 12 of them were experienced with the Austria-Forum
test system

 To get rid of this bias, we assigned those users
randomly to group A and B

41


Results: User Study
 Regarding the mean success rate, we could observe that on
average users of group A could find to 55% their designated
target resources

 Compared to this, in group B the users were only able to find to
23% their designated target resources

 Or in other words, on overage, we could observe an improvement
of 32% of the navigability of the Austria-Forum tagging system,
while using hierarchically constructed resource lists.

 These results confirm our theoretical assumptions as they were
made in previous work of this area [Helic et al. 2011]
Helic, D., Trattner, C., Strohmaier, M. and Andrews, K.: Are Tag Clouds Useful for Navigation? A Network-Theoretic
Analysis, Journal of Social Computing and Cyber-Physical Systems, 2011.

42


Results: User Study

The experiment showed that the hierarchically constructed
tag network is significantly better navigable than the one
naïve approach.

43


Problem: Predefined Resource Hierarchy
- Not always a predefined resource hierarchy is
given
- Hence, the presented approach is not
completely generic

- Other problem:
The Success Rate drops drastically if the
provided resource hierarchy is neither
balanced nor complete

44


Question?

How can we construct fixed branched and balanced
resource hierarchies from tagging data
automatically???

45


Algorithm: Resource Hierarchy Generation

46


Algorithm: Resource Hierarchy Labeling

47


Results: Semantic Evaluation

- Taxonomic F-Measure and
Taxonomic Overlap identify
the quality of a given taxonomy
against a golden standard via
common concepts.

- Comparison to four popular tag
hierarchy induction algorithms

- As golden standard for the experiment the Germanet
ontology was used (the Austria-Forum tag dataset contains
only German tags)
48


Results: Empirical Analysis

- 9 test participants (all of them experienced in the evaluation
of concept hierarchies)
- resource taxonomy with b=10

- Evaluation via online test
- Users had to classify tag trails

49


Results: Empirical Evaluation

Compared to a tag taxonomy comprising only tags we can
see that concept relations of a tag-resource taxonomy with
branching factor b = 10 are only to 5% less hierarchically
arranged than the tag concepts of the in theory best
semantically correct tag taxonomy approach the so-called
Deg/Cooc tag taxonomy induction algorithm.

50


Results: Tag Cloud Navigability

In order to determine the navigability of the approach several
tag networks with different resource list lengths were
generated.

Branching factors used in the experiment: b=2,5 and 10.
Resource list length was varied from k=10 to 50.

- To determine navigability: Size of LSCC and ED was measured.
- To determine efficiency a hierarchical decentralized searcher was
implemented utilizing the resource hierarchy as background knowledge to
search the tag networks.

51


Results: Network Properties

Simulations show the navigability of the hierarchically
constructed tag networks.
52


Results: Searcher

Simulations show very high success rates ( > 90%)
even for “short” resource lists (k=10).
53


Conclusions
- From a network-theoretical perspective (and only
looking at tags) tagging systems are navigable
- However, if we consider simple user-interface
constraints, they are NOT!
- Problem: Current tag cloud algorithms calculate resource lists in a
statically manner
- Pagination clusters tag network into isolated network clusters
- However, with hierarchically constructed resource
lists navigability can be recovered
- Such tag networks are also efficiently navigable, if
the resources of the tagging system can be arranged
into a fixed branched resource taxonomy
54


End of Presentation

Thank you!

Christoph Trattner
ctrattner@iicm.edu
Graz University of Technology, Austria

55


References and Further Readings
Trattner, C., Lin, Y., Parra, D., Yue, Z., Brusilovsky, P.: Evaluating Tag-Based Information
Access in Image Collections, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Helic, D., Körner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational efficiency of broad
vs. narrow folksonomies, In Proceedings of the 23rd ACM Conference on Hypertext and
Social Media, ACM, New York, NY, USA, 2012.
Trattner, C., Singer, P., Helic, D. and Strohmaier, M.: Exploring the Differences and Similarities
of Hierarchical Decentralized Search and Human Navigation in Information-networks In
Proceedings of the 11th International Conference on Knowledge Management and
Knowledge Technologies, ACM, New York, NY, USA, 2012.
Trattner, C.: Linking Related Content in Web Encyclopedias with search query tag clouds, IADIS
International Journal on WWW/Internet ,Volume 9(2), 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists and Tag Trails, Journal of Computing and Information Technology, Volume
19(3), 155-167, 2011.
Trattner, C., Helic, D. and Strohmaier, M.: On the Construction of Efficiently Navigable Tag
Clouds Using Knowledge From Structured Web Content, Journal of Universal Computer
Science, Volume 17(4), 565-582, 2011.

56


References and Further Readings
Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of
Folksonomies, In Proceedings of the 20th international conference on World wide web,
ACM, New York, NY, USA, 417-426, 2011.
Trattner, C., Körner, C., Helic, D.: Enhancing the Navigability of Social Tagging Systems with
Tag Taxonomies, In Proceedings of the 11th International Conference on Knowledge
Management and Knowledge Technologies, ACM, 7–9 September 2011, Messe Congress
Graz, Austria, 2011.
Trattner, C.: Improving the Navigability of Tagging Systems with Hierarchically Constructed
Resource Lists: A Comparative Study, In Proceedings of the 33rd International Conference
on Information Technology Interfaces, IEEE, Cavtat / Dubrovnik, Croatia, 2011.
Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: On the Navigability of Social Tagging
Systems, In proceedings of the Second IEEE International Conference on Social Computing
, Minnesota, USA, 2010.

57

On the Navigability of Social Tagging Systems

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (17)

Similar to On the Navigability of Social Tagging Systems

Similar to On the Navigability of Social Tagging Systems (20)

Recently uploaded

Recently uploaded (20)

On the Navigability of Social Tagging Systems