WEBIST 2009

438 views

Published on

Multimedia content is uploaded, tagged and recommended by users of collaborative systems such as YouTube
and Flickr. These systems can be represented as tagged-graphs, where nodes correspond to users and tagged-
links to recommendations. In this paper we analyze the online computation of user-rankings associated to
a set of tags, called a facet. A simple approach to faceted ranking is to apply an algorithm that calculates
a measure of node centrality, say, PageRank, to a subgraph associated with the given facet. This solution,
however, is not feasible for online computation. We propose an alternative solution: (i) first, a ranking for
each tag is computed offline on the basis of tag-related subgraphs; (ii) then, a faceted order is generated online
by merging rankings corresponding to all the tags in the facet. Based on empirical observations, we show that
step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their
results to those produced by the direct calculation of node centrality based on the facet-dependent graph.

Published in: Technology, Lifestyle
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
438
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

WEBIST 2009

  1. 1. Faceted Ranking In Collaborative Tagging Systems J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23 1 Core Security Technologies 2 ITBA 3 CONICET WEBIST 2009, Lisbon, Portugal
  2. 2. The Problem (Faceted Reputation) Which ickr photographers are the best regarding a facet, i.e. tag set, { sea, portugal }? Nodes are users/channels, edges are favorites and tags are associated to the favorited content.
  3. 3. Single Ranking (1/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags dont inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  4. 4. Single Ranking (2/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags dont inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  5. 5. Single Ranking (3/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags dont inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  6. 6. Edge-intersection, 1st gold standard (1/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  7. 7. Edge-intersection, 1st gold standard (2/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  8. 8. Edge-intersection, 1st gold standard (3/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  9. 9. Node-intersection, 2nd gold standard (1/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other. c
  10. 10. Node-intersection, 2nd gold standard (2/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other.
  11. 11. The Scalability Problem The previous two algorithms dont scale for online queries. Another possibility is computing singleton facets oine, and later merge the results online. Oine time and spatial complexity will grow linearly on #edges × #tags per edge. Scaling nicely. 100000 YouTube Flickr 10000 1000 # edges 100 10 1 0.1 1 10 100 1000 # tags
  12. 12. Singleton facets, computed oine (1/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  13. 13. Singleton facets, computed oine (2/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  14. 14. Probability-product Inspired by the probability independence rule, multiply PageRank probability of single tags. sea portugal rank! A 0.09 0.02 0.0018 #6 B 0.14 0.04 0.0056 #4 C 0.14 × 0.40 = 0.0560 #2 D 0.38 0.39 0.1482 #1 E 0.14 0.07 0.0098 #3 F 0.09 0.05 0.0045 #5 Possible bias towards the heaviest tag, eclipsing the others.
  15. 15. Rank-sum Lowest accumulated ordinal/position sum gets the best ranks. sea portugal rank! A #3 #6 9 #5 B #2 #5 7 #4 C #2 + #2 = 4 #2 D #1 #1 2 #1 E #2 #3 5 #3 F #3 #4 7 #4 Avoids this kind of topic drift towards one of the tags.
  16. 16. Winners-intersection Top W (small) nodes per singleton facet are used to build a new small graph. W = 500 in experiments (W = 3 in example). sea portugal A #3 B #2 C #2 ∩ #2 = C D #1 #1 D E #2 #3 E F #3
  17. 17. Experiments, comp. with Edge-intersection, OSimdarker is better results
  18. 18. More experiments (ickr)
  19. 19. Conclusions Exist approximate and scalable methods for faceted ranking in collaborative tagging systems. Functional web prototype: Egg-O-Matic http://egg-o-matic.itba.edu.ar Loose Ends Using weighted graphs. Scientic cites dataset (real egos!). Industrial-sized dataset (10^7 instead of 10^5 edges)
  20. 20. Prototype (1/2)
  21. 21. Prototype (2/2, last slide, thanks!)

×