SlideShare a Scribd company logo
Faceted Ranking In Collaborative Tagging Systems

   J. I. Orlicki12   P. Fierens2           J. I. Alvarez-Hamelin23
                     1 Core   Security Technologies
                                  2 ITBA

                               3 CONICET


               WEBIST 2009, Lisbon, Portugal
The Problem (Faceted Reputation)
      Which ickr photographers are the best regarding a facet, i.e.
      tag set, { sea, portugal }?
      Nodes are users/channels, edges are favorites and tags are
      associated to the favorited content.
Single Ranking (1/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (2/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (3/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Edge-intersection, 1st gold standard (1/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (2/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (3/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Node-intersection, 2nd gold standard (1/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.




   c
Node-intersection, 2nd gold standard (2/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.
The Scalability Problem
       The previous two algorithms don't scale for online queries.
       Another possibility is computing singleton facets oine, and
       later merge the results online.
       Oine time and spatial complexity will grow linearly on
       #edges × #tags per edge. Scaling nicely.


                      100000
                                                 YouTube
                                                   Flickr
                      10000

                        1000
            # edges




                         100

                          10

                           1

                         0.1
                               1   10             100       1000
                                        # tags
Singleton facets, computed oine (1/2)

      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Singleton facets, computed oine (2/2)
      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Probability-product


       Inspired by the probability independence rule, multiply
       PageRank probability of single tags.

                 sea        portugal                    rank!
           A     0.09         0.02            0.0018     #6
           B     0.14         0.04            0.0056     #4
           C     0.14   ×     0.40      =     0.0560     #2
           D     0.38         0.39            0.1482     #1
           E     0.14         0.07            0.0098     #3
           F     0.09         0.05            0.0045     #5

       Possible bias towards the heaviest tag, eclipsing the others.
Rank-sum
     Lowest accumulated ordinal/position sum gets the best ranks.

                  sea        portugal              rank!
             A    #3           #6             9     #5
             B    #2           #5             7     #4
             C    #2    +      #2        =    4     #2
             D    #1           #1             2     #1
             E    #2           #3             5     #3
             F    #3           #4             7     #4

     Avoids this kind of topic drift towards one of the tags.
Winners-intersection

        Top W (small) nodes per singleton facet are used to build a
        new small graph.
        W = 500 in experiments (W = 3 in example).




         sea       portugal
    A    #3
    B    #2
    C    #2    ∩      #2       =    C
    D    #1           #1            D
    E    #2           #3            E
    F    #3
Experiments, comp. with Edge-intersection, OSim
darker is better results
More experiments (ickr)
Conclusions


      Exist approximate and scalable methods for faceted ranking in
      collaborative tagging systems.
      Functional web prototype: Egg-O-Matic

                   http://egg-o-matic.itba.edu.ar




      Loose Ends
          Using weighted graphs.
          Scientic cites dataset (real egos!).
          Industrial-sized dataset (10^7 instead   of 10^5 edges)
Prototype (1/2)
Prototype (2/2, last slide, thanks!)

More Related Content

Similar to WEBIST 2009

BAS 250 Lecture 8
BAS 250 Lecture 8BAS 250 Lecture 8
BAS 250 Lecture 8
Wake Tech BAS
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
Solin TEM
 
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELSEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELgrssieee
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab codeBhushan Deore
 
Introduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationIntroduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size Optimization
Christian Aparicio
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
thamizh arasi
 
4 1 tree world
4 1 tree world4 1 tree world
4 1 tree world
Leonardo Auslender
 
SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2
Darren Kuropatwa
 
Biconnectivity
BiconnectivityBiconnectivity
Biconnectivity
msramanujan
 
Image ORB feature
Image ORB featureImage ORB feature
Image ORB feature
Gavin Gao
 

Similar to WEBIST 2009 (10)

BAS 250 Lecture 8
BAS 250 Lecture 8BAS 250 Lecture 8
BAS 250 Lecture 8
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELSEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab code
 
Introduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationIntroduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size Optimization
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
4 1 tree world
4 1 tree world4 1 tree world
4 1 tree world
 
SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2
 
Biconnectivity
BiconnectivityBiconnectivity
Biconnectivity
 
Image ORB feature
Image ORB featureImage ORB feature
Image ORB feature
 

Recently uploaded

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

WEBIST 2009

  • 1. Faceted Ranking In Collaborative Tagging Systems J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23 1 Core Security Technologies 2 ITBA 3 CONICET WEBIST 2009, Lisbon, Portugal
  • 2. The Problem (Faceted Reputation) Which ickr photographers are the best regarding a facet, i.e. tag set, { sea, portugal }? Nodes are users/channels, edges are favorites and tags are associated to the favorited content.
  • 3. Single Ranking (1/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 4. Single Ranking (2/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 5. Single Ranking (3/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 6. Edge-intersection, 1st gold standard (1/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 7. Edge-intersection, 1st gold standard (2/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 8. Edge-intersection, 1st gold standard (3/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 9. Node-intersection, 2nd gold standard (1/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other. c
  • 10. Node-intersection, 2nd gold standard (2/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other.
  • 11. The Scalability Problem The previous two algorithms don't scale for online queries. Another possibility is computing singleton facets oine, and later merge the results online. Oine time and spatial complexity will grow linearly on #edges × #tags per edge. Scaling nicely. 100000 YouTube Flickr 10000 1000 # edges 100 10 1 0.1 1 10 100 1000 # tags
  • 12. Singleton facets, computed oine (1/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 13. Singleton facets, computed oine (2/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 14. Probability-product Inspired by the probability independence rule, multiply PageRank probability of single tags. sea portugal rank! A 0.09 0.02 0.0018 #6 B 0.14 0.04 0.0056 #4 C 0.14 × 0.40 = 0.0560 #2 D 0.38 0.39 0.1482 #1 E 0.14 0.07 0.0098 #3 F 0.09 0.05 0.0045 #5 Possible bias towards the heaviest tag, eclipsing the others.
  • 15. Rank-sum Lowest accumulated ordinal/position sum gets the best ranks. sea portugal rank! A #3 #6 9 #5 B #2 #5 7 #4 C #2 + #2 = 4 #2 D #1 #1 2 #1 E #2 #3 5 #3 F #3 #4 7 #4 Avoids this kind of topic drift towards one of the tags.
  • 16. Winners-intersection Top W (small) nodes per singleton facet are used to build a new small graph. W = 500 in experiments (W = 3 in example). sea portugal A #3 B #2 C #2 ∩ #2 = C D #1 #1 D E #2 #3 E F #3
  • 17. Experiments, comp. with Edge-intersection, OSim darker is better results
  • 19. Conclusions Exist approximate and scalable methods for faceted ranking in collaborative tagging systems. Functional web prototype: Egg-O-Matic http://egg-o-matic.itba.edu.ar Loose Ends Using weighted graphs. Scientic cites dataset (real egos!). Industrial-sized dataset (10^7 instead of 10^5 edges)
  • 21. Prototype (2/2, last slide, thanks!)