SlideShare a Scribd company logo
1 of 21
Download to read offline
Faceted Ranking In Collaborative Tagging Systems

   J. I. Orlicki12   P. Fierens2           J. I. Alvarez-Hamelin23
                     1 Core   Security Technologies
                                  2 ITBA

                               3 CONICET


               WEBIST 2009, Lisbon, Portugal
The Problem (Faceted Reputation)
      Which ickr photographers are the best regarding a facet, i.e.
      tag set, { sea, portugal }?
      Nodes are users/channels, edges are favorites and tags are
      associated to the favorited content.
Single Ranking (1/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (2/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (3/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Edge-intersection, 1st gold standard (1/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (2/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (3/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Node-intersection, 2nd gold standard (1/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.




   c
Node-intersection, 2nd gold standard (2/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.
The Scalability Problem
       The previous two algorithms don't scale for online queries.
       Another possibility is computing singleton facets oine, and
       later merge the results online.
       Oine time and spatial complexity will grow linearly on
       #edges × #tags per edge. Scaling nicely.


                      100000
                                                 YouTube
                                                   Flickr
                      10000

                        1000
            # edges




                         100

                          10

                           1

                         0.1
                               1   10             100       1000
                                        # tags
Singleton facets, computed oine (1/2)

      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Singleton facets, computed oine (2/2)
      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Probability-product


       Inspired by the probability independence rule, multiply
       PageRank probability of single tags.

                 sea        portugal                    rank!
           A     0.09         0.02            0.0018     #6
           B     0.14         0.04            0.0056     #4
           C     0.14   ×     0.40      =     0.0560     #2
           D     0.38         0.39            0.1482     #1
           E     0.14         0.07            0.0098     #3
           F     0.09         0.05            0.0045     #5

       Possible bias towards the heaviest tag, eclipsing the others.
Rank-sum
     Lowest accumulated ordinal/position sum gets the best ranks.

                  sea        portugal              rank!
             A    #3           #6             9     #5
             B    #2           #5             7     #4
             C    #2    +      #2        =    4     #2
             D    #1           #1             2     #1
             E    #2           #3             5     #3
             F    #3           #4             7     #4

     Avoids this kind of topic drift towards one of the tags.
Winners-intersection

        Top W (small) nodes per singleton facet are used to build a
        new small graph.
        W = 500 in experiments (W = 3 in example).




         sea       portugal
    A    #3
    B    #2
    C    #2    ∩      #2       =    C
    D    #1           #1            D
    E    #2           #3            E
    F    #3
Experiments, comp. with Edge-intersection, OSim
darker is better results
More experiments (ickr)
Conclusions


      Exist approximate and scalable methods for faceted ranking in
      collaborative tagging systems.
      Functional web prototype: Egg-O-Matic

                   http://egg-o-matic.itba.edu.ar




      Loose Ends
          Using weighted graphs.
          Scientic cites dataset (real egos!).
          Industrial-sized dataset (10^7 instead   of 10^5 edges)
Prototype (1/2)
Prototype (2/2, last slide, thanks!)

More Related Content

Similar to WEBIST 2009 (10)

BAS 250 Lecture 8
BAS 250 Lecture 8BAS 250 Lecture 8
BAS 250 Lecture 8
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELSEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab code
 
Introduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationIntroduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size Optimization
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
4 1 tree world
4 1 tree world4 1 tree world
4 1 tree world
 
SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2
 
Biconnectivity
BiconnectivityBiconnectivity
Biconnectivity
 
Image ORB feature
Image ORB featureImage ORB feature
Image ORB feature
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Recently uploaded (20)

Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 

WEBIST 2009

  • 1. Faceted Ranking In Collaborative Tagging Systems J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23 1 Core Security Technologies 2 ITBA 3 CONICET WEBIST 2009, Lisbon, Portugal
  • 2. The Problem (Faceted Reputation) Which ickr photographers are the best regarding a facet, i.e. tag set, { sea, portugal }? Nodes are users/channels, edges are favorites and tags are associated to the favorited content.
  • 3. Single Ranking (1/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 4. Single Ranking (2/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 5. Single Ranking (3/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 6. Edge-intersection, 1st gold standard (1/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 7. Edge-intersection, 1st gold standard (2/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 8. Edge-intersection, 1st gold standard (3/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 9. Node-intersection, 2nd gold standard (1/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other. c
  • 10. Node-intersection, 2nd gold standard (2/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other.
  • 11. The Scalability Problem The previous two algorithms don't scale for online queries. Another possibility is computing singleton facets oine, and later merge the results online. Oine time and spatial complexity will grow linearly on #edges × #tags per edge. Scaling nicely. 100000 YouTube Flickr 10000 1000 # edges 100 10 1 0.1 1 10 100 1000 # tags
  • 12. Singleton facets, computed oine (1/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 13. Singleton facets, computed oine (2/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 14. Probability-product Inspired by the probability independence rule, multiply PageRank probability of single tags. sea portugal rank! A 0.09 0.02 0.0018 #6 B 0.14 0.04 0.0056 #4 C 0.14 × 0.40 = 0.0560 #2 D 0.38 0.39 0.1482 #1 E 0.14 0.07 0.0098 #3 F 0.09 0.05 0.0045 #5 Possible bias towards the heaviest tag, eclipsing the others.
  • 15. Rank-sum Lowest accumulated ordinal/position sum gets the best ranks. sea portugal rank! A #3 #6 9 #5 B #2 #5 7 #4 C #2 + #2 = 4 #2 D #1 #1 2 #1 E #2 #3 5 #3 F #3 #4 7 #4 Avoids this kind of topic drift towards one of the tags.
  • 16. Winners-intersection Top W (small) nodes per singleton facet are used to build a new small graph. W = 500 in experiments (W = 3 in example). sea portugal A #3 B #2 C #2 ∩ #2 = C D #1 #1 D E #2 #3 E F #3
  • 17. Experiments, comp. with Edge-intersection, OSim darker is better results
  • 19. Conclusions Exist approximate and scalable methods for faceted ranking in collaborative tagging systems. Functional web prototype: Egg-O-Matic http://egg-o-matic.itba.edu.ar Loose Ends Using weighted graphs. Scientic cites dataset (real egos!). Industrial-sized dataset (10^7 instead of 10^5 edges)
  • 21. Prototype (2/2, last slide, thanks!)