SlideShare a Scribd company logo
1 of 21
Download to read offline
Faceted Ranking In Collaborative Tagging Systems

   J. I. Orlicki12   P. Fierens2           J. I. Alvarez-Hamelin23
                     1 Core   Security Technologies
                                  2 ITBA

                               3 CONICET


               WEBIST 2009, Lisbon, Portugal
The Problem (Faceted Reputation)
      Which ickr photographers are the best regarding a facet, i.e.
      tag set, { sea, portugal }?
      Nodes are users/channels, edges are favorites and tags are
      associated to the favorited content.
Single Ranking (1/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (2/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Single Ranking (3/3)




      Basic approach, single rank and ltering. Scales well.
      Everything is biased to the richer nodes, tags don't inuence
      the ranking.
      G goes out, but why is D worstly ranked than A regarding
      {sea, portugal}? Is D better than C?
Edge-intersection, 1st gold standard (1/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (2/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Edge-intersection, 1st gold standard (3/3)

       Filtering edges including the conjunction of tags.
       Adequate tag bias, slightly restrictive.
Node-intersection, 2nd gold standard (1/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.




   c
Node-intersection, 2nd gold standard (2/2)
       Filtering edges including the disjunction of tags to rank.
       Plus ltering conjuntion of nodes involved in every tag edge
       after ranking.
       Adequate tag bias, slightly irrestrictive, possibly one tag
       prevails over the other.
The Scalability Problem
       The previous two algorithms don't scale for online queries.
       Another possibility is computing singleton facets oine, and
       later merge the results online.
       Oine time and spatial complexity will grow linearly on
       #edges × #tags per edge. Scaling nicely.


                      100000
                                                 YouTube
                                                   Flickr
                      10000

                        1000
            # edges




                         100

                          10

                           1

                         0.1
                               1   10             100       1000
                                        # tags
Singleton facets, computed oine (1/2)

      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Singleton facets, computed oine (2/2)
      Singleton facet subgraphs used in ranking, after that only best
      K users stored, where K is small.
Probability-product


       Inspired by the probability independence rule, multiply
       PageRank probability of single tags.

                 sea        portugal                    rank!
           A     0.09         0.02            0.0018     #6
           B     0.14         0.04            0.0056     #4
           C     0.14   ×     0.40      =     0.0560     #2
           D     0.38         0.39            0.1482     #1
           E     0.14         0.07            0.0098     #3
           F     0.09         0.05            0.0045     #5

       Possible bias towards the heaviest tag, eclipsing the others.
Rank-sum
     Lowest accumulated ordinal/position sum gets the best ranks.

                  sea        portugal              rank!
             A    #3           #6             9     #5
             B    #2           #5             7     #4
             C    #2    +      #2        =    4     #2
             D    #1           #1             2     #1
             E    #2           #3             5     #3
             F    #3           #4             7     #4

     Avoids this kind of topic drift towards one of the tags.
Winners-intersection

        Top W (small) nodes per singleton facet are used to build a
        new small graph.
        W = 500 in experiments (W = 3 in example).




         sea       portugal
    A    #3
    B    #2
    C    #2    ∩      #2       =    C
    D    #1           #1            D
    E    #2           #3            E
    F    #3
Experiments, comp. with Edge-intersection, OSim
darker is better results
More experiments (ickr)
Conclusions


      Exist approximate and scalable methods for faceted ranking in
      collaborative tagging systems.
      Functional web prototype: Egg-O-Matic

                   http://egg-o-matic.itba.edu.ar




      Loose Ends
          Using weighted graphs.
          Scientic cites dataset (real egos!).
          Industrial-sized dataset (10^7 instead   of 10^5 edges)
Prototype (1/2)
Prototype (2/2, last slide, thanks!)

More Related Content

Similar to WEBIST 2009

5 character classifiers
5 character classifiers5 character classifiers
5 character classifiersSolin TEM
 
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELSEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELgrssieee
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab codeBhushan Deore
 
Introduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationIntroduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationChristian Aparicio
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2Darren Kuropatwa
 
Image ORB feature
Image ORB featureImage ORB feature
Image ORB featureGavin Gao
 

Similar to WEBIST 2009 (10)

BAS 250 Lecture 8
BAS 250 Lecture 8BAS 250 Lecture 8
BAS 250 Lecture 8
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODELSEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
SEGMENTATION OF POLARIMETRIC SAR DATA WITH A MULTI-TEXTURE PRODUCT MODEL
 
Edge detection of video using matlab code
Edge detection of video using matlab codeEdge detection of video using matlab code
Edge detection of video using matlab code
 
Introduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size OptimizationIntroduction to Nastran SOL 200 Size Optimization
Introduction to Nastran SOL 200 Size Optimization
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
4 1 tree world
4 1 tree world4 1 tree world
4 1 tree world
 
SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2SMARTen Up! Part 1 of 3 v2
SMARTen Up! Part 1 of 3 v2
 
Biconnectivity
BiconnectivityBiconnectivity
Biconnectivity
 
Image ORB feature
Image ORB featureImage ORB feature
Image ORB feature
 

Recently uploaded

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

WEBIST 2009

  • 1. Faceted Ranking In Collaborative Tagging Systems J. I. Orlicki12 P. Fierens2 J. I. Alvarez-Hamelin23 1 Core Security Technologies 2 ITBA 3 CONICET WEBIST 2009, Lisbon, Portugal
  • 2. The Problem (Faceted Reputation) Which ickr photographers are the best regarding a facet, i.e. tag set, { sea, portugal }? Nodes are users/channels, edges are favorites and tags are associated to the favorited content.
  • 3. Single Ranking (1/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 4. Single Ranking (2/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 5. Single Ranking (3/3) Basic approach, single rank and ltering. Scales well. Everything is biased to the richer nodes, tags don't inuence the ranking. G goes out, but why is D worstly ranked than A regarding {sea, portugal}? Is D better than C?
  • 6. Edge-intersection, 1st gold standard (1/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 7. Edge-intersection, 1st gold standard (2/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 8. Edge-intersection, 1st gold standard (3/3) Filtering edges including the conjunction of tags. Adequate tag bias, slightly restrictive.
  • 9. Node-intersection, 2nd gold standard (1/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other. c
  • 10. Node-intersection, 2nd gold standard (2/2) Filtering edges including the disjunction of tags to rank. Plus ltering conjuntion of nodes involved in every tag edge after ranking. Adequate tag bias, slightly irrestrictive, possibly one tag prevails over the other.
  • 11. The Scalability Problem The previous two algorithms don't scale for online queries. Another possibility is computing singleton facets oine, and later merge the results online. Oine time and spatial complexity will grow linearly on #edges × #tags per edge. Scaling nicely. 100000 YouTube Flickr 10000 1000 # edges 100 10 1 0.1 1 10 100 1000 # tags
  • 12. Singleton facets, computed oine (1/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 13. Singleton facets, computed oine (2/2) Singleton facet subgraphs used in ranking, after that only best K users stored, where K is small.
  • 14. Probability-product Inspired by the probability independence rule, multiply PageRank probability of single tags. sea portugal rank! A 0.09 0.02 0.0018 #6 B 0.14 0.04 0.0056 #4 C 0.14 × 0.40 = 0.0560 #2 D 0.38 0.39 0.1482 #1 E 0.14 0.07 0.0098 #3 F 0.09 0.05 0.0045 #5 Possible bias towards the heaviest tag, eclipsing the others.
  • 15. Rank-sum Lowest accumulated ordinal/position sum gets the best ranks. sea portugal rank! A #3 #6 9 #5 B #2 #5 7 #4 C #2 + #2 = 4 #2 D #1 #1 2 #1 E #2 #3 5 #3 F #3 #4 7 #4 Avoids this kind of topic drift towards one of the tags.
  • 16. Winners-intersection Top W (small) nodes per singleton facet are used to build a new small graph. W = 500 in experiments (W = 3 in example). sea portugal A #3 B #2 C #2 ∩ #2 = C D #1 #1 D E #2 #3 E F #3
  • 17. Experiments, comp. with Edge-intersection, OSim darker is better results
  • 19. Conclusions Exist approximate and scalable methods for faceted ranking in collaborative tagging systems. Functional web prototype: Egg-O-Matic http://egg-o-matic.itba.edu.ar Loose Ends Using weighted graphs. Scientic cites dataset (real egos!). Industrial-sized dataset (10^7 instead of 10^5 edges)
  • 21. Prototype (2/2, last slide, thanks!)