SlideShare a Scribd company logo
1 of 20
Download to read offline
Rank aggregation via nuclear norm minimization


                                                                                     David F. Gleich
                                                                                  Purdue University
                                                                                         @dgleich

                                                                                    Lek-Heng Lim
                                                                             University of Chicago

                                                                                         KDD2011
                                                                                      San Diego, CA



              Lek funded by NSF CAREER award (DMS-1057064); David funded by DOE John von Neumann and NSERC
David F. Gleich (Purdue)                          KDD 2011                                            1/20
Which is a better list of good DVDs?
Lord of the Rings 3: The Return of …         Lord of the Rings 3: The Return of …
Lord of the Rings 1: The Fellowship          Lord of the Rings 1: The Fellowship
Lord of the Rings 2: The Two Towers          Lord of the Rings 2: The Two Towers
Lost: Season 1                               Star Wars V: Empire Strikes Back
Battlestar Galactica: Season 1               Raiders of the Lost Ark
Fullmetal Alchemist                          Star Wars IV: A New Hope
Trailer Park Boys: Season 4                  Shawshank Redemption
Trailer Park Boys: Season 3                  Star Wars VI: Return of the Jedi
Tenchi Muyo!                                 Lord of the Rings 3: Bonus DVD
Shawshank Redemption                         The Godfather


                  Standard                             Nuclear Norm
               rank aggregation                    based rank aggregation
                    (the mean rating)               (not matrix completion on the
                                                        netflix rating matrix)
David F. Gleich (Purdue)                KDD 2011                                    2/20
Rank Aggregation

Given partial orders on subsets of items, rank aggregation is
  the problem of finding an overall ordering.

Voting Find the winning candidate

Program committees Find the best papers given reviews

Dining Find the best restaurant in San Diego (subject to a budget?)




David F. Gleich (Purdue)      KDD 2011                          3/20
Ranking is really hard
                            John Kemeny               Dwork, Kumar, Naor,
      Ken Arrow                                                 Sivikumar




All rank aggregations
involve some measure       A good ranking is the
of compromise              “average” ranking under   NP hard to compute
                           a permutation distance    Kemeny’s ranking


David F. Gleich (Purdue)          KDD 2011                             4/20
Embody chair
                                       John Cantrell (flickr)


    Given a hard problem,
    what do you do?

    Numerically relax!

    It’ll probably be easier.



David F. Gleich (Purdue)    KDD 2011                            5/20
Suppose we had scores
Let   be the score of the ith movie/song/paper/team to rank

Suppose we can compare the ith to jth:

                                

Then                               is skew-symmetric, rank 2.

Also works for                 with an extra log.

              Numerical ranking is intimately intertwined
              with skew-symmetric matrices
                                     Kemeny and Snell, Mathematical Models in Social Sciences (1978)
David F. Gleich (Purdue)           KDD 2011                                                     6/20
Using ratings as comparisons




Ratings induce various skew-symmetric matrices.

                                                            Arithmetic Mean
                            

                                                            Log-odds
                                
                                              David 1988 – The Method of Paired Comparisons
David F. Gleich (Purdue)           KDD 2011                                            7/20
Extracting the scores
Given   with all entries, then                                107

            is the Borda




                                                Movie Pairs
                                                              105
  count, the least-squares
  solution to  

How many                   do we have?                        101
  Most.
                                                                      101           105
                                                                    Number of Comparisons
Do we trust all              ?
  Not really.                                                       Netflix data 17k movies,
                                                                    500k users, 100M ratings–
                                                                    99.17% filled



David F. Gleich (Purdue)                 KDD 2011                                               8/20
Only partial info? Complete it!
Let             be known for                    We trust these scores.

Goal Find the simplest skew-symmetric matrix that matches
  the data  



                            
       noiseless




        noisy
                       
                               Both of these are NP-hard too.
David F. Gleich (Purdue)                  KDD 2011                       9/20
Solution Go Nuclear




             From a French nuclear test in 1970, image from http://picdit.wordpress.com/2008/07/21/8-insane-nuclear-explosions/
David F. Gleich (Purdue)                                   KDD 2011                                                         10
The nuclear norm
    The analog of the 1-norm or      for matrices.

For vectors                               For matrices

                                          Let              be the SVD.

is NP-hard while                           

 

is convex and gives the same                        best convex under-
   answer “under appropriate                   estimator of rank on unit ball.
   circumstances”




David F. Gleich (Purdue)            KDD 2011                               11/20
Only partial info? Complete it!
Let             be known for           We trust these scores.

Goal Find the simplest skew-symmetric matrix that matches
  the data  



                            
              NP hard




                                                                Heuristic
                            
                Convex




David F. Gleich (Purdue)         KDD 2011                        12/20
Solving the nuclear norm problem
Use a LASSO formulation             1.  
                                    2. REPEAT
                                    3.           = rank-k SVD of
                                          

                                    4.
                                    5.  
                                          

                                    6. UNTIL  
Jain et al. propose SVP for
   this problem without
    



                                                       Jain et al. NIPS 2010
David F. Gleich (Purdue)      KDD 2011                                13/20
Skew-symmetric SVDs
Let         be an                skew-symmetric matrix with
  eigenvalues                                      ,
  where                          and       . Then the SVD of   is
  given by




                            
for   and   given in the proof.
Proof Use the Murnaghan-Wintner form and the SVD of a
   2x2 skew-symmetric block

                 This means that SVP will give us the skew-
                 symmetric constraint “for free”
David F. Gleich (Purdue)            KDD 2011                   14/20
Exact recovery results
David Gross showed how to recover Hermitian matrices.
  i.e. the conditions under which we get the exact  

Note that                  is Hermitian. Thus our new result!




                                       
                                                                Gross arXiv 2010.
David F. Gleich (Purdue)                  KDD 2011                          15/20
Recovery Discussion and Experiments
Confession If              , then just look at differences from
     a connected set. Constants? Not very good.
                               Intuition for the truth.
                                               




David F. Gleich (Purdue)        KDD 2011                      16/20
The Ranking Algorithm
0. INPUT   (ratings data) and c
   (for trust on comparisons)
1. Compute   from  
2. Discard entries with fewer than
   c comparisons
3. Set      to be indices and
   values of what’s left
4.                         = SVP(    )
5. OUTPUT  




David F. Gleich (Purdue)                 KDD 2011   17/20
Synthetic Results
Construct an Item Response Theory model. Vary number of
  ratings per user and a noise/error level
                           Our Algorithm!              The Average Rating




David F. Gleich (Purdue)                    KDD 2011                        18/20
Conclusions and Future Work

“aggregate, then complete”         1. Compare against others

Rank aggregation with              2. Noisy recovery! More
the nuclear norm is                   realistic sampling.
  principled
  easy to compute                  3. Skew-symmetric Lanczos
The results are much better           based SVD?
than simple approaches.




David F. Gleich (Purdue)      KDD 2011                       19/20
Google nuclear ranking gleich

More Related Content

More from David Gleich

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksDavid Gleich
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networksDavid Gleich
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisDavid Gleich
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansDavid Gleich
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningDavid Gleich
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsDavid Gleich
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph miningDavid Gleich
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresDavid Gleich
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsDavid Gleich
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...David Gleich
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutDavid Gleich
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential David Gleich
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreDavid Gleich
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...David Gleich
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsDavid Gleich
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksDavid Gleich
 

More from David Gleich (20)

Correlation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networksCorrelation clustering and community detection in graphs and networks
Correlation clustering and community detection in graphs and networks
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
Higher-order organization of complex networks
Higher-order organization of complex networksHigher-order organization of complex networks
Higher-order organization of complex networks
 
Spacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysisSpacey random walks and higher-order data analysis
Spacey random walks and higher-order data analysis
 
Non-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-meansNon-exhaustive, Overlapping K-means
Non-exhaustive, Overlapping K-means
 
Using Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based LearningUsing Local Spectral Methods to Robustify Graph-Based Learning
Using Local Spectral Methods to Robustify Graph-Based Learning
 
Spacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chainsSpacey random walks and higher order Markov chains
Spacey random walks and higher order Markov chains
 
Localized methods in graph mining
Localized methods in graph miningLocalized methods in graph mining
Localized methods in graph mining
 
PageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structuresPageRank Centrality of dynamic graph structures
PageRank Centrality of dynamic graph structures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
Big data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphsBig data matrix factorizations and Overlapping community detection in graphs
Big data matrix factorizations and Overlapping community detection in graphs
 
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCutAnti-differentiating Approximation Algorithms: PageRank and MinCut
Anti-differentiating Approximation Algorithms: PageRank and MinCut
 
Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential Fast relaxation methods for the matrix exponential
Fast relaxation methods for the matrix exponential
 
Fast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and moreFast matrix primitives for ranking, link-prediction and more
Fast matrix primitives for ranking, link-prediction and more
 
Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...Gaps between the theory and practice of large-scale matrix-based network comp...
Gaps between the theory and practice of large-scale matrix-based network comp...
 
MapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applicationsMapReduce Tall-and-skinny QR and applications
MapReduce Tall-and-skinny QR and applications
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
Relaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networksRelaxation methods for the matrix exponential on large networks
Relaxation methods for the matrix exponential on large networks
 

Recently uploaded

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Rank aggregation via nuclear norm minimization

  • 1. Rank aggregation via nuclear norm minimization David F. Gleich Purdue University @dgleich Lek-Heng Lim University of Chicago KDD2011 San Diego, CA Lek funded by NSF CAREER award (DMS-1057064); David funded by DOE John von Neumann and NSERC David F. Gleich (Purdue) KDD 2011 1/20
  • 2. Which is a better list of good DVDs? Lord of the Rings 3: The Return of … Lord of the Rings 3: The Return of … Lord of the Rings 1: The Fellowship Lord of the Rings 1: The Fellowship Lord of the Rings 2: The Two Towers Lord of the Rings 2: The Two Towers Lost: Season 1 Star Wars V: Empire Strikes Back Battlestar Galactica: Season 1 Raiders of the Lost Ark Fullmetal Alchemist Star Wars IV: A New Hope Trailer Park Boys: Season 4 Shawshank Redemption Trailer Park Boys: Season 3 Star Wars VI: Return of the Jedi Tenchi Muyo! Lord of the Rings 3: Bonus DVD Shawshank Redemption The Godfather Standard Nuclear Norm rank aggregation based rank aggregation (the mean rating) (not matrix completion on the netflix rating matrix) David F. Gleich (Purdue) KDD 2011 2/20
  • 3. Rank Aggregation Given partial orders on subsets of items, rank aggregation is the problem of finding an overall ordering. Voting Find the winning candidate Program committees Find the best papers given reviews Dining Find the best restaurant in San Diego (subject to a budget?) David F. Gleich (Purdue) KDD 2011 3/20
  • 4. Ranking is really hard John Kemeny Dwork, Kumar, Naor, Ken Arrow Sivikumar All rank aggregations involve some measure A good ranking is the of compromise “average” ranking under NP hard to compute a permutation distance Kemeny’s ranking David F. Gleich (Purdue) KDD 2011 4/20
  • 5. Embody chair John Cantrell (flickr) Given a hard problem, what do you do? Numerically relax! It’ll probably be easier. David F. Gleich (Purdue) KDD 2011 5/20
  • 6. Suppose we had scores Let   be the score of the ith movie/song/paper/team to rank Suppose we can compare the ith to jth:   Then   is skew-symmetric, rank 2. Also works for   with an extra log. Numerical ranking is intimately intertwined with skew-symmetric matrices Kemeny and Snell, Mathematical Models in Social Sciences (1978) David F. Gleich (Purdue) KDD 2011 6/20
  • 7. Using ratings as comparisons Ratings induce various skew-symmetric matrices. Arithmetic Mean   Log-odds   David 1988 – The Method of Paired Comparisons David F. Gleich (Purdue) KDD 2011 7/20
  • 8. Extracting the scores Given   with all entries, then 107   is the Borda Movie Pairs 105 count, the least-squares solution to   How many   do we have? 101 Most. 101 105 Number of Comparisons Do we trust all   ? Not really. Netflix data 17k movies, 500k users, 100M ratings– 99.17% filled David F. Gleich (Purdue) KDD 2011 8/20
  • 9. Only partial info? Complete it! Let   be known for   We trust these scores. Goal Find the simplest skew-symmetric matrix that matches the data     noiseless noisy   Both of these are NP-hard too. David F. Gleich (Purdue) KDD 2011 9/20
  • 10. Solution Go Nuclear From a French nuclear test in 1970, image from http://picdit.wordpress.com/2008/07/21/8-insane-nuclear-explosions/ David F. Gleich (Purdue) KDD 2011 10
  • 11. The nuclear norm The analog of the 1-norm or   for matrices. For vectors For matrices Let   be the SVD. is NP-hard while     is convex and gives the same   best convex under- answer “under appropriate estimator of rank on unit ball. circumstances” David F. Gleich (Purdue) KDD 2011 11/20
  • 12. Only partial info? Complete it! Let   be known for   We trust these scores. Goal Find the simplest skew-symmetric matrix that matches the data     NP hard Heuristic   Convex David F. Gleich (Purdue) KDD 2011 12/20
  • 13. Solving the nuclear norm problem Use a LASSO formulation 1.   2. REPEAT   3.   = rank-k SVD of     4. 5.     6. UNTIL   Jain et al. propose SVP for this problem without   Jain et al. NIPS 2010 David F. Gleich (Purdue) KDD 2011 13/20
  • 14. Skew-symmetric SVDs Let   be an   skew-symmetric matrix with eigenvalues   , where   and   . Then the SVD of   is given by   for   and   given in the proof. Proof Use the Murnaghan-Wintner form and the SVD of a 2x2 skew-symmetric block This means that SVP will give us the skew- symmetric constraint “for free” David F. Gleich (Purdue) KDD 2011 14/20
  • 15. Exact recovery results David Gross showed how to recover Hermitian matrices. i.e. the conditions under which we get the exact   Note that   is Hermitian. Thus our new result!   Gross arXiv 2010. David F. Gleich (Purdue) KDD 2011 15/20
  • 16. Recovery Discussion and Experiments Confession If   , then just look at differences from a connected set. Constants? Not very good.   Intuition for the truth.     David F. Gleich (Purdue) KDD 2011 16/20
  • 17. The Ranking Algorithm 0. INPUT   (ratings data) and c (for trust on comparisons) 1. Compute   from   2. Discard entries with fewer than c comparisons 3. Set   to be indices and values of what’s left 4.   = SVP(  ) 5. OUTPUT   David F. Gleich (Purdue) KDD 2011 17/20
  • 18. Synthetic Results Construct an Item Response Theory model. Vary number of ratings per user and a noise/error level Our Algorithm! The Average Rating David F. Gleich (Purdue) KDD 2011 18/20
  • 19. Conclusions and Future Work “aggregate, then complete” 1. Compare against others Rank aggregation with 2. Noisy recovery! More the nuclear norm is realistic sampling. principled easy to compute 3. Skew-symmetric Lanczos The results are much better based SVD? than simple approaches. David F. Gleich (Purdue) KDD 2011 19/20