0
Rank aggregation via nuclear norm minimization                                                                            ...
Which is a better list of good DVDs?Lord of the Rings 3: The Return of …       Lord of the Rings 3: The Return of …Lord of...
Ranking is really hard                            John Kemeny               Dwork, Kumar, Naor,      Ken Arrow            ...
Embody chair                                            John Cantrell (flickr)     Given a hard problem,     what do you d...
Suppose we had scoresLet   be the score of the ith movie/song/paper/team to rankSuppose we can compare the ith to jth:    ...
Using ratings instead of comparisonsRatings induce various skew-symmetric matrices.                                       ...
Extracting the scoresGiven   with all entries, then                               107            is the Borda             ...
Only partial info? Complete it!Let              be known for                     We trust these scores.Goal Find the simpl...
Solving the nuclear norm problemUse a LASSO formulation                1.                                         2. REPEA...
Skew-symmetric SVDsLet         be an                skew-symmetric matrix with  eigenvalues                               ...
Exact recovery resultsDavid Gross showed how to recover Hermitian matrices.  i.e. the conditions under which we get the ex...
Recovery Discussion and ExperimentsIf                  , then just look at differences from a       connected set. Constan...
The Ranking Algorithm0. INPUT   (ratings data) and c   (for trust on comparisons)1. Compute   from  2. Discard entries wit...
Synthetic ResultsConstruct an Item Response Theory model. Vary number of  ratings per user and a noise/error level        ...
Conclusions and Future Work                                    1. Compare against othersRank aggregation with             ...
Google nuclear ranking gleich          To appear KDD2011
Upcoming SlideShare
Loading in...5
×

Rank aggregation via skew-symmetric matrix completion

772

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
772
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Rank aggregation via skew-symmetric matrix completion"

  1. 1. Rank aggregation via nuclear norm minimization David F. Gleich Sandia National Laboratories Lek-Heng Lim University of Chicago Householder Symposium 18 Tahoe City, (Barely) CASandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.David F. Gleich (Sandia) Householder 18 1/15
  2. 2. Which is a better list of good DVDs?Lord of the Rings 3: The Return of … Lord of the Rings 3: The Return of …Lord of the Rings 1: The Fellowship Lord of the Rings 1: The FellowshipLord of the Rings 2: The Two Towers Lord of the Rings 2: The Two TowersLost: Season 1 Star Wars V: Empire Strikes BackBattlestar Galactica: Season 1 Raiders of the Lost ArkFullmetal Alchemist Star Wars IV: A New HopeTrailer Park Boys: Season 4 Shawshank RedemptionTrailer Park Boys: Season 3 Star Wars VI: Return of the JediTenchi Muyo! Lord of the Rings 3: Bonus DVDShawshank Redemption The Godfather Standard Nuclear Norm rank aggregation based rank aggregation (the mean rating)David F. Gleich (Sandia) Householder 18 2/15
  3. 3. Ranking is really hard John Kemeny Dwork, Kumar, Naor, Ken Arrow SivikumarAll rank aggregationsinvolve some measure A good ranking is theof compromise “average” ranking under NP hard to compute a permutation distance Kemeny’s rankingDavid F. Gleich (Sandia) Householder 18 3/15
  4. 4. Embody chair John Cantrell (flickr) Given a hard problem, what do you do? Numerically relax! It’ll probably be easier.David F. Gleich (Sandia) Householder 18 4 of 15
  5. 5. Suppose we had scoresLet   be the score of the ith movie/song/paper/team to rankSuppose we can compare the ith to jth:  Then   is skew-symmetric, rank 2.Also works for   with an extra log. Numerical ranking is intimately intertwined with skew-symmetric matrices Kemeny and Snell, Mathematical Models in Social Sciences (1978)David F. Gleich (Sandia) Householder 18 5/15
  6. 6. Using ratings instead of comparisonsRatings induce various skew-symmetric matrices. Arithmetic Mean   Log-odds   David 1988 – The Method of Paired ComparisonsDavid F. Gleich (Sandia) Householder 18 6/15
  7. 7. Extracting the scoresGiven   with all entries, then 107   is the Borda Movie Pairs 105 count, the least-squares solution to  How many   do we have? 101 Most. 101 105 Number of ComparisonsDo we trust all   ? Not really. Netflix data 17k movies, 500k users, 100M ratings– 99.17% filledDavid F. Gleich (Sandia) Householder 18 7/15
  8. 8. Only partial info? Complete it!Let   be known for   We trust these scores.Goal Find the simplest skew-symmetric matrix that matches the data     NP hard Heuristic   Convex   best convex underestimator of rank on unit ball.David F. Gleich (Sandia) Householder 18 8/15
  9. 9. Solving the nuclear norm problemUse a LASSO formulation 1.   2. REPEAT  3.   = rank-k SVD of    4. 5.     6. UNTIL  Jain et al. propose SVP for this problem without   Jain et al. NIPS 2010David F. Gleich (Sandia) Householder 18 9/15
  10. 10. Skew-symmetric SVDsLet   be an   skew-symmetric matrix with eigenvalues   , where   and   . Then the SVD of   is given by  for   and   given in the proof.Proof Use the Murnaghan-Wintner form and the SVD of a 2x2 skew-symmetric block This means that SVP will give us the skew- symmetric constraint “for free”David F. Gleich (Sandia) Householder 18 10/15
  11. 11. Exact recovery resultsDavid Gross showed how to recover Hermitian matrices. i.e. the conditions under which we get the exact  Note that   is Hermitian. Thus our new result!   Gross arXiv 2010.David F. Gleich (Sandia) Householder 18 11/15
  12. 12. Recovery Discussion and ExperimentsIf   , then just look at differences from a connected set. Constants? Not very good.   Intuition for the truth.    David F. Gleich (Sandia) Householder 18 12/15
  13. 13. The Ranking Algorithm0. INPUT   (ratings data) and c (for trust on comparisons)1. Compute   from  2. Discard entries with fewer than c comparisons3. Set   to be indices and values of what’s left4.   = SVP(  )5. OUTPUT  David F. Gleich (Sandia) Householder 18 13/15
  14. 14. Synthetic ResultsConstruct an Item Response Theory model. Vary number of ratings per user and a noise/error level Our Algorithm! The Average RatingDavid F. Gleich (Sandia) Householder 18 14/15
  15. 15. Conclusions and Future Work 1. Compare against othersRank aggregation with van Dooren (fitting)the nuclear norm is: Massey (direct least principled squares for   ) easy to computeThe results are much better 2. Noisy recovery! Morethan simple approaches. realistic sampling. 3. Skew-symmetric Lanczos based SVD?David F. Gleich (Sandia) Householder 18 15/15
  16. 16. Google nuclear ranking gleich To appear KDD2011
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×