Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SVD Applied to
Collaborative Filtering
      ~ URUG 7-12-07 ~
Recommendation System
Recommendation System
Answers the question:
What do I want next?!?
Recommendation System
Answers the question:
What do I want next?!?

 Very consumer driven.

 Must provide good results or ...
Collaborative Filtering
Base user recommendations off of:

  User’s past history.

  History of like-minded users.

View d...
Early Approaches

Goldberg, et. al. (1992), Using
collaborative filtering to weave an
information tapestry
Konstan, J., el....
Early CF Challenges
Early CF Challenges
Sparsity - No correlation between
users can be found. Reduced coverage
occurs.
Early CF Challenges
Sparsity - No correlation between
users can be found. Reduced coverage
occurs.

Scalability - Nearest ...
Early CF Challenges
Sparsity - No correlation between
users can be found. Reduced coverage
occurs.

Scalability - Nearest ...
Dimensionality Reduction
Dimensionality Reduction
 Latent Semantic Indexing (LSI)
Dimensionality Reduction
 Latent Semantic Indexing (LSI)

   Algorithm from IR community (late
   80s-early 90s.)
Dimensionality Reduction
 Latent Semantic Indexing (LSI)

   Algorithm from IR community (late
   80s-early 90s.)

   Addr...
Dimensionality Reduction
 Latent Semantic Indexing (LSI)

   Algorithm from IR community (late
   80s-early 90s.)

   Addr...
Dimensionality Reduction
 Latent Semantic Indexing (LSI)

   Algorithm from IR community (late
   80s-early 90s.)

   Addr...
Dimensionality Reduction
 Latent Semantic Indexing (LSI)

   Algorithm from IR community (late
   80s-early 90s.)

   Addr...
Framing LSI for CF
Products X Users matrix instead of Terms X
Documents.

        Netflix Dataset
480,189 users, 17,770 mov...
SVD- The math behind LSI
   Singular Value Decomposition

      For any M x N matrix A of rank r, it can
      decomposed ...
Related to eigenvalue
  decomposition (PCA)
U is the orthornormal eigenspace of
AA^T. Spans the “column space”, known
as l...
Reducing Dimensionality


                                  T
                      Ak = Uk ΣkVk

 A_k is the closest appr...
Making Recommendations
 Cosine Similarity- common way to find neighborhood.
                   i· j
 cos(i, j) =
          ...
Challenges with SVD
Scalability - Once again, compute
time grows with the number of users
and products. O(m^3)
  Offline st...
Incremental SVD
          T
 uk = u       Vk Σk
                  −1
Incremental SVD Results
GHA for SVD
  Gorrell (2006),GHA for Incremental SVD in
  NLP

      Based off of Sanger’s (1989) GHA for eigen
      deco...
GHA extended by Funk

 void train(int user, int movie, real rating)
 {
 
real err = lrate * (rating - predictRating(movie,...
Netflix Results
Best RMSEs

  0.9283

  0.9212

Blended to get 0.9189, 3.42% better than
Netflix.
Summary
SVD provides an elegant and automatic
recommendation system that has the
potential to scale.

There are many diffe...
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Culture
Next
Download to read offline and view in fullscreen.

36

Share

Download to read offline

SVD and the Netflix Dataset

Download to read offline

Short summary and explanation of LSI (SVD) and how it can be applied to recommendation systems and the Netflix dataset in particular.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

SVD and the Netflix Dataset

  1. 1. SVD Applied to Collaborative Filtering ~ URUG 7-12-07 ~
  2. 2. Recommendation System
  3. 3. Recommendation System Answers the question: What do I want next?!?
  4. 4. Recommendation System Answers the question: What do I want next?!? Very consumer driven. Must provide good results or a user may not trust the system in the future.
  5. 5. Collaborative Filtering Base user recommendations off of: User’s past history. History of like-minded users. View data as product X user matrix. Find a “neighborhood” of similar users for that user. Return the top-N recommendations.
  6. 6. Early Approaches Goldberg, et. al. (1992), Using collaborative filtering to weave an information tapestry Konstan, J., el. at (1997), Applying Collaborative Filtering to Usenet news. Use Pearson Correlation or cosine similarity as a measure of similarity to form neighborhoods.
  7. 7. Early CF Challenges
  8. 8. Early CF Challenges Sparsity - No correlation between users can be found. Reduced coverage occurs.
  9. 9. Early CF Challenges Sparsity - No correlation between users can be found. Reduced coverage occurs. Scalability - Nearest neighbor algorithms computation time grows with the number of products and users.
  10. 10. Early CF Challenges Sparsity - No correlation between users can be found. Reduced coverage occurs. Scalability - Nearest neighbor algorithms computation time grows with the number of products and users. Synonymy
  11. 11. Dimensionality Reduction
  12. 12. Dimensionality Reduction Latent Semantic Indexing (LSI)
  13. 13. Dimensionality Reduction Latent Semantic Indexing (LSI) Algorithm from IR community (late 80s-early 90s.)
  14. 14. Dimensionality Reduction Latent Semantic Indexing (LSI) Algorithm from IR community (late 80s-early 90s.) Addresses the problems of synonymy, polysemy, sparsity, and scalability for large datasets.
  15. 15. Dimensionality Reduction Latent Semantic Indexing (LSI) Algorithm from IR community (late 80s-early 90s.) Addresses the problems of synonymy, polysemy, sparsity, and scalability for large datasets. Reduces dimensionality of a dataset and captures the latent relationships.
  16. 16. Dimensionality Reduction Latent Semantic Indexing (LSI) Algorithm from IR community (late 80s-early 90s.) Addresses the problems of synonymy, polysemy, sparsity, and scalability for large datasets. Reduces dimensionality of a dataset and captures the latent relationships. Easily maps to CF!
  17. 17. Dimensionality Reduction Latent Semantic Indexing (LSI) Algorithm from IR community (late 80s-early 90s.) Addresses the problems of synonymy, polysemy, sparsity, and scalability for large datasets. Reduces dimensionality of a dataset and captures the latent relationships. Easily maps to CF!
  18. 18. Framing LSI for CF Products X Users matrix instead of Terms X Documents. Netflix Dataset 480,189 users, 17,770 movies, only ~100 milion ratings. 17,770 X 480,189 matrix that is 99% sparse! About 8.5 billion potential ratings.
  19. 19. SVD- The math behind LSI Singular Value Decomposition For any M x N matrix A of rank r, it can decomposed as: T A = UΣV U is a M x M orthogonal matrix. V is a N X N orthogonal matrix. Σ is a M x N diagonal matrix whose first r diagonal entries are the nonzero singular values of A. σ1 ≥ σ2 ... ≥ σr > σr+1 = ... = σn = 0
  20. 20. Related to eigenvalue decomposition (PCA) U is the orthornormal eigenspace of AA^T. Spans the “column space”, known as left singular vectors. V is the orthornormal eigenspace of A^TA. Spans “row space”. Right vectors. Singular values are the square roots of the eigenvalues.
  21. 21. Reducing Dimensionality T Ak = Uk ΣkVk A_k is the closest approximation to A. A_k minimizes the Frobenius norm over all rank-k matrices: ||A − Ak ||F
  22. 22. Making Recommendations Cosine Similarity- common way to find neighborhood. i· j cos(i, j) = ||i||2 ∗ || j||2 Somehow base recommendations off of that neighborhood and its users. Can also make predictions of products with a simple dot product if the singular values are combined with the singular vectors. 1/2 1/2 T CPprod = Cavg +Uk Sk (c) · Sk Vk (p)
  23. 23. Challenges with SVD Scalability - Once again, compute time grows with the number of users and products. O(m^3) Offline stage. Online stage. Even doing the SVD computation offline is not possible for large datasets. Other methods are needed.
  24. 24. Incremental SVD T uk = u Vk Σk −1
  25. 25. Incremental SVD Results
  26. 26. GHA for SVD Gorrell (2006),GHA for Incremental SVD in NLP Based off of Sanger’s (1989) GHA for eigen decomposition. a ∆ci b = ci · b(x − ∑ a a (a · c j )c j ) j<i b ∆ci a = ci · a(b − ∑ b b (b · c j )c j ) j<i
  27. 27. GHA extended by Funk void train(int user, int movie, real rating) { real err = lrate * (rating - predictRating(movie, user)); userValue[user] += err * movieValue[movie]; movieValue[movie] += err * userValue[user]; }
  28. 28. Netflix Results Best RMSEs 0.9283 0.9212 Blended to get 0.9189, 3.42% better than Netflix.
  29. 29. Summary SVD provides an elegant and automatic recommendation system that has the potential to scale. There are many different algorithms to calculate or at least approximate SVD which can be used in offline stages for websites that need to have CF. Every dataset is different and requires experimentation with to get the best results.
  • AditiSingh286

    Nov. 14, 2018
  • OlgunAydn

    Apr. 22, 2018
  • ChuewutYe

    Mar. 7, 2018
  • AnnaLee238

    Dec. 17, 2017
  • Riteshsawant4

    Oct. 14, 2017
  • gianvitosiciliano

    Nov. 8, 2016
  • ssuser20145e1

    Nov. 1, 2016
  • santinalin

    Jan. 8, 2016
  • loachli

    Jul. 14, 2015
  • ShaunMcCarthy2

    Jul. 8, 2015
  • MaxChevalier

    May. 18, 2015
  • penghao543

    Apr. 6, 2015
  • malkocb

    Jun. 25, 2014
  • pengzhang9634340

    Apr. 16, 2014
  • arjenpdevries

    Apr. 13, 2014
  • predator1987

    Feb. 6, 2014
  • VMarc123

    Dec. 12, 2013
  • flynnfei

    Nov. 17, 2013
  • YoshifumiSeki

    Nov. 5, 2013
  • harish3742

    Sep. 19, 2013

Short summary and explanation of LSI (SVD) and how it can be applied to recommendation systems and the Netflix dataset in particular.

Views

Total views

29,993

On Slideshare

0

From embeds

0

Number of embeds

511

Actions

Downloads

609

Shares

0

Comments

0

Likes

36

×