Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Music Recommender Systems

3,697 views

Published on

Algorithm about recommender systems.

Published in: Technology
  • Be the first to comment

Music Recommender Systems

  1. 1. Music Recommender Systems<br />超群.com<br />fuchaoqun@gmail.com<br />http://www.fuchaoqun.com<br />
  2. 2. Who is Using Recommender Systems?<br />
  3. 3. Recommender Systems<br /><ul><li>Summary:
  4. 4. http://en.wikipedia.org/wiki/Recommender_system
  5. 5. Keywords:</li></ul>recommender systems、 association rules 、collaborative filtering、slope one、 SVD、KNN....<br />
  6. 6. Algorithms<br /><ul><li>Association Rules
  7. 7. Slope one
  8. 8. SVD
  9. 9. ….</li></li></ul><li>Algorithms<br /><ul><li>Association Rules
  10. 10. Slope one
  11. 11. SVD
  12. 12. ….</li></li></ul><li>Association Rules<br />
  13. 13. Association Rules<br /><ul><li>Support:
  14. 14. Confidence:
  15. 15. Algorithms:Apriori algorithm、FP-growth algorithm
  16. 16. http://en.wikipedia.org/wiki/Association_rule_learning
  17. 17. Demo:Python + Orange</li></ul>http://www.fuchaoqun.com/2008/08/data-mining-with-python-orange-association_rule/<br />
  18. 18. Algorithms<br /><ul><li>Association Rules
  19. 19. Slope one
  20. 20. SVD
  21. 21. ….</li></li></ul><li>Slope One<br />
  22. 22. Slope One<br /><ul><li>By Daniel Lemire in 2005
  23. 23. http://www.daniel-lemire.com/fr/abstracts/SDM2005.html
  24. 24. Simper Could Be Better
  25. 25. Weighted Average:
  26. 26. http://en.wikipedia.org/wiki/Slope_One
  27. 27. Implements: http://taste.sourceforge.net/ (Java)http://code.google.com/p/openslopeone (PHP&MySQL)</li></li></ul><li>Algorithms<br /><ul><li>Association Rules
  28. 28. Slope one
  29. 29. SVD
  30. 30. ….</li></li></ul><li>Similarity<br />Similarity:<br />
  31. 31. SVD<br />Image copy from Here<br />
  32. 32. SVD In Image Compression<br />Original<br />K=10<br />K=20<br />
  33. 33. Process SVD<br />Define the original user-item matrix, R, of size m x n, which includes the ratings of m users on n items. rij refers to the rating of user ui on item ij .<br />Preprocess user-item matrix R in order to eliminate all missing data values.<br />Compute the SVD of R and obtain matrices U, S and V , of size m x m, m x n, and n x n, respectively. Their relationship is expressed by: R =U * S * VT .<br />Perform the dimensionality reduction step by keeping only k diagonal entries from matrix S to obtain a k x k matrix, Sk. Similarly, matrices Uk and Vk of size m x k and k x n are generated. The &quot;reduced&quot; user-item matrix, R’, is obtained by R’ = Uk * Sk * VkT, while r&apos;ij denotes the rating by user ui on item ij as included in this reduced matrix.<br />Compute sqrt(Sk) and then calculate two matrix products: Uk * sqrt(Sk)T, which represents m users and sqrt(Sk) * VkT , which represents n items in the k dimen-sional feature space. We are particularly interested in the latter matrix, of size k x n.<br />Use KNN on user matrix and item matrix, or you can multiply them to get user&apos;s rating on every item.<br />
  34. 34. Demo<br />Which two people have the most similar tastes?<br />Which two season are the most close?<br />from Here<br />
  35. 35. Demo<br />
  36. 36. Demo<br />
  37. 37. SVD<br /><ul><li>SVD
  38. 38. matlab
  39. 39. LAPCKL、BLAS (Fortran)
  40. 40. numpy、scipy (Python)
  41. 41. SVDLIBC、Meschach (C)
  42. 42. http://en.wikipedia.org/wiki/Singular_value_decomposition
  43. 43. ……
  44. 44. KNN:
  45. 45. matlab
  46. 46. FLANN
  47. 47. ……
  48. 48. All in one solution:
  49. 49. DIVISI
  50. 50. ……</li></li></ul><li>MAGIC DIVISI!<br />#!/usr/bin/env python<br />#coding=utf-8<br />import divisi<br />from divisi.cnet import *<br />data = divisi.SparseLabeledTensor(ndim = 2)<br /># read some rating into data<br /># data[user_id, song_id] = 4<br />svd_result = data.svd(k = 128)<br /># get songs that the user may like<br /># predict_features(svd_result, user_id).top_items(100)<br /># get similar songs<br /># feature_similarity(svd_result, song_id).top_items(100)<br /># get users that have similar tastes<br /># concept_similarity(svd_result, user_id).top_items(100)<br />
  51. 51. Music Recommender Systems<br /><ul><li>Data collection
  52. 52. Data Cleaning
  53. 53. Data Preprocessing
  54. 54. Data Mining
  55. 55. Tracking & Optimization</li></li></ul><li>Data collection<br /><ul><li>User rating
  56. 56. User collection
  57. 57. User listen log
  58. 58. User view log
  59. 59. ….</li></li></ul><li>Data Cleaning<br /><ul><li>Missing data
  60. 60. Wrong data
  61. 61. Noise data
  62. 62. Duplicate data
  63. 63. ….</li></li></ul><li>Data Preprocessing<br />
  64. 64. Data Mining<br />
  65. 65. Tracking & Optimization<br /><ul><li>Recommended result
  66. 66. User view and click what he like
  67. 67. Store user's click
  68. 68. Data Mining
  69. 69. Better recommendation</li></li></ul><li>That&apos;s it, Thanks.<br />Q&A<br />

×